Off the Mark

habitually probing generalist

Off the Mark header image 2

Some things read this week, 21 - 27 October 2007

October 28th, 2007 · 9 Comments

Note: Not much read due to being at the ASIS&T Annual Meeting in Milwaukee until Wed. evening.

Wednesday, 24 Oct

Shepherd, Simon. “Concepts and architectures for next-generation information search engines.” International Journal of Information Management 27(1), Feb 2007: 3-8.

This is a short, but interesting article in a copy of a journal I picked up for free at ASIS&T. While the prototype has great sounding potential the article is a bit too upbeat for me, e.g., “…future search engines will be able to solve the problems of both synonymy and polysemy” (3, emphasis in original).

In his description of Google PageRank he states “… due to its ability to present Web pages in a rank order that puts the pages the user is most likely to want to see at the top of the list” (3, emphasis in original). So I should trust someone who cannot get this correct? Google most certainly does not put pages in an order that the user will most likely want to see first. It puts the pages in an order that a typical user may want to see first. These are two entirely different beasts altogether! One is a real flesh-and-blood user with a real query while the other is a statistical fiction with no means whatsoever of expressing, much less having, an information need.

The theoretical problems for small-scale examples have been solved and the basic mathematics is understood. It remains to implement the algorithms “in anger” on real databases (5).

So scalability is not an issue at all? Perhaps he ought to read Harel (see below).

We have achieved Latent Semantic Indexing which seeks to identify semantic links between documents even where such links are by no means obvious even to a human reader, …” (6).

I realize that the key word here is going to be “obvious,” but this statement makes absolutely no sense to me. I can parse it out in English well enough. I just find it completely meaningless unless one really waffles about their use of “by no means” and “obvious.” If a human cannot identify the semantic links then are they there? It is humans that construct meaning. Can a machine specify meanings between items when it cannot even recognize meaning in the first place?

Again, it looks interesting. I also have no doubt that it would be an improvement over Google. The idea of backlinks is intriguing also, although I have questions around what constitutes a “reference” to another document (it can also work on the local computer). But no algorithm can solve synonymy and/or polysemy! That is not how language works. Perhaps with a large enough text corpus these algorithms (if scalable?) can do an amazingly good job at addressing both of these issues. But solve them?

Thursday, 25 Oct

Harel, David. Computers Ltd.: What They Really Can’t Do. Oxford: Oxford University Press, 2000.

  • Ch. 3: Sometimes we can’t afford to do it [for LIS452]

Tonkin, Emma (2007) Signal and Noise: Social Construction and Representation. In Lussky, Joan, Eds. Proceedings 18th Workshop of the American Society for Information Science and Technology Special Interest Group in Classification Research, Milwaukee, Wisconsin. [Word doc available at DLIST]

Zelle, John M. Python Programming: An Introduction to Computer Science. Wilsonville, Or: Franklin, Beedle, 2004.

  • Ch. 13: Algorithm Design and Recursion

Friday, 26 Oct

Davis, Hayley and Talbot J. Taylor, eds. Redefining Linguistics. London: Routledge, 1990.

  • Ch. 1: Davis, Hayley G. Introduction.
  • Ch. 2: Harris, Roy. On Redefining Linguistics.

Danskin, Alan. “Tomorrow never knows”: the end of cataloguing? World Library and Information Congress: 72nd IFLA General Conference and Council, 20-24 August 2006, Seoul, Korea. [pdf] Found via Cataloging Futures. [Oops. Wrong link. Thanks, Chris!]

A much more positive view of changes needed in the cataloging arena. Lays out the current challenges to traditional cataloging and then answers the question whether cataloging is relevant in the short- to medium-term and in the long-term. Argues that cataloging is about establishing a context for each resource, despite the horrible failure of the OPAC to make use of this navigational potential.

While I agree, this is one of those areas where it is not so much the OPAC designers fault. Some portion of it is, of course, but more of the problem resides in our rules systems; AACR2, MARC21, etc. Have a look at Barbara Tillett’s work on bibliographic relationships and especially the following Vellucci article:

Velluci, Sherry L. “Bibliographic relationships.” In: Weihs, Jean, ed. The Principles and future of AACR: Proceedings of the International Conference on the Principles and Future Development of AACR, Toronto, Canada, Oct. 23-25, 1997: 105-146. [RDA to FRBR mapping and they say a mapping of RDA to FRAD is due.

But these sorts of relationships and mappings cannot be afterthoughts if they are to work as they should; they must be integral to the system from the beginning. Even if they are being added mid-way that is not the same. JSC documentation says that they considered FRBR from the beginning. Perhaps. But the main problem is that FRBR (as a complete E-R model) is not complete. Both FRBR and RDA is being done piecemeal. And we are to get a coherent system from that process?

Friday - Saturday, 26-27 Oct

Davis, Hayley and Talbot J. Taylor, eds. Redefining Linguistics. London: Routledge, 1990.

  • Ch. 3: Love, Nigel. The Locus of Languages in a Redefined Linguistics.

Tags: ASIS&T Annual Meeting · ASIST · Articles · Books · Cataloging · Conferences · Language and word issues · My Life · Technology

9 responses so far ↓

  • 1 Irvin // Oct 28, 2007 at 5:59 pm

    Re the RDA-FRBR mapping table: it’s interesting (and concerning) how many RDA elements map to multiple FRBR entities or have question marks against them. In other words the data model on which RDA is built is unclear. What is I think is happening is that the content rules (RDA) are being defined first and then the (abstract) data model extrapolated back from them. It would make more sense from a machine processing point of view to go the other way: nail down the data model first, then the elements, then the content rules. Of course, I understand this is not easy in practice: we have a legacy situation to deal with and the primary task is to produce a replacement to AACR.

    However, this becomes a critical issue with the plans to express RDA as a DC application profile. Without a clear data model it’s not possible to express RDA data as a DC profile and express it in RDF. Pete Johnston’s response to the RDA Scope/Structure and Element Analysis papers is really worth reading on this:
    http://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind0708&L=dc-rda&T=0&F=&S=&P=50

  • 2 Irvin // Oct 29, 2007 at 12:18 am

    The Vellucci paper is also available at http://epe.lac-bac.gc.ca/100/200/300/jsc_aacr/bib_rel/r-bibrel.pdf

  • 3 Mark // Oct 29, 2007 at 6:20 am

    Thanks for the links and comments, Irvin. I added the link in with the article.

    I agree that our legacy situation is part of the issue. A big part even. But as I see the timeline (admittedly as a outsider to the process) it appears that the decisions to harmonize, if you will, took place well after work was begun. And the fact that we’re trying to harmonize with a standard that isn’t and will not be complete before RDA is complete is just baffling (to be polite).

  • 4 Irvin // Oct 30, 2007 at 12:14 am

    Well, I’m an outsider too but I think ‘harmonize’ is probably the right way to express the current relationship, rather than FRBR truly forming the theoretical basis of RDA. But each successive draft does seem more FRBR-influenced; the difference in this respect between the new and old chapters is one of the things that make it difficult to read at the moment.

    BTW there’s an interesting post and comments on the question of bibliographic models at Karen Coyle’s blog…

    http://kcoyle.blogspot.com/2007/10/bibliographic-er.html
    …particularly the OpenLibrary and AustLit projects that are mentioned there.

  • 5 Nathan // Nov 2, 2007 at 8:15 am

    “We have achieved Latent Semantic Indexing which seeks to identify semantic links between documents even where such links are by no means obvious even to a human reader, …” (6).

    Mark, to this you responded:

    “I realize that the key word here is going to be “obvious,” but this statement makes absolutely no sense to me. I can parse it out in English well enough. I just find it completely meaningless unless one really waffles about their use of “by no means” and “obvious.” If a human cannot identify the semantic links then are they there? It is humans that construct meaning. Can a machine specify meanings between items when it cannot even recognize meaning in the first place?”

    I am trying to think this through myself. If a machine has been programmed *by a human* to specify meanings between items, why might this not be possible - at least to certain degrees (I do not believe AI will ever fully replace humans)? Further , you say, “It is humans that construct meaning”, and I understand what you are saying, but when a chemist or physicist, for example, draws conclusions from evidence in the world, does this not mean that the world “makes sense”, in some sense at least, i.e. that it itself has meaning that should not be completely separated / lobotomized from our actually being able to *recognize* (not necessarily “create” ex nihilio) it, but perhaps ought to be distinguished nonetheless? Here I think we come back to the issue of the “theoretical-substantial context” (Hjorland’s “mind-independent reality”) and the Integrationist’s emphasis of “the social structural context”. I think they must be compatible, though I am not sure how to best make this case to the wider library community as a whole (I do think its important if we are not to lose touch with reality).

  • 6 Some things read this week, 28 October - 3 November 2007 // Nov 3, 2007 at 6:12 pm

    [...] 7: Love, Nigel. Integrating Languages. The Love was highly similar to his other article I read last week, The Locus of Languages in a Redefined Linguistics. In fact, whole paragraphs were the same as was [...]

  • 7 Mark // Nov 4, 2007 at 4:20 pm

    Irvin, I agree that successive revisions are (seemingly) becoming better harmonized. But, as you say, this sort of revision on top of other kinds makes it very difficult to track the changes between versions.

  • 8 Mark // Nov 8, 2007 at 7:45 pm

    Hi Nathan, Sorry for the delay. Amongst my issues and time constraints of the past week I have been pondering this and composing replies. Unfortunately, things went sour so I’m not sure if I can be coherent at all, nor will I remember all the points I wanted to bring in.

    First, let me state that, as of the present anyway, I have decided that “meaning” is probably the hardest concept to define that we have. It is certainly the hardest one I am aware of.

    Second, I am well aware that it is highly polysemous. But, personally, I no longer recognize most of those meanings. “Meaning” has, in effect, come to have little (or perhaps, less) meaning for me. But the meaning that remains is critically important.

    Third, I believe that much of the problem with the varied uses of “meaning” arises from primarily metaphoric uses of the term. That is, most of the definitions of meaning are metaphoric, and parasitic on the use I make of it. Of course, that is simply my opinion and not necessarily correct in any way, subjective or objective.

    Fourth, I have no idea what other terms to use to replace the senses of meaning to which I object.

    Fifth, I cannot adequately define what “meaning” means, even to myself. See the first point.

    That all said, I will try to dance around the idea enough to perhaps give you some sense of what I take “meaning” to mean.

    Often, we (myself included) talk about finding or discovering meaning, along with other types of locational contexts, such as locating meaning. As far as I am concerned, all of this kind of talk is highly metaphorical. These sorts of verbs are not properly used with meaning in a direct way, but only metaphorically. Often they fully suffice to convey (see what I mean?) what we mean when we use them. But, often, they get in the way and lead us astray as to what meaning actually is. That said, I have no good idea what kinds of verbs can be used in a literal sense with “meaning.”

    Meaning is also used in ideas/fundamental questions such as the meaning of a life, or the meaning of the universe. As much as I love Douglas Adams work, I think the answer “42″ is meant to show the utter incoherence of such an idea, at least phrased using the concept of meaning. Having been raised in the church, and having received a pretty good undergraduate education in philosophy (at an age to appreciate it to its fullest) I once thought I understood such questions. I most certainly no longer do. I fully understand the desire, nay demand, to ask such questions. And although I don’t think they really have answers, I do think they could be formulated better, and that they are important.

    The best I can say about meaning at the moment is that it is created (and that is not quite right) by humans. Whether apes, dolphins, extraterrestrials, etc. “make” meaning I am completely agnostic on. I would like to believe that humans are not the only creatures that make and use meaning in their lives.

    Meaning is created by the connections we make between things due to our experiences in the world, be it “external” or “internal.” We create it by making connections between things, ideas, feelings, events, other people, ourselves, …. The list is virtually inexhaustable as are the kinds of connections that we can and do make.

    That is why I do not think that computers can make or even find meaning. Sure, there is no doubt that a computer can recognize what it has been programmed to find (to be overly simplistic) but that is not the same thing. A computer can at best point to something that is meaningful, but only a human can actually “find” the meaning in it. [Dangit! I had a great citation for this point just yesterday. Damn you, mind!! Aha! Found it.]

    “Data mining techniques may discover implicit information in explicit data but these techniques do not necessarily guarantee that the discovered information is relevant, significant and trustworthy.” Uta Priss, “Alternatives to the “Semantic Web”: Multi-Strategy Knowledge Representation” in ISKO 7 /AKO 8, p. 305.

    Substitute meaning for information and, although it doesn’t exactly translate, you’ll get a glimpse of my point. It’s a lot like Swanson’s undiscovered public knowledge. Machines can “locate” and point at things which may, in fact, be meaningful (in the right context) when interpreted by a human. But the computer cannot understand, it cannot know, and thus it cannot recognize real meaning (although may point at possible meaning) and nothing can be meaningful to it.

    Now, I do not mean anything related to intentions, as in “I meant to say such and such,” nor do I mean meaning in the sense of a dictionary’s gloss on a lemma. Even if a computer has the entire OED in memory and a good program to spit out appropriate senses in relation to words used in other programs the computer no more knows what the words mean, and more importantly still cannot construct meaning on a larger scale.

    I am going to stop now as I realize that I am failing miserably. The best I can say is that I am trying to discuss “meaning” in, perhaps, its deepest sense and not in any of its more common senses. It is also, in my opinion, the hardest word/idea/concept to define. It is also the case that I am almost completely hamstrung by over 2000 years of Western culture, 100s of years of Western pedagogy, 100s of years of unquestioned metalinguistic practices such as the compilation and use of dictionaries, and lay perceptions of everyday linguistic practices (which are heavily influenced by all of the proceeding.

    I am unprepared, for many reasons, to currently address the rest of your comments. I will say that I think we probably agree in the gross details, and quite likely in the finer ones. But I would completely restate what you said in a different way. That is, I do think the meaning we construct needs to (usually) connect to the world in some real sense. If you are saying that only science has the answer–I do not think you are–then I disagree. I also would not accept, except in the loosest way of speaking, that the world “makes sense.” It only makes sense, that is has meaning, to us. It makes sense in something in between but closer to the looser sense for other living entities in that they are (often) able to successfully flourish in it. But that is not the meaning to which I refer.

    I do agree that the mind independent reality needs to (often) mesh with the social structural context. But, as I imagine you have experienced, it quite often does not.

    As usual, thanks for making me think, Nathan. I still hope to get back to more of your comments from a while back. And on this topic, I have no doubt that it is a highly transitional work-in-progress. I doubt that I’ll ever have the answer but I hope to come much closer. A first hope is being able to even express my concerns such that someone else can appreciate my concerns, whether or not they agree.

  • 9 Irvin // Nov 13, 2007 at 11:20 pm

    Re: ‘But each successive draft does seem more FRBR-influenced;’

    The RDA transformation to the FRBR model is now complete.

    “A new organization for RDA”
    http://www.collectionscanada.ca/jsc/rda-new-org.html

    As Obi-Wan might have said, “We’ve just taken the first step into a larger world”