18 thoughts on “On Assumptions about language use in tagging

  1. I didn’t read this but mainly because academic discussion (not yours, in general) of tagging makes me hostile. I have no problem with anecdotal evidence on this subject, unless it is used to bolster a paper, in which case that is pretty bullshit. esp. since there are so many places to get data.

    honestly (though you’ll hate this) have you ever met any non-librarian who has ever tagged things more for social use than personal? i can’t think of one example. tags like “toread” etc that are obviously for personal use prove things like that to me.

    i guess part of it is i don’t understand why we are so principally concerned with WHY people tag. I don’t care why catalogers catalog. I do care if they are accurate. I would be better served on some research on how to integrate tagging with “real” cataloging, and how to get people excited about tagging to be excited about catalogs.

  2. I love anecdotes. Most of us do. My problem with them lies in their being used, explicitly or otherwise, to support assumptions and arguments when, as you say, “there are so many places to get data.” And even if you have to work for the data that’s ok, too.

    I honestly have no real idea why people tag things as they do; not even myself often. But I fully agree that most people are tagging personally, which is why I was so upset at much of the early discussion of tagging. Because some fool had labeled it “social tagging” people assumed that it was done for social reasons. Another bad assumption forced by a certain use of language on those too unwitting to question .

    I fully agree with your last paragraph!!

    Honestly, I do my utmost to not discuss tagging. I find it a tertiary interest at best, and mainly for what it can allow me to do. But it is related to my primary interests, and when I define those interests as a whole then it fits (or should fit) squarely into them. Thus, I keep an eye on it.

    The other side of this specific coin (this post, that is) is all the reading that I have been doing on language lately, and seeing how various uses of it are constraining and shaping what people actually claim.

    Most of us are competent language users. This in NO way implies that any of us actually know jack about how it is used on a theoretical level. Now, one could argue that “Who cares? We get by you said.” Yes, I said that.

    But everything we do in LIS is premised on the use of language. And our systems are primarily built on an extremely flawed view of language. The primary view on which our systems are built is one in which the actual use of language by humans has been completely, violently, and explicitly removed.

    This reduction, especially in machines, CAN be useful. Clearly, it has been somewhat successful despite one’s view of our current systems. They may work poorly, or with training, but they work.

    The beginning and the end of everything we do in LIS (and cognate fields) is the human use of language by individuals in a specific context. We can ill afford to keep pretending as if it does not just because it is conceptually easier for us to research, to understand, and to implement information systems if we do pretend otherwise.

    There are large parts of our field that I know very little about and I try to stay out of them. When I happen to stumble into them and say something silly then I am grateful when someone (Christina, e.g.) suggests that perhaps I might have something to learn.

    I am no expert! I doubt I will ever be an expert, or, perhaps I should say, claim to be one. But I know enough about a fair few things to recognize when someone is making ridiculous claims–stated explicitly or otherwise–that all catalogers (or all librarians) are the same, that the language-using behavior of those who tag and those who don’t (as long as they aren’t librarians) is the same, etc. Especially when all they have is anecdote.

    And I am going to call “Bullshit!” as often as I can.

    If you (anyone) could get me into a booth with a pint or two, or even a 5-minute talk in a hallway at a conference, I have no doubt I could do a better job eviscerating our tools and theories as implemented (so, praxis) of “information organization” than most other librarians.

    But I have learned about these issues because I care, because I want to help make them better. I won’t say fix because there is no “fix.”

    We have a lot to learn and a lot to improve upon. We need to find a way to employ language to make our systems relevant to those who use them, and so that people will use them. And I most certainly do not mean relevant in a Google most-popular, heavily linked to sense. I mean as meeting a real need in their lives as they go about the business of creating meaning, of answering questions, of becoming informed, amazed, blindsided, educated, less lonely, or whatever else they may use them for.

    Our systems have lots of problems. They also have lots of potential. We know an awful lot of some of these things. Many of them have been unimplementable until recently, some of them have been poorly implemented. We also have an awful lot to learn. And I will not idly stand by while people use anecdote–in a blog post or a formal paper–to put forth bad, or at least unfounded, arguments using assumptions that are, quite possibly, specious.

    If I can keep trying to educate myself then so can others. They can put down the damn latest pop culture phenom for a few minutes and go learn a little about what we do know and what we don’t know. And more importantly, what both those are founded on. They might be surprised.

    Me. I’m surprised we manage to do anywhere near as good as we do, in all truth. It absolutely makes me slack-jawed, drooling in amazement. But that’s just me.

    And now I’ll shut up before I really write something I might regret.

    We’ve had our discussions, Jenny, about reading the literature. Neither of us are going to change our minds, and I think we know we agree a lot, too. I’m not saying that anyone has to try and educate themselves about these issues.

    All I can really hope for is that when they do talk about them using anecdote, faulty assumptions, or by being “forced” to make silly claims due to a faulty theory of language that they are open to me, or someone else, saying “Hold on just a second here.” [*Very* misplaced hope, I know.]

    I’ll probably be doing a LOT of that for a while as I piece together my own views and ideas on these things. This problem space consists of several highly complex topics that are all interrelated, and much of our field and our society is based on faulty understandings of them. I doubt I’ll ever really have the grasp I want on them, but if I can work towards one myself, and show a few others some of the problems in the meantime so that they can add their thinking, if they so desire, then perhaps a little progress can be made.

    But honestly, I do not want anyone who *assumes* that taggers and non-tagging tag users are the same, but different from librarians, having *anything* to do with the design of my information systems. But that’s just me, too.

  3. I am fairly certain that the vast majority of catalogers do not think in subject headings, either.

    Ummm…I must be in that small minority…NOT that I have the whole thing memorized, of course. Probably, the more accurate summarization is that I often think in MARC subfields since I don’t know all of LCSH by heart.

    But my tags on my LiveJournal account are not LCSH compliant if only because they are personal. I’m converting my LibraryThing tags to LCSH, though, because I consider it an “official” catalogue, even though it’s my personal collection. I make a distinction between the two. Personally, I’m all for public tagging in the library catalogue, but only in addition to LCSH. I think patrons should be able to log into their account and tag at will. It would be up to the library to decide if those tags are visible to other patrons and if those tags are also searchable. I like the notion of visible public tagging; I like the thought of a library catalog that has a search field for LCSH and for public tags; I also like the idea of something linking particular LCSHs with particular public tags if the system can detect that many people are using the same tag. It would make an interesting visual display for sure.

    But I don’t know how to create such a thing, let alone implement it. Maybe it’s already being done at Amazon…

  4. Sure wish I had left those first 2 points out. It is not that I consider them unimportant; they are important. But my main concern is the one that so far people are choosing not to respond to. Plus, if I had left them out the post would be a bit more focused and/or tighter.

    OK, Jenn. I could certainly be wrong. No question there. And on one use of the sentence “I think in subject headings” I have to agree. But, in reality, I think this is a loose use of language. We all talk like this much of the time, and it often suffices for what we mean.

    The reason that I deny that this sentence, except in the loose sense, is meaningful is that we have no idea what we think “in.” Many things have been postulated over the last several millennia. Is it pure logic? Is it some kind of language of mind? Is it whatever natural language that one speaks? Is it in bare concepts? We (science) do not know. All we have is our individual experiential feeling of what it seems we think in. [I don’t think we ever will know either. No matter the level of brain functioning we are able to fathom and describe. But this is a vastly different conversation.]

    An analogy can be made to an adult learning to be proficient at a second natural language. At some point, they start thinking “in” the second language, and not in their primary language with a subsequent translation into the second. It is in something like this sense that we can be said to think in subject headings.

    But then this can be said of any discipline, argot, 2nd language, etc. Doctors think in, what, symptoms, diseases? Chemists think in molecular structures? Structural engineers think in stresses and forces? Programmers think in C++ (or whichever language). OK. But this is basically meaningless in an analytical sense.

    When I say we do not think in subject headings this is what I mean. Sure, we become adept at doing subject analysis, recalling specific subject headings, moving into a way of thinking that uses subject headings far more proficiently than the non-librarian, etc. But this in no, truly meaningful, way involves us thinking “in” subject headings.

    Again, there is a sense in which one can be said to think in subject headings, or most anything else. Just as a book or music can be said to be content. But I maintain that these uses are, while on one level meaningful, very loose and useful only in general conversation. But when it is time to speak of these things in an analytical mode then we need to use these concepts and our language in a more formal way.

    Now many people may think they aren’t doing analysis, and they aren’t. But they often should be. If you want to sit around and talk about what is wrong with our information retrieval systems (or any other part of our discipline), or the differences between “professionals” and users, and you want change, and particularly if you want improvement then you best be doing analysis. Otherwise you are on the dangerous brink of making things worse.

    Now, there are certainly levels of analysis. I am not advocating that every discussion needs to start out by completely operationalizing its definitions (many cannot be!), or that everything needs to be sketched out in a formal logic (shudder!), or ….

    But far too many people are throwing around words/concepts that they either know little about, think they know, know for themselves and have no understanding of what they mean to others (often something different), etc. (or some combination of these). And the beauty of it all is that our discipline is faced with a plethora of these words/concepts and, dare I say it, a plethora of these people.

    Just like research, meaningful dialogue and meaningful thinking and analysis is hard. Hard is hard, but often well worth the effort.

    I do admit that there is a meaningful way in which librarians’ facility with subject headings is vastly different than our users, and that this difference is important. But, to me, this difference can only be made meaningful by something a little more formal than “I think in subject headings.” Because that is silly in this context. It is shorthand. It is loose. It is lazy (in this context). IMO.

    As I said, I could be wrong. But until someone can explain to me how that simple statement is supposed to be meaningful in a useful way that makes a difference, one that doesn’t come to the same thing as I am trying to say, then I feel my parsing is important.

    I am not trying to be pedantic or split semantic hairs here. I just feel that it is often important to be careful about how we use our words/concepts. And much of the reading that I am doing lately–in several areas of our discipline–is convincing me that much of our current problems reside in exactly this failure.

  5. Pingback: Book, music, communication, content, social

  6. Ooh Mark! I didn’t mean my comment negatively. I don’t mind discussion of tagging in this way (in fact I find it interesting, obviously) and while it may be a tertiary interest of yours it is a primary interest of mine. Both because I am personally interested and because it could be a job security issue, you know. 🙂 (I don’t believe that but some people in our profession do.)

    I think if people are really interested in why people tag, it’s fine with me that they pursue that. I just go to every tagging session at every conference I go to and it’s ALWAYS this question. It’s like going to every cataloging session and it being about why catalogers catalog–I can’t imagine anyone accepting that as an academic subject for discussion at a conference for EVERY panel.

    I usually find too, that almost everyone giving these lectures is someone who 1. is directly involved with the making of this kind of software or 2. wouldn’t tag if their life depended on it. I just wish in the discourse I didn’t feel like my day to day online practices weren’t the specimen in library literature. Like it’s SO WEIRD and foreign to these authors that people tag.

    I mainly agree with you though, however I think the solution is that librarians actually become like the users. Maybe if more librarians had to use libraries, library services, and online retrieval like our users did the literature would be a little different.

    Now I will go write a little on my totally anecdotal conference presentation! 🙂

  7. Jenny, I hope I didn’t misunderstand you (again), but may be a little. Why is it we have such a hard time understanding each other electronically?

    Anyway … most of what I wrote in response to your comment was not directed at you. I just took off on a related tangent that was festering.

    I guess I should clarify “tertiary interest.” I have so many freaking interests that things highly relevant and/or interesting to me easily get moved back to tertiary. Tagging is one of those. It is highly relevant to my interests but it is a very young research area so that makes finding good, solid research hard(er) to find. So besides the (temporarily) more important things occupying the foreground, the lack of material on the topic also moves it down the list.

    Also, being a young area that means that lots of writing on the topics is/was on blogs. Not to say that there is no thoughtful work on blogs because there is but that much of it will be purely anecdotal based on a lot of unexplicated assumptions.

    I definitely agree about the (at least seemingly) main gist of conference presentations on tagging. I go to quite a few, too, and I know you’ve been to more of them. That does seem to be THE question.

    And, yes, I think many librarians have little view of our systems as users. Most of their interaction (in some cases, all) is as the intermediary or the constructor/maintainer of the system so that whenever they are using it they are evaluating it from one of those perspectives.

    In many way (perhaps makes me a “bad” librarian), most of my interaction with our systems (except much of my cataloging) is as a user. I use the hell out of some of our systems, and they drive me nuts!

    Even as a cataloger–or esp. as a cataloger–using the online catalog to check a record I have just input is a freakin’ nightmare.

    Back to being a user, we have a library browser toolbar here (as you probably know). I have it installed in Firefox on the PC but not the Mac (small screen real estate). But I rarely use it. There are some drop-downs to choose things like the Lib Gateway or Hours or Dept. Libs or Reserves, etc. and Catalog or Databases or Journals, etc.

    But there is no way to select the sort of search you want to run within a “catalog.” I’m guessing the default catalog search is a keyword search. Maybe these are the default searches that most of our patrons run, but I’d really, really like to see some serious empirical data for these decisions. And I’d love to get my hands on the damn relevance ranking algorithms and damn the committee who “agreed” to them. It drives me insane when I drop in a definite title–especially a short one with only meaningful “keywords” in it–and get 1000s of freakin’ hits for a known item, which is usually nowhere near the top. For Christ’s sake! How much more relevant can a two- or three-word phrase be than to match a title exactly?

    Please, people. Show me the quality evidence that proves I’m a weirdo. I know I’m weird. But I simply cannot imagine how a direct match on a title with only meaningful keywords in it does not bring that to the #1 position in a list.

    I seriously do not think anyone will ever be able to explain the sense of that to me. Certainly they could explain the algorithm and I would understand why it fails to do what I expect in this case. But I doubt they will be able to rationalize (to me) the reason it is coded that way.

    Now, I really do not mean to disparage the librarians who make these sorts of decisions. I really don’t, even if I often curse them under my breath. 😉

    But research–if it can be believed and seeing as it comes from many disciplines I think maybe we can–show that most users will make the least effort. So if the default is a keyword anywhere search then, clearly, analyzing your search logs will show that users use KW searches primarily.

    Change that default and some interesting things happen. Change your displays and search capabilities and even “stranger” things happen as some of the early reports coming out of catalogs powered by Endeca are showing.

    And–perhaps it’s Voyagers problem–but why the heck can I not limit by library, or anything else, on the Quick Search page if I want to search by Call Number? Is it that dang odd that I might want to shelf browse a specific location?

    And, yes, there are ways around that. I know. But isn’t all the rage nowadays to not make the user think? Don’t make me game the system to make it do something it ought to. I’m already thinking about how to get at what I want. I shouldn’t be forced to think of another way of doing it.

    I also know I can go to the Advanced Search and implement my limit and then search by Classification. [Notice the change from Call Number to Classification. ??] But run one of those (without any limits, especially, as an exercise) and have a look at the results.

    It is taking the class no. from both the 082 in the Bib and from the MFHD and then ordering them alphabetically! Alphabetically. WTF? Who dreams this crap up? So since the call no. is being pulled from the 082 and it may well not be the assigned call no. you get something like the following on a search for 401.9 (remember, titles are alpha):

    401.9 OC3a
    401.9 H838A
    616.855 L26
    401.9 C12A
    401.9 M23A
    400 J26

    I cannot even begin to fathom how this can even begin to be considered useful. I can at least sort of try to understand the KW anywhere thing, but this? Completely baffling!

    I’m glad it’s pulling the records with the Class no. on the bib but shouldn’t those be ranked at the end of the list of items actually classed in 401.9? And what is with the alpha for Christ’s sake.

    So to achieve my desire to “shelf browse” I am forced to otherwise retrieve an item with that number and then click the call no. to get into the shelf browse function. I truly do not get it.

    And, since it works this way, I have no way that I can discover of shelf browsing a specific library. Because if I use the Quick Search with a library limit, find my item and then click on the Call # to get shelf browse I am dumped into the full catalog. My limit has been removed.


    By the way, I can fully defend wanting to do this search as a user, but esp. as a cataloger, who, by the way, is a seriously heavy user of the catalog. You know, just in case anyone forgot to consider that.

    Again, Jenny, most not in direct response to you. 🙂

    Hey now! If I had the money I’d definitely be at that “totally anecdotal conference presentation” of yours!! Of course, I might have to ask a question at the end about it’s being based only on anecdote. 🙂 OK, teasing on that one. Seriously, though, I am sorry I’m going to miss it. You’ll be awesome!

  8. RE: toolbar I think all digital project development in libraries is a struggle between librarians who want to meet our users halfway and people who think everything digital is evil. I think this hinders those projects in the points you brought up. I always thought that toolbar was WAY more useful if you WORKED at the library than if you were a student there. It comes out of that “users only want one box, like google.” which was the gimmick at conference 2 yrs ago.

    I think the fact that you USE these services more than most librarians is exactly WHY you find issues with them and they don’t. I think that makes you a good librarian!

    I would point out 1. most people don’t search by call number. That’s a working a reference desk in the math library statistical fact. 2. Everything recall related to MFHD in that library is sort of *&$%^$ed because of the way the data came into the database and 3. supporting an advanced user who DOES want to shelf browse and someone who does just want the one google search box is really difficult.

    I think the problem you’re seeing is that libraries are hampered more than say, web aps, by really structured data and lots of it. And bureaucracy. Moving to more useful systems will always screw some of the data when you have 20 million records.

    I will most definitely send you slides, although I come from the “as little on the slide as possible” school of presentation. It’s pricey to fly to Monterey, so I hear you!

  9. Hey Jenny. I do agree that not many users probably search by call # or class #, nor do many (perhaps more?) want to shelf browse.

    I wonder if those who do (or would if they knew how or understood what it is) would prefer to do so in the entire catalog or by location (primarily)? Both options should be (easily) available, though.

    Nonetheless, unless we have completely abandoned ourselves to the (imho, dumbass) memes of one search box and research equals a keyword or 2 dumped in that box, then give me my darn searches already. It simply cannot be that hard.

    I also don’t get the difference between the drop-down options from the Gateway to those on the Quick Search page. Since supposedly “everyone” will use the default anyway, then what is the point of limiting what’s in the drop-down on the Gateway page. Put the damn Call # option in there already for the utter oddball such as myself who will use the drop-down.

    When I end up with carpal tunnel from the 2 extra clicks I have to make every time I do my job I am coming after the knuckleheads who made that decision.

    Purely anecdotal on my part I know, but I maintain that catalogers are probably the heaviest users of the catalog (as individuals, not in aggregate). Sure, “users” come first, but can we make a concession on our own behalf?

    Yes, I have the client to work in, but I also need to use the OPAC and am SUPPOSED to check every record I add in the OPAC also. Call # is frequently the most convenient thing to use due to workflow reasons.

    Clearly, librarians are all of one mind. Maybe if I could learn to think in “OPAC record display” then I wouldn’t even need to look them up.


    I’m working hard on the little as possible school of slide design, but I’m not quite there yet. I have learned to spread it out across multiple slides, though.

  10. As one of the organizers of Library Camp NYC and an attendee of the session in question, I’d like to thank you, actually.

    That the session provoked further ongoing conversation outside Library Camp is profoundly gratifying – I just wish others who attended the session would jump in as well, so we could all learn.

  11. Welcome Steven, and thank you! Your comment is very gratifying.

    One never knows how to, or even whether they should, comment on a blog report of something they were not in attendance at. It can quickly bite you in the nether regions.

    I can only hope that I was clear enough that I realized that was a danger, and made explicit that I was not in attendance, and that my comments were addressed at what I took from the report and not from the event itself.

    Again, thanks!

  12. I’m still working through the discussion here, but a quick observation:

    There was a notable tag-related anxiety in the room: Namely, that none of a particular library’s patrons would tag, given the opportunity. One suggestion in response to this was the use of tag databases (like LibraryThing). A counterpoint raised by one of the catalogers in attendance was that, in their collection, LibraryThing tags seemed to basically be exactly the same as the cataloged subject information (that is, cut-and-pasted content). This was the context for the last point you quoted, which itself was a quick summary (and I am greatly indebted to the note-takers of the talk) of several different conversations happening at once.

    I recognize your position with regard to Nicole’s original post, but I think the above situation has some (ahem) anecdotal value of its own.

    After some sleep (and a more thorough perusal of what followed in the comments), I’ll try to come back. Very interesting read!

  13. Hi Benjamin and welcome.

    I see a lot of tag-related anxiety myself in various places, esp. amongst catalogers. Not sure what drives it, except perhaps some of the early talk about tags replacing subject headings. And then there’s the Calhoun Report….

    As for who will tag, how often, in what situations, etc. I have no idea. Those sorts of social questions are definitely outside my areas.

    The point about LT tags being the same as subject headings may well be valid; for some libraries and at this point. But that is a fairly simple empirical issue to determine.

    No doubt there is a lot of anecdote in this, esp. in my comments, many of which wandered fairly “off topic,” which is fine by me–since it was mainly me doing the wandering.

    Even my main theses are mostly conjecture. I said pretty much the same but with different words. I cited no real empirical data and used at least one pure anecdote. As I said in the comments above, I am not against anecdote properly used.

    I also spent much of my time arguing the opposite for “effect,” if you will. My point about me using the card catalog at 5 is pure anecdote, but I hope it makes one reconsider that “only librarians get subject headings” line. Me, I think there is value in realizing that we are more proficient in their use and just what does that mean for our patrons use of our systems.

    Similarly, I can fund a nugget of insight or value in everything I criticized. I can only hope that others who may ascribe to these beliefs and assumptions can do the same.

  14. catching up but yes it was so irritating to me when i worked there that you couldn’t look up call number easily. esp. at somewhere like math, where I KNEW call number ranges.

Comments are closed.