habitually probing generalist

habitually probing generalist header image 2

Blogs as a means of preservation selection for the WWW

December 19th, 2005 · No Comments

I recently read an article pointed to by Jeffrey Pomerantz at his blog PomeRantz:

Gouge, Marianne K. "Blogs as a Means of Preservation Selection for the World Wide Web."

I referenced this article in my comments on the article for our 1st Virtual Journal Club discussion.  Here is what I said:

I read another article that addresses and provides an answer to this question.  It is interesting, although badly edited, but it is ultimately, in my humble opinion, wrong.  [Gouge, Marianne K. "Blogs as a Means of Preservation Selection for the World Wide Web."]  The author [Gouge] relies on blog aggregators and a set of archival
criteria to determine what is of quality and should be saved.  What this boils down to is another instantiation of Google page ranking.   What the masses like (point at) determines quality.  This is such utter nonsense that I am continually amazed at the large numbers of intelligent people taken in by it.  The best that this sort of ranking can give us is what is of a certain kind of "quality" within pop culture.  That is, what is the most popular.  But to equate the popular with quality is just…well, let’s just call me elitist on this and move on.  I will probably write about the Gouge article on my blog seeing as I have now read it’s 40 pages twice.  Worth a read.  Just wrong, although it is a start and may be part of the answer.

But the fact remains that there is much of quality that has nothing
to do with pop culture.  And then there is the pop culture that is of
quality but not massively popular.  How do we identify those?

I intend to make a further critique of this paper here.  What I want folks to understand is that I consider this paper to be important, despite my negative critique.  The author is only using what is already a widely accepted method of measurement of quality and relevance on the web to determine if this method might also be useful in assisting in archival collection development decisions regarding blogs.

While I will be critiquing the specific paper, I will mostly be critiquing the underlying ideas.  I do consider her use of these metrics to be a reasonable one in that they are already widely accepted.  My goal is to show up the flaws with the assumptions underlying this method of determining relevance and quality.  I also understand that we may not have much choice but to use these sorts of measurements, from a pragmatic sense.  That is, that there is no better way to, in a practical sense, accomplish some of the uses these sorts of measurements are put to.  My hope is that as our technologies improve we may achieve a more useful means of having computers assist humans in making decisions about relevance and quality.  I also hope that the present method(s) of primarily measuring linking do not blind researchers from attempting other means of recognizing quality and relevance. 

I think the verb ‘recognizing’ is a much better one than ‘measuring’ in this arena.  First, one must be able to recognize these qualities of relevance and quality before one can measure them.  While quality may be inherent in something, that is still a completely unresolved issue in aesthetics.  Measuring quality seems to me like a completely futile effort.  Certainly, we can compare two things to each other and possibly rank one higher than the other on a personal scale of quality.  But to do this routinely, and for everything in a certain area is simply not possible.  I am often asked about my favorite band or artist, movie, color, and so on.  Now, I will not deny that I had these when I was younger; they also changed frequently.  And while I can certainly give you a (changing) list of my favorite artists, how can I even begin to rank Ani DiFranco vs. Ella Fitzgerald vs. Mark Knopfler?  Or rank American Beauty vs. Grave of the Fireflies vs. True Stories vs. Lola rennt?  They are all completely different in that they have different qualities and different relevances to me and my life.  And these both shift over time.

but then what kind of scale 
compares the weight of two beauties
the gravity of duties
or the ground speed of joy?
tell me what kind of gauge
can quantify elation?
what kind of equation
could i possibly employ?

school night
¤ Ani DiFranco ¤ reckoning

Relevance is certainly not inherent in anything.  There are far too many variables in determining relevance to someone or a situation.  And the research that shows that relevance and the question itself changes while one is undergoing the information seeking process should be enough to convince us that measuring relevance, particularly for others, is a completely useless undertaking.  Now, I agree that we can attempt to say what might be of relevance to a particular person in a particular situation.  But that is rarely what you see claimed.  The claim is that they are measuring relevance, often for large groups of people at the same time.

When returning results for a search, Google combines PageRank (our measure of a page’s importance) with sophisticated text-matching techniques to display pages that are both important and relevant to each search. Google counts the number of votes a page receives to determine its PageRank, interpreting a link from page A to page B as a vote by page A for page B. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important." Please note that ranking of sites in our search results is completely automated, and we do not manually assign keywords to sites. [Google Help Center from search on PageRank.  Emphasis mine.]

Clearly, Google is claiming that they are measuring importance and relevance to each search.  But if I do a search on "bass" how are they going to handle that?  And, yes, I agree I could do a far better search by excluding terms, such as "fish" or "guitar" or "drum" to narrow my results.  But, then I only have to look at studies of user search strategies to know how rarely people do so.  Don’t pull out your librarian’s bag of tricks on me.  We’re talking about normal people here.

Anyway, I’ll cover this and more.  On to the article.  Note: There are several sections of the paper which I will not comment on.

I don’t know if the copy available at UNC’s SILS Eletronic Theses and Dissertations is a draft copy or what, but it is very poorly edited.  Now I’ll admit, I have gone back and re-read some of my previous papers and been horrified to find typos and a few sentences that don’t make sense.  I’m sure we all have.  And this paper is highly readable.  I don’t believe that there was a single sentence I couldn’t make sense of.  It is mostly punctuation (not a strength of mine either) and spelling errors or typos.  A lot of them.

The purpose of the paper is to determine if various blog aggregators can assist in the determination of relevance and the archival quality of blogs to aid in archival collection development decisions.  It looks at three aggregators, Daypop, Blogdex, and BlogPulse.  There is also a cross-aggregator comparison to see which is best at indexing material that meets the criteria the author has selected for determining archival quality.

The method tested by the author is intended to be a third way between the "save everything" approach and selective decision making.  I wholeheartedly agree that we need a third way, or even better a plethora of third ways.  This is a useful attempt at finding an intermediate way.

Defining Credibility on the Web

As the author acknowledges, "Assessing credibility on the web is a difficult task" (14).  Some of the means used to judge credibility include: usefulness, relevance, interestingness, timeliness, comprehensivesness, "visual quality and design, URL domain name, date of material, and personal judgment if the source appears to be authoritative" (14).  She cites a Stanford study that shows that half of web users primarily determine credibility based on visual design.  Clearly a better method of determining credibility is needed.  User education and perhaps technology-driven solutions may help.  The problem in the end is that credibility, just as beauty or relevance, is completely in the eye of the beholder.

A quick thought that I’d like you to keep in the back of your mind is this:  What impact does this use of visual design to determine credibility have on what is linked to?  If over half of web users are using primarily visual design elements to determine what is credible, what does this imply for any system measuring relevance and/or quality based on links?

‘Distributed credibility’ is another concept discussed by the author.  This is when an online community displaces individual judgment by pooling intelligence and expertise to cross-check one another resulting in a so-called ‘collective intelligence’ (16).  The author (Gouge) claims that, "It is from this distributed credibility that the blogging community provides a means of selection" (16).

[The citation for 'distributed credibility' is Burbules, N.C. (2001). "Paradoxes of the Web: The Ethical Dimensions of Credibility." Library Trends, 49(3), 441-453.  Unfortunately I do not have this issue, but I'm going to have to have a look at this article in its entirety as I only believe in distributed credibility to a point.  Speaking of credibility, I have to admit that this article being in Library Trends gives it a lot more credibility for me than many other places it could've been.]

This distributed credibility though boils down to another form of the wisdom of crowds.  Karen Schneider of Free Range Librarian also questions this, although admittedly in a different context:

I’m cautious about invoking the wisdom of crowds–you mean the same people who reelected George Bush?–but to see why a person gave a recipe on Epicurious a particular rating is helpful to me in selecting recipe. (Knowing when and how someone isn’t wise is useful, too, like the people who complain that a macaroon recipe doesn’t work and make it clear that they were using marzipan and not almond paste.) Plus the ratings become part of the metadata.

The problem with the form of distributed credibility of the blogging community is there is little of the transparency that Karen rightly claims is necessary.  "What," you ask?  "Certainly there is credibility as we link to each other and, often even, state why," you vehemenently claim.  "Just look at the biblioblogosphere!"  Well, yes.  You are correct.  But only because we already have a good understanding of who is credible in this community.  Outsiders have no idea.  And some link somewhere in a search result cannot give anyone that knowledge of credibility that we in the community already have.

Then there are the issues of interest (relevance), gatekeeping, and so on.  I have already discussed some of this in response to some questions by Walt Crawford earlier this year.  But the simple point here is that in regards to the top 4 or 5 library bloggers, whom I know are extremely credible individuals, much of what they write is of little relevance to me.  But these kinds of measurement would lead one to the conclusion that only what they and a few more say is of importance.  And that is complete and utter nonsense.  I want to make very clear here that I do not mean to imply that these folks think or feel this way.  There are very few and highly limited domains in which we can individually be good judges of credibility.  We do need help.  But mechanical determination is not the be-all and end-all of a solution.  It is a necessary evil and will improve over time.  But we must keep in mind that it is ultimately a human judgment that is required before we all become convinced that what Google (or any similar ranking system) returns is relevant, credible and of quality.

Applying Credibility to the Blogging Community

This section talks about gaining social capital by people linking to you.  And while there is a small bit of usefulness in this concept, it is in reality not much more than a popularity contest.  Blogrolling is mentioned as a similar phenomena.  "From the blogroll a pattern emerges. The most popular and authoritative blogs list each other" (17).  I’m imagining the author has spent little time looking at political blogs or feminist bloggers complaints about the "man’s world" of much of the blogosphere.  Considering that she uses the phrase "most popular" she is somewhat off the hook.  But to put "most popular and authoritative" in the same sentence, as related concepts no less, is beyond the pale.  Only someone’s agent or Jimmy Wales could make such a claim with a straight face.

"If someone posts erroneous, [sic] information the author’s mistake will quickly be brought to his/her attention, therefore keeping the information provided within the blog accurate and credible" (18).  Maybe this is so in a limited selection of professional areas.  In broader, more subjective, areas this results in flame wars and endless debate.  I guess I shouldn’t say "more subjective."  Even librarianship is highly subjective.  We have no agreed upon theory of librarianship, and while we do have various professional statements of principle it is extremely clear that there is much latitude in interpreting these, much less subscribing to them in the first place.

As an example, I have been following the Chief Illiniwek conflict in various venues since before coming to UIUC, but especially since.  I have become so completely confused by all the various "facts" about the situation from the multiple sides in this issue (there are at least 3) that I am left to having an opinion on the matter based solely on the ethics of this fight wasting so much of the time and resources of people whose goal should be the education of themselves and others.  And while that may be an acceptable basis on which to form an opinion, knowing the "facts" about the symbol vs. mascot issue, the dance, the outfit, the opinion of folks of Native Illinoisan descent, etc. should all count for something in making one’s decision.  But credibility, quality, facts and so on, do not rise to the top.  It simply ends up being like liberal trolls on conservative blogs and vice versa.  To tie this directly back to blogs and the claims made for them, much of my information comes from the internal electronic bulletin boards of my library school.  Now we are all librarians or library students, surely we would do our research and only cite facts and, even if not, the credible and authoritative will rise to the top.  Guess again folks.

Archival Criteria

In this secton, the author discusses a few different statements and criteria for determining archival value.  She also makes several statements about various difficulties and writes them off as too difficult.  These are determining "value for research" and "intellectual quality" (22).  But, if we are going to write both of those off, then why are we even undertaking any of this?  What are we archiving, and why?  Then there is this odd connection made to ephemera which is somehow supposed to make the value determination easier.  And while I agree that much about blogs is ephemeral, I don’t see how this is supposed to help.

In the end, the author settles on 5 main criteria:

  • A. Is the Information Original or Unique?
  • B. Is the Information Documenting Issues of Current Social or Political Interest?
  • C. Does the Information have Wider Application, a Resonance beyond the Local Community?
  • D. Is the Information Exemplar of Its Type for Reasons of Design, Language, Topic, or Origin?
  • E. Does the Information Advance the Development of a Meaningful Organic Collection? (24-25)

My biggest issues are with Criteria B and C. 

Criterion C boils down to the information should be of interest outside of the blogosphere.  It can still be very narrow in its audience, say knitting.  OK, I agree.  But what is there that doesn’t meet this criteria?  Possibly 10% at most.  And even the most intense internal blogosphere navel-gazing might be of interest to some on "the outisde."  I maintain that this criterion is practically meaningless.  Then there is the issue of why it even matters if we accepted it.  If the goal is to archive some portion of the blogosphere for some future purpose, it is very likely that a blog’s current relevance to those on "the outside" is totally irrelevant to any archival decision.  I simply fail to see how this criterion can be relevant to any decision making, except in the rarest of cases.  Which means it fails as a primary criterion in my book.

Criterion B is more dangerous, on several levels.  While C can’t exclude much, B will exclude a large amount of high quality, credible, and relevant material.  B is set, in this study and in one aggregator in particular, to a 30 day cut-off.  If it isn’t about something that happened in the last 30 days it fails to meet this criterion. 

I have a hard time even beginning to explain my concerns with this.  It all seems so self-evident that it becomes hard to put into words.  We are talking about archives here.  Why are we only considering items about "current events?"  I have seen some amazing writing on blogs about things that are decades, centuries, and even millenia in the past—things that are credible, interesting, of intellectual quality, and of value for future research.  Why should these not be considered for archiving?  There is so much more here but I hope it is simply self-evident as I don’t feel like teasing this apart for myself or anyone else.  I simply accept it as completely wrong-headed.  If you don’t, then fine, maybe we can have a discussion and I’ll do the work involved to further explicate the whys of its wrongness, but for now, its just wrong.

It might be fully acceptable if this paper was about making archival decisions about the popular in pop culture, but its not, or at least never claims to be.

Methods

I’m not going to critique the author’s specific methods.  I already critiqued the general method above in my discussion prior to talking about the paper.  But I do want to comment on a few things in this section.

"As a link is repeated creating a burst it could be considered an indication of authority or quality of content and therefore be selected for preservation from this method as well" (27).  And yes, I see the "could" in this sentence.  I do appreciate it.  But, I also want to add that it could also only imply that it is the topic of the moment and have nothing to do with authority or quality.  How many people bought Britney Spears 2nd album and how fast?

Released in May 2000, Oops!… I Did It Again also debuted at number one in the U.S. and Canada, and was a similarly huge hit like her debut. It sold over 1.3 million units during its first week in the U.S., making it the fastest-selling album by a female artist in history. Within a year of release, it had shipped over nine million copies in the U.S. alone (and would go on to ship another million on top of that). Wikipedia.

So repeated linking implies quality and authority, huh?  [And yes, I am well aware that I could use a far more authoritative source here than Wikipedia, but the scale is what is relevant, not the exact numbers.]

"Once the content is posted the distributed network the blog will select the most relevant and credible information" [sic] (27).  I can easily come up with so many counterexamples to this claim about credibility that its not worth the effort to find any.  They are everywhere.  And yes, of course, if someone links to something that does implies relevance.  But why?  How?  That is often impossible to determine unless one spends the time to become a member of that community.  How many of you know what the title of my blog represents?  Why did I quote some lyrics from Blossoms & Blood, and why those lyrics, as a post?  While I am not making the claim that my blog or that post are archive worthy, the point is that those things all have immense relevance for me.  But how many of you know what that relevance is?  [And yes, I'm aware that you probably don't care.  It is an example, if you don't care about mine come up with one of your own.  It really is quite simple.]

I’m trying to leave the philosophical issues of relevance, quality and so on, out of this already complicated discussion, but there is always that level to which we could descend.

Further Study and Conclusions

"Each of the aggregators was able to indicate a high number of links that were of archival quality particularly in the quality of currency and with appeal to a wider audience. An improvement to the aggregators would be to address the ability to select a greater amount of content that is unique and of exemplar quality" (34).  Great.  They are good at the criteria that I consider to be generally useless.  My confidence is certainly inspired.

Weighting of links coming from experts or those with more influence is suggested as a solution to make the selection of the unique and exemplary better.  And while this solution makes some sense, and is what is done already in some ways by some search algorithms, it is also highly flawed.  Are we again saying that only what is linked to by the top 4 or 5 folks in the biblioblogosphere is unique or exemplary?  I fail to see this on a daily basis.  Again, that is not to disparage what they do.  It is only to state that they have no special claim on the unique, the exemplary, the interesting, the credible, and so on.  There is much outstanding work being done that they never see or link to, just as in any area of the blogosphere.

I have been fairly critical of many of the ideas in and behind this paper.  And I am.  I have also touched on some of the reasons why.  I do want to restate that I consider this paper to be a valuable contribution.  It evaluates a widespread methodology, it provides some data (which one is free to interpret differently), and it provides some criteria for archival decisions (with which one may disagree).  It is a try.  And for that, I sincerely applaud it.

Are we really willing to hand over all determinations of quality and relevance to our machines or even to some mass human determination of them?  I, for one, am not.

Tags: Articles · Librariana · Web/Tech · Weblogs