Danville for New Years Eva

Well, I’m here in exciting Danville, IL for New Years Eva.  Eva got my room comped so I only had to pay for my tickets to the show and the buffet, which were affordable.  What a sweetie!

I’ve had dinner and I’m killing time in my hotel room reading until the doors open at 9.  I brought The Riddle of Gender: Science, Activism, and Transgender Rights by Deborah Rudacille.  Eva’s sister and my good friend, Gina, is an anthro professor at ISU and she teaches a lot of gender studies stuff.  I took 3 classes with her, to include Cross-cultural Perspectives on Women, Sex and Gender.

Anyway, looking forward to the show, and seeing Gina because it’s been a while.  And then there is always the previous post. ;-)

I do hate missing the party in town though!  I’m not popular enough, nor do I want to be, to have to choose between friends; it really sucks!  But once I found out Eva got me a free room it was hard to say ‘No.’

Well, it’s just about 9 PM, so I’ll say my last Happy New Year to everyone and head downstairs.

Peace and love, and whatever else you really need in 2006.

I’ll Miss You Till I Meet You

I tried again, I went last night.
Another date was just not right,
And as I drove myself back home,
A little voice said just be alone,
But sometimes, I think I see you in a crowd,
It’s not picture perfect, you’re just meant for me somehow,

And I’ll miss you till I meet you,
Oh I’ll miss you till I meet you,
I miss you all the time.


But there are days I think of you
Saying ‘hey, that’s beautiful,
Yeah, I see it too.’
It all goes by so fast, like waving hands.
You want to capture things,
find someone who understands,

And I’ll miss you till I meet you,
Oh I’ll miss you till I meet you,
I miss you all the time.

Can you keep me awake?
I thought you could help,
Just to feel my way,
Find my better self.

Oh I’ll miss you, Oh I’ll miss you, I miss you all the time.

Dar Williams. "I’ll Miss You Till I Meet You."  My Better Self

"The other common theme to this record is that the songs all put stuff I find important out on the table. Less metaphor and more me. As to the album’s name," she goes on, "it’s best illustrated by the image on the cover, that your better self is not always the one you plan out or even motivate yourself to be."

"As much as I love to control what I write and perform, I know my better self is not an intentional construction," Williams explains. "It’s a spontaneous creation that I stumble across while I try to tell the truth," she candidly admits. (from About page – emphasis mine)

"Just to feel my way, Find my better self."  "It’s a search, more than an actual fact…."  (From the interview about the title name)

Thank you Dar, for the hope!  And thank you Em, for believing in me.

My goals for next year

Travis Ennis is Preparing for the new year.  I gave up resolutions a long time ago but some goals are certainly called for. 

Here are some of my goals for 2006 (in no particular order):

Graduate (May)
Attend my daughter’s college graduation (May)
Start on the CAS degree (out of my hands for now—application is in)
Submit an article or conference presentation proposal
Learn to inline skate (anyone local want to teach me this spring?)
Start exercising again
Stop the "bad" voice in my head
Finish my cat & class education bibliography
Learn some basic XML

Find some hope

Here’s to a great 2006 for everyone!

Things I got from those who love me

I got to hang out with my daughter on Friday and then I went back over to my ex’s on Christmas day.   I’ve been having a problem with my back for a while now and it was really bothering me on Christmas Eve, but I hope to get a massage today.  And I hope that helps!

Monday I watched four movies, but only one was worth talking about and I don’t feel much like talking.  I guess I should say "writing," since that is really what I don’t feel like.  I have so many things on my list to write about and 4 or 5 already started but I’m just not in the mood.

All in all, it has been a good year for me.  I really can’t complain.  I know that I’ve made progress.  But I also often don’t care, or more importantly, don’t know or feel why it should matter.  All of my big fears are still "out there" somewhere, sometime.

Here are the gifts I got from my family:

Movies:

Music:

Books:

Aren’t Amazon Wish Lists wonderful things?

But by far, the most important present I got this year is the love and concern of my friends, particularly those here and in Bloomington/Normal.  My family is heavily dispersed and I don’t hear from some of them very often.  I know that they love and care about me, but it is often difficult to feel.  But the look in someone’s eyes, the turn of a head, the timbre in a voice.  These are often small things, but they can mean so very much.

I think I’m going to a local party here in Urbana for New Years.  It’ll be my 1st New Years party in…?  Hell, I don’t know.  A very long time. 

I could also go to Danville to see my friend Eva Hunter who is playing a gig at some hotel and there’s a brunch the next day.  But I have no idea how much it is.  You have to actually contact the hotel for ticket/room details, and it is a couples package.  And do I really another expensive reminder that I’m not?

Heck, unless the weather is totally crappy, I can walk to Mary and Jenny’s party.

I guess I should get this posted, finish my 2nd breakfast, and see if I can get a massage appointment.  I have to return the movies anyway.

Well, here it is another day (Wed) and I haven’t posted this.  I did get an hour massage yesterday.  She had to spend the whole time on my back it was so screwed up.  I watched more movies yesterday and even one today.  I also finished 2 books yesterday, EffiBriest and The Myths We Live By.

Today I went and spent about 4 hours with my daughter.  I miss her so much.  I really wish she was closer so I could spend more time talking with her.  She is such a wise young woman, and she could really help me grow into the person I want to be.

Anyway, on one hand, I’m sorry for the lack of posting and substance.  I have a lot of stuff in the queue, and even more things picked out.  I’m just not in the mood to do any of them.  So on the other, I’m not sorry in the least.  I guess I’m taking an unintended blogging break whether I want to or not.

"Break" isn’t over until mid-Jan so hopefully I’ll catch the fire again and write some things of merit before then.

I hope everyone is enjoying themselves over these holidays and I wish you all a Happy New Year!  Whichever ones(s) you celebrate.

So am I normal?

Trying to clean out some of the stuff I have in Bloglines I came across this article and accompanying quizes I found at 3 Quarks Daily back in early June.  "How male or female is your brain?" at the Guardian Unlimited.

It is by Simon Baron-Cohen, director of the Autism Research Centre at the University of Cambridge and measures your Empathy Quotient and Systemising Quotient.

Baron-Cohen’s theory is that the female brain is predominantly hard-wired for empathy, and that the male brain is predominantly hard-wired for understanding and building systems. He calls it the empathising-systemising (E-S) theory.

Empathising is the drive to identify another person’s emotions and thoughts, and to respond to these with an appropriate emotion. The empathiser intuitively figures out how people are feeling, and how to treat people with care and sensitivity.

Systemising is the drive to analyse and explore a system, to extract underlying rules that govern the behaviour of a system; and the drive to construct systems.

So my EQ is supposedly 16 (maybe I was being hard on myself since I don’t consider myself very empathic), but 0-32 is lower than average.  People with Asperger syndrome or high-functioning autism average 20.  The average woman scores 47 and the average man 42.

My SQ was 24.  The average is 20-39, with the average woman scoring 24 and the average man 30.

Plotted out I have an extreme S brain, but only due to my extremely low EQ score.  Actually, I think my SQ should probably be higher, that is, more male-like.  Maybe I was cutting myself some slack on this one too.

Oh well.  They tell me not to worry.  There are lots of explanations for such low EQ scores supposedly.

Combine this with my theoretically less scientifically derived "traditionally feminine personality" and it’s no wonder my society has me completely confused.

I do think that I am far better at empathizing than this quiz reflected though.  It asked about how easy or naturally various empathic responses and intuitions come to me.  But it did not ask about how much effort and attention I put into achieving those goals, only if it was easy or effortless.  For many years I acted only on what naturally occured to me in this arena.  But for the last several years I have been actively working on just these skills.  It is hard work, and I often fail.  But, I have improved greatly.

As in anything, the effort that one puts into something, not the natural gift, is what matters—at least on a moral plane, if not an aesthetic one.  And in this area,—as in most ,if not all—I’ll choose the moral any day.

Movies I’ve watched lately

I went and saw Pride & Prejudice with my friend Em yesterday.  It was quite good, particularly the cinematography.  Lots of beautiful scenes.

It was also a bit more romantic than the book.  When I first saw Donald Sutherland on screen as Mr. Bennet, I went "huh…?"  But he was actually quite good in the role.  Brenda Blethyn was wonderfully obnoxious as Mrs. Bennet.  (Heh, she has the same birthday as me, except for the year.  I’m not quite that old.)  Keira Knightley was actually pretty good as Elizabeth.  As much as I like Bend It Like Beckham, I was concerned with whether she could actually act.  She’s also much better looking as a long-haired brunette.

I was wondering where I had seen the actress who plays Caroline Bingley, Kelly Reilly, before.  Turns out she played Wendy in L’ Auberge espagnole, which I even own.

Last night I rented the movie, Trixie, with Emily Watson.  Unfortunately, the disc was filthy.  My DVD player wouldn’t even recognize the wide screen side.  The "normal" side was heavily pixelated during the opening credits but it settled down by the time the movie started.  So I settled in to a very quirky little movie.  But the worse was yet to come.  About 20 minutes from the end, the disc just got stupid.  The sound started dropping out, it would pause itself, or it was heavily pixelated.  So much so that I was unable to finish watching it.  <grrr>

While it is not the best movie I have seen it is nowhere near as bad as the review at imdb.

Emily Watson also plays Lena in
Punch-Drunk Love.  I liked her in that, so I wanted to see her in some other films.  Thus, Trixie.  Trixie is an "eccentric, unconventional woman whose naive aspirations to rise from
her job as a security guard to full-fledged private eye lead her into a
tangled mess."  She also has a habit of mangling the English language, primarily by mixing metaphors, for example, "Even if I am between a rock and the deep blue sea, I am gonna fix this thing."

My favorite line from the movie, which has finally clued me in to "what I want to be when I grow up," comes when she is introducing herself to the corrupt Senator.  "I’m a private defective."

Hehehe.  Private defective.  Now I have a real aspiration in life!  I’m so very lucky I wasn’t drinking or eating when she spit that line out.  Private defective.  The heck with knowing how it ends, I’ll rewatch it (from another source) just to hear her say that line again.

Very quirky, but very spotty.  I wish she wasn’t chewing gum throughout the movie.  She is a very good looking woman, but not with a mouthful of gum. 

I took it back today and they politely gave me a free rental to make up for it.  I took a real quick look around and grabbed god is great, AND I’M NOT with Audrey Tautou.  I’m going to watch that after dinner.

Well, it was OK, but not great.  It can’t be all bad seeing as it opened with Ella doing Porter’s "What Is This Thing Called Love."

I’ve watched a few others lately, but they’re all things I already owned.

Warm ‘n’ Toasty

Now I like my flower pic banner, but this one is more fitting for the season.

Besides updating some software on 7 computers today at work I got to play learn some about Adobe Photoshop Elements.  I followed along as best I could with this post by Paul Stamatiou, HOW TO: Make a Blog Header Graphic.  He uses full PhotoShop and I didn’t need, nor want, every bit he talks about.  This was the second one I did today, but the other is more Spring-like.  I have a whole crap load of interesting images that I want to experiment with, and I’d like to find out how to do it all on my Apple, so there’ll be more experimenting.

We do teach a tech class on image editing software so this was really professional development.

The flames photo is a portion of a wonderful photo (Fire 1 by hazboun) that I found at stock.xchng

Don’t just check, READ, your sources

Well, well.  I walked over to the LIS Library in the godawful cold yesterday to get a copy of the Burbules article cited by Gouge just knowing that it was going to cause me to have to rethink my position, be a man, learn, grow, and all that assorted stuff.

Guess what?  One really should read an entire article if one is going to cite it to support one’s position.  I mean c’mon.  It is only 13 (small) pages, with citations.

I guess this is where the philosophy degree comes in handy.  One must actually summarize someone else’s argument and understand it, applying the principle of charity, before either agreeing or disagreeing with it.  And if you are going to use it to support your thesis, then the previous goes double.

Principle of charity:  "Interpreting a person N means making the best possible sense of N, and this means assigning meanings so as to maximize the overall truth of N‘s utterances" (Cambridge Dictionary of Philosophy, 2nd ed., p. 547).  Or, if you prefer a web resource.

The correlate of, "Emphasis is placed on seeking to understand rather than on seeking contradictions or difficulties" is to not use something to support your position that is not in agreement with it.

Here is what Gouge used Burbules to say:

Burbules offers another concept of how to assess credibility on Web [sic]. He suggests that by linking Web sites together and collectively screening the addition of new material, [online communities] pool their intelligence and expertise to make credibility judgments and to cross-check one another. . . One might term this an instance of ‘distributed  credibility’ in that it displaces an individual judgment with a collective intelligence (Burbules, 2001). By applying Burbules idea to the blogging community, this process of networking prevents one blog as standing as the definitive authority. Instead the collective intelligence creates authority. It is from this distributed credibility that the blogging community provides a means of selection (Gouge, 16).

Why yes, Burbules did say what was quoted.  But then this was seriously broadened past what he was actually claiming, and possibly more importantly, all of the caveats and dangers of such communities that he goes on to point out were left out completely.

The first point is that Burbules was referring to a different kind of "community" than blog aggregators.  Particularly since he was referring to an actual form of online community, the web ring, whereas blog aggregators are in no sense a community.  One could argue that the aggregator is just tapping into the collective intelligence of the ring, but aggregators do not just aggregate things from these communities.  Communities exist in many different states, and some are healthier than others.  What if it is a community of holocaust deniers?  Does their collective intelligence make them credible?  This use of Burbules point just doesn’t wash, as I will explain further in a moment.

An  interesting part of the same paragraph that belongs between the ellision Gouge introduced goes on to say:

This phenomenon is interesting both as an epistemic exercise and as an instantiation of social constructionism at work. However, obviously it is imperfect since shared wisdom can also mean shared misconceptions or biases. While less hierarchical and more democratic than relying on invisible editor/archivists to make judgments on one’s behalf, this approach has the vices of its virtues (447).

That is the complete text that was replaced by an ellipsis by Gouge.  It goes on to say a bit further in the same paragraph:

The greatest danger of such communities, as with communities generally, is that they can become exclusionary, hostile to unconventional, or radical challenges to their presumptions and practices (Burbules, 2000). From a credibility standpoint, this means that serious questioning—the kind of questioning that can only come from "outside" a given epistemic framework—is less likely to occur, and it is more likely that over time the shared preconceptions of such communities, even when they have been originally valid, will eventually become credibility blinders.

What I have tried to show here is how the most common responses to credibility issues online, while valuable and reasonable within certain constraints, ultimately turn out to be paradoxical and self-defeating. This does not make them useless, but it suggests a limit to how clear and reliable such credibility judgments can be (447-448).

See, Burbules larger point is that issues of credibility cannot be made solely on issues of objectivity and truth.  His point is an ethical one.  "In the end, the best safeguard is to check one’s judgments against the judgments of a community with which one has confidence: choosing that reference group prudently is as much a moral matter, involving issues of respect and trust, as a matter of expertise" (453).  Seems I said something similar.  Far less elegant, but very similar.  One must know and trust the community first.

It seems credibility is an ethical issue.  It has moral consequences.  I may just have to write about the Burbules article seeing as at least one person has misunderstood it. 

Burbules, N.C. (2001). "Paradoxes of the Web: The Ethical Dimensions of Credibility." Library Trends, 49(3), 441-453.  Highly recommended.

And now a bit more on Google and PageRank from the new Google Newsletter for Librarians:

PageRank evaluates two things: how many links there are to a web page from other pages, and the quality of the linking sites. With PageRank, five or six high-quality links from websites such as www.cnn.com and www.nytimes.com would be valued much more highly than twice as many links from less reputable or established sites.

For sake of argument, I’ll allow for a moment that Google is able to determine quality via their crawler and indexer.  But, please, what makes something that is linked to by a news site of more value?  In some instances, especially if you are looking for news items, then the fact that news sites link to something should increase its possible relevance to your search.  But a large percentage of things that I’m looking for will probaby never be linked to by a news site.  How and why does Google rank some sites as more valuable than others?  That is a question of utmost importance.

As a rule, Google tries to find pages that are both reputable and relevant. If two pages appear to have roughly the same amount of information matching a given query, we’ll usually try to pick the page that more trusted websites have chosen to link to. Still, we’ll often elevate a page with fewer links or lower PageRank if other signals suggest that the page is more relevant. For example, a web page
dedicated entirely to the civil war is often more useful than an article that mentions the civil war in passing, even if the article is part of a reputable site such as Time.com.

Oooh, now their giving us reputable and relevant by using "trusted websites."  Like www.nytimes.com?

All I can say is that this article for librarians which purports to answer the question, "How does Google collect and rank results?," generates more questions for me than it even begins to answer.  I sincerely hope it does the same for any librarian that reads it.

Blogs as a means of preservation selection for the WWW

I recently read an article pointed to by Jeffrey Pomerantz at his blog PomeRantz:

Gouge, Marianne K. "Blogs as a Means of Preservation Selection for the World Wide Web."

I referenced this article in my comments on the article for our 1st Virtual Journal Club discussion.  Here is what I said:

I read another article that addresses and provides an answer to this question.  It is interesting, although badly edited, but it is ultimately, in my humble opinion, wrong.  [Gouge, Marianne K. "Blogs as a Means of Preservation Selection for the World Wide Web."]  The author [Gouge] relies on blog aggregators and a set of archival
criteria to determine what is of quality and should be saved.  What this boils down to is another instantiation of Google page ranking.   What the masses like (point at) determines quality.  This is such utter nonsense that I am continually amazed at the large numbers of intelligent people taken in by it.  The best that this sort of ranking can give us is what is of a certain kind of "quality" within pop culture.  That is, what is the most popular.  But to equate the popular with quality is just…well, let’s just call me elitist on this and move on.  I will probably write about the Gouge article on my blog seeing as I have now read it’s 40 pages twice.  Worth a read.  Just wrong, although it is a start and may be part of the answer.

But the fact remains that there is much of quality that has nothing
to do with pop culture.  And then there is the pop culture that is of
quality but not massively popular.  How do we identify those?

I intend to make a further critique of this paper here.  What I want folks to understand is that I consider this paper to be important, despite my negative critique.  The author is only using what is already a widely accepted method of measurement of quality and relevance on the web to determine if this method might also be useful in assisting in archival collection development decisions regarding blogs.

While I will be critiquing the specific paper, I will mostly be critiquing the underlying ideas.  I do consider her use of these metrics to be a reasonable one in that they are already widely accepted.  My goal is to show up the flaws with the assumptions underlying this method of determining relevance and quality.  I also understand that we may not have much choice but to use these sorts of measurements, from a pragmatic sense.  That is, that there is no better way to, in a practical sense, accomplish some of the uses these sorts of measurements are put to.  My hope is that as our technologies improve we may achieve a more useful means of having computers assist humans in making decisions about relevance and quality.  I also hope that the present method(s) of primarily measuring linking do not blind researchers from attempting other means of recognizing quality and relevance. 

I think the verb ‘recognizing’ is a much better one than ‘measuring’ in this arena.  First, one must be able to recognize these qualities of relevance and quality before one can measure them.  While quality may be inherent in something, that is still a completely unresolved issue in aesthetics.  Measuring quality seems to me like a completely futile effort.  Certainly, we can compare two things to each other and possibly rank one higher than the other on a personal scale of quality.  But to do this routinely, and for everything in a certain area is simply not possible.  I am often asked about my favorite band or artist, movie, color, and so on.  Now, I will not deny that I had these when I was younger; they also changed frequently.  And while I can certainly give you a (changing) list of my favorite artists, how can I even begin to rank Ani DiFranco vs. Ella Fitzgerald vs. Mark Knopfler?  Or rank American Beauty vs. Grave of the Fireflies vs. True Stories vs. Lola rennt?  They are all completely different in that they have different qualities and different relevances to me and my life.  And these both shift over time.

but then what kind of scale 
compares the weight of two beauties
the gravity of duties
or the ground speed of joy?
tell me what kind of gauge
can quantify elation?
what kind of equation
could i possibly employ?

school night
¤ Ani DiFranco ¤ reckoning

Relevance is certainly not inherent in anything.  There are far too many variables in determining relevance to someone or a situation.  And the research that shows that relevance and the question itself changes while one is undergoing the information seeking process should be enough to convince us that measuring relevance, particularly for others, is a completely useless undertaking.  Now, I agree that we can attempt to say what might be of relevance to a particular person in a particular situation.  But that is rarely what you see claimed.  The claim is that they are measuring relevance, often for large groups of people at the same time.

When returning results for a search, Google combines PageRank (our measure of a page’s importance) with sophisticated text-matching techniques to display pages that are both important and relevant to each search. Google counts the number of votes a page receives to determine its PageRank, interpreting a link from page A to page B as a vote by page A for page B. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important." Please note that ranking of sites in our search results is completely automated, and we do not manually assign keywords to sites. [Google Help Center from search on PageRank.  Emphasis mine.]

Clearly, Google is claiming that they are measuring importance and relevance to each search.  But if I do a search on "bass" how are they going to handle that?  And, yes, I agree I could do a far better search by excluding terms, such as "fish" or "guitar" or "drum" to narrow my results.  But, then I only have to look at studies of user search strategies to know how rarely people do so.  Don’t pull out your librarian’s bag of tricks on me.  We’re talking about normal people here.

Anyway, I’ll cover this and more.  On to the article.  Note: There are several sections of the paper which I will not comment on.

I don’t know if the copy available at UNC’s SILS Eletronic Theses and Dissertations is a draft copy or what, but it is very poorly edited.  Now I’ll admit, I have gone back and re-read some of my previous papers and been horrified to find typos and a few sentences that don’t make sense.  I’m sure we all have.  And this paper is highly readable.  I don’t believe that there was a single sentence I couldn’t make sense of.  It is mostly punctuation (not a strength of mine either) and spelling errors or typos.  A lot of them.

The purpose of the paper is to determine if various blog aggregators can assist in the determination of relevance and the archival quality of blogs to aid in archival collection development decisions.  It looks at three aggregators, Daypop, Blogdex, and BlogPulse.  There is also a cross-aggregator comparison to see which is best at indexing material that meets the criteria the author has selected for determining archival quality.

The method tested by the author is intended to be a third way between the "save everything" approach and selective decision making.  I wholeheartedly agree that we need a third way, or even better a plethora of third ways.  This is a useful attempt at finding an intermediate way.

Defining Credibility on the Web

As the author acknowledges, "Assessing credibility on the web is a difficult task" (14).  Some of the means used to judge credibility include: usefulness, relevance, interestingness, timeliness, comprehensivesness, "visual quality and design, URL domain name, date of material, and personal judgment if the source appears to be authoritative" (14).  She cites a Stanford study that shows that half of web users primarily determine credibility based on visual design.  Clearly a better method of determining credibility is needed.  User education and perhaps technology-driven solutions may help.  The problem in the end is that credibility, just as beauty or relevance, is completely in the eye of the beholder.

A quick thought that I’d like you to keep in the back of your mind is this:  What impact does this use of visual design to determine credibility have on what is linked to?  If over half of web users are using primarily visual design elements to determine what is credible, what does this imply for any system measuring relevance and/or quality based on links?

‘Distributed credibility’ is another concept discussed by the author.  This is when an online community displaces individual judgment by pooling intelligence and expertise to cross-check one another resulting in a so-called ‘collective intelligence’ (16).  The author (Gouge) claims that, "It is from this distributed credibility that the blogging community provides a means of selection" (16).

[The citation for 'distributed credibility' is Burbules, N.C. (2001). "Paradoxes of the Web: The Ethical Dimensions of Credibility." Library Trends, 49(3), 441-453.  Unfortunately I do not have this issue, but I'm going to have to have a look at this article in its entirety as I only believe in distributed credibility to a point.  Speaking of credibility, I have to admit that this article being in Library Trends gives it a lot more credibility for me than many other places it could've been.]

This distributed credibility though boils down to another form of the wisdom of crowds.  Karen Schneider of Free Range Librarian also questions this, although admittedly in a different context:

I’m cautious about invoking the wisdom of crowds–you mean the same people who reelected George Bush?–but to see why a person gave a recipe on Epicurious a particular rating is helpful to me in selecting recipe. (Knowing when and how someone isn’t wise is useful, too, like the people who complain that a macaroon recipe doesn’t work and make it clear that they were using marzipan and not almond paste.) Plus the ratings become part of the metadata.

The problem with the form of distributed credibility of the blogging community is there is little of the transparency that Karen rightly claims is necessary.  "What," you ask?  "Certainly there is credibility as we link to each other and, often even, state why," you vehemenently claim.  "Just look at the biblioblogosphere!"  Well, yes.  You are correct.  But only because we already have a good understanding of who is credible in this community.  Outsiders have no idea.  And some link somewhere in a search result cannot give anyone that knowledge of credibility that we in the community already have.

Then there are the issues of interest (relevance), gatekeeping, and so on.  I have already discussed some of this in response to some questions by Walt Crawford earlier this year.  But the simple point here is that in regards to the top 4 or 5 library bloggers, whom I know are extremely credible individuals, much of what they write is of little relevance to me.  But these kinds of measurement would lead one to the conclusion that only what they and a few more say is of importance.  And that is complete and utter nonsense.  I want to make very clear here that I do not mean to imply that these folks think or feel this way.  There are very few and highly limited domains in which we can individually be good judges of credibility.  We do need help.  But mechanical determination is not the be-all and end-all of a solution.  It is a necessary evil and will improve over time.  But we must keep in mind that it is ultimately a human judgment that is required before we all become convinced that what Google (or any similar ranking system) returns is relevant, credible and of quality.

Applying Credibility to the Blogging Community

This section talks about gaining social capital by people linking to you.  And while there is a small bit of usefulness in this concept, it is in reality not much more than a popularity contest.  Blogrolling is mentioned as a similar phenomena.  "From the blogroll a pattern emerges. The most popular and authoritative blogs list each other" (17).  I’m imagining the author has spent little time looking at political blogs or feminist bloggers complaints about the "man’s world" of much of the blogosphere.  Considering that she uses the phrase "most popular" she is somewhat off the hook.  But to put "most popular and authoritative" in the same sentence, as related concepts no less, is beyond the pale.  Only someone’s agent or Jimmy Wales could make such a claim with a straight face.

"If someone posts erroneous, [sic] information the author’s mistake will quickly be brought to his/her attention, therefore keeping the information provided within the blog accurate and credible" (18).  Maybe this is so in a limited selection of professional areas.  In broader, more subjective, areas this results in flame wars and endless debate.  I guess I shouldn’t say "more subjective."  Even librarianship is highly subjective.  We have no agreed upon theory of librarianship, and while we do have various professional statements of principle it is extremely clear that there is much latitude in interpreting these, much less subscribing to them in the first place.

As an example, I have been following the Chief Illiniwek conflict in various venues since before coming to UIUC, but especially since.  I have become so completely confused by all the various "facts" about the situation from the multiple sides in this issue (there are at least 3) that I am left to having an opinion on the matter based solely on the ethics of this fight wasting so much of the time and resources of people whose goal should be the education of themselves and others.  And while that may be an acceptable basis on which to form an opinion, knowing the "facts" about the symbol vs. mascot issue, the dance, the outfit, the opinion of folks of Native Illinoisan descent, etc. should all count for something in making one’s decision.  But credibility, quality, facts and so on, do not rise to the top.  It simply ends up being like liberal trolls on conservative blogs and vice versa.  To tie this directly back to blogs and the claims made for them, much of my information comes from the internal electronic bulletin boards of my library school.  Now we are all librarians or library students, surely we would do our research and only cite facts and, even if not, the credible and authoritative will rise to the top.  Guess again folks.

Archival Criteria

In this secton, the author discusses a few different statements and criteria for determining archival value.  She also makes several statements about various difficulties and writes them off as too difficult.  These are determining "value for research" and "intellectual quality" (22).  But, if we are going to write both of those off, then why are we even undertaking any of this?  What are we archiving, and why?  Then there is this odd connection made to ephemera which is somehow supposed to make the value determination easier.  And while I agree that much about blogs is ephemeral, I don’t see how this is supposed to help.

In the end, the author settles on 5 main criteria:

  • A. Is the Information Original or Unique?
  • B. Is the Information Documenting Issues of Current Social or Political Interest?
  • C. Does the Information have Wider Application, a Resonance beyond the Local Community?
  • D. Is the Information Exemplar of Its Type for Reasons of Design, Language, Topic, or Origin?
  • E. Does the Information Advance the Development of a Meaningful Organic Collection? (24-25)

My biggest issues are with Criteria B and C. 

Criterion C boils down to the information should be of interest outside of the blogosphere.  It can still be very narrow in its audience, say knitting.  OK, I agree.  But what is there that doesn’t meet this criteria?  Possibly 10% at most.  And even the most intense internal blogosphere navel-gazing might be of interest to some on "the outisde."  I maintain that this criterion is practically meaningless.  Then there is the issue of why it even matters if we accepted it.  If the goal is to archive some portion of the blogosphere for some future purpose, it is very likely that a blog’s current relevance to those on "the outside" is totally irrelevant to any archival decision.  I simply fail to see how this criterion can be relevant to any decision making, except in the rarest of cases.  Which means it fails as a primary criterion in my book.

Criterion B is more dangerous, on several levels.  While C can’t exclude much, B will exclude a large amount of high quality, credible, and relevant material.  B is set, in this study and in one aggregator in particular, to a 30 day cut-off.  If it isn’t about something that happened in the last 30 days it fails to meet this criterion. 

I have a hard time even beginning to explain my concerns with this.  It all seems so self-evident that it becomes hard to put into words.  We are talking about archives here.  Why are we only considering items about "current events?"  I have seen some amazing writing on blogs about things that are decades, centuries, and even millenia in the past—things that are credible, interesting, of intellectual quality, and of value for future research.  Why should these not be considered for archiving?  There is so much more here but I hope it is simply self-evident as I don’t feel like teasing this apart for myself or anyone else.  I simply accept it as completely wrong-headed.  If you don’t, then fine, maybe we can have a discussion and I’ll do the work involved to further explicate the whys of its wrongness, but for now, its just wrong.

It might be fully acceptable if this paper was about making archival decisions about the popular in pop culture, but its not, or at least never claims to be.

Methods

I’m not going to critique the author’s specific methods.  I already critiqued the general method above in my discussion prior to talking about the paper.  But I do want to comment on a few things in this section.

"As a link is repeated creating a burst it could be considered an indication of authority or quality of content and therefore be selected for preservation from this method as well" (27).  And yes, I see the "could" in this sentence.  I do appreciate it.  But, I also want to add that it could also only imply that it is the topic of the moment and have nothing to do with authority or quality.  How many people bought Britney Spears 2nd album and how fast?

Released in May 2000, Oops!… I Did It Again also debuted at number one in the U.S. and Canada, and was a similarly huge hit like her debut. It sold over 1.3 million units during its first week in the U.S., making it the fastest-selling album by a female artist in history. Within a year of release, it had shipped over nine million copies in the U.S. alone (and would go on to ship another million on top of that). Wikipedia.

So repeated linking implies quality and authority, huh?  [And yes, I am well aware that I could use a far more authoritative source here than Wikipedia, but the scale is what is relevant, not the exact numbers.]

"Once the content is posted the distributed network the blog will select the most relevant and credible information" [sic] (27).  I can easily come up with so many counterexamples to this claim about credibility that its not worth the effort to find any.  They are everywhere.  And yes, of course, if someone links to something that does implies relevance.  But why?  How?  That is often impossible to determine unless one spends the time to become a member of that community.  How many of you know what the title of my blog represents?  Why did I quote some lyrics from Blossoms & Blood, and why those lyrics, as a post?  While I am not making the claim that my blog or that post are archive worthy, the point is that those things all have immense relevance for me.  But how many of you know what that relevance is?  [And yes, I'm aware that you probably don't care.  It is an example, if you don't care about mine come up with one of your own.  It really is quite simple.]

I’m trying to leave the philosophical issues of relevance, quality and so on, out of this already complicated discussion, but there is always that level to which we could descend.

Further Study and Conclusions

"Each of the aggregators was able to indicate a high number of links that were of archival quality particularly in the quality of currency and with appeal to a wider audience. An improvement to the aggregators would be to address the ability to select a greater amount of content that is unique and of exemplar quality" (34).  Great.  They are good at the criteria that I consider to be generally useless.  My confidence is certainly inspired.

Weighting of links coming from experts or those with more influence is suggested as a solution to make the selection of the unique and exemplary better.  And while this solution makes some sense, and is what is done already in some ways by some search algorithms, it is also highly flawed.  Are we again saying that only what is linked to by the top 4 or 5 folks in the biblioblogosphere is unique or exemplary?  I fail to see this on a daily basis.  Again, that is not to disparage what they do.  It is only to state that they have no special claim on the unique, the exemplary, the interesting, the credible, and so on.  There is much outstanding work being done that they never see or link to, just as in any area of the blogosphere.

I have been fairly critical of many of the ideas in and behind this paper.  And I am.  I have also touched on some of the reasons why.  I do want to restate that I consider this paper to be a valuable contribution.  It evaluates a widespread methodology, it provides some data (which one is free to interpret differently), and it provides some criteria for archival decisions (with which one may disagree).  It is a try.  And for that, I sincerely applaud it.

Are we really willing to hand over all determinations of quality and relevance to our machines or even to some mass human determination of them?  I, for one, am not.

My first Virtual Journal Club posting

I’m trying to keep my chin up as I remind myself that it is a very busy time of the year with semesters ending, traveling, holiday preparations, and so on. With that in mind, I’m still hoping Virtual Journal Club will take off. We had our 1st “meeting” yesterday and we only have a few postings so far. They are good ones though!

We looked at the award winning article, O’Sullivan, Catherine. “Diaries, On-line Diaries, and the Future Loss to Archives; or, blogs and the Blogging Bloggers Who Blog Them.” American Archivist 68 (1), Spring/Summer 2005: 53-73.

Feel free to join in at any time. The “meeting,” which isn’t anything of the sort, is only a target date. I’m hoping that as people catch their breath sometime during the Holiday(s) period that they’ll go ahead and post their thoughts. The articles for next month’s “meeting” are also posted.

With that said, here are (some of) my thoughts as I posted them on the O’Sullivan article (with some minor formatting changes allowed by this format):

Let me say up front that I thoroughly enjoyed this article. Unfortunately, it seemed a little light on the substance of interest to this discussion. I really enjoyed the historical overview of diaries, although 95% was irrelevant to our purposes. I found it fascinating as it ties various past and current threads of my reading together for me.

The short and sweet for our present purposes is that diaries have served as important archival materials for an assortment of reasons. Blogs, or at least some, should as well. An important distinction left untouched by the author is the wide variety of blogs that exist. Clearly many of these will be important to future scholars. The author clearly focuses on the blog as diary though. Seeing as she has provided us with an historical overview of diaries, this seems only fitting, even if ultimately limiting to the overall question of the value of archiving blogs.

The author asks three fundamental questions early on: “Will archivists have to adapt archival principles and practice to meet the needs and limitations of electronic records? Will archives have to modify their approach to administrative operations and policy-making procedures in the digital age? How do digital records factor into the collection development policies of collecting archives” (54)?

Clearly the answer to the first two is a definite yes. I would also hope that the archives community has been working on answers to these questions long before blogs came on the scene. [They have.] The last question is more complex, but then that is the question that motivates this discussion. Millions of electronic records are being created and millions are lost every day. The scholar of the future will need access to them in some organized fashion. Currently that is not being provided in any wide-scale, systematic fashion.

The diary as site of identity construction:

“…diaries developed as sites of self-exploration, self-expression, and self-construction. The process of self-monitoring adapted to meet the rhythms and demands of individualism, capitalism, nationalism, and industrialism, the hallmarks of modern society. The diary became a space where an individual’s identity was actively conceived and constructed” (60).

“Diaries were, to a large extent, self-referential and served as repositories of memory” (62). “…diaries acted as sites of memory, intended to preserve the diarist’s past from future oblivion” (62). My blog has, in fact, replaced the journal I had been keeping the last few years. Of course, that means much is left out of the account. Much could be said in the privacy of a Word file locked away safely on my hard drive vs. what I am willing to put on the open web. I know the same is true based on discussions in the biblioblogosphere and elsewhere about blog content. Does this then imply that online diaries are inferior to their print counterparts? Possibly. In many cases, yes. It depends on many factors, including how secure the diarist felt that their diary was and what the consequences of disclosure might be.

More democratic?

Angel has already addressed the more democratic aspect of blogs compared to early diary writing (65). This is, in one sense, a false dichotomy though, as it contrasts the early days of diary keeping to the early days of blogging. Or maybe I should say it is the wrong comparison, as these are what should be contrasted. The more interesting, and again not-quite-right, comparison is between opportunities to publish. This comparison is kind of odd, since most diaries were never meant to be published, while blogs are by definition. The important comment is the one by Todd Levin of Salon.com. He says that “on-line diaries occupy…one of the few places today where a “level ground for publishing” actually exists” (65). That, I believe, is the assertion that Angel is rightly contesting. It is much more affordable for someone to be able to obtain a cheap diary or some recycled paper to write in than it is to afford routine internet access. Both require a certain level of literacy, but blogging requires more forms of literacy. One must be able to read and write to maintain either form of diary, but one must also be conversant in various forms of computer literacy to maintain a blog.

“Seemingly trivial observations can shed light on major historic events. The evidential value that diaries possess for a particular age, or a particular diarist, cannot be overestimated” (64). With the value of the blog-as-diary as a given, what then does this mean for the archival enterprise?

Collection development concerns:

First comment. This article focuses exclusively on diary-like blogs. There are many other forms of blogs that will also be of value to future users of archives.

Admittedly, as the author says, much of the blogosphere is dross, but much is of value also. Value. This is always the $64,000 question. What is of value? And more importantly, what will be of value to “the future?”

I read another article that addresses and provides an answer to this question. It is interesting, although badly edited, but it is ultimately, in my humble opinion, wrong. [Gouge, Marianne K. "Blogs as a Means of Preservation Selection for the World Wide Web."] The author [Gouge] relies on blog aggregators and a set of archival criteria to determine what is of quality and should be saved. What this boils down to is another instantiation of Google page ranking. What the masses like (point at) determines quality. This is such utter nonsense that I am continually amazed at the large numbers of intelligent people taken in by it. The best that this sort of ranking can give us is what is of a certain kind of “quality” within pop culture. That is, what is the most popular. But to equate the popular with quality is just…well, let’s just call me elitist on this and move on. I will probably write about the Gouge article on my blog seeing as I have now read it’s 40 pages twice. Worth a read. Just wrong, although it is a start and may be part of the answer.

But the fact remains that there is much of quality that has nothing to do with pop culture. And then there is the pop culture that is of quality but not massively popular. How do we identify those?

The answer must be specifically targeted collection efforts based on an organization’s mission. It will take human decision making, perhaps influenced somewhat by various technological “suggestion systems” and also perhaps harvested by a program. But it will take a decision based on an expert human determination in the first place.

Regarding the physical differences:

One aspect of the construction of blogs makes them one of the simpler electronic formats to archive. That is the underlying HTML and CSS that generates a blog when it is rendered by a browser. This format is inherently much simpler to reproduce over the long haul than say Microsoft Word’s proprietary format. It can be stored as simple text and is rendered as a web page only when the time to view it comes. It also retains much of its semantics when viewed as text.

But the nature of the web adds a whole new level of complexity—the hyperlink—over a “simple” stand-alone document format such as MS Word. Trying to archive all outbound and inbound links from a blog, even a single blog post, would be nigh impossible. And even if possible, as least considering inbound links, it would be at best a snapshot in time. But without these links, much is lost in the way of context. How was this blog or post situated within the context of its blog, its community, discipline, time, and so on?

Another structural issue with blogs is the various forms of linking done within a blog. Directly related to the individual posts are comments and trackbacks from other blogs which contribute greatly to its context and as a possible (limited) “measure” of importance or influence. Other forms of linking are related to the blog as a whole. Blogrolls; Amazon wishlists; links to Amazon.com for current reads, listening to, etc.; ads; iPod playlists; images within the posts; and so on are all highly contextual and can provide a lot of information about a person. Most of these elements are also highly changeable.

The author states that, “…the physical nature of a manuscript diary reveals something of its history to the reader. The same cannot be said of documents viewed behind the flat, cold, glossy glare of a computer screen” (70). While I understand her claim to a point, I also disagree highly. As restrictive as the blog format seems to be, most blogging software (at least the non-free type) gives one a large amount of leeway regarding customization and layout. The specific layout and design elements utilized, whether someone is using a free account or paid, whether it is hosted or on one’s own domain, whether there is a blogroll, and other factors can tell quite a lot about a person and about the history of the blog. This is particularly the case if the blog has been archived over time. For instance, Jenica changes her title image and color scheme every few months and when she does she usually posts a bit about why. This would be valuable historical information revealing something of the history of her blog in just the sense that “the physical nature of the manuscript diary” does. [I'm not picking on you Jenica! You just make more frequent changes than me, and your "style" is...well, you have style. I don't.]

Revisability of most electronic documents is an issue. But it is one we are going to have to live with. Multiple snapshots over time may help towards mitigating this issue, at least in the case of blogs.

Future relevance? / Solutions

This actually takes us back to the question of collection development/management. How do we determine relevance, particularly future relevance? The author states that broadening acquisition policies “by developing collecting strategies that include on-line diaries” is where to begin (71). I could not agree more. The first decision to be made must be that these are worth preserving. The second, and admittedly much more difficult step, is to “develop a sound method of appraising” blogs to determine quality and relevance to a specific collection (71). This step is, as I see it, the most difficult and will require the greatest human input and decision making. The third step is to determine how to go about doing the actual acquiring once the previous determinations have been made. It is at this point that questions of copyright rear their multi-faceted heads. And as the author says, all of “[t]his needs to occur before a method of managing and preserving this information is developed” (71). Of course, the development of a good solution to the harvesting problem can include solutions to, or at a minimum work toward, the management issue. Preservation is related but is in reality a completely separate issue, although what exactly is harvested and how it is to be managed will impact on questions of preservation.

So the $64,000 question still remains. How do we determine quality and relevance? And how do we get a scalable answer to these questions? Unfortunately, I do not think there is going to be an adequate scalable solution, except by critically narrowing one’s collection focus. Various ranking algorithms and aggregator services can help to point us in the direction of content that might fit our needs, but they are not the solution. At best, they point us to the popular. And I will always maintain that there is much more of far better quality available and waiting to be discovered than that which is popular. So, a narrow focus implemented by collection developers who are well-versed in the field in which they are collecting and who have the time to actively search and browse, and then evaluate, the material they find or that which is brought to them via other means. Is this idealistic? Certainly. But, it is no more so than that which should drive most human acts that strive to construct and record meaning and the quality products of human effort. Perhaps along the way towards this lofty goal we will discover pragmatic compromises that can be made, just as we do in all such efforts. But to not begin with the ideal in mind is possibly more of a shame than to not begin at all.

Peripheral matter:

One concern that the author attributes to others completely baffles me. “Many communication and information technology specialists believe that blogs, being native to the Web, would lose all meaning and context if taken out of their natural environment” (73). Can anyone explain that concern to me? I can think of so many different angles from which to attack that thinking; and that includes even if we have poor solutions to one of the big context (links of all types) issues. Most blogs are, in the main, either text or pictures. And there is no “natural environment” for text or pictures. They only need to be represented as faithfully as possible, and even then it is only certain aspects that need to be faithfully reproduced. Yes, there is much pointing hither and thither on the web, but at the moment we are discussing blogs as diaries, not link blogs. Thus, while I agree they will lose some context and therefore some meaning, to claim that they “would lose all meaning and context” is simply asinine.

What we know of Aristotle’s thoughts is almost completely, and may be entirely, from lecture notes taken by his students and students of his students become teacher. Yet we practically ascribe the foundation of Western civilization to him. Based on what meaning and context left over from this transition from his words “natural environment” do we do this? We have the famous cave art at Lascaux, still in its “natural environment” (to some small degree), yet what do we know of its meaning and context? Them darn IT “boys” (usually boys) just drive me batty when they start talking about meaning. Most of them don’t know a thing about natural languages and how they work to construct meaning for humans, nor about any other aspect of human meaning making. I wish the author had provided some examples of people holding this view. I prefer to pick on real people since it easier to get further information on their views whereas it is much harder to do so from some disembodied group of “[m]any communication and information technology specialists.” Oh well.

I want to thank Lindsey for this and the other suggestions. I thought it was an excellent article overall, but then I also appreciated the history, tangential to our discussion be that as it may.


Feel free to check out the others. Hopefully we’ll have a few more soon, but I understand…’tis the season and all. Maybe next month.