habitually probing generalist

habitually probing generalist header image 2

LC Working Group – Structures and Standards, part 3 – Diane Hillmann

May 12th, 2007 · 3 Comments

Diane Hillmann – Research Librarian, Cornell University, Olin Library, also LITA Standards Coordinator

“Structures and Standards for Our Bibliographic Future” [ppt]

Before I begin, let me say “thank you” to Diane Hillmann for archiving and making her presentation available. And since she has, I highly suggest people have a look. Kathryn’s and my notes are basically a transcription of the slides so I may be less thorough on this presentation than I would otherwise.

On another note, especially for a standards coordinator and a librarian, shame on Diane Hillmann for only making these available in MS PowerPoint format. I am not one to say it should not be used, but it should not be the only available format for others.

General questions of balance are part of the consideration of the future of bibliographic control:

  • Standards compliance to standards interoperability
  • Environment? “Library” standards to larger standards world
  • Libraries have a long history of building standards and sharing data, BUT we cannot let our past become a strait-jacket

Imperatives:

  • To operate in a broader web-based arena using standards developed for sharing
  • To expose legacy data and vocabularies for wider use [Hear, hear to both of these!]

A start to this is the recent RDA/DCMI meeting 4/30-5/1 about data models

Outcomes:

  • RDA element vocabulary
  • Decision to expose RDA value vocabularies
  • Decision to develop an RDA Application Profile (AP) for FRBR and FRAD

RDA Element Vocabulary

  • Separate elements, attributes, properties from instructions for application
  • Provides definitions, relationships between elements and sub-elements that can be exposed to humans and machines
  • Explicitly include FRBR entities as defined relationships

RDA Element Vocab would include:

  • Element names; e.g., title proper
  • URIs – persistent and unambiguous
  • Definitions (to support semantic understanding)
  • Relationships (blueprint for processing inferences)
  • History of term changes / versioning

Why important?

  • Formal representation breaks down the silo around library data so that it is understandable by humans and machines
  • FRBR relationships can be explicitly included (clarity of expression)
  • Extensibility becomes far easier

RDA Value Vocabs

  • RDA is loaded with controlled vocabularies (like AACR2 and MARC21); embedded and difficult to use
  • CVs need to be formally expressed to be effectively used/reused and extended (recent RDA/ONIX joint effort is a step in the right direction) : this means human readable concepts, with machine registered URIs, in a traditional thesaurus structure, perhaps encoded in SKOS [there's a couple slides on this in the presentation] [there are also other XML thesauri applications - Alexandria Digital Library, ...]

RDA Application Profile

  • Documentation of community understanding and intent, key relationships, obligations and constraints.
  • Provides guidance for semantic crosswalks and specs for tools, applications and encoding.
  • Serves as a primary document to capture context around the creation of metadata (decisions and criteria).
  • Specifies appropriate controlled vocabularies and syntax encoding schemes.

Why is an RDA AP Important?

  • Can serve as a sound basic structure for the rethinking of library metadata structures/applications.
  • Each specialized community can use a related AP and reuse as much of the RDA EV as needed.
  • Further extension can be accommodated with a common structure like an AP to facilitate seamless reuse due to use of identical formal representations in each extension.

Effects on RDA?

“Not a lot but everything.”

  • Easily test data assumptions and instructional clarity
  • Allows specialized usage to evolve within an interoperable framework
  • Does not tie RDA to any specific encoding
  • Does not constrain the historic complexity of traditional library data

What’s in it for the DCMI?

  • DC community work on Application Profiles has been frustrated by a lack of formally declared properties suitable for reuse
  • RDA will provide the Semantic Web community with a plethora of stable, well-tested and generally applicable elements, properties and vocabularies
  • More general participation in development of DC Abstract Model-compliant APs

Moving Forward with an RDA/DC AP

  • Task Force has been established; co-chairs: Gordon Dunsire and Diane Hillmann
  • Both sides looking for funding
  • Important issues still on the table – but participants continuing to work

Problems with Legacy Vocabularies

LCSH, LCC, Name Authority File (NAF), etc.

  • Other communities are “suddenly discovering our legacy” and are interested in using these structures
  • New models are required for licensing in order to welcome these communities use of our legacy data [Amen!]
  • Difficult to use outside of MARC-based applications or unavailable for use due to:
    • Lack of URIs
    • Not structured for Web applications
    • No formal representations exist for application or continued development
    • Updating is not automatic

Why Are They Still Unavailable?

[If one is unaware of this conference I suggest they look into it. It is a sad reflection of where we are today that so much said 7 years ago (and some of it much earlier) has been simply ignored. I have only made it about 2/3rds of the way through the proceedings, but it is an eye-opener considering the present context. And if anyone has a copy of the proceedings they'd like to sell me please contact me via my Contact form.]

Can They Be Repurposed?

Yes, but they need to be “Webified” – URIs, explicit structure,…

See Harper & Tillett in upcoming Cataloging & Classification Quarterly edited by Jane Greenberg, “Library of Congress controlled vocabularies and their application to the Semantic Web.”

[Last Fall in Pauline Cochrane's Classification Systems seminar I gave a presentation entitled "Free the Authorities!" My independent study is on Terminology Services. I firmly believe these issues are critical if the library community is to retain much of its relevance in an online world.]

Questions

Q1: [I thought this was Cliff Lynch, Kathryn has it as Schottlaender but questionable] There are other vocabs outside the library world, e.g. gazetteers, etc. What about the openness of this set of CVs?

A: Absolutely! [following from Kathryn's notes] Reference to Weibel, Sutton, Hillmann, Baker proposal to contact legacy vocabulary owners about registering and exposing vocabularies for web reuse to overcome the old business model of publication for development and maintenance. This got little traction, but the time has come to try again.

Q2: [unknown] Are Bade’s world and Hillmann’s worlds conflicting or compatible?

A1: [Hillmann] there is no conflict in the goals which are similar in terms of metadata needs and the need to keep the special high need users but the divergence hinges on how they are satisfied we need to find new ways of doing this kind of record access

A2: [Bade] Each technology is technologically uncertain and when put to work there are many different results and different understandings.

A3: [Schottlaender] There is no conceptual divergence here – you both agree that we need more structured metadata NOT less.You are both stating the same thing. This reminds me of an LoC meeting 10 years ago between DC proponents and catalogers – each had the wrong impressions, different vocabularies but the same goals.

Q3 [Janet Swan Hill] I see a change in the use of vocabulary to passive words, expose, allow, enable, provide. Is this a conceptual problem or a problem with the language itself? This kind of vocabulary doesn’t ensure USE or USEFULNESS.

A: [Hillmann] This is what happens when you bring two cultural experiences together. WE are talking about the insertion of library culture (which uses YOU MUST) imperatives) into a broader set of communities where it doesn’t work that way. This is not coincidental use of vocabulary. Value must be emphasized in order to enable use, to entice. This language is intentional so people can/will see that something works, and has value and then decide to use it. We can’t do this by saying YOU MUST any longer. “…we have no stick to hit them with…”

[Swan Hill] Yes, but what is missing is “why you want to”

[Hillman] It is a marketing problem and we know it. Karen Schneider is working to provide a full explanation for consumption (?DCMI/RDA articulation work?)

Next up: Jane Greenberg, UNC

Tags: Cataloging · Information Retrieval · Librariana · Metadata · Standards · Web/Tech · Working Group on the Future of Bibliographic Control

3 responses so far ↓

  • 1 Jodi Schneider // May 13, 2007 at 9:32 am

    Mark, Cataloging Distribution Service still sells the
    Proceedings from the Bicentennial Conference on Bibliographic Control for the New Millennium
    $45 North America, ISBN 0-8444-1046-2
    This page was last updated last week, so I assume it’s current:
    http://www.loc.gov/cds/newfrom.html

  • 2 Jodi Schneider // May 13, 2007 at 9:41 am

    The proceedings also appear to be online in html, via the proceedings homepage
    http://www.loc.gov/catdir/bibcontrol/
    I found full-text links for the sample I looked at.
    So you may not need to wait…

  • 3 Mark // May 13, 2007 at 9:57 am

    Hi Jodi! Yes, I know they are available via the web, but remember that I’m old and have an attraction to dead trees.

    Seriously, I read lots of things electronically, but this is a 536-page publication. Web availability is not useful to me since I want the whole thing for historical perspective, not just one or 2 articles. Also, seeing as they are web pages it would most likely be much greater than 536 pages to print it from the web … and all that clicking!

    I am greatly pleased that it is freely available via the web, though.

    I will try to order it once again. But I tried about 6 months ago and was told they no longer had any available.