Fun with fixed field formats: Biographical information in 008
One of the moments that first got me thinking about creating this blog happened when I was cataloging a serial a year or so ago. I don’t work with serials very often - and if I’m being honest, I’m happy to keep it that way. But this day, as I was creating an original serial record in Connexion, and I was faced with an unfamiliar set of fixed fields:
I have a hard enough time remembering all of the fixed field values for monographs, which are mostly what I catalog, so I definitely had to look up some of the fixed field values for serials. I ended up reading through the possible values for “Nature of Contents” in OCLC Bib Formats & Standards, and noticed something kind of odd. One of the possible values is “h” for Biography. It’s allowable for continuing resources, but NOT for books. Presumably, that is because books have their own fixed field just for recording if something is a biography or not. But why would that be? Why would books and continuing resources handle biographical information differently in fixed fields?
At this point, it’s probably worth stepping back and talking about what the “fixed fields” are. For people who do the majority of their original cataloging in Connexion (I am one of them), we’re used to thinking of the fixed field as separate chunks of data, which change based on the type of resource we are cataloging, because that’s how OCLC presents them to us. But as defined in the MARC standard, what OCLC presents as separate elements are actually defined by their position within the Leader and 008 fields. Those fields (and the other 00X fields) are called “control fields,” or sometimes “coded fields,” because the information in them is defined based on relative character position, instead of subfields and indicators. For example, the three-character language code is characters 35-37 within the 008 string.
This is where it gets fun: the values of characters in the 008 field vary depending on the resource type. There are always 40 characters, but the meaning of characters 18-34 changes per resource type (technically, they are the 19th through 35th characters, because the numbering starts with position 00). For example, 008/34 can carry these meanings, depending on the material:
Nature of contents is 008/24-27 for books, and 008/25-27 for continuing resources. (For continuing resources, position 24 is the similar Nature of Entire Work.) Most of the possible values are the same for both material types, but some, like h for Biography, are not.
Okay, so different material types have different 008 definitions, which explains the biography difference between books and serials. But… why? Why would the positionality be different for these two types of materials?
Well, it’s probably because books and serials used to have two separate MARC formats. When MARC II was published in 1969, it only covered books. Other MARC formats were released over the next several years for other materials types: serials and maps in 1970, film in 1971, music and manuscripts in 1973.[1] Each of those formats contained fields needed specifically for that material type. Serials and books (along with other material types) have different characteristics, so it made sense that the 008 field defined different values for each position.
The model of different formats for different materials worked well enough for a while, but by the early 1980s, there was a desire to bring the different formats into one comprehensive bibliographic format. The project of “format integration” formally launched in 1983, and was mostly accomplished by 1988, although some lingering work lasted for years after that. According to a report about the subject from the Library of Congress, “Format integration is the validation of data elements for all forms of material, thus removing the restrictions on data elements that currently make them valid only for specific forms of material. The result is a single bibliographic format that contains data elements that can be used to describe many forms of material.”
The variable control fields like 008 were a particular source of concern with format integration. The ultimate solution was that the values of the specific positions would continue to differ based on material type, and would be defined by the type of material coded in the Leader/06-07. Additionally, the 006 field was defined in parallel to 008, so that additional characteristics could be recorded (because 008 could not be repeatable). This solution enabled the different types of information recorded in the 008 to remain, even though the new, unified bibliographic format could be used for all types of materials.
The 008 field itself was not changed very much during the process of format integration. Particularly, the positions that are defined differently per material type (characters 18-34) don’t seem to have been altered. I can’t confirm this, but my suspicion is that once it was decided that 008 would remain different per material type, it was easier to just leave it alone, rather than trying to redefine the characters and create a lot of work to convert existing records.
So that’s it: the different definitions of 008 apparently go all the way back to the earliest days of MARC, when each material type had its own separately defined format. Which seems fitting, in a way. The entire structure of the variable control fields, with each character position bearing a unique definition, makes the most sense when thought of in terms of the data storage and computing restrictions of the 1960s and early 1970s. Of course, none of this explains why we should code for the presence of biographical material at all, but that’s a topic for another blog...
[1 ]These dates come from Michele Seikel & Thomas Steele, “How MARC Has Changed: The History of the Format and Its Forthcoming Relationship to RDA” Technical Services Quarterly (2011) [paywall].