Monday, July 6, 2009

Support Query: Author Name Versioning

Institution: University of Glamorgan
Date: 30th June 2009
Subject: Metadata

A recent query came in from the University of Glamorgan who were looking as to how ‘to build some robustness around the issue of duplicate authors appearing in DSpace when the same author has a variety of author names.’ The phenomenon of different author names for the same author comes about from different publishers enforcing different citation styles and restrictions on an author. If each of these different names is entered into the ‘Author’ field of an item’s metadata record then there will be as many ‘Browse by Author’ records available in the repository as there are varieties of the author’s name.

A response to the query came from Bangor University who are planning in the future to agree a name format with each author and to use this agreed name in the ‘Author’ field. The publisher’s version of the name could then appear in the citation for the item entered into the ‘Citation’ field. By entering the data in this way only one ‘Browse by Author’ record is ever created but the name variant still appears in the item record and is therefore, a searchable object for search and discovery services such as Google.

This method offers a very straightforward solution to the problem but relies on the individual to recognise which author the variant name is associated with. I put the query forward to the JISC-REPOSITORIES mailing list to see if there were any other methods being utilised within other repositories and if any of these were a more automated solution to the problem.

I was previously aware of the Names project, which is developing a name authority system to reliably and uniquely identify individuals and institutions. This project has received further JISC funding and they are developing their prototype API which uses the Zetoc service to identify authors by assigning a unique id to each individual, and then associating each variant of the author’s name to this id. Current documentation on this API is available from http://130.88.120.172:8080/help.html, along with some example searches from the prototype. Networking Names, was highlighted as another initiative which looked to identify components of a “Cooperative Identities Hub” which would store information to help identify unique entities such as individual authors.

Another development which was of interest was CoNE (Control of Named Entities), a module which, according to the developer, can sit over DSpace, EPrints or Fedora repository software. It allows you to create an authority record for a number of metadata fields, including Author Name, so you can add all the known variants of that field into the module but you are prompted to use the confirmed authority record (or name version) for that entry. This module also applies to Journal Titles which may be of use if some contributors use title abbreviations.

A number of the respondents to the query were EPrints users, this software already offering auto-completion for a variety of fields including Author Name, using information from a known database or web service, i.e. LDAP. The ‘Creator’ (Author) field is a combination of a ‘name’ and an ‘id’ i.e. an e-mail address, which gives further authentication for each author. A citation in EPrints is not entered as a separate field however, but is concatenated from a selection of the other item record metadata fields. Therefore, whatever name is entered as an author will then be used within the citation.

This functionality means that it is not possible for two variants of an author name to appear within the same record, and if you wanted to stay true to the publisher’s version each time you would be back to having different ‘Browse by Author’ records again; although, in the repository database each name variant would be associated with the same author id. It was pointed out however, that this functionality does allow a controlled house citation style to be used within each repository record citation. As one respondent said, ‘The form of a citation always depends on the publication in which it appears, not on the publication to which it refers.’ Perhaps then the publisher’s version of an author name does not need to be stuck to rigidly, or even reflected within an item record, and an agreed in-house style for citation can be used each time.


If anyone would like further clarification of this information, or would like help with any other item record queries then please do not hesitate to contact the WRN team via wrnstaff@aber.ac.uk