Issues in Science and Technology Librarianship
Management of bibliographic and web references for many researchers is the closest thing to knowledge management they will ever do. This article describes ShaRef, a new approach to reference management that focuses on the user and enhances traditional reference management approaches with collaboration features and lightweight knowledge management. While this is primarily targeted at providing individual users and user groups with a better tool, it also creates a new and interesting link to libraries, because of the features that enable users to go from their own references directly to the library through the use of OpenURL. Thus, libraries must adjust to these new types of users, who are using new technologies to access a library.In the context of libraries, a "reference desk" and "reference services" usually refer to library employees helping library users find library resources. For most researchers, however, a "reference" is something more abstract: a pointer such as a bibliography entry pointing to a resource such as a journal article. Library references, on the other hand, often are required to help library users to convert very fuzzy references into more concrete references, which then point to actual resources. Thus it could be argued that the "human references" cover a much wider area of locating actual resources, but nevertheless this article concentrates on the abstract references and their management.
Looking up "reference" in the dictionary, one finds, among other entries, "one referred to or consulted," and "something that refers." We will concentrate on the latter definition of a reference as an abstract entity, arguing that most researchers routinely collect references as part of their personal -- rather lightweight -- knowledge management. We present some statistics gathered through a university-wide survey at ETH Zürich, and then present a system which is currently being developed, aimed at bringing together the references collected by researchers, groupware aspects of collaboration among researchers, and the traditional library world.
Surprisingly, when looking into how people manage their reference collections, we found that most of this is done using rather simple tools such as BibTeX or EndNote (over 60% of all respondents used BibTeX or EndNote). While these tools are appropriate for the task of compiling bibliographic entries for inclusion in documents, they have not been built for tasks going beyond this, such as cross-linking references and extending the basic reference model with additional information. A review of the literature showed that libraries, on the other hand, are not very interested in individual users collecting their own references, because from the libraries' point of view, a library catalog compiled and maintained by trained catalogers is far superior to anything an individual user could ever create. It is interesting to note that EndNote supports this library-centered view of the world by integrating library access through Z39.50, enabling EndNote users to import library catalog records with the click of a button.
The ShaRef vision, on the other hand, is to create a tool and an environment with the individual user as the central and most important entity. If the references are less perfect than a complete MARC record, this should not be a problem, as long as the information is sufficient for the individual user's knowledge management requirements. However, the challenging part for the project is to provide as many links to the outside world as possible by implementing sharing functionality with other users (via ShaRef or via web-based publishing), and by providing back-links to a library through OpenURL technology, which enables users to easily find resources in their preferred library.
This, however, is going to change, as the boundaries between traditional publishing and web-based information resources are slowly but surely disappearing. Traditional publishing material is made more web-compatible by assigning Digital Object Identifiers (DOIs), and an increasing number of high-quality and peer-reviewed publications are available online only. Thus, reference management solutions designed for the future should not make the increasingly artificial distinction between traditional publishing and web-based publishing. One thing, however, that should be taken into account is the much shorter average life span of a web-based resource; tools must take this into account and provide appropriate functionality such as caching and automatic link checking. After all, a broken link is not significantly different than a reference to a book that has gone out of print; both refer to resources that cannot be easily retrieved, even though they may have been archived somewhere.
ShaRef's goal is to take reference management to a level that makes it easy to seamlessly integrate bibliographic and web references. This not only reflects the changing world of publishing, it also makes it easier to manage all references uniformly, so that ShaRef's knowledge management capabilities can be used across all types of references.
Keywords can be used to identify concepts that should be referenced within ShaRef. For example, keywords could be used to index a paper based on its subject. ShaRef makes no attempt to define a given keyword vocabulary, or to structure keywords in any way, so keywords are simply an unstructured set of named concepts that can be referenced throughout ShaRef. Thus, more advanced concepts of structuring keywords such as ontologies (Bechhofer 2004) are not supported by ShaRef, but it is possible to use such a structure as an overlay over ShaRef's keywords and thus connect ShaRef's keywords with an externally defined ontology.
Cross-references connect references, and BibTeX users already know this concept which can be used to link entries from collections (such as papers from conference proceedings) to the complete volume (the proceedings itself). ShaRef generalizes this concept by enabling users to make generic cross-references. This can be used to create annotations that point to other references, such as an annotation to a paper stating that the claim made in this paper has been rejected by other publications. ShaRef does not assign semantics to these cross-references and thus goes less far than other systems providing well-defined semantics and even reasoning capabilities (Uren 2003), but ShaRef's approach is lightweight and sufficient to turn isolated references into a web of related metadata resources.
Sharing also is supported on a collaborative base, where people can decide to manage a set of references collaboratively. ShaRef supports users and user groups through identification and authorization features, and enables user groups to collaboratively manage references. These user groups may be research groups, university departments, or students attending some lecture. However, ShaRef does not require users to share their information, it also allows users to keep their references completely private.
Apart from the fact that researchers will always be able to visit their libraries and get individual help for locating a resource, ShaRef also supports a mechanism for an automated process through the use of OpenURL (Van de Sompel 2001). ShaRef users have to configure their local library's OpenURL resolver, and in turn can access the library's OpenURL service directly from within ShaRef. This way, ShaRef users can get from their personal references to the respective holdings of their library with a few clicks, very often without having to manually enter any additional information at all.
For this scenario to work, the library's OpenURL resolver must be configured appropriately, and as initial experiments with our library have shown, this is not trivial. The new task of human references within libraries, at least partly, could thus be to set up and maintain the OpenURL resolver. Doing this is an interesting and challenging task, because it involves thinking ahead of how to best serve all possible OpenURL queries with respect to the information sources of the library. In many cases, it may even be possible to directly guide users to the full text of resources that are available online. In other cases, guiding them to the most appropriate records in the library catalog is the best response. Overall, providing a good OpenURL service is key to the library of the future.
However, we are also aware that our model and our tool may not be the most appropriate tool for every user, so along with our goals to unify reference handling providing an environment supporting reference management and sharing, we have also identified a number of non-goals, which we explicitly do not want to support, such as library-scale cataloging, advanced ontology management, and advanced query or even reasoning features. However, due to the openness and extensibility of ShaRef, it is easily possible to add some of this functionality as additional layers on top of ShaRef.
Extensibility means that ShaRef allows users to define their own fields, choosing from a small set of predefined field types. These fields will be handled by ShaRef as if they were standard built-in fields, and users can thus extend ShaRef's data model by defining their own fields.
ShaRef is designed to avoid lock-in, it supports various import formats (BibTeX, EndNote, and bookmarks) and also supports these formats as output formats. However, because of the inherent limitations of these formats, some information will be lost when exporting data. Therefore, ShaRef data can also be exported in XML, in which case no information will be lost.
Because of its openness and extensibility, ShaRef can be used as a foundation for adding additional layers of software. As outlined above, more advanced technologies for keyword handling, for example Topic Maps, and more advanced technologies for handling cross-referencing, such as ClaiMaker (Uren 2003), could easily be added to ShaRef. Since ShaRef also has an API, applications wishing to use ShaRef as a back-end technology can easily do so by using the ShaRef API and providing an interface of their own.
Furthermore, ShaRef supports online and offline modes. In the online mode, the Java client communicates with the server. In offline mode, however, all data resides on the client. This configuration is ideal for traveling or for users who are only interested in the management functionality of ShaRef, but not in the publishing and sharing features.
ShaRef is under construction, but we believe that the blend of features combining personal knowledge management, collaboration, and reference handling is unique; and we hope they will make ShaRef a success at ETH Zürich and elsewhere. We are closely collaborating with the local library to integrate ShaRef as well as possible with the library's OpenURL resolver, and we hope that ShaRef will play a useful role in bringing users closer to the library, and in making the library services easier accessible for users.
Uren, Victoria, et al. 2003. Scholarly Publishing and Argument in Hyperspace. In The Twelfth International World Wide Web Conference, pp. 244-250, Budapest, Hungary: ACM Press. [Online]. Available: http://www2003.org/cdrom/papers/refereed/p137/p137-uren.html [Accessed November 5, 2004].
Van de Sompel, Herbert and Beit-Arie, Oren. 2001. Open linking in the scholarly information environment using the OpenURL framework. D-Lib Magazine 7(3). [Online]. Available: http://www.dlib.org/dlib/march01/vandesompel/03vandesompel.html [Accessed November 5, 2004].
Wilde, Erik. 2004. Usage and Management of Collections of References. Zürich, Switzerland: Computer Engineering and Networks Laboratory, Swiss Federal Institute of Technology, TIK-Report No. 194. [Online]. Available: http://dret.net/netdret/publications#wil04h [Accessed November 5, 2004].