Issues in Science and Technology Librarianship
Lura E. Joseph
Associate Professor, University Library
University of Illinois
In 2009, University of Illinois at Urbana-Champaign (UIUC) purchased a block of approximately 5,000 UIUC dissertations, authored between 1989 and 1997, that were scanned from microfilm by ProQuest. These were subsequently provided in PDF form both within the UIUC institutional digital repository (IDEALS) to the UIUC community and via the ProQuest platform. Subsequently, approximately 18,000 additional dissertations were digitized from microfilm by ProQuest for UIUC.
Most geology dissertations contain photographs, micrographs, maps and cross-sections (often in color and often oversized), seismic sections and other figures, images, and plates. This content is often some of the most useful information contained in geology dissertations. However, these elements do not copy adequately into microform format, and therefore are not adequate when digitized from microform. Geology dissertations need to be scanned from originals into high-quality color and grey-scale.
Administrators and other library staff unfamiliar with the discipline of geology may fail to understand the nature and extent of the problem. In order to document the need for this in-house work, a study was conducted to reveal the extent of problems in the ProQuest versions digitized from microform. Of the 439 known geology dissertations from UIUC, 398 have been digitized by ProQuest. This study found that 82% of the digitized dissertations had at least one figure with unacceptable quality and therefore need to be rescanned at high quality. The results have implications for other disciplines that rely on images and oversized plates to convey important information.
In 2010, University of Illinois at Urbana-Champaign (UIUC) began requiring Ph.D. dissertations and Masters' theses be submitted electronically for deposit in the university repository known as IDEALS. In an effort to assert control over historic dissertations, the university library began working toward using IDEALS as the ultimate aggregation and access point for all dissertations and theses (Shreeves and Teper 2012). In 2009, UIUC purchased a block of approximately 5,000 digital dissertations authored between 1989 and 1997. These were digitized from microfilm by ProQuest, and provided in PDF form both within IDEALS (to the UIUC community) and in the ProQuest platform. Subsequently, 18,557 additional dissertations were digitized by ProQuest from microfilm. IDEALS staff members are currently processing these PDF copies for ingest into IDEALS.
Geology dissertations and theses contain valuable information. Most geology dissertations and theses include illustrations and images such as maps, cross-sections, seismic sections, photographs, and micrographs. These illustrations and images are, in most cases, as important as the text, and in some cases they are more important. Maps and cross-sections may be the focus of the dissertation, and could be interpreted without the text. These maps and cross-sections are often consulted by professional geologists and geology students. In some cases, they are more detailed than state and federal survey mapping, and thematic maps such as subsurface maps related to petroleum geology may only be available in dissertations and theses; petroleum industry mapping is usually proprietary. Scott (1990) stressed the importance of preservation of the oversized maps and illustrations in geology theses and dissertations.
Photographs also contain extremely valuable geologic information. A single photograph of an outcrop may explain the geology far better than many words. A photograph may indicate the type of rock, stratigraphic and structural relationships, and the regional context, for example. Photographs record geologic features that will eventually be destroyed or changed. For example, photographs of quarry cliff-faces record geologic features that will soon be destroyed in the quarrying process. Photographs also record surface features such as dunes, beaches, streams, and glacial features that will change over time. A series of photographs taken over decades can be important for recording processes such as the retreat of glaciers; photos in dissertations can be a part of such a series. Aerial photographs are important for recording surface patterns that will certainly change over time. Photos of thin sections are important because the original sections would be difficult or impossible for other researchers to access.
Much or all of the information contained in illustrations and photographs may be lost in conversion to microform. Large maps, cross-sections, and other plates are folded, and in pockets or bound into the print volumes. Many of these are hand-colored, especially in older dissertations. In the microform copy of these dissertations, the large images appear as fragments of the whole, with distortion and little to no overlap. While this method preserves the images, it renders the information essentially useless. In addition, because ProQuest follows preservation best practice and creates black and white microfilm images, it is impossible to use the color-coded information on maps, cross-sections, and graphs. Furthermore, many geology dissertations contain detailed photographs. These photographs appear either completely or mostly black in the microform version, or lack necessary grey-scale shading. Even when created in digital format, dissertations containing color appear black and white in the microfilmed copies when the dissertations were submitted to ProQuest in paper format. References for examples of image quality problems in UIUC geology dissertations that were scanned by ProQuest from microfilm are included in the appendix. Butler and Bankole (2013) discussed similar problems with interlibrary loaned articles: science articles containing color or grey-scale images require special consideration when being scanned for patron use. A study of color in dissertations and theses at Pennsylvania State by Musser and Roberts (2007) found a significant increase in the use of color in dissertations and theses for the years 1995-2004, and noted that "monochromatic microfilming...is inadequate for materials with color illustrations" (p. 220).
Note that University of Illinois took advantage of the opportunity to acquire a very large number of dissertations in digital format at an affordable price and minimal staff time by having ProQuest scan from microfilm. Also note that ProQuest is able to produce high quality scans of original material. An example is the large scale dissertation and thesis retrospective digitization project that ProQuest did for the University of Wyoming in 2007-2008. At the time, this project was designated as "the world's largest dissertation and theses retrospective conversion project involving color and oversize maps" (Woods and McLean 2010). The resulting images are high quality. Since the dissertations digitized from microfilm are often insufficient for disciplines such as geology, most geology dissertations need to be scanned in high-quality color and grey-scale. In order to document the need for this work, a study was conducted to reveal the extent of problems in the ProQuest versions scanned from microfilm.
First, a complete list of UIUC geology dissertations, theses, and bachelor papers was compiled from a number of sources since this information was unknown and not apparent from the online catalog records. Next, a database was constructed and each UIUC geology dissertation on the ProQuest platform was examined for image quality. This project was conducted in the same manner as three earlier projects that examined the image quality in Elsevier online geology journals (Erdman 2006; Joseph 2006; Joseph 2012). As with these three studies, this project evaluated whether the black-and-white images were of sufficient quality to covey the necessary information to the user. A dissertation was considered to have a problem if the print version would likely need to be recalled because the scanned version contains images with quality so low that complete information cannot be determined by a specialist in the discipline. In some cases, photos are completely black (figures 1 & 2); in other cases, the subject is somewhat discernable, but a specialist would still need to recall the dissertation in order to interpret the information (figures 3 and 4). Other problems include map legends, maps, and graphs without color or sufficient separation of grey-scale to interpret (figures 5 & 6), and oversize plates. (Images are copyrighted by ProQuest, and included with their permission). Every image in each of the 398 scanned dissertations was examined on the ProQuest platform. One image of inadequate quality was considered sufficient to include a dissertation in the count as having inadequate quality.
Black images; no information
Hughes, Paul Warren, 1963, Stratigraphy of the Georgetown Formation, p. 46.
Black image; no information
Colquhoun, Donald John, 1960, Triassic stratigraphy of western central Canada, p. 31.
Lack of greyscale; not enough information
Ellwood, Robert Brian, 1961, Surficial geology of the Vermilion area, Alberta, Canada, p.11.
Lack of greyscale; not enough information
Pierce, Robert William, 1969, Ultrastructure and biostratigraphy of the conodonts of the Monte Cristo Group, Arrow Canyon Range, Clark County, Nevada, p. 41.
Legend, original in color; not enough information. Also, split graphic
McKay, Edward Donald, 1977, Stratigraphy and zonation of Wisconsinan loesses in southwestern Illinois, p. 3.
Graph, original in color; not enough information
Roadcap, George Stewart, 2004, Geochemistry and microbiology of extremely alkaline (PH>12) ground water in the Calumet slag-fill aquifer, p. 53.
Although quality determination is based on an individual opinion, the author formerly was a working geologist for 15 years, and is well qualified to make that determination. In addition, a PowerPoint of 20 examples of the various image problems was created. The PowerPoint includes screen shots of low quality images representing the various types of problems. In addition, the PowerPoint also includes color scans of the images from the original print dissertations for comparison. The presentation first shows a screen shot of the problem image, followed by a side-by-side comparison of the problem image and the scan from the original. The PowerPoint presentation was shown to five geologists at a geology conference including geologists from industry, government, and education. It was also shown to a group of geology librarians at a conference, and to three non-geology librarians. There was no disagreement with the author's assessments of image quality. The PowerPoint included the six images in this paper.
Of the 439 known UIUC geology Ph.D. dissertations, 398 (91%) were scanned from microfilm by ProQuest, and available online. Of the 398 scanned dissertations, 326 (82%) have inadequate quality. Of those with inadequate quality, 284 (87%) have black or dark images or lack grey-scale, 129 (40%) have split plates due to oversize (mostly maps or cross-sections), 102 (31%) have both black/dark images and split plates, 4 (1%) have other quality problems such as blurry text or contours, missing pages, or light images, and 11 (3%) were almost adequate, but had some quality problems. The split plates have two implications: It is difficult or impossible (due to distortion) for users to splice them back together from the scanned version; and the oversized items require special treatment if the originals are to be rescanned.
It is no surprise that the images in the versions scanned from microfilm are not useful due to quality problems, since the original microfilm versions are the source of the problems. However, librarians unfamiliar with the discipline of earth sciences may be unaware of the extent of the problems within the earth science dissertations subset. Even the earth sciences librarians may be unaware of the problems, since patrons usually request and use the print versions of the dissertations rather than the microfilm. It is only with the recent availability of online versions scanned from microfilm that the extent of the image problems has become apparent.
There are some advantages to having the ProQuest scanned versions of the dissertations, even with image quality problems: The text is now available to unlimited numbers of users from any place, at any time. Having the text available online allows a user to determine whether the material is useful. The full text of the material is searchable, increasing awareness. A study by Weible (2008) found that over 17% of the sample of UIUC-authored print dissertations were missing from the shelf. The ProQuest scanned versions are especially useful when a print copy is not available.
Yet the poor microfilm scans in the digital dissertations may also exacerbate problems. There are usually only two original print copies of the dissertations, and sometimes only one is still in existence. Many of the older dissertations may be in fragile condition. For example, copies from the former UIUC Geology Library were shelved in an area below Biology labs, and flooding from above was a common occurrence, increasing the fragile condition. Many of the older dissertations have glued-in photographs and fragile, oversized maps in pockets. Many older dissertations include hand-colored maps and other illustrations. The availability of the digital versions in ProQuest and in IDEALS may increase awareness of, and demand for these print dissertations in order to see the original images. Increased use, both on campus and through Interlibrary Loan, will further stress and threaten these materials. While the dissertations housed in Rare Books High Density shelving are somewhat protected by the policy of in-house use only, some dissertations have incorrect catalog locations (not designated as part of the Rare Books collection); thus they can circulate. Given these problems, it is crucial to create high-resolution digital images of the print originals as soon as possible before they deteriorate further.
This problem undoubtedly exists for other disciplines that rely on high quality color or grey-scale images, or oversize plates to convey important information in dissertations. This is the case for many of the physical and life sciences, as well as others such as geography. Any program to rescan dissertations, either in house or contracted out, should consider all subject areas that rely on high quality images and oversize plates to convey information, including the arts and humanities. It might be necessary for each institution to set their own priorities based on available staff, time, and funds, as well as size of their dissertation collections, condition of the various parts of the print collection, and use of the dissertations by subject.
Butler, Barbara, and Bankole, Karen. 2013. If a picture is worth a thousand words, shouldn't we address our image scanning standards? Journal of Interlibrary Loan, Document Delivery and Electronic Reserve 23(2), 81-95. DOI: 10.1080/1072303X.2013.817365.
Erdman, Jacquelyn M. 2006. Image quality in electronic journals: A case study of Elsevier geology titles. Library Collections, Acquisitions, and Technical Services, 30(3-4), 169-178. DOI:10.1016/j.lcats.2006.08.002.
Joseph, Lura E. 2006. Image and figure quality: A study of Elsevier's Earth and Planetary Sciences electronic journal back file package. Library Collections, Acquisitions, and Technical Services 30(3/4), 162-168. DOI:10.1016/j.lcats.2006.12.002.
Joseph, Lura E. 2012. Improving the quality of online journals: Follow-up study of Elsevier's backfiles image rescanning project. Library Collections, Acquisitions, and Technical Services 36(1-2), 18-23. DOI:10.1016/j.lcats.2011.08.001.
Scott, Sally J. 1990. Method for evaluating preservation needs of oversized illustrations in geology theses. Geoscience Information Society Proceedings 21, 137-146.
Shreeves, Sarah L., and Teper, Thomas H. 2012. Looking backwards: Asserting control over historic dissertations. College & Research Libraries News 73(9), 532-535.
Woods, Janet and McLean, Austin. 2010. Uncharted territory; The pitfalls and challenges of a large scale dissertation and theses retrospective digitization project. Unpublished paper presented at ETD 2010.
Black or Dark Images in Scanned Version:
Original in Color:
Lack of Grey-Scale; Photos & Micrographs:
Lack of Color or Grey-Scale; Graphs, Maps, etc.:
This work is licensed under a Creative Commons Attribution 4.0 International License.