Previous Contents Next
Issues in Science and Technology Librarianship
Fall 2012
DOI:10.5062/F4Q81B1B

[Refereed]

Literature Use in Engineering and Computer Science Research: An Analysis of Works Cited in Dissertations and Theses

Janet Fransen
Engineering Librarian
University of Minnesota
Minneapolis, Minnesota
Fransen@umn.edu

Copyright 2012, Janet Fransen. Used with permission.

Abstract

Any engineering librarian will tell you that their researchers' literature needs differ from researchers in other disciplines: Books are used less, and conference papers more, than in humanities disciplines. This study analyzes literature cited in theses and dissertations submitted over a three-year period by students in three departments of the University of Minnesota's College of Science & Engineering. The results show how literature type and age differ between engineering researchers and computer scientists and their counterparts in humanities and social sciences. The results also illustrate how particular disciplines within engineering differ in their use of literature and the level to which they make use of the literature of fields outside of their own.

Introduction

New graduate students in engineering and computer science come from a variety of backgrounds. Those entering graduate school immediately after achieving an undergraduate degree may have experience finding the kinds of books and papers they needed for undergraduate liberal arts classes, but may have had little or no exposure to the literature they'll need for their graduate program. Students returning to school after years in the workplace may know about standards and trade resources used in their day-to-day engineering work, but may not be as comfortable with conference papers and journal articles.

Over time, most graduate students will achieve a degree of fluency with the literature they need for their research. Engineering librarians can decrease the time to fluency by building their own knowledge of what past students have cited in their research and sharing that knowledge with their students. At the same time, they will increase their credibility with engineering faculty and administration and position themselves for more informed decision-making on collection development.

As a liaison librarian for Aerospace Engineering & Mechanics (AEM), Electrical & Computer Engineering (ECE), and Computer Science & Engineering (CS), I began this project as a means of determining what new graduate students most needed to know about the literature of their fields in order to provide more effective instruction and user services. I analyzed citations from the bibliographies of dissertations and theses from those three departments published over a three-year period. The results show differences in the type, discipline, and age of the materials used by graduate student researchers in each department.

Context

Institution

The University of Minnesota is a public research university with more than 52,000 enrolled students. The University's College of Science and Engineering includes 11 academic departments and in 2011-12 enrolled 5,046 undergraduates and 2,698 graduate students. Table 1 lists the numbers of faculty members, undergraduates, and graduate students for the departments included in the study.

Table 1. 2011-12 FTE and enrollments.

Department Faculty FTE Undergraduates Graduate Students
Computer Science 35 352 400
Electrical Engineering 39 324 483
Aerospace Engineering 19 153 96
Total 93 829 979

A small percentage of students will seek individual librarian consultations during their programs, so the liaison librarian must find creative ways to become familiar with the different types of research occurring in each department and with the kinds of literature researchers and students at all levels will need. By reviewing the bibliographies of dissertations and theses completed in the recent past, the author sought to answer these research questions:

Literature Review

Citation analysis is the process of gathering citations from a set of source documents, separating each into its component parts, and aggregating those parts to highlight trends in material type, journal title, age, or other factors. Many librarians undertake a citation analysis at the local level as a way of evaluating past collection development practices and informing future decisions regarding the collection. Hoffmann and Doucette (2012) reviewed the methods used in 34 such studies published between 2005 and 2010 alone. All of these were user studies--studies of a particular user group's citation practices--and all sought to inform collection practices. Hoffmann and Ducette caution those approaching a citation analysis project to consider their objectives and choose data to be analyzed accordingly. Their appendix, Guide to Considerations for Citation Analysis Methodologies, offers suggestions for defining the scope of a study. Thinking through decisions before confronting a mountain of citations will save the researcher's time and allow for the creation of a consistent, streamlined dataset.

Careful consideration of the methods can help the researcher avoid some of the problems and pitfalls cited by critics of citation analysis. MacRoberts and MacRoberts (1989) point to the interpretation of the bibliography as a "list of influences." They argue that, among other issues, authors in their sample did not cite all their formal influences, and rarely cited any of their informal influences.

Beile et al. (2004) used citations in education dissertations from three institutions to build core journal lists for each institution, as well as a composite list. They found that the composite list was quite different from each individual list. While bolstering the contention that local citation analysis is necessary to understand the practices of local researchers, this finding demonstrates that single-institution citation analysis studies cannot necessarily be generalized. The study also revealed that each institution's collection included the vast majority of citations by authors from that institution. The authors argue that this finding may just as easily indicate researchers' over-reliance on the local collection as it indicates the quality of the collection.

Even when the stated objective of a citation analysis study is to inform collection development, the effort often results in improvements to user services. Kirkwood (2009) was able to demonstrate the need for more funds to purchase monographic materials for civil engineering students, but also found that the training and experience graduate assistants gained while collecting the data made them more effective at the reference desk. In addition, the process brought to light tools and skills that new graduate students should be taught.

Knowing the types and age range of literature used by researchers in a discipline is a key to understanding the cultural norms of that discipline (Kushkowski et al. 2003). Armed with an understanding of the discipline's use of literature, librarians can work with academic advisors and others to educate new students in those norms. Many citation analysis studies have identified differences among disciplines, and among fields within a discipline. Knievel and Kellsey (2005) demonstrated that nearly three quarters of all citations in the humanities journal articles they analyzed were to monographs, but the percentage of citations to monographs varied greatly among the eight disciplines they investigated. Walcott's analysis of biology theses and dissertations at her institution showed very high use of serials over monographs (1994). Goodrum et al. (2001), Rahm (2008), and Wainer et al. (2010) all contend that conference publications are held in higher regard in Computer Science than in other fields. Thompson (2001) notes the importance of grey literature in engineering, as well as the continuing use of older technical reports from the National Advisory Committee for Aeronautics (NACA) and the National Aeronautics and Space Administration (NASA).

Methods

Because the primary objective of the study was to determine what new graduate students in engineering and computer science need to know, the study analyzed the bibliographies of the most in-depth written work successful graduate students are likely to complete: a thesis or dissertation. Masters theses and Ph.D. dissertations from each of three departments for the calendar years 2008, 2009, and 2010 were identified using the ProQuest Dissertations & Theses database and the University of Minnesota's institutional repository, the University Digital Conservancy (UDC). Only dissertations and theses available in full text from one of these two online sources were included in the analysis. Counts by department are shown in Table 2.

Table 2. Theses and dissertation totals by department/year.

  AEM CS ECE Total
Calendar Year Theses Dissertations Theses Dissertations Theses Dissertations  
2008   2   6   13 21
2009 1 5   17 2 21 46
2010 8 6 3 18 8 13 56
Total 9 13 3 41 10 47 123

Most Master's students in engineering and computer science choose a non-thesis path, so thesis counts do not reflect the number of degrees awarded. Because of the small number of theses (22) relative to dissertations (101), the study compared average citation counts but made no other comparisons of citations in one versus the other.

Lists of source documents for each department were generated using a screen-scraping utility, Needlebase, and by exporting search results to RefWorks. Given the large number of source documents, the author experimented with several means of automatically harvesting citations rather than manually transcribing them. The process used involved copying and pasting each bibliography into a Microsoft Word document and using Visual Basic for Applications (VBA) code to parse the citations into individual table rows for import to Microsoft Access. For a step-by-step description of the process, see the white paper Parsing Citations using Visual Basic for Applications: A Step-by-Step Guide (Fransen 2012).

Data elements collected for each source document included the title, date submitted, department, advisor, and author. Data elements collected for each citation included the text of the citation, literature type, publication year, and journal or conference title (if applicable).

Table 3 shows the number of citations as well as the average by department.

Table 3. Citation counts and averages by department.

    Average Citations per Source Document
Dept Number of Citations Overall Masters PhD
AEM 1,443 66 33 88
CS 4,495 102 50 106
ECE 4,474 78 33 88
Total 10,411 85 35 95

In order to provide as rich a picture of literature use as possible, each citation was assigned one of about 40 literature types. For analysis, these types were aggregated into seven summary literature types:

Literature type was determined using UlrichsWeb or, when the title was not found, by searching Google Scholar for the article title and inspecting the article itself for the required information. UlrichsWeb and OCLC WorldCat provided the ISSN, publisher, and Library of Congress classification for each journal.

Results

Types of Literature

Overall, graduate students in the three departments cited articles in academic journals more often than anything else, as shown in Table 4. Books accounted for only 10 percent of all citations. This is consistent with Musser & Conkling's analysis of citations in engineering journals: In their work, journals comprised 53 percent of the citations, and monographs accounted for 12 percent (1996).

Table 4. Literature format type counts: Total for all departments.

Literature Type Number of Citations Percent of Total
Journal; Academic/Scholarly 4,948 48%
Conference 3,254 31%
Book 1,034 10%
Report 307 3%
Web site 301 3%
Thesis 149 1%
Other 410 4%
Grand Total 10,411 100%

By contrast, although Miller's analysis of biology theses showed similar use of books (11 percent), students in that study referenced journal articles 84 percent of the time, and conference papers a scant 0.8 percent (2011). Kayongo & Helm (2012) noted that dissertation writers in arts and humanities at Notre Dame cited books 73 percent of the time, and 42 percent of social sciences scholars' citations were to books.

Although journal articles and conference papers are cited much more often than books in engineering and computer science, analysis of these data showed marked differences among the three departments (Figure 1).

Figure 1. Comparison of literature types used by department.

Computer scientists were more likely to cite conference papers (42 percent) than any other literature type, including journal articles (35 percent). Unlike other academic disciplines, conference publication in computer science is considered to be at least on par with, if not superior to, journal publication (Andonie and Dzitac 2010; Wainer et al. 2010).

Aerospace Engineering students were more likely to use reports than are those in Computer Science and Electrical Engineering. Although citations are not exceptionally high relative to other literature types (8 percent), citing reports is common: Of the 22 source documents in Aerospace Engineering, 18 of them cited at least one report. Reports were most commonly AIAA Papers or NACA/NASA reports.

Library of Congress Classifications

During the data gathering phase, it became apparent that computer scientists cited more literature from other disciplines than students in the other two departments. Therefore, Library of Congress classifications for journals cited were collected and analyzed. Figure 2 illustrates the relatively large number of citations to journals outside of Science and Technology by computer science students.

Figure 2. Journals cited by LC Classification.

Note that Computer Science itself is classified by the Library of Congress as Science, within the Mathematics subclass. This accounts for the high number of citations to journals classified as Mathematics relative to other sciences, as shown in Table 5.

Table 5. LC subclass distribution for journals cited by Computer Science students in the Science classification.

Subclass Description Number of Citations Percent of Total
Mathematics (QA) 396 38%
Science (General) (Q) 194 19%
Physiology (QP) 189 18%
Natural history - Biology (QH) 187 18%
Physics (QC) 36 3%
Geology (QE) 21 2%
Astronomy (QB) 6 1%
Chemistry (QD) 5 0%
Microbiology (QR) 3 0%
Zoology (QL) 2 0%
  1,039 100%

Journals cited in Physiology were largely neuroscience-related, and those in Natural history-Biology were from the bioinformatics and molecular biology sub-disciplines. Computer Science scholars also drew on materials from disciplines outside of Science and Technology (Table 6).

Table 6. LC classification distribution of journals cited by Computer Science students that were not classified as Science or Technology.

Class Description Number of Citations Percent of Total
Social Sciences (H) 80 27%
Medicine (R) 60 20%
Philosophy, Psychology, Religion (B) 58 20%
Language and Literature (P) 43 15%
Geography, Anthropology, Recreation (G) 21 7%
Bibliography, Library Science, Information Resources (Z) 16 5%
History of the Americas (E) 4 1%
General Works (A) 4 1%
Agriculture (S) 4 1%
Political Science (J) 2 1%
Education (L) 2 1%
Total 294 100%

About 40 percent of the social sciences journals cited were related to statistics. As with the Physiology and Natural History-Biology journals in the Science classification, journals classified as Medicine were mostly related to neuroscience, bioinformatics, and molecular biology. Philosophy, Psychology, Religion journals fell entirely into the Psychology subclass, including cognitive science. Language and Literature titles related to linguistics.

Aerospace Engineering journals cited were predominantly in the Science classification. This was surprising since the subclass for Aeronautics and Astronautics is TL, in the Technology classification. Journals in the Science classification fell into the Mathematics and Physics subclasses. The most highly cited journals in these areas are fluids-related: Journal of Fluid Mechanics, Physics of Fluids, and Experiments in Fluids. The University's Aerospace Engineering & Mechanics department does have a strong Fluid Mechanics research area, but it is worth noting that citations to these three journals came from seven of the 22 source documents, with two dissertations accounting for two-thirds of those (Table 7).

Table 7. Citations by Aerospace Engineering students to Journal of Fluid Mechanics, Physics of Fluids, and Experimental Fluids.

Source Document Citations to Three Major Fluids Journals Classified as QA or QC Percent
Dissertation A 80 34%
Dissertation B 77 33%
Dissertation C 34 14%
Dissertation D 27 11%
Dissertation E 8 3%
Dissertation F 8 3%
Thesis G 1 0%

Age

As with literature types and journal classifications, the age profile of materials used differs by department. The averages for Computer Science and Electrical Engineering shown in Table 8 are similar; average age of Aerospace Engineering literature is substantially older for all types listed. This does not imply that Aerospace Engineering students cited only older literature: 50 percent of conference papers and journal articles cited were six years and four years old, respectively. Those ages are similar to the 50 percent mark for Computer Science and Electrical Engineering citations. But the aerospace engineering materials appear to have a longer shelf life. 20 percent of journal articles cited were more than 23 years old.

Table 8. The literature used by researchers in aerospace engineering has a longer useful life than that used in other disciplines.

Document Type Avg Age (years) AEM Avg AEM Years to 50% AEM Years to 80% CS Avg CS Years to 50% CS Years to 80% ECE Avg ECE Years to 50% ECE Years to 80%
Conference 6.9 12.0 6 14 7.0 6 10 6.0 5 8
Journal 11.6 16.1 4 23 10.9 7 15 10.6 7 15
Book 15.9 20.6 13 35 15.0 10 21 14.4 10 22
Report 14.3 23.4 15 39 9.8 6 14 6.0 3 8

Aerospace Engineering students' use of older literature bears further examination, particularly as it relates to reports. As noted by Thompson (2001), NASA and its precursor NACA conducted a great deal of basic research in areas surrounding flight and space travel. Reports of that research are still valuable in the field today. Citation counts spike at the 80-89 year mark (the 1920s) and the 30-39 year mark (the 1970s). The first includes NACA work on wing and airfoil sections. The second set of citations includes research on boundary layers and heat transfer that would have been particularly important for spacecraft launch and reentry.

Because Master's and PhD students use and cite materials outside of the LC subclass for their discipline (and therefore outside of the scope for the liaison/selector for that discipline), the author examined the average ages of journals in subclasses other than that of the dissertation writer's discipline and compared them to the overall average age of journals for the discipline. The data for all subclasses are scattered, particularly since many subclasses have only one or two associated citations. Table 9 shows a selection of subclasses for each discipline for comparison.

Table 9. Average age of journal citations for select subclasses outside the primary subclass for the discipline.

AEM average journal age = 16 years
Average journal age for Motor Vehicles. Aeronautics. Astronautics (TL) subclass = 11 years
Subclass Number of Citations Average Age
Science (General) (Q) 17 42
Chemical Technology (TP) 22 30
Chemistry (QD) 57 21
Mathematics (QA) 176 16
Physics (QC) 262 14
Engineering (General). Civil Engineering (TA) 79 12

 

CS average journal age = 11 years
Average journal age for Mathematics (QA) subclass = 11 years
Subclass Number of Citations Average Age
Statistics (HA) 32 24
Psychology (BF) 57 23
Physics (QC) 36 16
Science (General) (Q) 170 11
Natural History - Biology (QH) 187 9
Electrical engineering. Electronics. Nuclear engineering (TK) 189 8

 

ECE average journal age = 11 years
Average journal age for Electrical engineering. Electronics. Nuclear engineering (TK) subclass = 8 years
Subclass Number of Citations Average Age
Mechanical Engineering and Machinery (TJ) 89 15
Physics (QC) 557 13
Science (General) (Q) 261 12
Natural History - Biology (QH) 116 8
Engineering (General). Civil Engineering (TA) 102 7

Discussion

As assumed at the outset of the study, graduate students in engineering and computer science overall cited academic journal articles more often than any other type of literature. However, this was not true in all disciplines: Computer Science students cited conference papers more frequently than journal articles. Conference papers have generally been considered grey literature, but are increasingly findable on the web through conference web sites, sponsoring society databases, and institutional repositories. It may be that, as Fortnow (2009) suggests, computer science researchers will join the rest of the academy and move toward more polished journal articles as the standard for citation. But at this time it seems just as likely that fields such as electrical engineering will increasingly accept conference papers as a mode of sharing new research and place less emphasis on journal articles. More longitudinal research across different types of publications is needed to determine whether a trend exists.

In light of their future need for journal articles and conference papers, all engineering and computer science graduate students should be taught how to search for such literature in licensed databases, as well as in Google Scholar and on societies' and publishers' web sites. They should understand which search tools will provide the most efficient path to the particular literature they need, and how to find either a particular item from a citation or everything they can on a topic.

All liaisons with collection responsibilities spend some of their selection time on books. Selectors in engineering and computer science are aware that they spend more time managing serials in their collections than books, and this study indicates that their time allocation is appropriate. The data in this study can help explain to others in the library system how engineering selectors' needs and processes differ from those of many of their colleagues.

In addition to highlighting the heavy use of journal articles and conference papers across the engineering disciplines studied here, the data show the importance of reports to aerospace engineering students. Many of the NACA, NASA, and AIAA reports commonly used in aerospace engineering are available online. But many more are available at the University of Minnesota only in microfiche form. Liaisons can help introduce students to the culture of literature in that field by offering instruction on how to find and use papers that have not been digitized.

The wide range of literature cited, particularly in computer science, supports Tucci's assertion that "interdisciplinary work may well present librarians with excellent opportunities to convince those faculty members now practicing self-service that librarians have critical information-seeking knowledge and skills and can help them improve the efficiency and effectiveness of their research (2011)." Clearly, librarians working within a liaison or subject specialist model must make a point of both learning from and educating colleagues about a wide range of disciplines outside their own subject areas.

Liaisons instructing new graduate students need not school them on the use of literature in every other possibly overlapping field, nor gain all that knowledge themselves. But they should make students aware that differences, as well as other databases, exist, and that other librarians are available within the institution to help connect them to the interdisciplinary information they need.

Because budget cuts and new materials may be relevant beyond strict department boundaries, selectors need to make faculty aware of decisions about the collection, particularly journal subscriptions, for other subject areas.

The data imply (but do not conclude) that the age profile of useful literature from other fields may differ from the age profile of literature used within the student's discipline. For example, it is possible that computer scientists working on biology research would need more and older biology review articles than scholars working full time in biology. More study is needed to determine whether scholars in a particular field use their literature in the same way those outside the field use it.

Although the analysis discussed here portrays a certain profile of use for each department studied, those profiles should not be interpreted as complete models of all research for those disciplines, either in general or at the University of Minnesota. Aspects of the data, particularly as it relates to interdisciplinary research, may reflect research grants activities of labs that happened to produce a higher number of dissertations or theses during the timeframe covered. It is important to understand that some areas use literature from other disciplines more than others; it is less important to know what those other disciplines were at the University of Minnesota during the 2008-2010 timeframe.

Conclusion

Conducting this study deepened my knowledge of the subject areas. But more importantly, the study's results changed the way I orient new graduate students to finding literature in their field. I now present some of the findings in my fall orientation sessions with each department's new students, notably the types of literature used in each field, the diversity of subject areas used in computer science, and the use of older reports in aerospace engineering. The final results provide a snapshot that helps new students understand what they need to know as they begin their own information gathering process, and demonstrate how engineering is different from other disciplines, and how sub-disciplines differ from one another.

At the same time, citation analysis of theses and dissertations is only one tool in understanding the literature used in a field. Some writers of these source documents will continue in academia and produce other writing that is likely to draw on the same types of literature. But the vast majority of Master's recipients and more than half of doctoral engineers and computer scientists work in the private for-profit sector (National Science Foundation 2009).

A study by Jeffryes and Lafferty used surveys and focus groups made up of University of Minnesota undergraduates who had participated in the cooperative education program in 2009-10 (2012). Their summary of the types of literature students were asked to use on the job, and their comfort level finding and using that literature, provides a second perspective on what students should be taught about engineering resources to prepare them for the workplace. For example, their research showed that more than three quarters of the students (who were primarily from the Mechanical Engineering department) were asked to find and use standards. This literature type is obviously important in engineering, but was not often cited in theses and dissertations in the fields studied. This second study provides a needed balance to the study described here. Liaisons can use these studies and others like them to build curricula that introduce students at all levels to the culture of their fields.

References

Andonie, R. and Dzitac, I. 2010. How to write a good paper in computer science and how will it be measured by ISI Web of Knowledge. International Journal of Computers, Communications & Control (4):432-46.

Beile, P.M., Boote, D.N., and Killingsworth, E.K. 2004. A microscope or a mirror?: A question of study validity regarding the use of dissertation citation analysis for evaluating research collections. The Journal of Academic Librarianship 30(5):347-53.

Fortnow, L. 2009. Viewpoint: Time for computer science to grow up. Communications of the ACM 52(8):33.

Fransen, J.L. 2012. Parsing citations using Visual Basic for Applications: A step-by-step guide. [Internet]. [Cited November 19, 2012]. Available from: http://purl.umn.edu/127017

Goodrum, A.A., McCain, K.W., Lawrence, S., and Lee, G.C. 2001. Scholarly publishing in the Internet age: A citation analysis of computer science literature. Information Processing & Management 37(5):661-75.

Hoffmann, K. and Doucette, L. 2012. A review of citation analysis methodologies for collection management. College & Research Libraries. [Internet]. [Cited November 19, 2012]. Available from: http://crl.acrl.org/content/early/2011/07/21/crl-254.short

Jeffryes, J. and Lafferty, M. 2012. Gauging workplace readiness: Assessing the information needs of engineering co-op students. Issues in Science and Technology Librarianship (69). [Internet]. [Cited November 19, 2012]. Available from: http://www.istl.org/12-spring/refereed2.html

Kayongo, J. and Helm, C. 2012. Relevance of library collections for graduate student research: A citation analysis study of doctoral dissertations at Notre Dame. College & Research Libraries 73(1):47-6.

Knievel, C. and Kellsey, J.E. 2005. Citation analysis for collection development: A comparative study of eight humanities fields. The Library Quarterly 75(2):142-68.

Kirkwood, P. 2009. Using engineering theses and dissertations to inform collection development decisions especially in civil engineering. American Society for Engineering Education 2009 Annual Conference and Exposition. [Internet]. [Cited November 19, 2012]. Available from: http://tinyurl.com/c8e9l85

Kushkowski, J.D., Parsons, K.A, and Wiese, W.H. 2003. Master's and doctoral thesis citations: Analysis and trends of a longitudinal study. Portal: Libraries and the Academy 3(3):459-7.

MacRoberts, M.H. and MacRoberts, B.R. 1989. Problems of citation analysis: A critical review. Journal of the American Society for Information Science 40(5):342-9.

Miller, L.N. 2011. Local citation analysis of graduate biology theses: Collection development implications. Issues in Science and Technology Librarianship (64). [Internet]. [Cited November 19, 2012]. Available from: http://www.istl.org/11-winter/refereed3.html

Musser, L.R. and Conkling, T.W. 1996. Characteristics of engineering citations. Science & Technology Libraries 15(4):41-9.

National Science Foundation. 2009. Characteristics of doctoral scientists and engineers in the United States: 2006. Arlington, VA: National Science Foundation, Division of Science Resources Statistics. [Internet]. [Cited November 19, 2012]. Available from: http://www.nsf.gov/statistics/nsf09317

Rahm, E. 2008. Comparing the scientific impact of conference and journal publications in computer science. Information Services & Use 28:127-8.

Thompson, L.A. 2001. Grey literature in engineering. Science & Technology Libraries 19(3):57-73.

Tucci, V.K. 2011. Assessing information-seeking behavior of computer science and engineering faculty. Issues in Science and Technology Librarianship (64). [Internet]. [Cited November 19, 2012]. Available from: http://www.istl.org/11-winter/refereed5.html

Wainer, J., de Olveira, H.P., and Anido, R. 2010. Patterns of bibliographic references in the ACM published papers. Information Processing & Management 47(1):135-42.

Walcott, R. 1994. Local citation studies: A shortcut to local knowledge. Science & Technology Libraries 14(3):37-41.

Previous Contents Next

W3C 4.0 
Checked!