During the week of January 16-19th, I visited the Smithsonian Tropical Research Institute (STRI) to discuss several matters relating to the Smithsonian Research Online (SRO) program and to offer technical support and training to STRI library staff. I was accompanied from Washington by Digital Services Head, Martin Kalfatovic, who was to attend a three-day Encyclopedia-of-Life meeting at Barro-Colorado Island during the same week.
Together we met with Oris Sanjur (STRI Associate Director for Science Administration), Vielka Chang-Yau (STRI head librarian), Angel Aguirre (librarian), Klaus Winter (STRI scientist) and Eldredge Bermingham (STRI Director). Everyone was in agreement that STRI-authored publication data ought to be collected in one place and that the SIL is doing a good job of coordinating this program across all Institution units. The Director and Associate Director will discuss the specific needs of their unit and report back to SIL, who will propose a workflow to accomplish this.
Meanwhile, I held a brief introduction to the bibliographic tools, EndNote and Zotero for STRI library staff and volunteers. While we had a training room available to us, unfortunately there was not a copy of these programs available to all participants. But they were still able to see the possibilities of using these tools in day-to-day library services.
Alvin and Vielka review the SRO website and list of Smithsonian-authored publications using the newly-installed LCD screen in the STRI library. Photo courtesy of martin_kalfatovic via Flickr.
Finally, I met with Fernando Bouché (Head, Office of Information Technology) and STRI programmer, Carlos Caballero, to discuss the management of publication data, its re-use on the STRI web page and inclusion in the SI Collections search system (EDAN).
STRI scientists publish over 300 scholarly papers every year. Approximately 70% of them are captured automatically by the SRO via websites and associated tools. This circumvents the need for manual data entry. The inclusion of the complete corpus of work being done there is an essential part of representing the research being conducted at the Institution and the cooperation between the SI Libraries and STRI will bring the project to fruition.
At the recent Berlin 9 Open Access meeting, a pre-conference session on open access publishing featured speakers who detailed the required innovations in publishing business models necessary to both make scholarship freely available and to ensure sustainability. Among the speakers was Dr. Neil M. Thakur of the National Institutes of Health. His presentation centered on an aspect of open access that I have not seen discussed before. Thakur opened with a central question of how to do more with less and he listed three options: work longer, work cheaper or create efficiencies in productivity. It was the latter (and only realistic) option that he concentrated on. Making scientific publishing more efficient requires open access to the literature but for reasons that have previously been overlooked.
In the past, advocates for the open access to scholarly literature have emphasized two audiences which suffer for lack of access to literature: scientists who work at under-funded organizations and who are unable to afford increasingly high subscriptions to scholarly journals, and motivated citizen-scientists (sometimes patients with debilitating diseases) who take it upon themselves to learn the technical language of their area of interest but who are locked out of a large body of literature due to a lack of resources to pay.
But Thakur brings in a third and until now ignored audience: machines. The development of natural-language computer processing and text-mining services is going to be increasingly useful in science in the near future. Because most researchers now face an information-glut rather than an information-scarcity, it is more and more important for them to be able to scan and review large bodies of publications which cannot be covered by simple linear readings. So this time-scarcity problem can be addressed by making the text of scientific publications open to machine processing and interpretation in order to allow scholars to quickly review publications both past and current based on the frequency of certain terms, their proximity to one another and other algorithms. This machine-to-machine access to scholarly literature is a productivity multiplier, Thakur said in his presentation.
A second presentation was by Peter Binfield from the Public Library of Science (PLoS). This is one of the most accomplished open access publishers using the business model where the author pays an article processing charge. In addition to this new way of doing the business of publishing, in recent years a new journal, PLoS One has become the largest journal, publishing over 6000 papers in 2010*. (Binfield expects to publish more than 15,000 in 2011). Despite the high volume, this journal publishes only papers of sound scientific quality and all manuscripts are peer- reviewed as with any other scientific journal. The key difference is that there is no editorial oversight filtering submissions based on popularity or widespread appeal of the subject matter; no matter the topic, if the science is done properly and it passes review by other scientists, it can be published in PLoS One. This model has become so popular that it has spawned a number of imitators from both commercial and non-profit publishers and Binfield pointed out that most of them have article processing charges nearly identical to PLoS One ($1350)
Interestingly, PLoS One was assigned an Impact Factor® by Thomson Reuters in 2010 and although the Binfield says that PLoS doesn’t particularly care for the Impact Factor® as a useful measure of scientific achievement, the inclusion of the journal in this popular metric probably explains the spike in submissions during 2011.
*According to Smithsonian Research Online data, Institution scholars have published more than 65 items in PLoS One including 25+ in 2011.
A recent article in BusinessWeek (http://buswk.co/h8pnfS) profiled a Japanese company that provides homes with some needed extra space. A recent startup, Bookscan, offers scanning of personal book collections in part for customers to more efficiently use their domestic space. As many know, Japanese homes are generally much smaller than North American homes and one can imagine that the elimination or reduction of a bookshelf can be a very valuable expansion of living area.
In addition to services such as those offered by Bookscan, major manufacturers have begun introducing increasingly sophisticated consumer scanning technologies (an example from ION Audio pictured here). There have even been attempts with personal cameras and other equipment (http://bit.ly/fNqHrt) to scan entire volumes for personal use.
One thing is clear: it has never been easier to create personal digital collections than it is now and that's one reason a Digital Library group was recently formed at the Smithsonian Institution Libraries. In addition to licensing content from commercial publishers and scanning our own books, the SIL is beginning to handle a wide variety of digital materials created outside the professional publishing world. However, the acquisition and management of this type of content in library collections demands forethought and the development of a coherent approach.
Most data management and digital library experts agree that early management intervention of digital material (preferably at the time of creation) is the best way to ensure that the content will survive and be usable even in the short term. And the SIL's new Digital Library group aims to ensure that electronic materials are easily discovered, usable and integrated into the larger Smithsonian digital world. The group plans to meet regularly, relying on the perspective of all SIL staff to recommend work flow and practices for specific situations and material-types. All staff are welcome to attend DL meetings and either contribute their own experience or learn more about the collection and management of this material at SIL.
The Smithsonian Research Online program recently surpassed the mark of 10,000 publications in the Digital Repository. This collection of digital publications by Smithsonian staff represents a broad review of research done by researchers at the Institution. Each year the program (initials, SRO) collects information on nearly 2000 publications by Institution researchers many of whom later contribute their article’s corresponding digital reprint. This milestone was achieved when the paper by Ben Hirsch (STRI) and Jesus Maldonado (NZP), “Familiarity breeds progeny: sociality increases reproductive success in adult male ring-tailed coatis (Nasua nasua)” was deposited to the collection.
The SRO consists of two basic components: a list of publications authored by Smithsonian researchers and affiliates, and a corresponding digital repository which contains the actual article or chapter in electronic form. The data which SRO collects is not only used by Institution administrators for research assessment purposes, but is also re-used by webmasters and other Smithsonian offices for reports, presentations and other public information services.
These electronic versions of peer-reviewed, scholarly journal articles are therefore made much more widely available to the worldwide research community thanks to indexing and search capabilities provided by the Repository in conjunction with scientific web portals. In addition to finding specific articles authored by Smithsonian scholars, the Digital Repository indexes the full text of each publication, thereby allowing search engines to retrieve these publications based on technical or geographic terms which may not appear in the title of the publication.
Shortly after adding the Hirsch and Maldonado paper, the Repository then added a paper by NMNH paleontologist, Matthew Carrano, “New materials of Masiakasaurus knopfleri Sampson, Carrano, and Forster, 2001, and implications for the morphology of the Noasauridae (Theropoda: Ceratosauria)”. Part of Smithsonian Contributions to Paleobiology, the addition of this item to the Repository ensures that Smithsonian Scholarly Press publications are archived in digital form for long-term public access.
Studying library science means among other things studying the publishing industry and standard publishing practices. For that reason research librarians are in a good position to offer new services to university and other scholarly publishers. Several university presses now partner with their libraries for support in the conversion to digital publishing.
Many scholarly presses are tapping into the experience with digital publications which librarians have developed since the 1990s. Libraries are increasingly able to offer services range from preservation of digital content, creation of descriptive and other metadata to ensure that publications are most easily discovered by researchers, the management of an efficient digital publication work flow and publishing scientific and other data.
A recent meeting co-sponsored by the Society for Scholarly Publishing and the Association of Research Libraries underscores this emerging connection. “Partnering to Publish: Innovative Roles for Societies, Institutions, Presses, and Libraries” was held in Washington, D.C. on November 10th and featured speakers from libraries and publishers highlighting current cooperation between libraries and presses and exploring new opportunities for cost-effective and innovative joint ventures.
The recent creation of several organizations on university campuses with names like the Center for Digital Research and Scholarship and the creation of new library staff positions such as Associate University Librarian for Publishing or Digital Scholarly Publishing Officer are evidence of this trend in new services at universities across North America. The National Institutes of Health library for example, includes “reference assistants” who help scientific authors with reference verification and management and with manuscript submission. Clark also stated that the advent of online databases, bibliographic management software and other self-service resources has contributed to a drop in requests for basic reference assistance. The inclusion of these kinds of “author services” in additional to traditional reader services in research libraries has been noted elsewhere.1
Recently the Smithsonian Institution Libraries has also become involved, not only by archiving the digital editions of the Smithsonian Scholarly Press, but also in managing the Institution’s membership in CrossRef, a registry of persistent web URLs commonly used in academic publishing today. The SIL is also overseeing the conversion of existing Scholarly Press publications to the increasingly popular ePub format which is usable on the electronic book readers such as Amazon’s Kindle.
1Christine L. Borgman. 2010. "Research Data: Who will share what, with whom, when, and why?" China-North America Library Conference, Beijing