Several years ago the National Museum of Natural History hosted a visitor from Science magazine who gave a presentation for authors on strategies for getting published in that journal which boasts a low 7% acceptance rate. The session contained some helpful tips and the speaker ended his talk with some humor by speculating (somewhat facetiously, perhaps) on the top 10 future trends in scientific publishing. Referring to the recent growth in co-authorship, one of his “predictions” was that someday the list of co-authors for an article would exceed the length of the article itself. While that may have seemed amusing, few realized that within a short time the Smithsonian Research Online (SRO) database of scholarly publications would add an article which includes over 3000 co-authors and approximately the same number of words in the article’s body of text.
That particular paper came from research at the Smithsonian Astrophysical Observatory, and while it may represent an extreme example from a discipline requiring large and complex research teams, it highlights the confusion that has emerged around author identities with the growth in published scholarship in the last decade or so. This is especially true given the growing research output from countries like China. A recent paper in the American Journal of Physical Anthropology1 found that among a sample of 18 million people in the US, there were nearly 900,000 last names, or roughly one last name per 20 people. But among 1.28 billion Chinese there were just over 7,300, or roughly one last name per 200,000 people, indicating that the elimination of researcher mis-identification may be even more difficult as the scholarly literature from other cultures grows.
The growth in co-authorship also presents a challenge for the management of the SRO. It would be ideal if all author names uniformly included surname, first name and middle initial but because of the variety of sources supplying the data, that is not always possible. Even if standardizing names of the 500+ Smithsonian researchers in the database was feasible, the task of cleaning up external researcher (co-author) names would be nearly impossible. If, for example, a Smithsonian scientist publishes a paper with 10 co-authors, library staff working on the SRO must identify the Smithsonian affiliate from among the list of co-authors. To their credit, Smithsonian Libraries staff have developed some personal familiarity with Smithsonian authors over time, but a long list of collaborators can be tedious to work through for each publication added.
The scholarly publishing industry has begun to address the issue by establishing a system of unique numeric identifiers that can be assigned to authors so that they can be differentiated if, for example, two authors share a last name and first initial. The ORCID (Online Researcher and Contribution ID) initiative is an effort to coordinate a system of such identifiers. In addition to a researcher’s publication output, this system of identifiers could also be applied to awards and grants, as it is likely some of the larger foundations make awards to those with similar names.
Just as digital object identifiers (DOI) have been implemented by most online publishers, these author identifiers might make tracking research much easier if they were included in standard online publication data. Hopefully one day an author identifier scheme can be applied not only to the SRO data but to Smithsonian research generally. That might make reporting and researcher evaluation much easier.