When it comes to publishing the most impactful scientific research and identifying the best up-and-coming research paths, it takes one to know one.
That's what Northwestern University researchers found when they analyzed nearly six million citations among more than 156,000 published scientific papers.
While most researchers cite older, well-established papers in their field, highly cited papers—papers that other published papers cite the most often and therefore are considered successful—also cite more work that has been published relatively recently.
In fact, that cited work goes on to become highly cited itself, showing that top scientists and engineers are adept at betting on good prospects.
"You could say the best researchers also have the best scientific taste," said Luís Amaral, Erastus Otis Haven Professor of Chemical and Biological Engineering in Northwestern's McCormick School of Engineering and lead author of the research.
Amaral and his co-authors published these and other findings from the analysis on April 15 in the journal Nature Human Behaviour.
Understanding location of citations
When scientists and engineers publish scientific papers, they not only present their findings, but also cite previously published work in the field. When research is cited, it is often seen as a mark of achievement for the cited paper, the journal in which it was published and the researcher who conducted the work. If cited enough times, all three can be referred to as "highly cited."
But citing a research paper isn't necessarily an endorsement of it. Citations can also be used to provide background, identify methodology or even offer correction or criticism of the paper.
To better understand the context in which papers are cited, Amaral and his collaborators examined more than 156,000 papers in Public Library of Science (PLOS) journals between 2005 and 2016. Since these journals are open, the researchers could examine the entire text to better understand what is cited, and where in the paper the citations appear. That's important, because citations appear in all four sections of a standard research paper: introduction, methods, results, and discussions.
The researchers found that 74 percent of citations appeared in the introduction and discussion sections, where such citations often aren't necessarily an endorsement. The introduction often gives background on the field and cites older papers, while the discussion references the future of the field and often cites younger papers. The researchers found that citations in the methods section were the most highly cited, as well as the oldest. That's likely because there is often broad consensus in the field about which experimental methods are most appropriate.
"Papers that describe new methods of research usually get a lot of attention and will acquire a lot of citations," said Julia Poncela-Casasnovas, a postdoctoral fellow in Amaral's group and co-author of the research.
Citing the best research early on
When the researchers traced the citations from the PLOS papers to their original papers using the Web of Science—a trove of 60 million scientific papers—they found that highly cited papers did not uniformly cite just the classic papers in the field. Instead, they cited papers that were much younger—two to four years younger on average, depending on the paper section. That means authors of top scientific papers are generally up-to-date on the latest scientific literature in their field.
Not only are the reference papers younger—the researchers found those cited papers themselves would often become highly cited papers.
"Researchers of good papers are better at selecting good references," Poncela-Casasnovas said. "They seem to be good at finding the best research early on, and everyone else will follow."
Debate about merits of citations
This research could contribute to the debate about the merits of counting citations, the researchers say. Since the majority of citations occur in the introduction and discussion sections—where their merit isn't necessarily endorsed—perhaps those citations could be weighted less than those in the methods section, where a citation is more certainly an endorsement.
This debate will only continue in the future, as more than one million scientific papers are published each year. The next step, perhaps, would be to use natural language processing to better understand the exact contexts in which each citation is made.
"We need a better way to understand just how meaningful the research in these papers is," said Martin Gerlach, a postdoctoral fellow in Amaral's group and a co-author on the paper.