I’m sitting in my office at Université Laval, waiting for an opportunity to speak with my
professor, and procrastinating revising a manuscript. My procrastination,
almost always, is to read the internet, and today I’ve found a new article from
The Atlantic, “The Scientific Paper is Obsolete”.
The main thesis of this article is that the scientific paper
as we know it today has outlived its utility. The author, James Somers, opens
with a description of the niche the scientific paper was invented to fill: a
short, incremental advance published as widely as a book but as readable as a
letter, and permanent where a lecture is ephemeral. I’ve had conversations with academics in social sciences or humanities disciplines who express
their surprise that books, which for argument’s sake are publications longer
than about 100 pages, almost never appear in the list of citations in my
scientific publications. I list 11 publications – scientific papers – on my
C.V. with me as an author (always one of several, I have no sole-author
publications) and I’m first author on 7 of those; this means I did most of the
actual writing. I feel this experience gives me some perspective to evaluate
the article in The Atlantic.
There are the expected jabs at the style and perceived
readability of scientific papers, a criticism so widespread and consistent that
I now mostly ignore it. I get it, you don’t get the enjoyment of reading a
scientific paper that you get out of reading something else, and you put the
blame largely on the abundant jargon and dense prose of typical scientific
papers; James Somers also adds some mentions of “mathematical symbols”, which
is indeed one major feature of many scientific papers that separates them from
written works intended for a wider, non-specialist audience. But that’s the
point – the intended audience of a scientific paper is not the general public,
it’s other experts in that discipline. Know your audience. I guess James Somers does - scientists and non-scientists decrying the difficult prose of scientific papers to non-scientists is very popular in popular science articles.
This isn’t to say that a scientific paper cannot be or
should not be highly readable to non-specialists and other members of the general public, but to approach a
scientific paper as a non-specialist and then complain about the jargon is to
miss the point. I think one has to approach a scientific paper from a position
of self-knowledge, in that I have to read a paper outside my area of expertise
in a different (and more difficult) way compared to reading a paper that might
cite my own work.
Another major difference between a scientific paper and
something like an article in The Atlantic
– and these two categories are of similar word-count, on average – is the
abundant citations in a scientific paper. Every fact, every suggestion, every
piece of information in a scientific paper that is not derived directly from
the study itself will be cited; credit is given to the prior work that
established those facts or provided those suggestions (unless the fact or
suggestion is obvious or already widely known and established; we don’t cite
Scheele and Priestly (1772) when
talking about oxygen, for example). I find myself wishing for some citations
and outside attributions while reading this Atlantic
article because James Somers makes so many claims that I would like to dispute.
For example, here’s the third paragraph of the article:
The more sophisticated science becomes, the harder it is to communicate results. Papers today are longer than ever and full of jargon and symbols. They depend on chains of computer programs that generate data, and clean up data, and plot data, and run statistical models on data. These programs tend to be both so sloppily written and so central to the results that it’s contributed to a replication crisis, or put another way, a failure of the paper to perform its most basic task: to report what you’ve actually discovered, clearly enough that someone else can discover it for themselves.
Are papers really longer in 2018 than they were, on average,
in 1998, or 1978, or 1888? Are they more “full of jargon and symbols”? Are the
majority of analytical computer programs “so sloppily written”?
And what replication crisis? Mr. Somers, have you not read
the recent counterargument to the crisis-in-science narrative by Dr. Fanelli,
recently published by PNAS?
Moving on, one major criticism is that scientific papers are
not a good way to express and describe complex results. Animations, something
computers are quite good at, are useful tools for visualizing such complex
concepts but are very difficult to express on a static sheet of paper, which
the modern PDF (Portable Document Format) emulates. I agree, but I do not agree
with the follow-up point that this renders the PDF hopelessly useless. A
scientific paper is about the words, not the pictures or other visualizations.
It’s about the information. Expressing that information in a way the audience
can understand and use is the key skill of writing a scientific paper, and is
distinct from the skills that create written material intended to be read by as
wide an audience as possible. A scientific paper relies heavily on absolute
honesty, and presenting all of the available and relevant information to allow
the reader to independently decide to agree or not with the author’s arguments
and conclusions. A magazine article pushes a particular interpretation of some
phenomenon. A scientific paper pushes the phenomenon and then describes one (or sometimes more) possible
interpretation of that phenomenon, usually in light of similar phenomena and
potential alternative interpretations. A graph is not data, it's an expression of data. An animation is not an argument, it's one support for an argument.
Visualization is a technique, a way to take obscure numbers
and show the patterns they contain. I struggle with it, constantly. The paper I
am procrastinating working on right now has some decent figures* in it and I don’t see a need for a great deal of work on the
visualization side of this paper. I have another project I’m working on that is
at a much earlier stage and my current activities there are primarily concerned
with visualization. I’m at the “data exploration” stage, where I throw the
metaphorical spaghetti of the data at the metaphorical wall and see what
sticks. That means lots and lots of images, mostly graphs I get my computer to
make for me, and some scribbles on paper in my notebook.
*A figure
is any image in a scientific paper, a photograph or map or, most commonly, a
graph illustrating the mathematical relationship between two or more
parameters. I tend to write papers by making the figures first, but that's a personal style and subjective workflow thing, and certainly not universal among scientists.
Back to The Atlantic
It’ll be some time before computational notebooks replace PDFs in scientific journals, because that would mean changing the incentive structure of science itself. Until journals require scientists to submit notebooks, and until sharing your work and your data becomes the way to earn prestige, or funding, people will likely just keep doing what they’re doing.
This is more interesting to me than the preceding description
of competing formats for “computational notebooks”. I have seen suggestions
from other people that concentrate on changing other aspects of scientific
publishing, often the abolition of for-profit publishing companies (e.g. Here),
but these suggestions and discussions do not express a dissatisfaction with the
basic unit of scientific communication, the scientific paper. What would my job
look like if both scientific papers and the way in which they are disseminated
were to go away? Would I just be uploading lumps of code and datatables to some
institutional server, whenever I feel like my analyses have answered some tiny
question? Does my "Literature Cited" section just become a link-dump?
“At this point, nobody in their sane mind challenges the fact that the praxis of scientific research is under major upheaval,” Pérez, the creator of Jupyter [one of the competing calculation notebooks – MB], wrote in a blog post in 2013. As science becomes more about computation, the skills required to be a good scientist become increasingly attractive in industry. Universities lose their best people to start-ups, to Google and Microsoft. “I have seen many talented colleagues leave academia in frustration over the last decade,” he wrote, “and I can’t think of a single one who wasn’t happier years later.”
I had to look up the definition of “praxis”; I think it’s
exactly what I was talking about, what does my job look like if the scientific
paper and scientific publishing are drastically changed? Dr. Pérez apparently
thinks my job would not change much. I’m not so sure.
There’s also a problem in that paragraph with a possible
logical fallacy: confirmation bias. Lots of sad people leave, and then you find
a few of them later and they’re happier. Well, good! Happier people is a good
thing. But to then claim that it was the act of leaving that made them happier,
and then extend that by implication that everybody should consider leaving, is
to stretch beyond the available information into unsupported (and idealistic)
speculation. If the only people who left were the unhappy people, then what
about the happy people who stayed? Would they have also become even more happy
had they left? Did the people who stayed unhappy, or became more unhappy after leaving avoid talking to you?
At this point I’m wandering away from the discussion about
scientific papers. And I think the article did, too. It concludes with a weak
suggestion that maybe some new tools will be useful (who could disagree with
that? Tools are useful by definition) and that, hey Galileo, right?
I remain unconvinced in the impending death of the
scientific paper. What I got out of this article was a description of some computer
programmers and physicists with generally poor social skills but good ideas and
skills related to generating and analyzing data. And that somehow this means the
time I spend teaching ESL graduate students how to write better English that is
also in the demanding, highly technical style of current scientific
communication is somehow wasted.
No comments:
Post a Comment