Sat, 06 Apr 2019
Why non-use of ArXiv refs in a bibliography is unethical
It has become quasi-obligatory since the late 1990s for cosmology
research articles to be posted at
the ArXiv preprint server, making
them publicly available under
access. Much of other astronomy, physics and mathematics articles
needed for cosmology research is also available at ArXiv. In
practice, this means that almost all post-mid-late-1990s literature
cited in cosmology research articles is available on ArXiv.
Many of these articles are posted before external
peer-review by research journals, so they are literally "preprints",
while others are posted after acceptance by a journal, but usually
before they appear in paper versions of the journals, for those
journals that are still printed on paper, or as online "officially
published" articles. However, most of these "preprints" are cited
before they are formally published — because they're
hot-off-the-press, state-of-the-art results, or to put in plain
English rather than advertising jargon, they're useful new results
that need to be taken into account. Several journals, including MNRAS
and A&A, insist on hiding the fact that references are easily
obtainable without paywall blocks by requiring all references that
have peer-reviewed bibliometry data to have
identifiers removed from the list of references (bibliography) of
any research paper!
The reason cited by colleagues (there doesn't seem to be a formal
public justification by MNRAS/A&A) for excluding ArXiv identifiers
from the bibliography for articles that are already formally published
is to restrict citations as much as possible to the peer-reviewed
literature. But this is nonsense: including both the peer-reviewed
identifying information (year, journal name, volume, first page)
and the ArXiv identifier informs the reader that the
article is peer-reviewed, while also guaranteeing
that the article is available to the reader (at least) under green
open access. So that reason is unconvincing.
Another reason cited by colleagues is that the journal versions are
more valid than the preprints, since the journal versions have usually
been updated following peer-review and following language editor and
proof-reader requests for corrections. This reason has some validity,
but in practice is weak. Article authors quite frequently update their
preprint on ArXiv to match the final accepted version of their article
(in content, not in the particular details of layout, to reduce the
chance of copyright complaints by the journals), because they know
that many people will access the green open access version, and they
want to reduce the risk that readers will refer to an out-of-date
preprint version. Other authors only post their article on ArXiv once
it is already accepted, in which case no significant revision is
needed to match the content of the accepted version.
If the reasons for hiding ArXiv references are weak, what are the
reasons for including ArXiv references?
- For articles that are not provided in open access mode by the publishers
either immediately or after an embargo period
(such as many cosmology journals including JCAP, PRD, and CQG,
which seem to block all of their
articles behind paywalls unless open access charges are paid by the
authors at an appropriate step of submitting the article for
removing/omitting ArXiv references from a reference list
blocks access to the research articles for:
- scientists (physicists, mathematicians) in institutes who do not
pay for subscriptions to astronomy/cosmology journals;
- astronomers in institutes who do not
pay for subscriptions to maths/physics journals containing articles
with justifications of mathematical techniques or physics that is
not published in astronomy journals;
- scientists (astronomers, physicists, mathematicians) in
institutes/universities who do not pay for global subscriptions to
the publishers of the journals referred to;
- scientists in poor countries who do not pay for any journal
subscriptions at all;
- the general public — including former
astronomy/cosmology students who retain an interest in
cosmology research and have the competence to understand
research articles — who do not have access to any research
institute or university journal subscriptions.
Arguments 1, 2, and 3 are practical problems; these researchers will generally
know that they can search ArXiv and the
and after 30–120 seconds will find out if the article is available
on ArXiv, or possibly by open access on the journal website.
Argument 3 here can be considered as a form of racism. There are several
Nobel prizes explicitly related to
contributions to physics,
actually got a Nobel prize rather than merely having
his name cited in the topics of Nobel prize awards, but the
reality of today's economic/political/sociological setup is
that the budgets of many Indian astronomy research institutes are
far lower than that of rich-country institutes, so excellent scientists
of high international reputations, and their undergraduate and
postgraduate students, have to do research without having
access to any paid journal subscriptions.
Argument 5 could be considered as arrogance, elitism, and/or bad
public relations in the Internet epoch.
A&A now has a short embargo (12 months?) for paywall blocks
on articles, after which all articles become gold open access (with no
extra charges to authors); MNRAS has a longer embargo, and other journals
are under pressure to shift to open access. So what are arguments
for including ArXiv identifiers for peer-reviewed articles
that are available under open access by the
- It would require a lot of extra administrative effort by authors to
update their .bib files depending on the dates on which articles become
open access after an embargo;
- It would require a lot of extra administrative effort by authors to
modify their .bib files to separate out journals whose articles are never
open access from those with an embargo period;
- Authors at institutions with some or many journal subscriptions
generally don't notice whether or not a cited article
is behind a paywall, because the publishers' servers usually have
IP filters that automatically recognise authors' computers as having
authorisation to access the articles.
- Although big journal publishers can probably be relied
on, to some degree, to maintain their article archives in
the long-term, we know that the group of people running ArXiv have
solid experience in long-term archiving and backing up
(data storage redundancy) practices, and they have no
conflict between commercial motivations and scientific
- A typical article has anywhere from 30–100 or so references.
Each of those also has from 30–100 or so "second-level" references.
And so on. Even if the n-th level references are to a large degree
redundant, a complete survey of the third or fourth level of references
could easily cover 1000–10,000 articles. Nobody is going to
read that many background articles, and not even their abstracts.
Obviously, in practice, a reader can only trace back a modest number
of references, and a modest number of references in those references.
So for those articles that can be, after a little effort, found
by the reader despite the ArXiv identifier being
omitted, or as publisher-provided online articles, the hiding of the
ArXiv identifier (and lack of a clickable ArXiv link) slows down the
time for the reader to find the abstract and decide whether or not to
read further. Even though the slowdown might only be an extra minute,
multiplying that extra minute by the number of references to be potentially
checked leads to a big number of minutes. Adding unnecessary "administrative"
work for the reader is obstructive.
So that's why you should include ArXiv references in the
bibliographies of your research articles. You can set up a LaTeX
command so that if the journal asks you to remove them in the
official version, you do that at the final stage for your "official"
version, because you don't want to waste time trying to convince the
journal about the ethical arguments above. But in your ArXiv versions
and other versions that you might distribute to colleagues, you should
favour the more ethical versions, which include the ArXiv references.
permanent link |
trackback: ping me (experimental)
Please publish comments on a community-based Fediverse server of your choice and ping me in the comment with @firstname.lastname@example.org.