Topological acceleration (en | fr) About | Scalar averaging/DE | inhomog@ADS arXiv ADS

blogs archives
In the dark
Trenches of discovery
Café sciences

Sat, 06 Apr 2019

Why non-use of ArXiv refs in a bibliography is unethical

It has become quasi-obligatory since the late 1990s for cosmology research articles to be posted at the ArXiv preprint server, making them publicly available under green open access. Much of other astronomy, physics and mathematics articles needed for cosmology research is also available at ArXiv. In practice, this means that almost all post-mid-late-1990s literature cited in cosmology research articles is available on ArXiv.

Many of these articles are posted before external peer-review by research journals, so they are literally "preprints", while others are posted after acceptance by a journal, but usually before they appear in paper versions of the journals, for those journals that are still printed on paper, or as online "officially published" articles. However, most of these "preprints" are cited before they are formally published — because they're hot-off-the-press, state-of-the-art results, or to put in plain English rather than advertising jargon, they're useful new results that need to be taken into account. Several journals, including MNRAS and A&A, insist on hiding the fact that references are easily obtainable without paywall blocks by requiring all references that have peer-reviewed bibliometry data to have their ArXiv identifiers removed from the list of references (bibliography) of any research paper!

The reason cited by colleagues (there doesn't seem to be a formal public justification by MNRAS/A&A) for excluding ArXiv identifiers from the bibliography for articles that are already formally published is to restrict citations as much as possible to the peer-reviewed literature. But this is nonsense: including both the peer-reviewed identifying information (year, journal name, volume, first page) and the ArXiv identifier informs the reader that the article is peer-reviewed, while also guaranteeing that the article is available to the reader (at least) under green open access. So that reason is unconvincing.

Another reason cited by colleagues is that the journal versions are more valid than the preprints, since the journal versions have usually been updated following peer-review and following language editor and proof-reader requests for corrections. This reason has some validity, but in practice is weak. Article authors quite frequently update their preprint on ArXiv to match the final accepted version of their article (in content, not in the particular details of layout, to reduce the chance of copyright complaints by the journals), because they know that many people will access the green open access version, and they want to reduce the risk that readers will refer to an out-of-date preprint version. Other authors only post their article on ArXiv once it is already accepted, in which case no significant revision is needed to match the content of the accepted version.

If the reasons for hiding ArXiv references are weak, what are the reasons for including ArXiv references?

  • For articles that are not provided in open access mode by the publishers either immediately or after an embargo period (such as many cosmology journals including JCAP, PRD, and CQG, which seem to block all of their articles behind paywalls unless open access charges are paid by the authors at an appropriate step of submitting the article for publication), removing/omitting ArXiv references from a reference list blocks access to the research articles for:
    1. scientists (physicists, mathematicians) in institutes who do not pay for subscriptions to astronomy/cosmology journals;
    2. astronomers in institutes who do not pay for subscriptions to maths/physics journals containing articles with justifications of mathematical techniques or physics that is not published in astronomy journals;
    3. scientists (astronomers, physicists, mathematicians) in institutes/universities who do not pay for global subscriptions to the publishers of the journals referred to;
    4. scientists in poor countries who do not pay for any journal subscriptions at all;
    5. the general public — including former astronomy/cosmology students who retain an interest in cosmology research and have the competence to understand research articles — who do not have access to any research institute or university journal subscriptions.

    Arguments 1, 2, and 3 are practical problems; these researchers will generally know that they can search ArXiv and the ADS and after 30–120 seconds will find out if the article is available on ArXiv, or possibly by open access on the journal website.

    Argument 3 here can be considered as a form of racism. There are several Nobel prizes explicitly related to Bose's contributions to physics, Chandrasekhar actually got a Nobel prize rather than merely having his name cited in the topics of Nobel prize awards, but the reality of today's economic/political/sociological setup is that the budgets of many Indian astronomy research institutes are far lower than that of rich-country institutes, so excellent scientists of high international reputations, and their undergraduate and postgraduate students, have to do research without having access to any paid journal subscriptions.

    Argument 5 could be considered as arrogance, elitism, and/or bad public relations in the Internet epoch.

  • A&A now has a short embargo (12 months?) for paywall blocks on articles, after which all articles become gold open access (with no extra charges to authors); MNRAS has a longer embargo, and other journals are under pressure to shift to open access. So what are arguments for including ArXiv identifiers for peer-reviewed articles that are available under open access by the publishers?

    1. It would require a lot of extra administrative effort by authors to update their .bib files depending on the dates on which articles become open access after an embargo;
    2. It would require a lot of extra administrative effort by authors to modify their .bib files to separate out journals whose articles are never open access from those with an embargo period;
    3. Authors at institutions with some or many journal subscriptions generally don't notice whether or not a cited article is behind a paywall, because the publishers' servers usually have IP filters that automatically recognise authors' computers as having authorisation to access the articles.
    4. Although big journal publishers can probably be relied on, to some degree, to maintain their article archives in the long-term, we know that the group of people running ArXiv have solid experience in long-term archiving and backing up (data storage redundancy) practices, and they have no conflict between commercial motivations and scientific aims.
    5. A typical article has anywhere from 30–100 or so references. Each of those also has from 30–100 or so "second-level" references. And so on. Even if the n-th level references are to a large degree redundant, a complete survey of the third or fourth level of references could easily cover 1000–10,000 articles. Nobody is going to read that many background articles, and not even their abstracts. Obviously, in practice, a reader can only trace back a modest number of references, and a modest number of references in those references. So for those articles that can be, after a little effort, found by the reader despite the ArXiv identifier being omitted, or as publisher-provided online articles, the hiding of the ArXiv identifier (and lack of a clickable ArXiv link) slows down the time for the reader to find the abstract and decide whether or not to read further. Even though the slowdown might only be an extra minute, multiplying that extra minute by the number of references to be potentially checked leads to a big number of minutes. Adding unnecessary "administrative" work for the reader is obstructive.

So that's why you should include ArXiv references in the bibliographies of your research articles. You can set up a LaTeX command so that if the journal asks you to remove them in the official version, you do that at the final stage for your "official" version, because you don't want to waste time trying to convince the journal about the ethical arguments above. But in your ArXiv versions and other versions that you might distribute to colleagues, you should favour the more ethical versions, which include the ArXiv references.

fr | permanent link | RSS | trackback: ping me (experimental)

Comments: Please publish comments on a community-based Fediverse server of your choice and ping me in the comment with

home :: arXiv_refs

content licence: CC-BY | blog tools: GNU/Linux, emacs, perl, blosxom