|Topological acceleration (en | fr)||About | Scalar averaging/DE | inhomog@ADS||arXiv||ADS|
Mon, 01 Jul 2019
What is the point of publishing a scientific paper if an expert reader has to do so much extra work to independently reproduce the results that s/he is effectively discouraged from doing so?
Reproducibility: brief description
In the present practice of cosmology research, such a paper tends to be accepted as "scientific" if the method is described in sufficient detail and clearly enough, and if the observational data are publicly available in the case of an observational paper. However, the modern concepts of free-licensed software and efficient management of software evolution via git repositories over the Internet, as well as Internet communication in general, should make it, in principle, possible to allow an expert reader to reproduce the figures and tables of a research paper with just a small handful of commands in a terminal, to download, compile and run scripts and programs provided by the authors of the research article. This will in practice make it easier for more scientists to verify the method and results, and improve on them, rather than forcing them to rewrite everything from scratch.
This idea has been floating around for several years. A very nice summary and discussion by Mohammad Akhlagi includes Akhlagi's own aim of making the complete research paper reproducible with just a few lines of shell commands, and links to several astronomical reproducible papers from 2012 to 2018, most using complementary methods.
I tend to agree that using Makefiles is most likely to be the optimal overall strategy for reproducible papers. For the moment, I've used a single shell script in 1902.09064.
The software evolution problem
I suspect, unfortunately, that there's a fundamental dilemma in making fully reproducible papers that remain reproducible in the long term, because of software evolution. Akhlagi's approach is to download and compile all the libraries that are needed by the author(s)' software, in specific versions of the software that were used at the time of preparing the research paper. This would appear to solve the software evolution problem.
My approach, at least so far in 1902.09064, is to use the native operating system (Debian GNU/Linux, in my case) recommended versions of all libraries and other software, to the extent that these are available; and to download and compile specific versions of software that are "research-level" software, either not yet available in a standard GNU/Linux family operating system, or evolving too fast to be available in those systems.
Download everything: pro
Download everything: con
Prefer native libraries: pro
Prefer native libraries: con
Choosing an approach
While the "download everything" approach is, in principle, preferable in terms of hypothetical reproducibility, it risks being heavy, could have security risks, could be difficult due to dependency hell, and might in the long term not lead to exact reproducibility anyway, for practical reasons (leaving aside theoretical Turing machines). The "prefer native libraries" approach provides, in principle, less reproducibility, but it should be more efficient, secure and convenient, and, in practice, may be sufficient to trace bugs and science errors in scientific software.
Comments: edit name, title and content in this template: NAME: name; TITLE: title; Please publish my comment on https://cosmo.torun.pl/blog/reproducibility ; content; and send the edited template to blog cosmo torun pl; use of email is for antispam filtering only; your email address will not be published.
Sat, 06 Apr 2019
It has become quasi-obligatory since the late 1990s for cosmology research articles to be posted at the ArXiv preprint server, making them publicly available under green open access. Much of other astronomy, physics and mathematics articles needed for cosmology research is also available at ArXiv. In practice, this means that almost all post-mid-late-1990s literature cited in cosmology research articles is available on ArXiv.
Many of these articles are posted before external peer-review by research journals, so they are literally "preprints", while others are posted after acceptance by a journal, but usually before they appear in paper versions of the journals, for those journals that are still printed on paper, or as online "officially published" articles. However, most of these "preprints" are cited before they are formally published — because they're hot-off-the-press, state-of-the-art results, or to put in plain English rather than advertising jargon, they're useful new results that need to be taken into account. Several journals, including MNRAS and A&A, insist on hiding the fact that references are easily obtainable without paywall blocks by requiring all references that have peer-reviewed bibliometry data to have their ArXiv identifiers removed from the list of references (bibliography) of any research paper!
The reason cited by colleagues (there doesn't seem to be a formal public justification by MNRAS/A&A) for excluding ArXiv identifiers from the bibliography for articles that are already formally published is to restrict citations as much as possible to the peer-reviewed literature. But this is nonsense: including both the peer-reviewed identifying information (year, journal name, volume, first page) and the ArXiv identifier informs the reader that the article is peer-reviewed, while also guaranteeing that the article is available to the reader (at least) under green open access. So that reason is unconvincing.
Another reason cited by colleagues is that the journal versions are more valid than the preprints, since the journal versions have usually been updated following peer-review and following language editor and proof-reader requests for corrections. This reason has some validity, but in practice is weak. Article authors quite frequently update their preprint on ArXiv to match the final accepted version of their article (in content, not in the particular details of layout, to reduce the chance of copyright complaints by the journals), because they know that many people will access the green open access version, and they want to reduce the risk that readers will refer to an out-of-date preprint version. Other authors only post their article on ArXiv once it is already accepted, in which case no significant revision is needed to match the content of the accepted version.
If the reasons for hiding ArXiv references are weak, what are the reasons for including ArXiv references?
So that's why you should include ArXiv references in the bibliographies of your research articles. You can set up a LaTeX command so that if the journal asks you to remove them in the official version, you do that at the final stage for your "official" version, because you don't want to waste time trying to convince the journal about the ethical arguments above. But in your ArXiv versions and other versions that you might distribute to colleagues, you should favour the more ethical versions, which include the ArXiv references.
Comments: edit name, title and content in this template: NAME: name; TITLE: title; Please publish my comment on https://cosmo.torun.pl/blog/arXiv_refs ; content; and send the edited template to blog cosmo torun pl; use of email is for antispam filtering only; your email address will not be published.
Tue, 30 Aug 2016
Popular science descriptions of our present understanding of observational cosmology tend to say that we know the age of the Universe to be 13.80 gigayears, with an uncertainty of just 0.02 gigayears (20 megayears). But some of the oldest microlensed stars in the Galactic Bulge, within the central kiloparsec or so of our Galaxy, have best estimated ages of about 14.7 gigayears!. In the figure at left, our analysis of the probability distribution of the most likely age of the oldest of these stars is shown. The thin curves show probability densities for the ages of individual stars—several of these peak between about 14.5 and 15 gigayears. The thick curve shows the age of the oldest of these stars, supposing that we choose the individual star ages randomly according to their probability distributions. (This includes possible ages much lower than in the figure; we take the full asymmetric distributions into account.) So could the Universe be a gigayear older than is generally thought? The uncertainties are still big, but this is certainly an exciting prospect for shifting towards a more physically motivated cosmological model.
The more careful descriptions of the age of the Universe give a caveat—a warning of how or why the standard estimate might be wrong—the age estimate depends on fitting observations by using the standard ΛCDM model. Which is the standard model of cosmology. Meaning that it makes a non-standard assumption about gravity. Instead of allowing space to curve differently in regions where matter collapses into galaxies versus places where the Universe becomes more empty, which is what Einstein's general relativity says, the standard model is rigid (apart from uniform expansion). It doesn't allow general relativity to apply properly.
Several of us have been working on theoretical tools and observational analysis to see if we can apply general relativity better than in the standard model. At least so far, we generally find that doing our homework tells us that the would-be mysterious "dark energy" is really, until or unless proven otherwise, just a misinterpretation of space recently becoming negatively curved (on average) as voids and galaxies have formed during the most recent several gigayears.
This is where the age of the Universe comes in. In our new paper, arXiv:1608.06004, my colleagues and I summarise some key numbers that we argue are needed by any of the "backreaction" models similar to ours, which allow space to curve as galaxies and voids form, as required by the Einstein equation of general relativity. These simple constraints show that by fitting a no-dark-energy flat model (the Einstein–de Sitter model) at early times, the age of the Universe should be somewhat less than 17.3 gigayears, and quite likely somewhat more than the ΛCDM estimate of 13.8 gigayears. So we looked at published observations of stellar ages, which individually still have big uncertainties, but together favour the oldest stars having ages of around 14.7 gigayears. As expected, this is somewhere in between the two limits of 13.8 and 17.3 gigayears.
So will there be a race between detailed "backreaction" models versus stellar observers to get tight cosmological predictions of the age of the Universe versus accurate spectrosopic measurements of the oldest Galactic stars's ages (which have to be younger than the Universe, of course!)?
Barely had our paper become public on ArXiv, that we were reminded by colleagues studying cosmic microwave background (CMB) observations using the Einstein–de Sitter, no-dark-energy, flat cosmological model at early times that they also found an age of the Universe of something like 14.5 gigayears! Figure 4 bottom-right of arXiv:1012.3460 (PRD) shows our colleagues' estimates of the age of the Universe using the CMB and type Ia supernovae observations. Their most likely age is about 14.5 gigayears, give or take about half a gigayear. This is not so very different from the Galactic Bulge star best estimate! So we have very different, independent methods tending to give similar results. The uncertainties are still big. This story is not closed. But an extra Gigayear for the age of the Universe may be a clue that helps shift from the precise ΛCDM cosmology to the upcoming generation of accurate cosmology...
Comments: edit name, title and content in this template: NAME: name; TITLE: title; Please publish my comment on https://cosmo.torun.pl/blog/an_extra_gyr ; content; and send the edited template to blog cosmo torun pl; use of email is for antispam filtering only; your email address will not be published.
content licence: CC-BY | blog tools: GNU/Linux, emacs, perl, blosxom