<<
CosmoTeaching20242025
Reproducibility course
Draft content for usosweb
Title
EN: Reproducibility in astronomy research papers
PL: Odtwarzalność w artykułach badawczych astronomicznych
Room allocation
Radioastronomy building seminar room: 1-2 sessions; radioastronomy computer room: remaining sessions.
Prerequisites
(max 65535b)
Undergraduate knowledge of astronomy and experience with astronomy software.
Basic knowledge of Unix-like operating systems and of one or more common scientific programming languages such as Fortran, C, Python and/or R.
ECTS points: 3
Total student workload
(max 65535b)
Lectures/tutorials: 30 h
Individual work: 30 h
Learning outcomes - knowledge
(max 65535b)
(Using the 2019 AS2 parameters - https://www.home.umk.pl/~kmgesicki/AS2-tabela-spojnosci.pdf)
W1: K_W04 - has deeper knowledge of commonly used computational methods in astrophysics by having matched the abstract scientific narrative to a series of plain text source scripts in a reproducible way
W2: K_W05 - is familiar with state-of-the-art research in his/her particular field by learning how to reproducibly download and compile state-of-the-art astronomical source code
W3: K_W08 - knows general principles of scientific project management
Learning outcomes - skills
(max 65535b)
U1: K_U03 - can independently plan, carry out and develop quantified comparisons between models and observations or between analytical and numerical models and can calculate and interpret hypothesis testing
U2: K_U07 - is competent in English at the level needed for reading and understanding astronomical software, in particular using free-licensed software, consistently with the requirements defined at level B2+ in the European System Describing Language Education
Learning outcomes - social competencies
(max 65535b)
K1: K_K01 - understands the limits of his/her own knowledge and the need to consult with expert peers in open, decentralised software forges, such as those running Forgejo.
K1: K_K02 - understands the role of transparency in encouraging one's own and other's intellectual honesty and understands the ethical problems of software licensing
K1: K_K04 - is able to think and act in a collegial way that encourages transparent, participatory decision-making on scientific project management
Teaching methods
(max 65535b)
Lectures, interactive face-to-face tutorials, and asynchronous online communication using the https, ssh, irc and matrix protocols.
Short description
(max 1000b)
This course will teach the motivations for a minimalist reproducible research paper template that is appropriate for astronomical research papers.
Students will learn the tools via discussion and hands-on usage and adaptation of the template.
Ideally, each student will use his/her own branch of the template for his/her own research project.
Description
(max 65535b)
Most contemporary astronomy research papers depend extensively on software pipelines.
Observations are collected and reduced digitally, and analysed in comparison to numerical models.
Analytical calculations are often sufficiently complex that computer algebra systems are needed to reduce the chance of errors.
This heavy dependence on powerful computational hardware and software makes reproducibility of any individual astronomy research paper difficult.
This course will start with an overview of the broader problem of reproducibility in science [1] and the current proposal by several astronomers for carrying out reproducible research projects whose aim, method and results are published in peer-reviewed journals as research papers [2].
An overview of other existing methods is given in Appendix A of [2] on tools including containers such as Docker, package managers, and Jupyter, and in Appendix B of [2] on existing implementations of scientific workflows.
The initial task of the students will be to fully reproduce, on an OS and computer of their choice, an already published paper [3], starting with a source package small enough to fit on a floppy disk.
Guidance on shell-level computing skills [4] and the Maneage system [5] will be provided.
The main aim will be that each student's own branch of the template will be sufficiently developed to the level of yielding a reproducible draft research paper (pdf file) based on the draft status of the student's observational data and/or models at the end of the semester.
International scientific collaboration by providing bug reports [6] will be encouraged as part of this course.
Extensive use of git repositories, as well as synchronous communication during tutorials and asynchronous communication at other times, will include channels using the irc protocol: #maneage
https://www.oftc.net, and the matrix protocol: #maneage_community:matrix.org . A curated list of matrix servers is provided at
https://servers.joinmatrix.org .
Bibliography
(max 65535b)
[1]
https://cosmo.torun.pl/~boud/Roukema20240311IANCU.pdf
[2] Akhlaghi+2021
https://oadoi.org/10.1109/MCSE.2021.3072860
[3]
https://ui.adsabs.harvard.edu/abs/2022CQGra..39u5007B
[4]
https://cosmo.torun.pl/foswiki/Cosmo/ProgrammingForCosmologists
[5]
https://maneage.org
[6]
https://www.chiark.greenend.org.uk/~sgtatham/bugs.html
Assessment methods and assessment criteria
(max 65535b)
The scale of points (max 5) will be negotiated by rough consensus
among the participating students and the lecturer, depending on
progress made during the semester. The initial set of parameters is
(N0, N1, N2, N3, N4) = (3, 0.5, 0.5, 0.5, 0.5).
N0 points - A git branch of Maneage with the student's own project
(private access only prior to submission to a journal), through to the
stage of verification (verify.mk) of some minimal results, that can be
reproduced through to the final pdf by (at least) the lecturer.
N1 points - Each properly written bug report (Tatham 1999 [6]) posted on
the Maneage bug reporting site will count for N1 points. If responses
are given by developers within the semester, constructive followup
will be expected for the report to qualify.
N2 points - Each properly written merge request (MR) posted on a Maneage
fork on a git forge will count for N2 points. If responses
are given by developers within the semester, constructive followup
will be expected for the report to qualify.
N3 points - Provision of (successive) log files on a POSIX-compatible
OS that shows previously unknown bugs in upstream Maneage, and
participation in their analysis, through to final testing that either
establishes a reproducible bug or proposes a fix. Max N3 points.
N4 points - Log files, as for case N3, for project-specific software. Max N4 points.
--
BoudRoukema - 31 Oct 2024 + ...