Digital Scholarship@Leiden

Leiden University Libraries & Elsevier Seminars on Reproducible Research: Wrap-up Seminar 2

Leiden University Libraries & Elsevier Seminars on Reproducible Research: Wrap-up Seminar 2

The second session of the Leiden University Libraries & Elsevier seminars on Reproducible Research discussed the phenomenon of reproducibility mainly on a conceptual level. As one of its central questions, the seminar examined the scope and the limits of the concept of reproducible research.

Did you miss this seminar or would you like to watch some parts (again)?
You can watch the recording of the seminar on this playlist:

If you have not yet registered for the remaining seminar, you can register here. Please subscribe to the UBLeiden Youtube channel not to miss the wrap-up videos for seminar 3.

WRAP-UP Seminar 2

Reproducibility is a convoluted term which has been applied in many different ways, but, in the first lecture, Christof Schöch offered a very clear explanation of some the central concepts in this context. To avoid confusion, he used ‘repeatable research’ as a more encompassing term, and he proposed a typology of repeatable research which considers three basic variables: the data that are used, the study’s methods and the research question. These variables can be combined to form a three-dimensional space which we can use to classify specific forms of repeated research. Schöch also discussed a number of concrete examples of repeated research. One of these focused on a study conducted in 1887 by T.C. Mendenhall, which was revisited in 2015. When we apply existing methodological approaches to other data or to other questions, this can often lead to interesting new insights.

In the second lecture, Bart Penders focused mostly on the conceptual boundaries of replication. Is replication actually possible and/or desirable within all academic disciplines? He firstly explained, using references to H.M. Collins’ Changing Order, that the knowledge that is needed to replicate an experiment cannot always be articulated fully. Such forms of tacit knowledge can, in most cases, be transferred only via face-to-face interactions. The aim of the open science movement is to make the entire scholarly workflow more transparent, and this may make replication more attainable in many fields. We still need to ask ourselves the question, nonetheless, whether replication is actually the right tool for the job in all situations. The ‘job’, in this context, is the aim to safeguard the reliability and the robustness of scientific claims. Collins referred to replication as ‘the Supreme Court of science’, but it ought to be stressed that its jurisdiction is not universal. When considering the ‘rightness of the tool’, we need to consider the fact that academic disciplines may differ with respect to methodology and epistemic culture. In the humanities and in the interpretative social sciences, we can generally assess the quality of the research on the basis of written publications alone, but in areas such as the natural sciences and the life sciences, we cannot always take the reported findings at face value. In the latter cases, we often need access to other resources in the study’s ‘network of circulating references’, such as data sets, computer code or research protocols. There may also be the practical problem that certain phenomena that are studied simply cannot be replicated, such as the Big Bang, or pre-Covid vaccine hesitancy.

In the third talk, I illustrated some of the general topics that were raised in the first two lectures by discussing the results of a number of concrete replication studies conducted for projects in the digital humanities. Researchers who replicate a study usually need to develop more specific operational criteria that can help them to evaluate if and how a study can actually be replicated. In the digital humanities, many researchers have fortunately adopted the general principles of computational reproducibility. Researchers increasingly offer access to the raw or the processed data they have worked with, and many studies have also shared the full code that has been developed. Such open materials evidently make the task of replication much easier. It must be stressed, nonetheless, that when all the analytic processes are repeated, this does not mean that we can also verify whether a certain finding is correct or truthful. If there are logical mistakes in the code, for example, these mistakes can also be repeated in the replication. To be able to corroborate or to refute the findings, we generally need to recollect data or to reconstruct methods.