Digital Scholarship@Leiden

How will we exchange digital research data in the future?

How will we exchange digital research data in the future?

The proceedings of the 1st International Conference on FAIR Digital Objects in Leiden offer a vision of a more accessible future for research data.

Research data often end up in repositories where they can be accessed via download links. As a researcher wanting to build upon earlier work, you can only hope that the data is described well enough to be reused. Imagine instead a future where research data is accessible at your fingertips, just a search away. Did someone else already perform a similar study? Looking for data to validate your study? Simply search the Internet using terms describing the research you are looking for to get the actual results back in a format that you can readily work with.

This is the promised future of FAIR Digital Objects, where FAIR stands for Findable, Accessible, Interoperable and Reusable. In this blog post, we explain what FAIR Digital Objects are and report on some of the latest developments seen at the 1st International Conference on FAIR Digital Objects.

What is a FAIR Digital Object?

From October 26th to the 28th 2022, the 1st International Conference on FAIR Digital Objects took place in Leiden, The Netherlands. The three days were packed with talks and workshops to explore the status and future of FAIR Digital Objects.

But what is a FAIR Digital Object? To answer that question, let’s first take a step back and look at how information is exchanged over the Internet today. At the core of the Internet is the Internet protocol suite commonly known as TCP/IP that describes the exchange of messages between devices with an IP address. It is a simple and elegant solution but not very well suited for exchanging meaningful data.

A richer version of such a message is a FAIR Digital Object. The protocol that handles the transfer is known as the Digital Object Interfacing Protocol.1 FAIR Digital Objects are described as “standardized, autonomous, and persistent entities, which contain the information needed about different kinds of digital objects (data, metadata, documents, software, semantic assertions, etc.), to enable both humans and machines to Find, Access, Interoperate, and Reuse (FAIR) these digital objects in highly efficient and cost-effective ways. These entities are independent of continuously changing technologies and the many different ways that are and in future will be organized and structured. In addition, they have built-in mechanisms to support data sovereignty.”2

As a researcher, why should I care about FAIR Digital Objects?

FAIR Digital Objects and the Digital Object Interface Protocol will be as invisible to a researcher as the Internet Protocol suite is now. So, in that case, why should you as a researcher care about these developments? The answer lies in the nature of a FAIR Digital Object, which is basically a piece of data.

FAIR Digital Objects will be described using standards to help computers interpret them, but as a researcher you will need to know these standards to use the FAIR Digital Objects efficiently, in a similar way that you need to know how to formulate a search question over the Web so that you will get back a relevant answer.

George Strawn, one of the founders of the Internet, said at the conference that FAIR Digital Objects seek to do for computers what the Web does for humans. Researchers of the future will be experts in how to describe their FAIR Digital Objects so that they can be understood by computers and humans.

Some developments at the conference might be relevant to researchers more immediately, such as FAIR Implementation Profiles. A FAIR Implementation Profile “is a collection of FAIR implementation choices made by a community of practice for each of the FAIR Principles.”3 During the past years FAIR Implementation Profiles have been created by research communities to declare how they implement the FAIR principles.

A prominent example is the connection of the Cluster of Environmental Research Infrastructures (ENVRI) to the European Open Science Cloud (EOSC), and now carried on within the WorldFAIR project where 11 disciplinary and cross-disciplinary case studies are investigated to advance implementation of the FAIR principles and, in particular, to improve interoperability and reusability of digital research objects, including data.

Kristina Hettne from the Centre for Digital Scholarship at Leiden University Libraries has been contributing to FAIR Implementation Profiles from the start and was recently joined by Alessa Gambardella from the Faculty of Science at Leiden University. They will continue to be involved as part of the FAIR Implementation Profile and Practice Group.

During the conference, Erik Schultes and Barbara Magagna lead the discussion on whether FAIR Implementation Profiles should be part of the core description of a FAIR Digital Object, resulting in discipline-specific standards and vocabularies for describing research data as part of the core of the Internet. As researchers explored how to make their data FAIR throughout the conference, the link between FAIR Implementation Profiles and Data Management Plans became a critical discussion point.

Another discussion of note involved the implementation of FAIR Digital Objects using RO Crate,4 an approach to aggregate research outputs such as data, software, documents, and their descriptions so that not only humans but machines can understand how they all fit together. Stian Soiland-Reyes presented5 how the principles of Linked Data could be updated for the FAIR Digital Object principles, considering the active nature of research data.

Take home message

I would like to end this blog post with a statement from one of the founders of the Internet Robert E. Kahn: “The most important challenge is how to make a FAIR Digital Object describe to the computer all the different ways it can be reused.” Something to think about when you are describing your next dataset!

Find the conference outcomes here.