FAIR data

The attention of researchers is increasingly directed to the phase of the research lifecycle where data are published, shared, discovered and reused. One of the perceived ways to achieve optimal reuse is to make data FAIR (Findable, Accessible, Interoperable and Reusable) (Force 11, 2014; Wilkinson, et al., 2016).

The FAIR guiding principles consist of 15 facets (Dutch Techcentre for Life Sciences, 2016) which describe a continuum of increasing reusability. Importantly, data should not only be FAIR for humans but also for machines, allowing for instance, for automated search and access to data. Funders like the European Commission have drafted Guidelines on FAIR Data Management for the H2020 programme (European Commission, 2016). Good data management is one way to support the FAIR principles.

20171118_FAIRdata_V2_Tekengebied 1

Steps toward FAIRer data

In this tour guide, we treat the FAIR principles as guidelines to a clear higher goal: you should aim to prepare your data for optimal (re)use from the start and take appropriate measures to be most likely to succeed. To achieve FAIRness, data objects should at least have:

  • A persistent identifier (PID) for the data object as a whole
    Persistent identifiers like DOIs form the solution to link rot. Link rot is the process by which hyperlinks stop referring to the original source in time because it was moved. Without a PID, the data object simply will not be findable let alone reusable (see 'Data citation')
  • A sufficient set of metadata
    A sufficient and standardised set of metadata (elements which describe the data) will enhance findability, interoperability and reusability. The quality of the descriptive information regarding the data has a profound impact on their reusability. So the more documentation of the data’s context, the better. As a minimum, there should be sufficient amount of metadata to make the data findable but also understandable and reusable by other researchers (see 'Documentation and metadata').
  • A clear licence
    Researchers (and computers) who find a dataset should immediately know what they are allowed to do with it. Stating clear reuse rights is like having a warm 'Welcome' on the doormat of your dataset. The motto is: ‘open if possible, restricted if necessary’ (see 'Data licensing').

One of the ways to make sure your data will not become reuseless in the long term is to choose a data repository which has these attributes built into the infrastructure for submitting your dataset. In the interest of FAIR data, researchers are advised to deposit their data, along with all the documentation needed to make sure they can be reused, in a research data archive with both the explicit goal and the necessary expertise to store data sustainably and maintain their usability (Van Berchum & Grootveld, 2017).

Making data FAIR is a joint responsibility of researchers and data repositories. In a comprehensive document, the Swiss National Science Foundation explains (SNF, n.d.) how the responsibilities of both are distinct.

In the chapter on archiving and publishing data, we will guide you in making the FAIRest choice for entrusting your data.

Expert tip