Citing your data

Persistent identifiers ensure future access to unique published digital objects, such as a text or data set. Persistent identifiers are assigned to data sets by digital archives | American Sociological Review - Submission Guidelines (Sage Publishing, 2017).

For data products to be uniquely identifiable and attributable to their data creators two types of identifiers are recommended:

DOIplusORCID800px
  • A persistent identifier (PID) to your dataset
    The publication of data sets is becoming more and more important as a citable contribution to research. To become citable, you need to make sure that your datasets gets a unique, persistent identifier. The Digital Object Identifier (DOI) is a well-known identifier in academia. Having a PID is an important aspect of making sure your data meets the F (Findability) and A (Accessibility) in FAIR data management.
  • A persistent author identifier
    To make your research results even more connected you can create your personal persistent author identifier. The ORCID iD provides such a persistent digital identifier, distinguishing you from every other contributor and supporting automated linkages among all your professional activities. By creating and using an ORCID iD you will be able to present all of your - growing - work through one channel.

Citing new data types

Citing rapidly changing data is also challenging. The Data Cite organization has published recommendation regarding citing new data types. There is the possibility to cite the continuously updated dataset and only add an access date and time to the citation. However, this means that the citation does not result in access to the resource as cited when it was changed in the meantime. This limits reproducibility of the work that uses this form of citation. Another option is to cite a specific “snapshot” (i.e., a copy of the entire dataset made at a specific time) but this requires unique identifiers for each version/snapshot of data.

Data Citation and impact

Data citation is the practice of providing a reference to data in the same way as researchers routinely provide a bibliographic reference to other scholarly resources | Australian National Data Service (n.d).

The impact of your research may be determined by a wide range of research outputs such as data sets, software, blog posts, presentations, tweets, etc. Being able to cite such research outputs is important for building a culture where all types of research outputs count. In the video (Research Data Netherlands, 2014) below data citation and the role of persistent identifiers is explained.

In an article on spatial and temporal dynamics of multidimensional well-being, livelihoods and ecosystem services in coastal Bangladesh (Adams, et al., 2016) in the data journal Scientific data, you can see for yourself how data citation is applied.

DOIexample

When you scroll to the bottom of the article, you can see that this data paper cites two datasets. The persistent identifiers to these datasets are DOIs.

ExampleDataCitation

Furthermore, if you click the author's name - Helen Adams - you see the ORCID iD is visible.

ORCIDexample

Expert tips

  • ExpertTIp400pxContrast
  • 1. Deposit your data in a data repository
    When you deposit your data in a (trusted) data repository, a persistent identifier to your data sets is often automatically assigned.

    2. Register for an ORCID iD
    Registering for an ORCID iD is easy. Do it now (ORCID, n.d.)! Or first have a look at this video (Vanhaverbeke, 2017) in which other researchers state how having an ORCID iD benefits them.

    3. Check how FAIR your data are
    Want to know how FAIR your data are? Have a look at the checklist by Jones and Grootveld (2017).

    4. Include persistent identifiers as a variable
    Include the persistent identifier to your dataset as a variable in your data file. For example, the database from the ISSP 2015 on Work Orientations (GESIS, n.d.) includes the following variable: name of the variable: DOI; variable label: "Digital Object Identifier“. It has the same value for all the cases: doi:10.4232/1.12848. The link goes directly to the metadata in the GESIS data archive.