Consortium of European Social Science Data Archives

7

Feb

CESSDA checklist helps to set up a Persistent Identifier service at your data archive

Persistent Identifiers (PID), a long-lasting reference to a digital resource, sounds like very “techy” business, and it is. However, if you are setting up a PID service in your archive or repository, it is not all about the tech. Policies and procedures matter.

A deliverable for the CESSDA Tools and Services Working Group is a “check-list” for current or future CESSDA ERIC Service Providers (SPs) required to set-up a PID service. Brigitte Hausstein (GESIS - Leibniz Institute for the Social Sciences) and Laurence Horton (University of Toronto) worked on producing a set of questions to help identify and address the challenges and considerations that are part of establishing a PID service.

Having had experience in setting up PID services at GESIS (da|ra) and LSE Library respectively. Drawing on the expertise present in the CESSDA working group, they identified a set of questions for repository managers and data stewards. These questions cover very first considerations when beginning to set up a service through to creating a PID.

The checklist begins with the basics: identifying criteria for a service and the Service Provider options available. We then move to preparations for introducing a PID system into an archive, covering the administrative and legal requirements, technical infrastructure, workflows, structure, granularity, metadata questions, and the associated policies and procedures required to answer them. The third and final section probes roles and responsibilities within the organisation for the PID service.

One lesson from our experience that we wanted to impart to users is that setting up a PID service involves a significant amount of consideration and discussion on the organisational aspects of a PID service. For example, what you want PIDs for, where you will apply them, and identifying who is responsible and how they will be maintained are all questions that must be considered before attempting to establish a service.

There are certain restrictions within which we worked. The list focuses on social science data archives, or social science related data sets in CESSDA SPs, and operates within the parameters of CESSDA’s PID Policy (Hausstein et al. 2019). This policy restricts the range of PID services CESSDA SPs can choose from to four types: Handle (and its derivative DOI - Digital Object Identifier), URN:NBN, and ARK. Although this meant we did not have to address every kind of PID in operation today, this still created a challenge in framing questions in a way that relates to PIDs but does not focus on aspects unique to one type of service.

Another challenge was to keep to a reasonable number of questions and a manageable overall length. The checklist is not intended to be used as a comprehensive and exclusive document in setting up a service, but rather to help service managers keep some focus on the broader issues and considerations in setting up a service. The final draft contains thirteen questions and is approximately 1700 words long, including supporting text.

The checklist itself was modelled on one produced by the European Commission to help manage the introduction of the General Data Protection Regulation. This was a compliance checklist that took data controllers through a sequence of questions constructed around the regulation. What we found useful was that the checklist did not simply ask for a yes/no response to a question, but also provided basic information on what was needed to produce an informed response that either helped answer the question or identify an issue for which further discussion or help was required. A PID service is not a distant analogy in this sense, as both PIDs and the GDPR require consideration of current practices, organisational goals, and a review of policies and procedures.

A draft of the PID checklist was provided to CESSDA SPs, who were invited to review it and subsequently provided helpful comments to improve the clarity and specificity of the checklist. A final draft was then presented in October 2019 to the CESSDA Service Provider meeting in Bergen, where additional feedback was collected. The checklist was then approved by CESSDA ERIC and published in January 2020.

References:

CESSDA ERIC Checklist for the Usage of Persistent Identifiers. Version 1.0, 2019. https://doi.org/10.5281/zenodo.3611333CESSDA ERIC Persistent Identifier Policy 2019. Principles, Recommendations and Best Practices. Version 2.0. https://doi.org/10.5281/zenodo.3611327

Authors:

Horton, Laurence (Faculty of Information, University of Toronto) https://orcid.org/0000-0003-2742-6434
Hausstein, Brigitte (GESIS Leibniz Institute for the Social Sciences, Leader of the CESSDA PID Project) https://orcid.org/0000-0001-5430-8201