Data in social sciences

In this tour guide we focus on data generated in social science research, both quantitative and qualitative. Notably, within the field of social sciences, you will often work with data originating from human participants. This can mean that you are handling (sensitive) personal data, which deserve special attention.

In the tabs below a definition of personal data is given and our concept of quantitative and qualitative data is introduced.


If you collect research data that enabless you to identify a person, then this is classified as personal data. Within the General Data Protection Regulation (GDPR, European Union, 2016) personal data is defined as any information relating to an identified or identifiable natural person known as ‘a data subject’. It is further specified that an identifiable natural person is someone who can be identified, either directly or indirectly, by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person. Personal data can include a variety of information, such as names, address, phone number and IP addresses.
The GDPR applies only to the data of living persons. Data which do not count as personal data do not fall under data protection legislation, though there may still be ethical reasons for protecting this information.

Sensitive personal data

Certain personal data are considered particularly sensitive and thus require specific protection when they reveal information that may create important risks for the fundamental rights and freedoms of the involved individual. Examples of sensitive personal data include data revealing religion, sexual orientation, or racial or ethnic origin. Within the GDPR the following categories are defined as ‘special categories of personal data’:

  • Racial or ethnic origin;
  • Political opinions;
  • Religious or philosophical beliefs;
  • Trade union membership;
  • Genetic data;
  • Biometric data;
  • Data concerning health;
  • Data concerning a natural person's sex life or sexual orientation.

There are other data which may contain sensitive information which does not fall under the special categories of personal data but should still be treated like as such, including, for example, confidential business data and secret data concerning state security.

Like with research data in general, social science data cover a broad range of materials, from structured numerical datasets to interviews, field notes, and documents collected for ethnographic studies, for instance. In this tour guide, we discern quantitative and qualitative data, though both can, of course, be collected during the same study. In the table below the main attributes of both types of data are shown. Even though an attribute is described in one of the columns it does not imply that it can’t exist in the other.

(Sensitive) personal data and the tour guide

Tips for handling (sensitive) personal data are present throughout this tour guide. In particular, we would like to point out the following:

  • In the chapter on storing data, you will find measures to protect (sensitive) personal data from unauthorised access with strong passwords and encryption.
  • In the chapter on protecting your data, you will learn how a combination of gaining consent, anonymising data, gaining clarity over who owns the copyright to your data and controlling access to data can enable the ethical and legal sharing of (sensitive) personal data.