Anonymisation

DilemmaAnonymisation800px

I am collecting data on asylum seekers' and refugees' experiences of forced labour. These participants can be considered 'doubly vulnerable'. We want to share these data. How should we protect our participant's anonymity?

Consider:

  • not recording any official identifying data (e.g. Home Office numbers);
  • letting participants choose their own pseudonyms (which should not be disclosive in any way);
  • password-protecting interviewee contact details;
  • not connecting pseudonyms to these password protected interviewee contact details.

Read more about the ethical considerations of this real-life project at the site of the Economic and Social Research Council (ESRC, 2017c).

The best way to protect your participant's privacy may be not to collect certain identifiable information at all. The second best is anonymisation which allows data to be shared whilst protecting participant’s personal information. Anonymisation should be considered in the context of the whole project and how it can be utilised alongside, informed consent and access controls. For example, if a participant consents to their data being shared then the use of anonymisation may not be required.

Personal data can be disclosed through two categories of identifiers.

  • Direct identifiers are ones like the participant’s name, address, or telephone numbers that specifically identify them;
  • Indirect identifiers are ones that when they are placed with other information could also reveal an individual, for example, by cross-referencing occupation, salary, age, and location.
Deindentify1000px

Pseudonymisation and anonymisation are two distinct terms which fall under different categories in the General Data Protection Regulation (GDPR; European Union, 2016a). Whereas anonymisation irreversibly destroys any way of identifying the data subject, in theory, pseudonymisation allows to re-identify the data subject with additional information.

The GDPR defines pseudonimisation as "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information". To pseudonymise a dataset "the additional information must be kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person". Directly identifying data is held separately and securely from processed data to ensure non-attribution.

Anonymous data is data that cannot identify individuals in the dataset in any way. Neither directly through name or social security number, indirectly through background variables, nor through a list of names or through an encryption formula and code/scrambling key.

Anonymisation methods

When anonymising, data identifiers need to be removed, generalised, aggregated or distorted. Below, best practices for anonymising quantitative and qualitative data are given.

QauntGuide

The best practices for anonymising quantitative data

  • This may involve removing or aggregating variables or reducing the precision or detailed textual meaning of a variable;
  • Aggregate or reduce the precision of a variable such as age or place of residence. As a general rule, report the lowest level of geo-referencing that will not potentially breach respondent confidentiality;
  • Generalise the meaning of a detailed text variable by replacing potentially disclosive free-text responses with more general text;
  • Restrict the upper or lower ranges of a continuous variable to hide outliers if the values for certain individuals are unusual or atypical within the wider group researched.
QualGuide

Best practices for anonymising qualitative data

  • Using pseudonyms or generic descriptors to edit identifying information, rather than blanking-out that information;
  • Plan anonymisation at the time of transcription or initial write-up, (longitudinal studies may be an exception if relationships between waves of interviews need special attention for harmonised editing);
  • Use pseudonyms or replacements that are consistent throughout the research team and the project. For example, using the same pseudonyms in publications and follow-up research;
  • Use 'search and replace' techniques carefully so that unintended changes are not made, and misspelt words are not missed;
  • Identify replacements in text clearly, for example with [brackets] or using XML tags such as <seg>word to be anonymised</seg>;
  • Create an anonymisation log (also known as a de-anonymisation key) of all replacements, aggregations or removals made and store such a log securely and separately from the anonymised data files.

Different types of identifiers have been listed in the table below. Information that is deemed to be sensitive according to the Finnish Personal Data Act has been marked with an asterisk (*). Each identifier has been characterised (direct identifier, strong indirect identifier, indirect identifier).

The last column notes the easiest methods for dealing with that type of identifier. Remove means removing, Change means changing to pseudonyms and Categorise means categorisation/classification. In the case of qualitative data, Categorising means coarsening identifying information, that is, categorising it.

Some identifiers may be both indirect identifiers and strong indirect identifiers. An unusual occupation or occupational status is a strong indirect identifier while a common occupation is just an indirect identifier.

Identifier type

Direct identifier

Strong indirect identifier

Indirect identifier

Anonymisation method

Personal identification number

x

 

 

Remove

Full name

x

 

 

Remove/Change

Email address

x

x

 

Remove

Phone number

 

x

 

Remove

Postal code

 

 

x

Remove/Categorise

District/part of town

 

 

x

Categorise

Municipality of residence

 

 

x

Categorise

Region

 

 

x

Categorise

Major region

 

 

x

 

Municipality type

 

 

x

 

Audio file

x

 

 

Remove

Video file displaying person(s)

x

 

 

Remove

Photograph of person(s)

x

 

 

Remove

Year of birth

 

x

 

Categorise

Age

 

 

x

Categorise

Gender

 

 

x

 

Marital status

 

 

x

 

Household composition

 

 

x

Categorise

Occupation

 

(x)

x

Categorise

Industry of employment

 

 

x

 

Employment status

 

 

x

 

Education

 

 

x

Categorise

Field of education

 

 

x

 

Mother tongue

 

 

x

Categorise

Nationality

 

 

x

Categorise

Workplace/Employer

 

(x)

x

Categorise

Vehicle registration number

 

x

 

Remove

Title of publication

 

x

 

Categorise

Web page address

 

(x)

x

Remove

Student ID number

 

x

 

Remove

Insurance number

 

x

 

Remove

Bank account number

 

x

 

Remove

IP address

 

x

 

Remove

Health-related information *

 

(x)

x

Categorise/Remove

Ethnic group *

 

(x)

x

Categorise/Remove

Crime or punishment *

 

 

x

Categorise/Remove

Membership in a trade union *

 

 

x

Categorise

Political or religious allegiance *

 

 

x

Categorise

Other position of trust or membership

 

(x)

x

Categorise/Remove

Need for social welfare *

 

 

x

Categorise/Remove

Social welfare services and benefits received *

 

 

x

Categorise/Remove

Sexual orientation *

 

 

x

Remove

Source: Finnish Social Science Data Archive (2017b).

Expert tips

  • ExpertTip400px
  • 1. Data access controls
    In situations where (sensitive) personal data are not fully anonymised, data can still be archived and shared by regulating or limiting access to the data. Access controls can permit control down to an individual file level, meaning that mixed levels of access control can be applied to a data collection. You will learn more about choosing the appropriate data access category for your data files in the chapter on archiving and publishing data (see 'Access categories').

    2. Irreversible anonymisation
    In some countries, anonymisation needs to be irreversible and the original data deleted. Be sure to check the national requirements.

    3. Anonymisation tools
    The UK Data Archive (n.d.) has developed a Text anonymisation helper tool (downloads in a .zip file) with how to install instructions via Wiki. It is an add-on MS Word macro for aiding anonymisation of qualitative data.

    4. Reading tip
    In this factsheet by OpenAIRE (2017) you are guided in how to balance open access and data protection and advised on what to do when anonymisation isn't possible.

Case study

In a research study on investigating how couples manage their households during recessions (Gush and Laury, 2015), finding the right balance between confidentiality and usefulness of the data was a real challenge (UK Data Service, 2022a). Archiving challenges with this project were to anonymise the data and apply optimal access conditions.

Careful judgement was required to apply the level of anonymisation most appropriate for this particular data. The research team members went through the transcripts and removed certain types of identifying data such as names, places of work, and geographic areas. Regarding access conditions, it was decided to make the data available using a Special Licence (UK Data Service, 2022b; see 'Licensing your data' for other possible licences). Under this kind of licence, a potential user is required not only to register with the UK Data Service, but also to complete a detailed application form and agree to additional restrictions on data handling and usage. The use of the Special Licence then made it possible to apply a minimal level of anonymisation, thus reducing the loss of data quality.

A practice in anonymising qualitative data

Follow the steps to see whether you recognise direct and indirect identifiers in an interview transcript and whether you know how to deal with them accordingly.

Anonymisation

Mr Tom Jeavons, aged 63, was suffering from metastatic cancer resulting from a primary site in the bladder. His wife, Sue (58), had been his main carer for many months as he struggled with severe pain, anxiety and other symptoms. Eventually, she received support from the hospice at home team, based at their nearby hospice – St Barbara. 11 days before his death, he was admitted to their inpatient unit, where he died. The case was identified by the staff there as a “critical case”, involving palliative sedation and the difficulties staff experienced in controlling his complex symptoms. Other interviews carried out were with the hospice consultant, Dr Jane O’Connor and three nurses: Elaine McDonald, Claire Smith, and Mark Ferguson. Mr and Mrs Jeavons’ GP, Dr Paul Hyde, was also interviewed which added a different medical perspective, making this an unusual case.

Central themes in all of the interviews were his intractable and distressing symptoms and the repeated requests from Mr Jeavons for euthanasia. His wife mentions earlier discussions with Mr Jeavons about the possibility of going to a Dignitas clinic, but he was already too ill to travel. She also expresses how concerned she was about what Mr Jeavons’s adult children might witness when he was dying in the hospice.

Source: Data collection by Seymour (2010-2012).

Read through the interview script and consider what anonymisation would be needed before archiving this transcript for future sharing.

TRANSCRIPT SYMBOLS:

INT

Interviewer

RESP

Respondent

[?]

Unintelligible

INT: So, really, it’s as I said to you: I want you to tell me what you can remember about Mr Jeavons’ care in the last week of his life ... or about Mr Jeavons in the last week of his life.

RESP: Yeah, erm, 11 days, Tom was in St Barbara’s Hospice for the last 11 days of his life so...

INT: So if you’d like to talk about that period...

RESP: Yeah.

INT: ...that’d be great.

RESP: Prior to him going in, and we was coping with his care at home, but then he was becoming less and less mobile: he couldn’t go to the toilet; he had a frame, and everything that you added in that was, it was a step to help him but a downward step to the end of how he could cope. We had a Bariatric bed brought into the other room but he insisted in sleeping in his chair. We had St Barbara’s here and, erm, the GP, and, er, we also had him assessed at home as to whether or not we could care for him completely at home. And Tom was about 20-something stone, so he wasn’t easy to manoeuvre and, and the one thing that concerned me was the fact that, erm, they needed four people to move him, you know, if he wanted to go to the toilet or if he wanted to go on a bedpan or anything, and we had the bed in there – which he wouldn’t sleep in. And, erm, basically the, logistically trying to be able to do everything for him and keep him comfortable, we’d have to wait for an on-call four nurses – could be in the middle of the night – and, and sort of the idea of being able to cope, erm, for his safety and wellbeing was, was really compromised. He didn’t want to go into St Barbara’s, he didn’t want to die in hospital, erm, but I just felt I had to take that decision to say, erm, when the guy came out to assess him, erm, he said, ‘We can do it but, you know, you’ve got to say what you’re going to do at three o’clock on Saturday, early hours of Saturday morning, and he wants to go on the bedpan or you need to change him or whatever.’ And, and it, I had to let logic and let my heart... be ruled by my head.

INT: Mm.

RESP: So we got him into St Barbara’s., and he went in on the Friday, 11 days before he died, and, erm... when, when he went in – because he couldn’t move – from, from a few days before that he wasn’t able to move to get to the toilet or anything and we got commodes and things like that and, you know, with having young, young girls in here, we couldn’t find him somewhere that he could be private...

INT: Mm.

RESP: ...and that was a bit of a problem for him, because he was a very private man in that, in that way. Erm, so we went into St Barbara’s on the Friday and they decided that what they were going to do was going to fit him with a catheter. Well, unfortunately, it was so traumatic for him because all Tom’s waterworks had retracted...

INT: Ah.

RESP: ...so much, but there was a determination on the, on the part of the staff to try and make it easier for him to have this catheter put in. Well, it wasn’t, it was counter-productive really because, erm, his son came to see his dad, and I was there, and we went out the room and this nurse had spent about an hour and a half trying to get this catheter in. They tried to do it at home, erm, and failed...

INT: Mm.

RESP: ...and of course he was incredibly sensitive, incredibly tender and everything else, and everything had shrunken and retracted so far back it was nigh impossible to actually, to do it without causing him any distress.

INT: Mm.

RESP: So they left it at home but we tried to get it done, erm, in the hospital, they tried to do it, and this lady, erm, had succeeded in getting a catheter in, but he was traumatised by it – there was no other word, he was traumatised – and when myself and his son went back into the room after about an hour and a half, waiting for this thing to, to be finished, er, he actually said to me and to his son, ‘Just go away and leave me alone.’ And that, unfortunately, was the last time his son saw him, so, Darren lives way over in Seatown. So unfortunate his son’s last memory was that. So he stuck with the catheter but the catheter didn’t really feel that comfortable, and every time he passed water he was actually yelling in pain. Er, two or three days later they actually took the catheter out and just put him on a pad and, and let him just wee, because, to be honest, did it matter? You know, and to put him through it, he was traumatised with his catheter fitting, and, you know, obviously they’re trying to make life easier and more comfortable, erm, but it was, as I say, it was counter-productive.

Anyway, erm... I came home, had a shower, went back in and he was a little bit calmer. Erm... before he went in, erm, he wasn’t eating very much or drinking very much, because his, his requirement for food – he kept asking for, for help to die, because he’d enough – he was, he was really, there was no quality; he was in such a lot of pain; he was on such a lot of drugs, and he, he just really, there was no value to him just languishing as he was. Erm, and so it was basically decided that if, if he wanted a drink... a drink would always be there if he wanted one, but there’d be no encouragement, erm, because as, as St Barbara’s said, ‘We can’t kill him,’ you know, quite [?], ‘We can’t...’ you know, ‘There’s nothing we can’t... we can keep him out of pain; we can keep him calm, erm, but we can’t kill him.’ Erm, and I remember him saying to Dr O’Connor ‘Just put the boot in, Dr O’ Connor.’ ... ‘Just put the boot...’ [?], he’d had enough. Anyway ... [ ] I cannot criticise the care that they gave him at St Barbara’s because it was, you know, fantastic.

Here you find the answer to what direct and indirect identifiers need to be anonymised. They are underlined and given a number in brackets. At the bottom of the page, you see how anonymisation can be done for each case.

TRANSCRIPT SYMBOLS:

INT

Interviewer

RESP

Respondent

[?]

Unintelligible

[ ]

Edited to maintain anonymity [1- Added to clarify anonymisation of transcript]

Mr Tom Jeavons [2 - Delete and replace with [This gentleman]], aged 63, [3 - Delete] was suffering from metastatic cancer resulting from a primary site in the bladder [4 - Delete]. His wife, Sue [5 - Delete] (58), [6 - Delete] had been his main carer for many months as he struggled with severe pain, anxiety and other symptoms. Eventually, she received support from the hospice at home team, based at their nearby hospice – St Barbara. [7 - Delete] 11 days before his death, he was admitted to their inpatient unit, where he died. The case was identified by the staff there as a “critical case”, involving palliative sedation and the difficulties staff experienced in controlling his complex symptoms. Other interviews carried out were with the hospice consultant, Dr Jane O’Connor [8 - Delete] and three nurses: Elaine McDonald, Claire Smith and Mark Ferguson [9 - Delete]. Mr and Mrs Jeavons’ [10 - Delete and replace with [The couple’s]] GP, Dr Paul Hyde, [11 - Delete] was also interviewed which added a different medical perspective, making this an unusual case.

Central themes in all of the interviews were his intractable and distressing symptoms and the repeated requests from Mr Jeavons [12 - Delete and replace with [the patient]] for euthanasia. His wife mentions earlier discussions with Mr Jeavons [13 - Delete and replace with [her husband]] about the possibility of going to a Dignitas clinic, but he was already too ill to travel. She also expresses how concerned she was about what Mr Jeavons’s [14 - Delete and replace with [his]] adult children might witness when he was dying in the hospice.

INT: So, really, it’s as I said to you: I want you to tell me what you can remember about Mr Jeavons’ [15 - Delete and replace with [your husband’s]] care in the last week of his life ... or about Mr Jeavons [16 - Delete and replace with [your husband]] in the last week of his life.

RESP: Yeah, erm, 11 days, Tom [17 - Delete and replace with [he]] was in St Barbara’s Hospice [18 - Delete and replace with [the hospice]] for the last 11 days of his life so...

INT: So if you’d like to talk about that period...

RESP: Yeah.

INT: ...that’d be great.

RESP: Prior to him going in, and we was coping with his care at home, but then he was becoming less and less mobile: he couldn’t go to the toilet; he had a frame, and everything that you added in that was, it was a step to help him but a downward step to the end of how he could cope. We had a Bariatric bed brought into the other room but he insisted in sleeping in his chair. We had St Barbara’s [19 - Delete and add [hospice at home]] here and, erm, the GP, and, er, we also had him assessed at home as to whether or not we could care for him completely at home. And Tom [20 - Delete and replace with [he]] was about 20-something stone, so he wasn’t easy to manoeuvre and, and the one thing that concerned me was the fact that, erm, they needed four people to move him, you know, if he wanted to go to the toilet or if he wanted to go on a bedpan or anything, and we had the bed in there – which he wouldn’t sleep in. And, erm, basically the, logistically trying to be able to do everything for him and keep him comfortable, we’d have to wait for an on-call four nurses – could be in the middle of the night – and, and sort of the idea of being able to cope, erm, for his safety and wellbeing was, was really compromised. He didn’t want to go into St Barbara’s [21 - Delete and replace with [the hospice]] , he didn’t want to die in hospital, erm, but I just felt I had to take that decision to say, erm, when the guy came out to assess him, erm, he said, ‘We can do it but, you know, you’ve got to say what you’re going to do at three o’clock on Saturday, early hours of Saturday morning, and he wants to go on the bedpan or you need to change him or whatever.’ And, and it, I had to let logic and let my heart... be ruled by my head.

INT: Mm.

RESP: So we got him into St Barbara’s [22 Delete and replace with [the hospice]], and he went in on the Friday, 11 days before he died, and, erm... when, when he went in – because he couldn’t move – from, from a few days before that he wasn’t able to move to get to the toilet or anything and we got commodes and things like that and, you know, with having young, young girls in here, we couldn’t find him somewhere that he could be private...

INT: Mm.

RESP: ...and that was a bit of a problem for him, because he was a very private man in that, in that way. Erm, so we went into St Barbara’s [23 Delete and replace with [the hospice]] on the Friday and they decided that what they were going to do was going to fit him with a catheter. Well, unfortunately, it was so traumatic for him because all Tom’s [24 - Delete and replace with [his]] waterworks had retracted...

INT: Ah.

RESP: ...so much, but there was a determination on the, on the part of the staff to try and make it easier for him to have this catheter put in. Well, it wasn’t, it was counter-productive really because, erm, his son came to see his dad, and I was there, and we went out the room and this nurse had spent about an hour and a half trying to get this catheter in. They tried to do it at home, erm, and failed...

INT: Mm.

RESP: ...and of course he was incredibly sensitive, incredibly tender and everything else, and everything had shrunken and retracted so far back it was nigh impossible to actually, to do it without causing him any distress.

INT: Mm.

RESP: So they left it at home but we tried to get it done, erm, in the hospital, they tried to do it, and this lady, erm, had succeeded in getting a catheter in, but he was traumatised by it – there was no other word, he was traumatised – and when myself and his son went back into the room after about an hour and a half, waiting for this thing to, to be finished, er, he actually said to me and to his son, ‘Just go away and leave me alone.’ And that, unfortunately, was the last time his son saw him, so, Darren [25 - Delete and replace with [his son]] lives way over in Seatown [26 - Delete and replace with [he lives some distance away]]. So unfortunate his son’s last memory was that. So he stuck with the catheter but the catheter didn’t really feel that comfortable, and every time he passed water he was actually yelling in pain. Er, two or three days later they actually took the catheter out and just put him on a pad and, and let him just wee, because, to be honest, did it matter? You know, and to put him through it, he was traumatised with his catheter fitting, and, you know, obviously they’re trying to make life easier and more comfortable, erm, but it was, as I say, it was counter-productive.

Anyway, erm... I came home, had a shower, went back in and he was a little bit calmer. Erm... before he went in, erm, he wasn’t eating very much or drinking very much, because his, his requirement for food – he kept asking for, for help to die, because he’d enough – he was, he was really, there was no quality; he was in such a lot of pain; he was on such a lot of drugs, and he, he just really, there was no value to him just languishing as he was. Erm, and so it was basically decided that if, if he wanted a drink... a drink would always be there if he wanted one, but there’d be no encouragement, erm, because as, as St Barbara’s [27 - Delete and replace with [the hospice]] said, ‘We can’t kill him,’ you know, quite [?], ‘We can’t...’ you know, ‘There’s nothing we can’t... we can keep him out of pain; we can keep him calm, erm, but we can’t kill him.’ Erm, and I remember him saying to Dr O’Connor [28 -Delete and replace with [the doctor]] ‘Just put the boot in, Dr O’ Connor [29 - Delete and replace with [doctor]] .’ ... ‘Just put the boot...’ [?], he’d had enough. Anyway ... [ ] I cannot criticise the care that they gave him at St Barbara’s [30 - Delete and replace with [the hospice]] because it was, you know, fantastic.