De-identify datasets
De-identifying data
De-identifying data is when you take steps to ensure that an individual cannot be identified through the personal or health information you have collected.
When you de-identify data you reduce the risk of a privacy breach. Legal requirements under privacy legislation do not apply to de-identified data.
It may not be suitable to de-identify all datasets. There may be a higher risk of re-identification when you combine them with other datasets. Consider this when handling unit record level data.
If you're not sure that data can be, or has been, properly de-identified, you should treat it as you would personal information. Also, the data may not be suitable for open release.
Avoid re-identifying data
You can avoid data being re-identified by:
- presenting and sharing aggregated rather than specific results or raw data
- checking if elements of what you've recorded would potentially allow someone's identity to be inferred or derived.
Remember that when you reuse or recycle datasets over multiple projects there's a risk of being able to identify someone by linking the datasets together.
Resources
- De-identification Decision Making Framework - Office of the Australian Information Commissioner and Data 61
- Reasonably ascertainable identity - Information and Privacy Commission NSW.