Technology Commercialisation

AI, medical data, and privacy: When can de-identification still leave a fingerprint?

February 21, 2025

As artificial intelligence (AI) continues to make breakthroughs in patient care, it is important to take a beat and evaluate the privacy concerns that any healthtech or medtech business seeking to use de-identified information will inevitably need to examine.  

In particular, is de-identified patient data (e.g. an x-ray scan image) really 'anonymous', or actually a digital fingerprint that might still point to an individual's identity?

This article provides a check-up on the Office of the Information Commissioner's (OAIC) 'guidance on privacy and developing and training generative AI models', and recent developments in this space.

De-identification dilemma: When is data really de-identified?

De-identification is often seen as the salve for privacy concerns – making data 'safe' for use.  It involves removing or altering information that identifies an individual or is reasonably likely to enable their identification (e.g. name, date of birth, address).  

When that treatment is holistic and irreversible, de-identified information will not constitute 'personal information' for the purposes of the Privacy Act 1988 (Cth), and will therefore fall outside the spectrum of that law.  In certain circumstances, de-identified information can be used with lower legal risk.

However, de-identified patient data could still be personal information in ways that are yet to be fully understood.  

In the context of x-ray scans, some types of direct identifiers are 'burned-in' information (i.e. the data is recorded on the original scan itself and cannot be removed from the image).  

Physically cropping an image to remove the accompanying text is often used as a method to de-identify a scan.  But there remains a risk that an image is still identifiable if: someone knows the patient or their medical history; or the condition is rare and distinctively identifiable through the scan. For example, a retina or facial scan can be used as a method of authentication – if a retina scan was used to train an AI model, could that information still be considered identifiable having regard to facial or retina scanning technology?  

AI models could also be able to identify an individual from 'de-identified' medical scans like MRIs or CTs based on features that are, quite literally, under the surface.  

Fingerprint or false alarm?

Fingerprints are one of the most recognisable forms of biometric data. They are unique to each individual, and therefore treated as an identifier in their own right.  

Similarly, a medical scan could hold unique, individualised features (e.g. specific blood vessel structures, teeth and bone alignment) – something that a human may not see, but an AI model might know what to look for.  

Even if an image does not come with a name tag, the body’s unique structures still hold some secrets that the right AI algorithm might uncover.

In circumstances where AI is not simply taking the pulse of patient data, rather diagnosing something more personal, medical images that are currently considered 'anonymous' could be hiding something much more powerful.  

If de-identified scans can be used to identify an individual, even without explicit identifiers, that data should be treated with the same caution as biometric markers.  

The AI upshot: Privacy or progress?

There is no doubt that AI is revolutionising healthcare, from diagnosing disease via less invasive means to predicting medical conditions before a symptom arises.  

But with every AI advancement, it is important to check that privacy is not left behind in the waiting room.

In a speech on 13 February 2025, Australian Information Commissioner Elizabeth Tydd delivered a privacy prescription for the digital age – the formula is: transparency; regulatory cohesion; and regulatory effectiveness.  

The OAIC might even consider introducing privacy principles that recognise and protect the unique biometric 'fingerprint' in every medical image.  We will provide updates as they come to hand.

What is the treatment plan for this privacy dilemma?

In the meantime, it is crucial that medical technology companies eager to use de-identified patient data to train AI models tread carefully with both the data and the law.

A 'privacy by design' approach suggests developers of technology which uses personal information (including medical data) should prescribe:

  • verifiable data provenance and governance measures, ensuring that personal information is collected only by lawful and fair means.  For example:
    • depending on the circumstances, it may be unlawful to create datasets by scraping or otherwise compiling information available online (even if accessible to the public);
    • if the information is provided by a third party (e.g. a hospital or health service), examine: if it was collected directly from the individual or indirectly through another source; the circumstances of collection; and whether the AI use is for the primary purpose of collection or a related secondary purpose; and
    • if the information contains any sensitive information (e.g. health and biometric information), patient consent is generally required before collection;
  • robust de-identification processes, ensuring that data is stripped of all direct identifiers, as well as indirect identifiers that could lead to re-identification;
  • regular audits to assess the risk of re-identification, particularly as AI models become more sophisticated;
  • transparent consent mechanisms, allowing patients to opt-in or opt-out of the use of their data for AI training;
  • use of only de-identified data sets for training of AI models, with strict access controls; and
  • that any data used is securely stored and protected from unauthorised access.

Avoid testing the patience of your patients

Where complex AI technologies and data supply chains are involved, steps taken to de-identify personal information may not always be effective.  

If there is any doubt about whether the Privacy Act applies to specific AI-related activities, best practice is to err on the side of caution and assume it does apply.

The AI world is changing fast, and it is vital to properly diagnose how medical data is being de-identified, protected, and used.  

While advancing healthcare outcomes, medtech companies need to balance innovation with responsibility, ensuring that privacy remains a top priority.  Follow the treatment plan above, or contact our team for assistance.

Author

Hannah Fas | Senior Associate | +61 7 3338 7507 | hfas@tglaw.com.au

A special thanks to Summer Clerk Kyuree Han for her assistance in putting this article together.

Download pdf
Brisbane

Hayden Delaney

Brisbane

Steven Hunwicks

RELATED ARTICLES
THE AUTHOR
Recent posts

Keep
learning