Healthcare data is among the most sensitive data there is. No one wants their healthcare history made public. And artificial intelligence (AI) presents yet another source of risk, as organizations use large volumes of data to develop and train AI models and applications for purposes such as patient support and medical research.
However, even if doctors and insurance companies follow HIPAA guidelines to protect patient data, there are other sources of healthcare data that don't fall under the HIPAA umbrella.
In this guide, we'll talk about the limitations of HIPAA, and how organizations, including organizations that do not fall under HIPAA, can ensure that they adequately protect private healthcare data.
HIPAA doesn't cover everything
When you think of protecting healthcare information, HIPAA is the first thing that comes to mind. We've all had to sign HIPAA statements at the doctor's office.
However, HIPAA might not cover every organization that gathers and uses healthcare data.
What is HIPAA and what does it cover?
The Health Insurance Portability and Accountability Act (HIPAA) of 1996 establishes federal standards to protect sensitive health information from disclosure without the patient's consent.
HIPAA applies to healthcare providers and health plans—doctors and insurance companies— as well as other entities that support those providers and plans. For example, the billing service that a medical practice uses to send bills and process payments.
HIPAA loopholes and their privacy risks
However, there are other types of organizations that might legitimately gather healthcare data, but that are not subject to HIPAA. And if they use, share, or otherwise fail to protect your healthcare information, then you might find your private healthcare data used in a way that you didn't approve or want.
You might fill out a healthcare waiver before you board an amusement park ride, undergo a spa treatment, or get a tattoo. These waivers ask about your current health conditions and medications, but the companies that require them might not count as healthcare providers.
And it might be fun to get your DNA tested, and figure out whether some of your traits and tics are genetic, but those organizations also are not officially subject to HIPAA regulations. They could very well sell this treasure trove of health details to other organizations to plug into an AI application.
Or you might answer a survey from a student who is writing a paper about health issues in their hometown, and plans to feed that information to a machine learning algorithm to generate their analysis. Once again, the information gathered in this type of scenario is not subject to HIPAA.
Protect patients regardless of regulatory loopholes
Your organization might or might not fall under HIPAA. But if you gather, share, or use any kind of personal healthcare data, you must protect that data and ensure that patients have control over how it is used.
Here, at a glance, are some basic steps to ensure data security and patient control:
- Consult patients about sharing and using data - Before you gather and use data, inform individuals of what you are gathering and how you plan to use it.
- Discuss the risks associated with data sharing - As you explain how you plan to use their data, make sure that they understand the risks of sharing data. Also explain how you plan to mitigate those risks.
- Obtain informed consent for data usage - Once they understand the risks, and understand exactly how you will use their healthcare data, get their consent for that usage. Make sure that they understand that they are providing consent and what they are consenting to.
- Adhere to privacy regulations - Even if you are not subject to regulations such as HIPAA, follow those regulations closely. Your policies and processes should at least meet or even exceed those requirements.
- Consider allowing individuals to see the output of AI models - Patients might feel more secure about sharing data if they can see for themselves how it is used. Give them early access to that new chatbot or those analytic charts so that they can see the value and also can verify that it doesn't reveal their identity.
Let's look at some standards and tools that can help you in your efforts to protect healthcare data, especially when you plan to use it for AI.
RAISE benchmarks
The Responsible Artificial Intelligence Institute (RAI Institute) is a non-profit organization that provides tools for responsible AI oversight and compliance.
In late 2023, they introduced the Responsible AI Safety and Effectiveness (RAISE) benchmarks. These benchmarks support organizations in efforts to establish policies that support the safe and secure use of AI.
- RAISE AI Policy Benchmark
- RAISE LLM Hallucinations Benchmark
- RAISE Vendor Alignment Benchmark
Following these benchmarks can help ensure that you have policies and procedures in place to protect data that you use for AI.
RAISE AI Policy Benchmark
This benchmark evaluates the comprehensiveness of a company's AI policies. It measures their scope and alignment with the RAI Institute's model enterprise AI policy.
This policy is based on the AI Risk Management Framework (RMF) from the National Institute of Standards and Technology (NIST).
It guides organizations to frame AI policies that address risks from generative AI and large language models (LLMs).
RAISE LLM Hallucinations Benchmark
Hallucinations are a well-known risk of AI. AI tools are capable of spouting information that is misleading, incorrect, or both. And they can do so with an air of perfect confidence.
This benchmark helps organizations to assess the risk of hallucinations and to take measures to minimize that risk.
RAISE Vendor Alignment Benchmark
Even if you have an effective AI policy, organizations that you work with might not.
This benchmark assesses whether the policies of a supplier organization aligns with the ethical and responsible AI policies of organizations that purchase from that organization.
For example, is the AI developer that you share data with as committed as you are to protecting that data?
The benchmark ensures that a vendor's AI practices are in line with the values and expectations of the businesses that they serve, so that data maintains the same level of protection throughout its lifetime.
Unblock your AI initiatives and build features faster by securely leveraging your free-text data.
GDPR protections
In the European Union (EU), the General Data Protection Regulation (GDPR) provides strict regulations for overall data privacy.
Unlike HIPAA, GDPR applies to all personal data, not just healthcare information.
GDPR also applies to any organization that collects sensitive data, and is not limited to the United States or to healthcare providers and plans.
To better protect personal healthcare data, even if you don't do business in the EU, you should ensure that your data privacy controls meet the more stringent GDPR standards.
Use synthetic data to safeguard patient privacy
Synthetic data tools allow you to use realistic data that is free from personal information. You can use these tools to strip out identifying information from private healthcare data before you feed it into an AI algorithm.
Tonic Structural can de-identify data in databases and text files. Structural scans the data to identify sensitive values. You can then quickly replace all of those values with realistic alternatives (Michael becomes John, identifiers are scrambled, and so on).
If you instead have large volumes of files such as PDFs, images, and even .docx files—like patient notes—then Tonic Textual can help. Textual also detects and replaces sensitive values. You can then download the redacted files to use for AI development and model training.
Conclusion
Personal healthcare is one type of data that has been used as a source for AI training and development. It is highly sensitive, and HIPAA does not fully cover all possible healthcare data sources and uses.
Regardless of whether your organization is legally required to follow HIPAA, it is imperative to protect personal healthcare data that you collect, share, and use. You absolutely need to develop and follow stringent data privacy and protection policies.
You can also use Tonic.ai's synthetic data tools, Tonic Structural and Tonic Textual, to identify, remove, and replace sensitive values in healthcare data. The de-identified data is then safe to use for purposes that include AI tool and model development.
To learn more about Structural de-identification and Textual file redaction, connect with our team or start a free trial of Structural or Textual.