The role of AI and data in healthcare
As generative artificial intelligence advances, what once required vast resources and extensive expertise can now be achieved with smaller, more agile teams. At the heart of these innovations lies data—the essential ingredient that fuels AI-driven experiences. However, in heavily regulated industries like healthcare, leveraging data for AI presents unique challenges. Unlike unregulated sectors where customer data can be used with less stringent guidelines, healthcare organizations must navigate stringent regulations, such as those enforced by the Health Insurance Portability and Accountability Act (HIPAA). This is where Tonic.ai comes into play.
Tonic.ai specializes in generating high-quality synthetic data that mimics real-world data while safeguarding patient privacy. Our platform not only helps companies ensure compliance with HIPAA but also goes beyond traditional data masking techniques. For example, Tonic.ai can help you build high quality, HIPAA compliant data pipelines to power your RAG experiences with data containing PHI. By packaging our synthetic data with Expert Determination services, we provide healthcare companies with a comprehensive solution that combines the de-identification of sensitive data with the rigorous standards required by HIPAA.
Today, Tonic.ai works with some of the nation's largest healthcare organizations, such as United Healthcare, CVS Health, Walgreens, Signify Health, Syneos Health, and many others.
Understanding HIPAA and healthcare data
HIPAA, enacted in 1996, establishes strong safeguards to protect personal health information (PHI) in the U.S. While HIPAA does not specifically mention AI, it applies to AI’s use in healthcare contexts. For instance, if a HIPAA-covered entity, like a healthcare provider, uses PHI to train AI models, they must ensure compliance with HIPAA’s privacy and security rules while ensuring that PHI is properly de-identified, as outlined by the U.S. Department of Health and Human Services. Similarly, if an AI company processes PHI on behalf of a HIPAA-covered entity, it becomes a business associate and is required to adhere to HIPAA regulations, including prohibitions against using non-de-identified health data to train generative AI models. This ensures that personal health information remains protected and secure.
HIPAA de-identification: Safe Harbor vs. Expert Determination
HIPAA recognizes the utility of healthcare information and allows for PHI to be de-identified in two different ways, as specified in §164.514(a)-(b) of the regulations: Safe Harbor and Expert Determination.
Safe Harbor: a conservative approach
The Safe Harbor method involves removing any instance of 18 specific identifiers from the dataset to ensure the data is safe from re-identification. While Safe Harbor provides strong privacy guarantees, it is very prescriptive, leaving less room for nuance in determining how to de-identify PHI. This can be an appropriate method to use in instances where the de-identified data doesn’t need to meet a high degree of fidelity to the original data. But for use cases that do require a high degree of fidelity, for example, data used to train ML models, it may not be a suitable approach.
Expert Determination: a tailored solution
Expert Determination in HIPAA adopts a customized approach tailored to the specific dataset and intended use case, typically resulting in data with much higher utility. An 'expert'—using reasonable statistical and scientific principles—determines the approach to de-identify the data that ensures a very low chance of re-identification. However, because the expert provides the rules and process for de-identifying the data, the data owner is the one responsible for actually performing the data-de identification process, it can be both time-consuming and expensive without the appropriate software. Furthermore, changes in the data, the population on which the data is based, or new research findings can invalidate the determination, necessitating periodic reviews and updates. This is where Tonic.ai comes into play.
Tonic.ai and Expert Determination
Using Tonic.ai's Textual platform in conjunction with Expert Determination offers a robust solution for healthcare organizations looking to leverage sensitive patient data in generative AI applications. The process begins by utilizing Tonic Textual’s proprietary Named Entity Recognition (NER) models to identify PHI within unstructured data. An expert then examines the efficacy of these models by understanding the relationships and underlying PHI in the source data, determining that the redacted data is in compliance with HIPAA regulations. Once the expert validates the approach based on the model's quality and the data itself, Textual can be employed to effectively de-identify the data. This allows organizations to safely and securely use their data for advanced AI development, all while maintaining the highest standards of privacy and regulatory compliance.
The importance of Expert Determination in healthcare data
When opting for Expert Determination, it’s essential to understand exactly what the end result will be, under which circumstances it applies, and the conditions under which the determination is no longer valid. This ensures that your data remains compliant while retaining its usefulness for AI applications.
Conclusion: Developing with confidence
In conclusion, handling healthcare data under HIPAA requires not just a commitment to legal compliance, but also a deep understanding of the implications for patient privacy and data utility. Safe Harbor offers a straightforward method of de-identification, providing a specific outlined approach, while Expert Determination offers a more tailored and flexible solution. Tonic.ai's products support both methods of achieving HIPAA compliance, enabling you to choose the solution that best aligns with your specific data privacy requirements. By partnering with Tonic.ai, healthcare organizations can leverage synthetic data alongside Expert Determination to achieve both compliance and high data utility, empowering them to innovate without compromising on security or privacy. As you navigate these complexities, ensure that your data strategies are not only compliant but also resilient and adaptable to future challenges.