Blog
Compliance

Using synthesized data for Expert Determination in HIPAA

Author
Adam Kamor, PhD
Author
September 4, 2024
Using synthesized data for Expert Determination in HIPAA
In this article
    Share

    The role of AI and data in healthcare

    As generative artificial intelligence advances, what once required vast resources and extensive expertise can now be achieved with smaller, more agile teams. At the heart of these innovations lies data—the essential ingredient that fuels AI-driven experiences. However, in heavily regulated industries like healthcare, leveraging data for AI presents unique challenges. Unlike unregulated sectors where customer data can be used with less stringent guidelines, healthcare organizations must navigate stringent regulations, such as those enforced by the Health Insurance Portability and Accountability Act (HIPAA). This is where Tonic.ai comes into play.

    Tonic.ai specializes in generating high-quality synthetic data that mimics real-world data while safeguarding patient privacy. Our platform not only helps companies ensure compliance with HIPAA but also goes beyond traditional data masking techniques. For example, Tonic.ai can help you build high quality, HIPAA compliant data pipelines to power your RAG experiences with data containing PHI.  By packaging our synthetic data with Expert Determination services, we provide healthcare companies with a comprehensive solution that combines the de-identification of sensitive data with the rigorous standards required by HIPAA.

    Today, Tonic.ai works with some of the nation's largest healthcare organizations, such as United Healthcare, CVS Health, Walgreens, Signify Health, Syneos Health, and many others. 

    Understanding HIPAA and healthcare data

    HIPAA, enacted in 1996, establishes strong safeguards to protect personal health information (PHI) in the U.S. While HIPAA does not specifically mention AI, it applies to AI’s use in healthcare contexts. For instance, if a HIPAA-covered entity, like a healthcare provider, uses PHI to train AI models, they must ensure compliance with HIPAA’s privacy and security rules while ensuring that PHI is properly de-identified, as outlined by the U.S. Department of Health and Human Services. Similarly, if an AI company processes PHI on behalf of a HIPAA-covered entity, it becomes a business associate and is required to adhere to HIPAA regulations, including prohibitions against using non-de-identified health data to train generative AI models. This ensures that personal health information remains protected and secure.

    HIPAA de-identification: Safe Harbor vs. Expert Determination

    HIPAA recognizes the utility of healthcare information and allows for PHI to be de-identified in two different ways, as specified in §164.514(a)-(b) of the regulations: Safe Harbor and Expert Determination.

    Safe Harbor: a conservative approach

    The Safe Harbor method involves removing any instance of 18 specific identifiers from the dataset to ensure the data is safe from re-identification. While Safe Harbor provides strong privacy guarantees, it is very prescriptive, leaving less room for nuance in determining how to de-identify PHI. This can be an appropriate method to use in instances where the de-identified data doesn’t need to meet a high degree of fidelity to the original data. But for use cases that do require a high degree of fidelity, for example, data used to train ML models, it may not be a suitable approach.

    Expert Determination: a tailored solution

    Expert Determination in HIPAA adopts a customized approach tailored to the specific dataset and intended use case, typically resulting in data with much higher utility. An 'expert'—using reasonable statistical and scientific principles—determines the approach to de-identify the data that ensures a very low chance of re-identification. However, because the expert provides the rules and process for de-identifying the data, the data owner is the one responsible for actually performing the data-de identification process, it can be both time-consuming and expensive without the appropriate software. Furthermore, changes in the data, the population on which the data is based, or new research findings can invalidate the determination, necessitating periodic reviews and updates. This is where Tonic.ai comes into play. 

    Tonic.ai and Expert Determination

    Using Tonic.ai's Textual platform in conjunction with Expert Determination offers a robust solution for healthcare organizations looking to leverage sensitive patient data in generative AI applications. The process begins by utilizing Tonic Textual’s proprietary Named Entity Recognition (NER) models to identify PHI within unstructured data. An expert then examines the efficacy of these models by understanding the relationships and underlying PHI in the source data, determining that the redacted data is in compliance with HIPAA regulations. Once the expert validates the approach based on the model's quality and the data itself, Textual can be employed to effectively de-identify the data. This allows organizations to safely and securely use their data for advanced AI development, all while maintaining the highest standards of privacy and regulatory compliance.

    The importance of Expert Determination in healthcare data

    When opting for Expert Determination, it’s essential to understand exactly what the end result will be, under which circumstances it applies, and the conditions under which the determination is no longer valid. This ensures that your data remains compliant while retaining its usefulness for AI applications.

    Conclusion: Developing with confidence

    In conclusion, handling healthcare data under HIPAA requires not just a commitment to legal compliance, but also a deep understanding of the implications for patient privacy and data utility. Safe Harbor offers a straightforward method of de-identification, providing a specific outlined approach, while Expert Determination offers a more tailored and flexible solution. Tonic.ai's products support both methods of achieving HIPAA compliance, enabling you to choose the solution that best aligns with your specific data privacy requirements. By partnering with Tonic.ai, healthcare organizations can leverage synthetic data alongside Expert Determination to achieve both compliance and high data utility, empowering them to innovate without compromising on security or privacy. As you navigate these complexities, ensure that your data strategies are not only compliant but also resilient and adaptable to future challenges.

    Adam Kamor, PhD
    Co-Founder & Head of Engineering

    Fake your world a better place

    Enable your developers, unblock your data scientists, and respect data privacy as a human right.
    Accelerate development with high-quality, privacy-respecting synthetic test data from Tonic.ai.Boost development speed and maintain data privacy with Tonic.ai's synthetic data solutions, ensuring secure and efficient test environments.