Integrating Tonic Structural with Your Existing Tech Stack

What do I know about my business? What do I know about my customers?

Before you launch any new initiative, whether it's application development, marketing, or machine learning / AI, the starting point is always to take stock of what you know. At the heart and start of analysis is data discovery.

Too often, however, you are unable to proceed. Although you know what data is available, for valid reasons you might be barred from using it. Typical reasons include data sensitivity concerns, a consent agreement that does not allow re-use, or company policy that restricts the required access.

In many organisations, that might be the end of it. Effectively, the computer says no. If this is happening in your organisation, it sounds like you haven’t heard of Tonic Structural.

Tonic Structural allows you to create data resources that mimic your existing real-world datasets while addressing security and privacy concerns. Structural maintains statistical properties and relationships in your data, but removes or unrecognisably transforms personally identifiable information (PII) and other unwanted attributes.

In this article, we focus on how you can integrate Tonic Structural into your systems and processes, and use its extensive APIs to maximise efficiency and value.

Understanding Tonic Structural

You can think of Tonic Structural as a data transformation platform that can be applied to a comprehensive list of data sources, including standard relational databases, Snowflake, Databricks, and flat files.

The output of the transformation is a destination database or the equivalent, which is structurally identical to the original, but populated with fully sanitised data, in line with your policies and requirements. It’s a powerful platform that can allow you to build and innovate your products in ways that were previously denied. So you can now:

Use production-like data to build and enhance applications.
Develop proprietary machine learning models that are based on previously unavailable private data.
Use representative data to test performance.
Use actual but suitably redacted input to diagnose faults.

Structural is highly configurable. You select the required transformations that work best for you based on the automated privacy report that Structural generates and your own needs and constraints. A transformation might produce a realistic value, scramble the text, or null out the value. The configuration can also preserve data formats (such as timestamps), relationships between columns (for example, city and state always match) and between records (for example, a given name value always gets the same replacement).

If you don't need a full database, Structural can also output representative but referentially intact data subsets.

With our workspace inheritance option, you can use the same source data to produce slightly different output—different transformations or different subsets that are larger or smaller or that highlight different areas.

Structural can write the output data to a database server, or, for even easier re-use, to a data volume in a container repository.

The result is a suitably anonymised yet representative data resource that you can integrate with your development, test, machine learning, and AI processes and pipelines.

Assessing your current tech stack

Before you move forward with your Tonic Structural integration, ask yourself these key questions:

How will you use Tonic Structural?
What opportunities are there to streamline the process as much as possible?

Some examples :

Developers automatically ‘pull down’ containerised representative data subsets when they update their code. To ensure consistency, you can tag and version the containers.
Synthetic, anonymised, and redacted databases are used as part of your CI/CD pipelines and regression tests.
Machine learning models are kept up-to-date with the latest obfuscated data.

Streamline test data generation and provisioning.

Accelerate your release cycles and eliminate bugs in production with realistic, compliant data de-identification.

Book a demo

Tonic Structural integration options

Structural is designed with interoperability in mind. Some key features include:

A comprehensive REST API. Anything that you can do from the web console, you can execute from the REST API. REST offers a common interface that should be acceptable in nearly any context. You can sequence the API calls to support even the most complex orchestrations.
Post-execution webhooks and scripts for notifications and to initiate downstream processes.
Sandboxed ‘workspace’ environments that allow independent configuration, parallelisation, and control of multiple processes.
A comprehensive authorisation system that allows granular enforcement of least privilege.
SSO integration with all major providers.

The diagram below illustrates a workflow including a number of these features.

Example

You’re a major ad-tech specialist. You have a number of high-value customers for whom you support individualised segmentation and bid models. Ad-tech data is highly dynamic—typically discarded after 30 days. Keeping models up-to-date is critical.

Structural allows you to configure individual workspaces for each customer. While the data is usually anonymised, Structural offers the opportunity to enforce this, ensuring that third-party distribution limitations are honoured, even potentially offering this as a service. The data can be further transformed to remove superfluous or undesirable attributes.

The transformed data is used to train the bespoke models. You can integrate the process of initiating the data transformation and writing back of job status into your CI/CD pipeline. Once the data is ready, the model updates can be triggered. You can use webhooks to ensure that dashboards are up-to-date and show the build status, which can also be exposed to the customer. The diagram below provides a visual representation of the workflow.

Critical to the process is error handling. Working with a highly distributed system, SRE engineers can be notified when errors occur, which can streamline response times, particularly in an industry where time is money. The visibility of customer data can be tightly controlled, and even limited entirely. Finally, you can version control the data resources to allow for quantitative analysis of the improvements.

Tips for a smooth Tonic Structural integration

When you select a solution, coverage and extensibility are critical. Because successful integration is often a combinatorial exercise, you need tooling that offers the widest range of integrations—it won’t be long before a new requirement or data source gets added to the mix. It also likely won’t be long before you either congratulate yourself for your foresight in selecting a future-proofed tool, or you wish that you’d given the possibility more consideration.

Tonic Structural can be applied to 18 different types of data sources, with new sources always on our road map.

Similarly, the choice of REST as an interface means that you can integrate Structural within any programming framework.

Finally, our SSO support is as comprehensive as you’ll find, and we’re always responding to new providers.

Conclusion

Tonic Structural can genuinely unlock your data for you. The careful thought that we put into our integration endpoints means that you can quickly hook it into your current environment and then carry on with your core business while it does the heavy lifting for you.

To learn more about Tonic Structural’s integration into your workflows, connect with our team.

Integrating Tonic Structural with your existing tech stack

Understanding Tonic Structural

Assessing your current tech stack

Streamline test data generation and provisioning.

Tonic Structural integration options

Example

Tips for a smooth Tonic Structural integration

Conclusion

Related Guides

Integrating Tonic Structural into your data refresh and CI/CD pipelines

Security for Tonic.ai cloud products

The hidden value of test data: a case study on tech debt & business value

Make your sensitive data usable for testing and development.