Product updates

Tonic.ai product updates: July 2024

Author

Chiara Colombi

Author

July 31, 2024

We're excited to share the latest updates and announcements designed to improve your experience with our products. This month's issue includes:

July 2024 headline: Tonic Textual’s new Pipeline workflow for LLMs 💠
The data quality is in the details
- Customizable and AI-enhanced sensitivity scans capture more in Tonic Structural 🔍
- Tonic Ephemeral’s usage report provides granular visibility into value 📊
Meeting your data where it is
- SAP ASE and Yugabyte connectors in Tonic Structural 🔌
- Free credits with Tonic.ai products on AWS 🤑
- Deploying Tonic Ephemeral on Azure and Google Cloud ☁️
- Textual Snowflake native app enhancements ❄️

‍

💠 Textual's new Pipeline workflow for LLMs 💠

Ever feel like your RAG system is a drag? It could be the data you're using to power it. We recently released a major update to Tonic Textual that enables you to create scalable unstructured data pipelines to feed your vector database.

Textual Pipelines allow you to easily prepare and augment your unstructured data for RAG and LLM applications. After integrating directly with your knowledge store, Textual automatically parses and normalizes your unstructured data in any file format. While in-flight, our NER models spring into action to extract semantic metadata, and detect and redact or synthesize sensitive data. Finally, you can customize your chunking strategy and use the embedding model of your choice to load clean, secured, and enriched data from any file format into your vector store.

With Textual, you can release your RAG systems into production with confidence, knowing that it will always have access to the freshest, highest quality data, and that any sensitive information is protected from unintended eyes or data leakage. Sign up for a free trial to get started today.

‍

The data quality is in the details...

‍

Structural's customizable sensitivity scan 🔍

Tonic Structural’s sensitivity scans are configured to catch the most common sensitive data that we see across our customers. However, proprietary data often has uniquely sensitive data types. To enable you to detect the data that is sensitive specifically for you, we’ve introduced a new capability enabling you to add custom logic to identify sensitive data in your workspaces, taking privacy and efficiency to new levels.

Structural users can add rules to the sensitivity scan to automate generator recommendations for columns the standard scan doesn’t detect. Users can configure text matching criteria (for example STARTS WITH, CONTAINS) for the column names and pair them with a generator recommendation of their choice.

While you flex this new feature in your workspaces, we’re already hard at work on the next iteration: enabling you to manage custom rules based on the row-level data. 👀

Edit custom sensitivity scan rule screenshot

Structural's AI-enhanced sensitivity scan 🦾

In addition to making Structural’s sensitivity scan customizable, we’re also uplevelling it with AI, incorporating machine learning algorithms to catch more of your sensitive data automatically. It’s early days yet, and we’re taking care to roll this out slowly—we’re in the business of keeping sensitive data out of AI, not tossing AI and PII into a mixing bowl willy-nilly.

If you’d like to learn more or join the customers testing out this feature, our Product team would love to connect with you. Your participation will help to ensure the feature accurately catches all of your sensitive data while providing the best possible generator recommendations. Connect with our team to learn more.

Track ROI with Ephemeral's usage report 📊

Our new solution for spinning up (and down) ephemeral databases on demand now offers a usage report. This report includes information on database uptime and resource allocation so you can audit usage and manage costs. If you haven’t explored Ephemeral yet, you can get started for free today. Once you’ve poofed some databases into and out of existence, run the report and let us know what you think!

‍

Meeting your data where it is...

‍

SAP ASE and Yugabyte now on Structural 🔗

Our native database support in Tonic Structural continues to expand with the addition of two new connectors that epitomize our mission of offering far-reaching support for data of all shapes and sizes, from legacy enterprise support to newer, cloud-native technologies. Please put your hands together and give a warm welcome to SAP ASE and YugabyteDB!

SAP ASE, formerly known as Sybase, is a relational model database server developed by Sybase Inc. Native support for de-identifying SAP ASE data with Tonic Structural is in active development. Interested in hooking up your data now? Connect with our team to get early access.

YugabyteDB is a high-performance transactional distributed SQL database for cloud-native applications. If you’re already taking advantage of this newer DB, or are considering making the switch, make sure to talk to your Tonic rep about integrating Yugabyte with Tonic.

Get credits for purchasing Tonic on AWS 🤑

Considering purchasing one of Tonic.ai’s products through the AWS Marketplace? In partnership with AWS, we’re now offering free credits on AWS for customers who purchase through the marketplace. Earn up to 2% back as AWS credits for signing an annual contract and transacting through the AWS Marketplace. Existing customers are also eligible for this offer for renewal contracts. In addition to the credits, transacting via AWS Marketplace can simplify procurement processes and allows you to apply a portion of Tonic.ai’s license costs to draw down existing AWS commitments. Ask your Tonic.ai representative for more details. This program runs until June 30, 2025.

Deploying Ephemeral on Azure and Google Cloud ☁️

Tonic Ephemeral now supports Microsoft Azure and Google Cloud, in addition to Amazon AWS, for self-hosted deployments. Ephemeral makes it easy for developers to spin up fully populated test and development databases for ephemeral test environments so you can work more efficiently while keeping costs under control. We're currently offering free trials of Ephemeral—sign up here to get started today.

Textual Snowflake Native App enhancements ❄️

The new Textual Snowflake Native App is here! We recently released a major update to the Textual Native App that brings Textual’s powerful pipelining capability to Snowflake. You can now use the TEXTUAL_PARSE() UDF in Snowflake worksheets to automatically parse documents from S3, structure the data into a common markdown format, and load them into a Snowflake table. Then, use the TEXTUAL_REDACT() UDF to ensure that your sensitive text data is redacted to maintain data privacy. With this new workflow, your private data is prepared, enriched, and protected in Snowflake, ready to power your AI systems on Snowflake Cortex. If you’re building enterprise AI on Snowflake, check out our listing in the Snowflake Marketplace or contact us for a demo.

‍

Small Updates; Big Impacts

Often it's the little things that matter most. Here's a round up of our smaller releases:

We’ve migrated our Oracle connector to Structural’s new data pipeline. In internal tests, we’re seeing speed improvements of up to 2x! Who knew Oracle could move so fast? 🤯
Self-hosted instances of Structural can now schedule sensitivity scans to run automatically on a weekly basis. By default, the weekly scans are enabled and run each Sunday at midnight. 🧙
Structural can now detect the following sensitivity types defined by the HIPAA Safe Harbor method: medical record numbers, health plan beneficiary numbers, account numbers, certificate and license numbers, web URLs, full face photographic images and similar images, and biometric identifiers, including finger and voice prints. Clean ALL the PHI! 🧹
The Salesforce connector on Structural now offers support for the Continuous generator, Algebraic generator, and using WHERE clauses in subsetting target table configuration.
When configuring a workspace to write output to an Ephemeral snapshot, you can now optionally configure the compute resources. By default, the resources are based on the size of the source database.

As always, we'd love to hear your feedback on our products. What do you need? What do you love? What could be better? Send us a note at hello@tonic.ai! And for all the latest updates, be sure to check out our complete release notes in our product docs.

Want to make your data usable?

Unblock product innovation with safe, high-fidelity data de-identification and synthesis.

Book a demo

Chiara Colombi

Director of Product Marketing

A bilingual wordsmith dedicated to the art of engineering with words, Chiara has over a decade of experience supporting corporate communications at multi-national companies. She once translated for the Pope; it has more overlap with translating for developers than you might think.