Tonic Textual release information

Learn about what’s in the latest Tonic.ai product releases.
v229
v229
Removed
February 20, 2025

Bug fixes and other internal updates.

v228
v228
Removed
February 18, 2025

Bug fixes and other internal updates.

v227
v227
Removed
February 15, 2025

Bug fixes and other internal updates.

v226
v226
Removed
February 15, 2025

Bug fixes and other internal updates.

v225
v225
Removed
February 14, 2025

Bug fixes and other internal updates.

v224
v224
Removed
February 12, 2025

On the Home page, in addition to typing or pasting text to redact, you can now upload a file.

v223
v223
Removed
February 12, 2025

Improved the detection of URLs and email addresses.

The URL for the generated Textual SDK documentation is changed to https://tonic-textual-sdk.readthedocs-hosted.com/en/latest/index.html.

v222
v222
Removed
February 10, 2025

Upgraded libraries to address security vulnerabilities.

v221
v221
Removed
February 6, 2025

Bug fixes and other internal updates.

v220
v220
Removed
February 5, 2025

Improved detection of signature blocks in PDFs.

v219
v219
Removed
February 5, 2025

On self-hosted instances, when a dataset file fails to upload, you can now download the associated logs.

v218
v218
Removed
February 3, 2025

For .docx dataset files, the image handling configuration is now also applied to SVGs.

The new Getting Started checklist guides users through the initial tasks to preview redaction, set up API access, install the SDK, and send a redaction request.

v217
v217
Removed
January 30, 2025

Bug fixes and other internal updates.

v216
v216
Removed
January 30, 2025

Bug fixes and other internal updates.

v215
v215
Removed
January 27, 2025

Bug fixes and other internal updates.

v214
v214
Removed
January 24, 2025

Custom entity types - You can now define custom entity types, to identify sets of values that are unique to your industry or organization. For each custom entity type, you provide one or more regular expressions to define the matching values. You can then enable or disable the entity type for each dataset and pipeline.

v213
v213
Removed
January 24, 2025

Added additional language support.

v212
v212
Removed
January 24, 2025

Fixed an issue that caused slow load times for customers with large datasets. Calls to GET /api/dataset and GET /api/dataset/{datasetid} no longer return entity information for the dataset files. Instead, the new GET /api/dataset/{datasetid}/pii_info endpoint returns the entity information for a dataset’s files.

A new dataset settings option controls the output in .docx tables. By default, table content goes through the regular scan and redaction process, and detected entity values are handled based on the dataset’s entity type handling configuration. You can also choose to completely block out all table cells, in which case each table cell is covered by a black box.

v211
v211
Removed
January 21, 2025

Bug fixes and other internal updates.

v210
v210
Removed
January 15, 2025

File statistics for pipelines - The pipeline details page now displays a summary of information about the pipelines files, including the number of files, the number of words in the files, the number of detected entity types, and the number of detected topics. For entity types, the display includes the number of detected values for each type. For topics, the display includes the number of files that involve each topic.

On the dataset details page, the preview count for each entity type now reflects the count of values that are assigned that type in the output files. Previously, values that matched multiple entity types were included in the preview count for all of the matching types.

v209
v209
Removed
January 10, 2025

Bug fixes and other internal updates.

v208
v208
Removed
January 8, 2025

Bug fixes and other internal updates.

v207
v207
Removed
January 7, 2025

Bug fixes and other internal updates.

v206
v206
Removed
January 7, 2025

Bug fixes and other internal updates.

v204
v204
Removed
January 3, 2025

Bug fixes and other internal updates.

v203
v203
Removed
December 29, 2024

New Textual Home page - The Textual Home page now contains an updated version of the Playground, where you can see how Textual detects and replaces entity values in text. There is no longer a separate Playground page, and there is no LLM Synthesis option. For each entity type, you can configure handling options and added or excluded values. Textual generates Python and cURL versions of the request that you can copy.

v202
v202
Removed
December 17, 2024

Improved handling of fillable PDF forms.

v201
v201
Removed
December 13, 2024

From the Request Explorer, in addition to testing added and excluded values, you can now also select the handling type for each entity type. The Unified toggle is replaced with options to view either the original values with their corresponding types (Identification) or the actual output with the replacement values (Replacement).

v200
v200
Removed
December 12, 2024

Regular expression-based email address detection no longer validates the domain name, which makes the detection more general.

v199
v199
Removed
December 11, 2024

Edit and replay recorded requests - When you use the Request Explorer to preview a recorded redaction request from the SDK, you can now edit the request to add and exclude entity values. You can then re-run the redaction and view the differences between the original request and the edited request.

When you configure excluded values for an entity type, you can block detection of a specific type within a matching phrase. For example, if you add the phrase "one moment, please" to an excluded value for numeric values, the word “one” is not detected as a numeric value in that specific context.

v198
v198
Removed
December 4, 2024

For images in dataset .docx files, added a replace option that replaces the images with black rectangles instead of scanning the images for sensitive values.

v197
v197
Removed
December 3, 2024

Fixed an issue that prevented users from saving the dataset configuration for .docx comments.

Record and view redaction requests - When you make a redact call to redact a plain text string, you can now record the request. You specify the amount of time to keep the recording, and any tags to assign to the request. From the new API Explorer page, you can then view and analyze the recorded redaction requests, to assess the quality of the redaction.

v196
v196
Removed
November 29, 2024

Bug fixes and other internal updates.

v195
v195
Removed
November 28, 2024

Bug fixes and other internal updates.

v194
v194
Removed
November 26, 2024

Improved the accuracy and performance of NER models.

v193
v193
Removed
November 26, 2024

New redaction options for datasets - The settings panel for datasets now includes additional configuration options:

  • You can configure whether to redact or remove images in .docx files.
  • You can choose to remove comments from .docx files.
  • For PDF files, you can choose to detect and redact scanned signatures.
v192
v192
Removed
November 21, 2024

Bug fixes and other internal updates.

v191
v191
Removed
November 20, 2024

Bug fixes and other internal updates.

v190
v190
Removed
November 19, 2024

Added the Tonic NER model version to the model information. The API endpoint /api/environment/models reports version strings for NER models.

Entity manager for entity types - The new entity manager allows you to view all of the occurrences of each entity type in a dataset. it displays the original value, the context in the original file, and the context in the transformed file. To view the entities manager, from the entity value preview list, click Open Entities Manager. Note that by default, for the NUMERIC_VALUE entity type, Textual only provides context information for the first 20 occurrences. To change this, set the SOLAR_NER_OCCURRENCE_IGNORE_NUMERIC_VALUE environment variable to false.

v189
v189
Removed
November 14, 2024

Bug fixes and other internal updates.

v188
v188
Removed
November 14, 2024

Improved detection of names, particularly in ASR transcripts.

v187
v187
Removed
November 7, 2024

Added an optional jsonpath_allow_lists to redact_json. You use jsonpath_allow_lists to override NER results at specific JSON Path expressions.

v186
v186
Removed
November 6, 2024

Bug fixes and other internal updates.

v185
v185
Removed
November 5, 2024

Bug fixes and other internal updates.

v184
v184
Removed
November 4, 2024

Bug fixes and other internal updates.

v183
v183
Removed
November 2, 2024

Bug fixes and other internal updates.

v182
v182
Removed
November 1, 2024

Bug fixes and other internal updates.

v181
v181
Removed
November 1, 2024

Bug fixes and other internal updates.

v180
v180
Removed
October 31, 2024

Bug fixes and other internal updates.

v179
v179
Removed
October 31, 2024

Textual can now redact images in .docx files.

v178
v178
Removed
October 30, 2024

Fixed a rare issue where Azure OCR returned a400 response when the file upload stream contained corrupted data.

Improved synthesis on days of the week and ordinal numbers that are flagged as DATE_TIME.

Textual now only disables a numeric span when it overlaps one of the following disabled types: DATE_TIME, DOB, LOCATION, LOCATION_ADDRESS, LOCATION_ZIP, MONEY, CREDIT_CARD, PHONE_NUMBER.

v177
v177
Removed
October 28, 2024

Textual now allows you to parse EML and MSG files.

v176
v176
Removed
October 25, 2024

You can now use the Python SDK to configure Azure pipelines.

v175
v175
Removed
October 25, 2024

Bug fixes and other internal updates.

v174
v174
Removed
October 24, 2024

Bug fixes and other internal updates.

v173
v173
Removed
October 24, 2024

You can now use the Python SDK to configure Amazon S3 pipelines.

v172
v172
Removed
October 23, 2024

Amazon Textract can now be used to process dataset files.

v171
v171
Removed
October 23, 2024

On the Python SDK, added parameters for pipeline creation, including the file location, the connection credentials, and whether to also generate redacted files.

v170
v170
Removed
October 21, 2024

Improved the Textual NER model throughput on long strings that contain a large number of numeric characters.

Added the redact_html function to the SDK, which allows you redact sensitive values from HTML strings.

v169
v169
Removed
October 16, 2024

Improved detection of names and organizations.

Disabled auxiliary model detection of WORK_OF_ART.

v168
v168
Removed
October 16, 2024

Improved the Textual NER model throughput on long strings that contain a large number of detected entities.

Added support to store dataset files in a specified S3 bucket, instead of in the Textual application database.

When Textual replaces first name values, it now attempts to use a name with the same gender.

For the DOB (date of birth) entity type, you can now configure synthesis options. You can set how to shift the date.

v167
v167
Removed
October 14, 2024

Bug fixes and other internal updates.

v166
v166
Removed
October 11, 2024

Improved the synthesized values for the PERSON_AGE entity type.

v165
v165
Removed
October 11, 2024

You can now configure the entity type handling for a dataset before you upload the dataset files.

You can now provide added and excluded entity values when you use the SDK to redact individual strings and files.

Added a new method to the SDK. redact_xml works similarly to redact_json. To produce a redacted output, you pass in a redact_xml string.

v164
v164
Removed
October 9, 2024

Improve Pipeline UI to include better Python SDK code snippets

v163
v163
Removed
October 8, 2024

Improved the user experience when you load a large number of files to a dataset.

v162
v162
Removed
October 7, 2024

Updated the UI for adding and excluding values for entity types. Changed the tab labels to Add to detection and Exclude from detection, and removed the requirement to click the edit icon for the entries.

v161
v161
Removed
October 2, 2024

Added support in the SDK to create dataset include lists to define additional values for an entity type.

v160
v160
Removed
September 29, 2024

Removed support for en_core_web_trf and en_core_web_lg auxiliary models. Disabled model inference for ORGANIZATION, PERSON, LOCATION, and MONEY entity types. Updated the auxiliary model configuration environmental variables to have new default values:
TEXTUAL_AUX_MODEL_GPU: false
TEXTUAL_AUX_MODEL: en_core_web_sm

Fixed a redaction issue that was caused by a regression from v140.

Improved the Textual NER model, specifically for datetime values and and electronic health records.

Fix for correctly re-synthesizing files as part of pipelines

When you call the dataset.add_file method in the Textual SDK, you can now pass in IO bytes.

You can now specify a list of additional values to include for each entity in a datasets. This allows Textual to identify values that it might not identify because they are specific to your organization or industry. The list can contain both specific values and regular expressions.

Improved the file list display for datasets to better accommodate longer file names.

For an uploaded file pipeline, added a Back to Files breadcrumb to return the user to the main file list.

On the dataset details page, the bulk edit function for entity type handling is now a dropdown instead of separate buttons.

v159
v159
Removed
September 25, 2024

You can now use the Python SDK to delete files from a dataset.

To improve performance, enabled date synthesis inference on GPU. Added the environment variable TEXTUAL_DATE_ SYNTH_ GPU to manage whether to use it.

Renamed the following environment variables:

  • SOLAR_PREFER_GPU to TEXTUAL_AUX_MODEL_GPU
  • SOLAR_AUX_MODEL to TEXTUAL_AUX_MODEL
v158
v158
Removed
September 24, 2024

Bug fixes and other internal updates.

v157
v157
Removed
September 23, 2024

Bug fixes and other internal updates.

v156
v156
Removed
September 23, 2024

Bug fixes and other internal updates.

v155
v155
Removed
September 20, 2024

Bug fixes and other internal updates.

v154
v154
Removed
September 19, 2024

You can now create pipelines that use files from Azure Blob Storage.

v153
v153
Removed
September 18, 2024

Bug fixes and other internal updates.

v152
v152
Removed
September 18, 2024

Bug fixes and other internal updates.

v151
v151
Removed
September 18, 2024

Bug fixes and other internal updates.

v150
v150
Removed
September 17, 2024

Bug fixes and other internal updates.

v149
v149
Removed
September 16, 2024

Improved performance for previewing PDF files.

v148
v148
Removed
September 10, 2024

Bug fixes and other internal updates.

v147
v147
Removed
September 9, 2024

Bug fixes and other internal updates.

v146
v146
Removed
September 8, 2024

The responses for textual.redact and textual.llm_synthesis now include:
br<

  • The language for each entity value.
  • The start location (new_start) of each entity value in the redacted content.
  • The end location (new_end) of each entity value in the redacted string.

For uploaded file pipelines that also generate redacted files, you can now configure the handling option for each entity type.

v145
v145
Removed
August 30, 2024

Bug fixes and other internal updates.

v144
v144
Removed
August 29, 2024

Added a HEALTHCARE_ID entity type for identifiers associated with health care.

Removed the right-hand panels from the Home page. Added the API Keys panel to the dataset details page to accompany the code snippets.

v143
v143
Removed
August 28, 2024

Bug fixes and other internal updates.

v142
v142
Removed
August 26, 2024

On the Playground, LLM Synthesis is now turned off by default.

v141
v141
Removed
August 22, 2024

Improved synthesis for DATE_TIME entities by recognizing non-standard date formats.

v140
v140
Removed
August 21, 2024

Added a forgot password option to allow Textual Cloud users to reset their password.

Improved our detection of date values.

Removed the US_DRIVER_LICENSE entity type.

v139
v139
Removed
August 19, 2024

Introduced a new model that improves performance for non-English languages.

The responses for textual.redact and textual.llm_synthesis now include the redacted or synthesized value.

v138
v138
Removed
August 15, 2024

Bug fixes and other internal updates.

v137
v137
Removed
August 12, 2024

When Textual synthesizes values, it now matches the capitalization of the original value.

v134
v134
Removed
August 8, 2024

You can now set up a pipeline to connect to a Databricks workspace. You can also configure a Databricks pipeline to generate redacted files in addition to the JSON output.

Textual can now detect values in multiple languages. Textual Cloud supports a set of non-English languages. For a self-hosted instance, you must enable multi-language support and provide the language models for Textual to use.

v135
v135
Removed
August 8, 2024

For PDF and image files, the pipeline file details now include any tables and key-value pairs that are in the file.

The Pipelines and Datasets pages now display lists instead of cards.

v133
v133
Removed
August 1, 2024

Bug fixes and other internal updates.

v131
v131
Removed
August 1, 2024

Textual now supports a pay-as-you-go option for Textual Cloud. When you set up a pay-as-you-go subscription, you provide a credit card that is automatically billed each month for your Textual usage.

v132
v132
Removed
August 1, 2024

Textual no longer includes the option to create custom models to use for datasets.

v130
v130
Removed
July 31, 2024

Redesigned the structure of the JSON output.

v129
v129
Removed
July 26, 2024

Bug fixes and other internal updates.

v128
v128
Removed
July 25, 2024

Bug fixes and other internal updates.