Tonic Validate is a free, open-source library for evaluating RAG and LLM based applications. It makes it easy to understand how the answer quality of your application changes over time as parameters are tuned, components are changed, and data sources are refreshed.
We recently announced a new listing on GitHub Marketplace that provides a GitHub Actions template to run Tonic Validate against code changes on every commit. Today, we’re following up with an additional listing that allows you to establish integration tests each time a branch is merged into your main branch. Together, these two GitHub Actions templates will enable your teams to create scalable CI/CD processes for your RAG systems. It’s available now on GitHub Marketplace, here.
To get started, follow the instructions found in the marketplace listing. I’ll also give a brief demonstration of how to set up below.
Setting up Tonic Validate as a GitHub Action
To use the action in your own GitHub repositories, simply add an additional workflow file in your .github directory. The workflow file will look something like this:
You must provide just two pieces of information (the project ID and LLM responses) as well as the API keys.
API Key
Validate uses an LLM to help evaluate your application. By default, we use GPT-4 Turbo, available from OpenAI. You just need to provide your OpenAI API Key in the workflow file. If you prefer to use Azure OpenAI, then you can instead provide both the equivalent Azure API Key and the Azure OpenAI Endpoint URL.
Today, Validate supports either OpenAI or Azure OpenAI models, but if there are other LLMs you think we should be using for evaluation, you can file an issue on our main repository.
Questions
Validate takes a set of user-provided questions on which to perform the evaluation. Each question is represented by a JSON object that looks like the below JSON.
The optional fields shown above allow Validate to compute additional metrics. If those fields are not provided, Validate just computes the metrics that do not require the missing fields.
The file should ultimately look something like this:
Running your action
Once this setup is complete, your action triggers each time code is merged to main. After the evaluation completes, you can see your results in the Tonic Validate UI as a new data point on each metric visualization.
Next Steps
If you have any feedback, or see a missing feature that you need, just reach out. The easiest way to reach us is through GitHub—file an issue on either the GitHub Action repo or the main Tonic Validate repo. Alternatively, you can check out our UI offering for business, which provides a nice UI experience on top of our open source offering. You can learn more from the Validate page on our website.