Questions that come to mind when looking at Fairly’s offerings include ‘how do they link together’ and ‘why does it matter?’ In this article we hope to capture our vision for the future of automated AI performance testing, why it's needed, and what role Fairly AI plays in it.
Fundamentally, a control contains a question that must be answered about the state of a system. The act of answering that question brings the AI system one more degree closer to being in alignment with a policy. A collection of controls form a policy, and policies reflect regulations, standards, and frameworks. A system might be an organization, it might be software, or it might be the AI models themselves.
When a user answers a control, they would also provide evidence to substantiate the answer they selected. This is important so as to show that answers are not arbitrary. Evidence might take several forms:
Once those controls are completed, there needs to be a system to synthesize, to govern, all of that information and provide a summarized view of the risk a given organization takes on. It’s not too far from how a teacher might distill a student’s individual answers in each test and assignment into a single letter grade at the end of a semester.
So far we’ve explained what controls are, the role of evidence regarding them, and why we need a system to capture this process. But as AI systems proliferate, organizations may wish to set their own policies to measure the state of their operations in a way that fits their use case. This means more controls because controls reflect the state of a system.
As a result, there needs to be an easy way to generate controls to work with a governance platform. That’s why we built a control generator. We used this tool internally to transform legislation and standards into controls that ‘fit’ into our system. Raw text from a pdf can be transformed into a control that contains a question, weighted answer choices, a description, and in-depth categorization and tagging as well. This helps organizations scale because they no longer have to rely on staff to manually develop controls and perform assessments.
Internally, we’ve also deployed technology that scans documents to provide answers and evidence for controls. This doesn’t mean we are trying to replace compliance professionals. Rather, we aim to assist them. Document scanning allows compliance professionals to find answers for the controls they generate quickly and then swiftly validate those answers.
Finally there comes testing. For all the controls an organization builds, generative models are stochastic and that means their outputs are not easily predicted. As a result, real-world tests can uncover issues that might not be apparent by using a questionnaire alone. We call this automated red teaming. By simulating an actual user, we repeatedly stress test your generative systems, tally the results, and integrate those insights into our governance platform to provide a nuanced view of how your generative models perform when tested adversarially.
In summary, we make it easy to build, simplify getting answers from documents, streamline performance testing, and integrate all of this into a platform that delivers risk insights to you.