Get in touch

What every CTO needs to know about AI

The new definition of quality

By Adam Pettman, Head of Innovation & AI · 12th May 2025

You’ll know that AI is quickly becoming part of the day-to-day if you’re leading technology in financial services or the public sector right now. Whether it’s generative tools, machine learning models or intelligent automation, AI is shifting how we operate, make decisions and deliver services. But as adoption ramps up, so does the need for serious thinking about how we test and govern these systems. 

Testing isn’t just about spotting bugs anymore, but about understanding the real-world consequences of AI systems and making sure they’re safe and reliable. With incoming regulation in the UK, there’s a growing pressure to get ahead of the curve before the curve gets ahead of us. 

Here’s what CTOs need to know right now. 


Why AI testing is a must  

Most organisations are no longer asking “should we use AI?” They’re already doing it. In UK financial services alone, 75% of firms now report some form of AI use, up from 53% just two years ago. On the public sector side, we’re seeing departments experiment with AI-powered chatbots, decision-support tools and predictive analytics across everything from healthcare to planning and public safety. 

That’s promising, but there’s a catch. Most existing test frameworks weren’t designed with AI in mind. They don’t account for model drift, bias, hallucinations or the different ways AI systems can be exploited by malicious actors. Traditional Quality Assurance (QA) doesn’t ask whether a machine’s decision was fair, transparent or compliant with the law. But regulators do. 

Technology teams are now being asked to prove that the systems they build are trustworthy. That means treating testing as a core part of your AI strategy, not a final tick box. 

 

Where does the focus need to be? 

Security is the obvious starting point. AI changes the game when it comes to data exposure and attack surfaces. For example, generative models have been shown to leak personal or sensitive information. The financial sector is already flagging concerns around prompt injection attacks and the risks of third-party large language models generating inappropriate or biased outputs. 

If you’re integrating external models or APIs, you’ll want to treat them as potentially untrusted components. Test them like you would any other third-party risk – validate inputs and outputs, limit permissions and keep them sandboxed from core systems where possible. 

Privacy is another area where we’re seeing heightened scrutiny. Under the UK GDPR and its EU equivalent, organisations are still responsible for how personal data is used, even if AI is making the decisions. The Information Commissioner’s Office has made it clear that AI systems need to comply with data minimisation, explainability and the right to human review - especially in areas like recruitment, lending or benefits eligibility. 

On the performance side, the challenge is that AI models behave differently at scale and over time. A fraud detection system might perform perfectly well in a test environment, but struggle under live transaction loads. A predictive model that works today could degrade in accuracy next month if real-world patterns shift. 

That’s why we’re driving towards continuous testing and monitoring. Some organisations are running automated checks for accuracy, fairness and outliers in production environments, then flagging anomalies for human review. This mirrors what the UK government now expects from departments trialling generative AI: robust controls, ongoing oversight and the ability to step in if things go wrong.  

 

Regulation is constantly moving  

The EU AI Act has now been formally adopted, with most rules set to apply across 2025 and full enforcement by 2026. It introduces new categories of AI risk, mandatory testing for bias and accuracy, as well as transparency requirements for both decision-making systems and generative models. 

If your organisation uses AI for credit scoring, hiring, biometric ID or public service automation, you’ll fall under the "high-risk" category. That means your systems will need formal documentation, human oversight and quality testing before they go live. Think of it like CE-marking for AI. 

Meanwhile, the UK is taking a lighter, sector-led approach, but don’t mistake that for inaction. Financial regulators are making it clear they expect firms to apply existing rules – like the Consumer Duty, SMCR and operational resilience requirements – to AI deployments. The FCA is asking for boards to have oversight of AI risks and for senior leaders to be able to explain how their systems work and why they’re safe. 

In the public sector, the push is towards transparency. The Algorithmic Transparency Recording Standard (ATRS) is now mandatory for all UK government departments using AI. It requires them to publish clear records of how algorithms are used, what data they rely on and what risks have been considered. 

If you haven’t already, you need to review how your current systems would stack up against these expectations. Could you publish a register of your AI use cases tomorrow? Could you demonstrate your models have been tested for bias or explainability? These are fast becoming baseline requirements and meeting them starts with putting the right foundations in place straight away.  

 

What needs to be in place now? 

Start by looking at your infrastructure. Testing AI properly often requires access to GPU compute, versioned data pipelines and ways to simulate real-world workloads. If you’re relying on legacy systems or outdated test environments, now is the time to modernise. 

Next, think about skills. Traditional QA teams may not have experience in testing machine-learning systems or validating AI outputs. Consider investing in upskilling your existing testers and hiring AI QA specialists. Testing is no longer a bolt-on – it needs to be part of the design process from day one. 

Lastly, formalise your governance. That means assigning ownership for each AI system, tracking decisions about risk and ethics and ensuring test results are reviewed by the right people. Some firms are building model assurance committees or adding AI risk to their internal audit scope. Public sector teams are creating ethics panels or involving external stakeholders in review. 

Even if you’re not legally required to do this yet, it’s a smart way to build confidence—internally and externally. 

 

The opportunity for CTOs 

AI is pushing us to rethink how we define quality in technology. It’s no longer enough for a system to work as designed – it must also be fair, safe and explainable. 

As CTOs, you’re in a unique position to lead this shift. You already know how to deliver at scale, manage risk and balance innovation without losing control. The challenge now is to bring those skills into the AI space and build systems that are as responsible as they are capable. 

That starts with testing because if you don’t test for the right things, you can’t be confident in the results. 

The good news is you’re not doing this alone. Frameworks, tools and guidance are emerging. Regulators are signalling what’s coming. And if you start adapting now, you can build AI systems that organisations, customers and citizens can genuinely trust. 

If you’re already on this journey, we’d love to hear what’s working for you. If you’re just getting started, know that getting your testing strategy right will set the foundation for everything that comes next. 

Whether you're refining your approach or building from scratch, we're here to help you test with confidence and clarity. Book in to chat to us about our AI consultancy offering today.

Get in touch