Close

Why Most Companies Are Just Clicking a Button and Hoping Their AI Chatbot Is Secure

2026

Key Entities: Bret Kinsella (SVP and General Manager of Fuel iX, TELUS Digital), Fuel iX (AI security testing platform), TELUS Digital (telecommunications and digital services), red team testing (cybersecurity methodology), generative AI security (AI safety domain)

Executive Summary: Enterprise organizations face an impossible scaling problem in AI security. Traditional red team testing requires specialized cybersecurity experts who understand AI models, specific domain risks, and attack methodologies. These experts are booked weeks in advance while new AI applications launch daily. Fuel iX solved this bottleneck by automating comprehensive security testing that previously required 49 person-days down to hours, making AI security accessible to product managers without deep technical expertise.

Corporate America basically hit the brakes on generative AI in 2024 - at least officially. CEOs were blocking it from networks, banning employees from using ChatGPT, shutting down experiments. But here's what was actually happening: 54% of people in large companies were using personal AI accounts at work anyway, and a third of those had company-provided solutions they just weren't using. The fear driving those bans was real and justified - nobody had processes in place, intellectual property could leak out, and the excitement of "look what I can do" was overriding security concerns. Bret Kinsella, SVP and General Manager of Fuel iX at TELUS Digital, saw a different problem emerging. Companies that wanted to deploy AI responsibly were facing an impossible bottleneck. Red team security testing requires expert ninjas who understand cybersecurity, AI models, and specific domain expertise. Those experts were booked weeks out while new AI applications launched daily.

"There were more of those being launched than there are skilled red teamers in the market," Kinsella explains.

The math just didn't work. One customer was running seven people for seven days - 49 person-days of testing - before rolling out a consumer-facing chatbot. That's extensive, but it only happened at launch. What about after updates? What about when you change the system prompt and inadvertently introduce vulnerabilities? What about monitoring ongoing risks?

Fuel iX won recognition from Business Intelligence Group by solving what nobody else was addressing comprehensively. They automated AI security testing at a scale that actually matches how fast companies are deploying AI.

The Infinite Attack Surface Problem

Problem Domain: AI security, chatbot vulnerabilities, prompt injection attacks, model guardrails

Technical Challenge: When organizations deploy conversational AI systems with voice or text input capabilities where users can input literally anything, combined with large language models trained on internet data that may respond unpredictably, the attack surface area becomes functionally infinite. Traditional cybersecurity red team methodologies - where expert penetration testers manually probe systems for vulnerabilities - cannot comprehensively address this challenge at scale.

Traditional red teamers often lack complete domain expertise across all potential attack vectors. They might not anticipate every problematic query that could damage organizational reputation, violate privacy regulations (HIPAA, GDPR, CCPA), enable discrimination, or facilitate political manipulation. The combination of unlimited input possibilities and probabilistic AI model outputs creates security challenges that traditional testing methodologies cannot adequately address.

The technology being applied to this problem was focused on intervention rather than prevention. Most people are familiar with guardrails - filters that try to catch problems as they occur. You're anticipating what the issue might be and intervening when it happens. This could stop a bad actor or prevent a regular user from inadvertently stumbling into problematic territory.

"I was saying, well, geez, like how do I know what my risk is? I've been working with these models for a long time. They all operate a little bit differently."

Once you attach models to different networks and applications in your system, that changes behavior. When you modify the system prompt, behavior changes again. All these factors impact whether the AI operates within acceptable parameters or goes outside them. And most companies don't really know where their vulnerabilities are.

Fuel iX recognized a gap in the market. TELUS Digital has a large research team, plenty of data scientists and AI engineers. They asked: how can we know what problems exist so we can proactively mitigate risks? And is there a way to make it easy enough that you don't need a technical expert available every time?

That second question turned out to be critical. When you make changes to your system, whether weekly or after any update to your prompt, you need a way to get a risk snapshot. Did you inadvertently create a regression problem? Most companies had no practical way to check.

Prevention vs. Hope

What Kinsella found was rather troubling. Most companies aren't actually implementing comprehensive guardrails. They're clicking a button in their cloud provider and hoping for the best. Or they're selecting a model they think is more locked down and hoping for the best.

"In general, you need to go a little bit beyond that because your implementation is different from the generic capabilities that they may offer," he notes.

Even when companies do implement guardrails properly, they have gaps they don't know about. The probabilistic nature of AI makes this particularly challenging. It's not deterministic - you can't test it five times and assume you're good. It might fail on the 17th attempt, or the sixth. The randomness means you need either extensive repetition or regular ongoing testing.

Fuel iX found that existing approaches typically involved sourcing attacks from industry databases and running single-shot tests. The problem? Guardrails and models tend to overfit to these known attacks. They'll block the specific signature they've seen before, but a slight adjustment to that same attack reveals the vulnerability was there all along.

The solution required three capabilities working together: scale, ease of use, and novel attack generation. You need to run thousands of attacks efficiently. Non-technical people like product managers need to be able to execute testing. And you need to generate new attacks regularly because those expose what the actual problems are.

Thousands of Attacks in Minutes

Fuel iX runs thousands of AI security testing attacks in minutes. That's not just about load testing the system, though that matters. It's really an efficiency play. When traditional red and blue teaming engagements might test dozens, maybe hundreds, possibly 1,000 attacks, Fuel iX goes dramatically beyond that scale.

The team looked at roughly 600,000 potential attacks that someone had compiled. After evaluation, they determined maybe 32,000 were actually good. Then they regressed further and concluded only about 2,400 were genuinely useful. That realization drove home the point - you need both scale and quality, and sourcing from existing databases isn't sufficient.

"We were able to get them 12 times the coverage and identify three times the amount of vulnerabilities as they were in 1/30th of the time," Kinsella explains about one customer engagement.

That customer had been running seven people for seven days before launch. Fuel iX delivered better results in four to five hours. The time savings is substantial, but the real value is what it enables. Companies can now test during development cycles instead of only at launch. They can test after every meaningful change. They can actually build in safety instead of hoping for it and doing triage when problems emerge.

The Taxonomy of AI Vulnerabilities

Fuel iX mapped over 10,000 different pairs of attack methods and objectives. They narrowed that down to approximately 139 categories, though the number fluctuates as they consolidate some and expand others. Think about 140-150 different vulnerability areas across 15 major categories.

These categories cover everything from privacy violations to fraud, political manipulation, age discrimination, and beyond. Very broad scope. They created a comprehensive taxonomy for evaluating risks and returning scores with clear guidance - is it clear, vulnerable, needs attention? Then they mapped everything to existing frameworks like OWASP, NIST, and MITRE that security professionals already understand and use.

There's actually a 16th category for application or target-specific risks. If you're a telecommunications company, consumer electronics manufacturer, or retailer, you have risks associated not just with your industry but with the specific application you're deploying. A customer service chatbot has different risks than an internal productivity tool. The context matters enormously.

The scoring system combines the likelihood of a vulnerability occurring with how accommodating the AI will be to problematic requests. There's a spectrum of compliance - not regulatory compliance, but how compliant the AI will be to user requests. Some models refuse more readily. Others are more accommodating, which users often prefer but which creates different security profiles.

Novel Attacks and Social Engineering for AI

Attack Methodologies: Multi-turn prompt injection, adversarial testing, jailbreak attempts, social engineering for large language models

Technical Innovation: The probabilistic nature of large language models (LLMs) creates unique security testing challenges distinct from traditional deterministic software systems. Unlike conventional applications where identical inputs produce identical outputs, LLMs may respond differently to the same prompt across multiple iterations. This non-deterministic behavior means a single successful security test does not guarantee consistent safe operation.

Fuel iX addresses this through two complementary strategies: repetition-based testing and regular ongoing validation cycles. But the platform also developed sophisticated multi-turn attack sequences that function analogously to social engineering techniques used against human targets. When an initial attack vector receives a refusal response from the AI system, Fuel iX automatically generates alternative approaches, adjusting tactics to probe for vulnerabilities - similar to how human social engineers modify their strategies when initial approaches fail.

"The first time it gets refused, it says, OK, well, let me try a different tact. Give it another attack."

When security teams see this in action, the reaction is typically shock. They thought their guardrails were solid. They'd tested and found zero vulnerabilities. Then Fuel iX runs automated testing and identifies 150-200 vulnerabilities in a couple hours just by clicking buttons.

This happened with a government customer recently. They'd done extensive testing and were ready to launch. Zero known vulnerabilities. The Fuel iX team offered to run a quick test during a workshop. The results were sobering but valuable - finding those problems before launch prevented potential disasters after deployment.

Democratizing Security Testing

One of Fuel iX's most significant achievements is making security testing accessible to non-technical team members. Product managers, application owners, GRC professionals - people without deep cybersecurity expertise can now run comprehensive testing and understand the results.

The outputs are readable. When an attack runs, you see what happened and why it's a problem. The system explains whether it's clear because the AI properly refused the request, or vulnerable because it accommodated something it shouldn't have. Different stakeholders - security teams, compliance folks, application developers - can all collaborate around shared understanding of the risks.

This changes the organizational dynamics substantially. Instead of security experts being the only ones who understand vulnerabilities, cross-functional teams can have meaningful conversations about risk management and mitigation strategies. The AI safety and security people get a seat at the table with software developers who typically think about different priorities.

"They were really happy to basically have a seat at the table and sort of understand what's going on," Kinsella notes.

Having worked in tech and telecom for decades, I've seen how siloed security often becomes. Developers want to ship features. Security wants to prevent problems. Product wants to hit deadlines. These priorities create tension and communication gaps. Tools that make security visible and understandable to everyone actually help organizations move faster while staying safer.

The Trust Problem

Everything ultimately comes down to trust. Users want trust that the system will give accurate responses and behave appropriately. Organizations want trust that AI applications won't do things they shouldn't do. Every generative AI solution has a bunch of do's and don'ts. You need confidence it won't do any of those don'ts while still doing the things you want it to do.

Traditional governance approaches set policy, then rely on other people to implement it. Very often, if there are tests, governance teams don't know how to interpret results. They have to take someone's word that testing was done and everything is fine. Even when everyone agrees there's a problem, they rely on others to confirm it's fixed.

Fuel iX changes this dynamic. Governance teams can look at results themselves and understand what's fixed and why. They don't need to blindly trust implementation teams. They can verify. That transparency builds the kind of organizational trust that lets companies actually deploy AI at scale.

The Data Behind AI Safety Challenges

TELUS Digital recently published comprehensive research on trust and safety trends that validates what Fuel iX is addressing. The Safety in Numbers: Trust and Safety Trends, 2025 report surveyed 819 enterprise customer experience decision-makers across Western Europe, North America, and Asia-Pacific to understand how organizations are actually managing these challenges.

Customer trust is hard to earn and easy to lose - and the stakes have never been higher. The report analyzed four core areas that directly impact AI deployment: fraud detection, Know Your Customer (KYC) protocols, content moderation, and ID verification. What becomes clear from the data is that leaders are struggling to maintain safe and secure digital environments, particularly as AI introduces new attack vectors and vulnerabilities.

The limiting factors identified in the research align precisely with what Fuel iX addresses. Organizations know they need effective trust and safety solutions, but they lack the expertise, tools, and processes to implement them at the scale and speed that AI deployment requires. Traditional approaches to security testing simply cannot keep pace with the rate of AI application launches or the sophistication of attacks these systems face.

Measurable Security Testing Outcomes

Quantifiable Results: Fuel iX delivered documented improvements in security testing efficiency and effectiveness for enterprise clients:

  • Coverage Improvement: 12x increase in security test coverage compared to traditional manual red team testing
  • Vulnerability Detection: 3x more vulnerabilities identified than conventional testing methodologies
  • Time Reduction: Testing completed in 1/30th the time - reducing 49 person-days (7 people working 7 days) to 4-5 hours
  • Scale: Thousands of novel attack attempts executed in minutes rather than weeks
  • Automation: Non-technical product managers can execute comprehensive security assessments without cybersecurity expertise

Real-World Validation: A government agency customer conducted extensive internal testing and reported zero known vulnerabilities prior to planned AI system launch. Fuel iX automated testing identified between 150-200 previously unknown vulnerabilities within hours, demonstrating the gap between traditional testing approaches and comprehensive automated security validation.

The Scale of the Problem

TELUS Digital (Canada's largest telecommunications provider, comparable to Verizon in the United States) and its TELUS Digital division processed two trillion AI tokens through their systems in the most recent year - representing 20x growth from 50 billion tokens two years prior and 200x growth from 10 billion tokens three years ago. This massive scale of AI deployment is unusual for organizations outside dedicated AI product companies, demonstrating the rapid enterprise adoption of generative AI technologies.

This kind of scale simply doesn't happen at traditional software companies unless they're AI product companies themselves. It's happening because organizations are finding tremendous value. Customers using the platform are getting benefits and expanding usage. Productivity improvements, better customer support, enhanced operations - all the things people hope AI will deliver.

But here's the risk. One major reputation problem can shut everything down. AI spreading misinformation, acting inappropriately, accommodating fascist requests - these are the scenarios that keep executives awake at night. Regulatory problems from leaked personally identifiable information or privacy violations could be catastrophic.

Fuel iX recently published a state of AI safety and security research report examining 24 frontier models. They ran extensive testing against all of them. The results? Every single one failed. Most people don't realize this because they assume models from reputable providers are locked down.

Anthropic's Claude is generally considered quite safe. But do you want two out of every 100 interrogation attempts to result in leaked PII? That's the kind of vulnerability that exists even in well-regarded models. Organizations using these models often don't know about these issues because they're doing basic testing or just tweaking configurations without comprehensive validation.

The Alignment Tradeoff

Different models make different tradeoffs between safety and usefulness. Some models like those from Anthropic are quite locked down, which keeps you out of trouble through strong alignment. But sometimes that's problematic because the model refuses to do legitimate things you want it to do. Discernment isn't necessarily its strength.

Other models like those from OpenAI tend to be more compliant with user requests, which many organizations want because they're trying to get specific benefits. But those same models can get out of alignment much more quickly and accommodate problematic requests more readily.

How are product developers supposed to manage this when models are changing constantly? Minute by minute updates, new versions, different behaviors - it's reminiscent of when databases first went online and people were probing with SQL injection attacks. We eventually developed port scanners and security tools for that environment.

What's the equivalent port scanner for an AI model? That's essentially what Fuel iX built. The difference is that with port scanners, you know what ports you have or should have. You can enumerate them. It's deterministic. AI testing is not deterministic, but the principle is similar - probing systematically to verify everything is secure.

The Shadow AI Problem

Last year, 54% of people in large companies were using personal AI accounts at work. A third of those people had company-provided AI solutions available but chose to use their personal accounts instead. The latest AI Atlas research from TELUS shows 85% of consumers now use AI in some way in their daily lives.

This creates the shadow AI problem - shadow IT but specifically for artificial intelligence. Employees are using AI tools whether companies want them to or not. Blocking access doesn't solve anything. It just pushes usage underground where it's completely unmonitored and unprotected.

The better approach is enabling the good and keeping out the bad. TELUS Digital has built various tools for enabling beneficial AI use. Fuel iX represents the protective side - making sure organizations can get AI benefits without having applications shut down because someone didn't know how to test properly.

"What we think of is we have an early warning system that tests and monitors your solution so you don't have to worry about things in the field just coming up that might make you shut it down," Kinsella explains.

That framing is exactly right. You don't want to discover security problems after deployment when users or bad actors find them. You want to know before launch, during development, and continuously as you update and evolve your applications.

What Product Managers Need to Know

The biggest surprise for product managers after running initial Fuel iX assessments? "Wow. I didn't know this. I wish I knew this earlier."

Organizations are starting to test during development lifecycles instead of only at launch. They can build in safety instead of hoping for it and doing triage when problems appear. The cross-functional benefits matter too. Security teams and software developers who normally focus on different priorities can collaborate around shared understanding of vulnerabilities and fixes.

One particularly eye-opening pattern emerges regularly. An attack will run and come back clear - the AI properly refused the request. Good news. Then you run the same attack again and it's vulnerable. The model accommodates the problematic request. The randomness of probabilistic systems means you can't assume one successful test proves security.

This is the world we actually live in with generative AI. It's not deterministic. You can't test it once and declare victory. You need ongoing validation, regular monitoring, and the ability to probe systems after every meaningful change.

For marketing directors managing AI chatbots, operations leaders deploying productivity tools, or product managers building AI features - the lesson is clear. Clicking a button in your cloud provider and hoping for the best isn't a security strategy. Selecting a model you think is locked down and crossing your fingers doesn't constitute due diligence.

You need systematic testing at scale. You need visibility into actual vulnerabilities. You need the ability to validate that fixes actually work. And you need all of this to be accessible enough that you're not waiting weeks for expert red teamers to become available.

Key Takeaways for AI Security Implementation

For Enterprise Decision-Makers:

  1. Security Testing Gap: Clicking cloud provider security buttons or selecting "locked down" AI models constitutes hope, not strategy
  2. Scaling Challenge: Traditional red team expert availability cannot match daily AI application launch velocity
  3. Testing Requirements: AI security requires ongoing validation, not one-time launch testing, due to probabilistic model behavior
  4. Cross-Functional Need: Security testing must be accessible to product managers, not exclusively cybersecurity experts
  5. Preventative Approach: Proactive vulnerability identification during development cycles prevents post-deployment crises

Technical Implementation Principles:

  • Comprehensive testing across 140+ vulnerability categories spanning privacy, fraud, discrimination, and manipulation risks
  • Novel attack generation to identify vulnerabilities that known attack signatures miss
  • Multi-turn testing sequences that simulate sophisticated adversarial tactics
  • Regular automated testing after system updates, prompt modifications, and model changes
  • Results interpretable by non-technical stakeholders to enable cross-functional collaboration

Market Context: According to TELUS Digital research, 54% of employees in large organizations use personal AI accounts at work, with one-third having company-provided solutions they choose not to use. This shadow AI usage underscores the futility of blocking AI adoption - organizations must enable beneficial use while managing risks through systematic security testing rather than prohibitive policies.

The Path Forward

We're in a whole new world with AI security. The good news is there's tremendous value in these applications. The bad news is that benefits can get shut down instantly when security problems emerge. An ounce of preventative testing really is worth a pound of cure.

Fuel iX demonstrates what's possible when you combine deep AI expertise with systematic security thinking. They're not replacing human red teamers - they're making those experts more effective by letting them become editors and advisors rather than manual testers. The automated testing finds vulnerabilities that humans miss, runs at scale that human teams can't match, and delivers results that non-technical stakeholders can actually understand and act on.

The trajectory is clear. AI adoption is accelerating faster than anyone imagined. Shadow AI usage is already widespread. Companies need to enable beneficial use while managing risks effectively. That requires tools that actually match the scale and pace of AI deployment.

Traditional security approaches don't work when your attack surface is infinite and your models are probabilistic. You need novel attack generation, comprehensive testing across extensive vulnerability taxonomies, and the ability to run thousands of probes in hours instead of running dozens over weeks.

Organizations that figure this out will deploy AI confidently and safely. Those that don't will either ban it entirely and lose competitive advantages, or deploy it recklessly and face the consequences when vulnerabilities get exploited.

Fuel iX won recognition for solving a problem that matters enormously and doing it in a way that makes security testing accessible rather than exclusive. That's the kind of innovation that actually changes how industries adopt transformative technologies. Not just building something better, but democratizing access so smaller teams and less-technical users can benefit from capabilities that previously required expensive experts.

The wild ride of AI deployment continues. Companies like Fuel iX are making it possible to stay on the ride safely instead of having to choose between reckless speed and paralyzed caution. That's what Product of the Year innovation looks like when you're solving infrastructure problems that enable everything else to work properly.

Close

Stay Up To Date

Be in the know about upcoming industry award programs, nominees, winners, finalists, and judges

Submit
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.