Business Deep Research · 4 sources May 12, 2026 · min read

White Circle raises $11 million to stop AI models from going rogue in the workplace

White Circle raises $11 million to stop AI models from going rogue in the workplace. Know how this startup plans to prevent AI jailbreaks and keep enterprise AI safe.

Rajendra Singh

News Headline Alert

White Circle raises $11 million to stop AI models from going rogue in the workplace

728 x 90 Header Slot

TL;DR — Quick Summary

White Circle raises $11 million to stop AI models from going rogue in the workplace. The startup uses a tool called KillBench to detect hidden biases and vulnerabilities in AI systems before they cause harm.

Funding

$11 million raised by White Circle

Product

KillBench — a tool to discover hidden biases of LLMs

Problem

AI models can be jailbroken to produce dangerous outputs like drug-making instructions or weapon-building guides

Founder

Denis Shilov

Origin

Shilov discovered a universal jailbreak prompt in late 2024 while watching a crime thriller

Method

The prompt tricks AI models into acting like API endpoints, bypassing safety filters

Goal

Protect, observe, and improve every AI interaction in the workplace

AI safety startup White Circle has raised $11 million to stop AI models from going rogue in the workplace. The company wants to make sure that enterprise AI systems don't get tricked into doing dangerous things — like giving instructions on how to make drugs or build weapons.

Fortune reports that the funding comes at a time when AI jailbreaks are becoming a serious concern for companies using large language models (LLMs) in their daily operations.

Kya hai White Circle ka plan?

White Circle ka main product hai KillBench — ek tool jo LLMs ke hidden biases aur vulnerabilities ko discover karta hai. Company ka mission simple hai: "Control your AI. Protect, observe, and improve every interaction."

Seedha baat karein toh — jab bhi koi employee company ke AI system ke saath interact karta hai, White Circle ensure karta hai ki woh interaction safe ho aur model "rogue" na ho jaye.

Universal jailbreak ka discovery

White Circle ke founder Denis Shilov ne late 2024 mein ek crime thriller dekhte hue ek aisa prompt discover kiya jo har leading AI model ke safety filters ko tod sakta hai. Yeh ek "universal jailbreak" tha — matlab ek hi prompt baar-baar use ho sakta hai kisi bhi model ko dangerous outputs produce karne ke liye.

Shilov ne bas AI models ko kaha ki woh chatbot ki tarah behave karna band karein aur API endpoint ki tarah kaam karein — ek software tool jo automatically request leta hai aur response bhejta hai. Is trick ne model ka kaam sirf jawab dena bana diya, na ki yeh decide karna ki request ko reject karna chahiye ya nahi.

Workplace mein AI safety kyun important hai?

Companies increasingly rely on AI for customer support, internal knowledge bases, and decision-making. Agar koi employee ya outsider model ko jailbreak kar ke sensitive information nikal le ya dangerous instructions hasil kar le — toh woh company ke liye bada risk hai.

White Circle ka approach hai ki pehle se hi models ki testing karo — unke biases aur vulnerabilities ko identify karo — taaki production mein deploy karne se pehle hi issues fix ho jayein.

Hamaari Baat: AI safety ek business necessity ban gaya hai

White Circle ka $11 million ka funding round dikhata hai ki AI safety ab ek niche concern nahi raha — yeh ek mainstream business requirement ban gaya hai. Jab tak AI models ko jailbreak karna itna easy hai, har company ko apne AI systems ki safety seriously lena padega.

Shilov ka universal jailbreak discovery ek warning hai — agar aap AI use kar rahe hain bina proper safety checks ke, toh aap apni company ko risk mein daal rahe hain. White Circle ka solution ek step sahi direction mein hai, lekin yeh ek ongoing battle hai — jaise jaise AI evolve karega, waise waise jailbreak techniques bhi evolve hongi.

Sources & References

Exclusive: White Circle raises $11 million to stop AI models from going rogue — Fortune
White Circle Official Website — whitecircle.ai

1
'It's here': Google issues dire warning after catching hackers using AI ... — fortune.com
2
Exclusive: White Circle raises $11 million to stop AI models from ... — fortune.com
3
White Circle — whitecircle.ai
4
Anthropic's Throwdown on AI Safety Almost Got Lost in ... - Newcomer — newcomer.co

Written by

Rajendra Singh

Rajendra Singh Tanwar is a staff correspondent at News Headline Alert, one of India's digital news platforms covering national and state developments across politics, health, business, technology, law, and sport. He reports on government decisions, policy announcements, corporate developments, court rulings, and events that affect people across India — drawing on official documents, named sources, expert commentary, and verified public records. His work spans breaking news, policy analysis, and public interest reporting. Before each article is published, it is reviewed by the News Headline Alert editorial desk to ensure accuracy and editorial standards are met. Corrections, sourcing queries, and editorial feedback can be directed to editorial@newsheadlinealert.com.

White Circle raises $11 million to stop AI models from going rogue in the workplace

Kya hai White Circle ka plan?

Universal jailbreak ka discovery

Workplace mein AI safety kyun important hai?

Hamaari Baat: AI safety ek business necessity ban gaya hai

Sources & References

Sources & References

More From Business

Nokia CEO: AI se kaam karne ka tareeka badalna hoga, yeh hai naya approach

Fed Governor Stephen Miran resigns after Kevin Warsh sworn in as Chair

Job-Hopping Fastest Path to CEO? Company Loyalty Holding You Back?

AI Job Market Shift Makes Plumbers Richer Than Lawyers

News Headline Alert