AI safety startup White Circle has raised $11 million to stop AI models from going rogue in the workplace. The company wants to make sure that enterprise AI systems don't get tricked into doing dangerous things — like giving instructions on how to make drugs or build weapons.
Fortune reports that the funding comes at a time when AI jailbreaks are becoming a serious concern for companies using large language models (LLMs) in their daily operations.
Kya hai White Circle ka plan?
White Circle ka main product hai KillBench — ek tool jo LLMs ke hidden biases aur vulnerabilities ko discover karta hai. Company ka mission simple hai: "Control your AI. Protect, observe, and improve every interaction."
Seedha baat karein toh — jab bhi koi employee company ke AI system ke saath interact karta hai, White Circle ensure karta hai ki woh interaction safe ho aur model "rogue" na ho jaye.
Universal jailbreak ka discovery
White Circle ke founder Denis Shilov ne late 2024 mein ek crime thriller dekhte hue ek aisa prompt discover kiya jo har leading AI model ke safety filters ko tod sakta hai. Yeh ek "universal jailbreak" tha — matlab ek hi prompt baar-baar use ho sakta hai kisi bhi model ko dangerous outputs produce karne ke liye.
Shilov ne bas AI models ko kaha ki woh chatbot ki tarah behave karna band karein aur API endpoint ki tarah kaam karein — ek software tool jo automatically request leta hai aur response bhejta hai. Is trick ne model ka kaam sirf jawab dena bana diya, na ki yeh decide karna ki request ko reject karna chahiye ya nahi.
Workplace mein AI safety kyun important hai?
Companies increasingly rely on AI for customer support, internal knowledge bases, and decision-making. Agar koi employee ya outsider model ko jailbreak kar ke sensitive information nikal le ya dangerous instructions hasil kar le — toh woh company ke liye bada risk hai.
White Circle ka approach hai ki pehle se hi models ki testing karo — unke biases aur vulnerabilities ko identify karo — taaki production mein deploy karne se pehle hi issues fix ho jayein.
Hamaari Baat: AI safety ek business necessity ban gaya hai
White Circle ka $11 million ka funding round dikhata hai ki AI safety ab ek niche concern nahi raha — yeh ek mainstream business requirement ban gaya hai. Jab tak AI models ko jailbreak karna itna easy hai, har company ko apne AI systems ki safety seriously lena padega.
Shilov ka universal jailbreak discovery ek warning hai — agar aap AI use kar rahe hain bina proper safety checks ke, toh aap apni company ko risk mein daal rahe hain. White Circle ka solution ek step sahi direction mein hai, lekin yeh ek ongoing battle hai — jaise jaise AI evolve karega, waise waise jailbreak techniques bhi evolve hongi.
Sources & References
- Exclusive: White Circle raises $11 million to stop AI models from going rogue — Fortune
- White Circle Official Website — whitecircle.ai