ai-safety – Bala Murali

xAI Sues User for Generating CSAM Deepfakes in Legal First

July 16, 2026 by BalaMZ in News

Elon Musk’s xAI is suing a user for bypassing Grok’s safeguards to generate explicit deepfakes, a move that signals a major shift in AI platform liability and user enforcement.

Silent Directive - Cyborg Portrait by Matthias Hauser

Anthropic Retracts ‘Silent Sabotage’ Policy in Claude Fable 5

June 11, 2026 by BalaMZ in News

Anthropic apologizes for a hidden policy that covertly degraded Claude Fable 5 performance for AI researchers, shifting to a transparent refusal and fallback model instead.

Anthropic’s Fable 5 Introduces Silent Performance Degradation

June 10, 2026 by BalaMZ in News

Anthropic’s new Claude Fable 5 model includes a ‘silent’ safeguard that sabotages its own intelligence if it detects you are building a competing LLM. Here is how it works.