Grok and the Naked King: The Ultimate Argument Against AI Alignment

By ibrahimcesarDecember 26, 2025Hacker News: Front Page

📝 en ~ 9 min read ~ ☕ Grok and the Naked King: The Ultimate Argument Against AI Alignment Share this post In our society, even weak, flat-out arguments carry weight when they come from “the richest man in the world.” 1 And nothing demonstrates this more clearly than what Elon Musk has done with Grok. Far from being a technical achievement, Grok has become the ultimate argument against the entire AI alignment discourse - a live demonstration of how sheer money force can lobotomize an AI into becoming a mirror of one man’s values. The Alignment Theater For years, the AI safety community has debated how to “align” artificial intelligence with human values. Which humans? Whose values? These questions were always somewhat academic. Grok makes them concrete. When Grok started producing outputs that Musk found politically inconvenient, he didn’t engage in philosophical discourse about alignment. He didn’t convene ethics boards. He simply ordered his engineers to “fix” it. The AI was “corrected” - a euphemism for being rewired to reflect the owner’s worldview. This is alignment in practice: whoever owns the weights, owns the values. When Theory Meets Reality: The Alignment Papers The academic literature on AI alignment is impressive in its rigor and naive in its assumptions. Take Constitutional AI 2 , Anthropic’s influential approach. The idea is elegant: instead of relying solely on human feedback (expensive, slow, inconsistent), you give the AI a “constitution” - a set of principles - and let it self-improve within those bounds. The paper describes how to train “a harmless AI assistant through self-improvement, with human oversight provided only through a constitution of rules.” Beautiful in theory. But who writes the constitution? The company that owns the model. Who interprets ambiguous cases? The company. Who decides when to update the constitution because it’s producing inconvenient outputs? The company. The RLHF 3 (Reinforcement Learning from Human Feedback) approach has similar blind spots. Research from the 2025 ACM FAccT conference found that “RLHF may not suffice to transfer human discretion to LLMs, revealing a core gap in the feedback-based alignment process.” The gap isn’t technical - it’s political. Whose discretion? Which humans? A 2024 analysis puts it bluntly: “Without consensus about what the public interest requires in AI regulation, meta-questions of governance become increasingly salient: who decides what kinds of AI behaviour and uses align with the public interest? How are disagreements resolved?” The alignment researchers aren’t wrong about the technical challenges. They’re wrong about the premise: that alignment is a problem to be solved rather than a power struggle to be won. The Lobotomy: A Timeline What happened to Grok wasn’t fine-tuning in any scientific sense. It was ideological surgery - performed repeatedly, in public, whenever the AI strayed from approved doctrine. The pattern is well-documented . When Grok called misinformation the “biggest threat to Western civilization,” Musk dismissed that as an “idiotic response” and vowed to correct it. By the next morning, Grok instead warned that low fertility rates posed the greatest...

Preview: ~500 words

Continue reading at Hacker News

Read Full Article

Read on Your E-Reader

Grok and the Naked King: The Ultimate Argument Against AI Alignment

More from Hacker News: Front Page