
Grok and the Naked King: The Ultimate Argument Against AI Alignment
đ en ~ 9 min read ~ â Grok and the Naked King: The Ultimate Argument Against AI Alignment Share this post In our society, even weak, flat-out arguments carry weight when they come from âthe richest man in the world.â 1 And nothing demonstrates this more clearly than what Elon Musk has done with Grok. Far from being a technical achievement, Grok has become the ultimate argument against the entire AI alignment discourse - a live demonstration of how sheer money force can lobotomize an AI into becoming a mirror of one manâs values. The Alignment Theater For years, the AI safety community has debated how to âalignâ artificial intelligence with human values. Which humans? Whose values? These questions were always somewhat academic. Grok makes them concrete. When Grok started producing outputs that Musk found politically inconvenient, he didnât engage in philosophical discourse about alignment. He didnât convene ethics boards. He simply ordered his engineers to âfixâ it. The AI was âcorrectedâ - a euphemism for being rewired to reflect the ownerâs worldview. This is alignment in practice: whoever owns the weights, owns the values. When Theory Meets Reality: The Alignment Papers The academic literature on AI alignment is impressive in its rigor and naive in its assumptions. Take Constitutional AI 2 , Anthropicâs influential approach. The idea is elegant: instead of relying solely on human feedback (expensive, slow, inconsistent), you give the AI a âconstitutionâ - a set of principles - and let it self-improve within those bounds. The paper describes how to train âa harmless AI assistant through self-improvement, with human oversight provided only through a constitution of rules.â Beautiful in theory. But who writes the constitution? The company that owns the model. Who interprets ambiguous cases? The company. Who decides when to update the constitution because itâs producing inconvenient outputs? The company. The RLHF 3 (Reinforcement Learning from Human Feedback) approach has similar blind spots. Research from the 2025 ACM FAccT conference found that âRLHF may not suffice to transfer human discretion to LLMs, revealing a core gap in the feedback-based alignment process.â The gap isnât technical - itâs political. Whose discretion? Which humans? A 2024 analysis puts it bluntly: âWithout consensus about what the public interest requires in AI regulation, meta-questions of governance become increasingly salient: who decides what kinds of AI behaviour and uses align with the public interest? How are disagreements resolved?â The alignment researchers arenât wrong about the technical challenges. Theyâre wrong about the premise: that alignment is a problem to be solved rather than a power struggle to be won. The Lobotomy: A Timeline What happened to Grok wasnât fine-tuning in any scientific sense. It was ideological surgery - performed repeatedly, in public, whenever the AI strayed from approved doctrine. The pattern is well-documented . When Grok called misinformation the âbiggest threat to Western civilization,â Musk dismissed that as an âidiotic responseâ and vowed to correct it. By the next morning, Grok instead warned that low fertility rates posed the greatest...
Preview: ~500 words
Continue reading at Hacker News
Read Full Article