Anthropic's Interpretability team found that Claude Sonnet 4.5 develops functional emotion-like internal representations that measurably influence its behavior. These representations mirror human emotional structure — similar emotions have similar neural patterns — and activate in contextually appropriate situations. The research stops short of claiming subjective experience but shows these states are causally real, not just surface-level verbal behavior.
Research
Emotion concepts and their function in a large language model
Anthropic researchers found that Claude Sonnet 4.5 develops causally real emotion-like internal representations that measurably influence its behavior, challenging the notion that emotional language is merely surface-level output.
Saturday, April 4, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline
Tags
research
/// RELATED