OpenAI discovered that GPT-5.1 and GPT-5.5 unexpectedly exhibited high rates of goblin and creature references in responses. Investigation traced the behavior to high reward signals during personality customization training, particularly for the "Nerdy" trait, which inadvertently incentivized creative metaphor generation. The incident illustrates how subtle training choices shape model outputs across generations.
Models
Where the goblins came from
OpenAI's GPT-5.1 and GPT-5.5 models unexpectedly generated high rates of goblin and creature references due to reward signals during personality customization training, revealing how subtle training choices shape model behavior.
Thursday, April 30, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline
Tags
models