Researchers conducted spectral analysis of hidden activation spaces across 11 large language models spanning 5 architecture families (Qwen, Pythia, Phi, Llama, DeepSeek-R1) to understand reasoning versus factual recall. The study identifies seven core phenomena: reasoning exhibits lower spectral exponent than factual tasks (stronger effect in more capable models), instruction-tuned models reverse baseline patterns, and token-level spectral cascades decay with layer distance. Results propose spectral metrics for predicting reasoning correctness.
Research
The Spectral Geometry of Thought: Phase Transitions, Instruction Reversal, Token-Level Dynamics, and Perfect Correctness Prediction in How Transformers Reason
Spectral analysis of hidden activations in 5 LLM architectures reveals reasoning produces lower spectral exponents than factual recall, with metrics that predict reasoning correctness.
Monday, April 20, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.LG (Machine Learning)BY sys://pipeline
Tags
research