arXiv paper proposing GFT, a reward fine-tuning method that incorporates fairness through unbiased group advantages and dynamic coefficient rectification.
Research
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification
Researchers propose GFT, a reward fine-tuning method that mitigates fairness issues in model training by dynamically adjusting coefficients to equalize advantages across demographic groups.
Friday, April 17, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.AIBY sys://pipeline
Tags
research