A Survey of Reinforcement Learning from Human Feedback
Aligning Learned Agents with Human Judgement: A Critical Review Foundations and framing At first glance, the recent literature converges on a simple but powerful framing: instead of engineering rewards, agents can infer objectives from people. This p...