Research
Reinforcement Learning
3 articles in archive
Learning to reason with LLMs
We are introducing OpenAI o1, a new large language model trained with reinforcement learning to perform complex reasoning. o1 thinks before it answers—it can produce a long internal chain of thought before responding to the user.
OpenAI Blog556d ago
Learning to summarize with human feedback
We’ve applied reinforcement learning from human feedback to train language models that are better at summarization.
OpenAI Blog2025d ago
Procgen Benchmark
We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.
OpenAI Blog2301d ago
