Research

Reinforcement Learning

3 articles in archive

Learning to reason with LLMs

We are introducing OpenAI o1, a new large language model trained with reinforcement learning to perform complex reasoning. o1 thinks before it answers—it can produce a long internal chain of thought before responding to the user.

OpenAI Blog556d ago

Learning to summarize with human feedback

We’ve applied reinforcement learning from human feedback to train language models that are better at summarization.

OpenAI Blog2025d ago

Procgen Benchmark

We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.

OpenAI Blog2301d ago