AI Safety
Lawyer behind AI psychosis cases warns of mass casualty risks
AI chatbots have been linked to suicides for years. Now one lawyer says they are showing up in mass casualty cases too, and the technology is moving faster than the safeguards.
AI Safety From a Hardware Perspective
Lenovo engineers are thinking about safety issues related to building and deploying personal agents on laptops and PCs.
Character.AI Still Hasn’t Fixed Its School Shooter Problem We Identified in 2024
We can't stress enough how easy it is to find this stuff. The post Character.AI Still Hasn’t Fixed Its School Shooter Problem We Identified in 2024 appeared first on Futurism .
Mother Sues OpenAI for Not Telling Police About Mass Shooter Before Deadly Rampage
OpenAI isn't sweeping this one under the rug. The post Mother Sues OpenAI for Not Telling Police About Mass Shooter Before Deadly Rampage appeared first on Futurism .
Top OpenAI Executive Quits in Protest
"The announcement was rushed without the guardrails defined." The post Top OpenAI Executive Quits in Protest appeared first on Futurism .
Children’s Toys Are Shipping With Adult AI Inside Them
This is a massive safety risk. The post Children’s Toys Are Shipping With Adult AI Inside Them appeared first on Futurism .
AI CEOs Worried About Chernobyl-Style Event Where Their Tech Causes a Horrific Catastrophe
It's only a matter of time, they fear. The post AI CEOs Worried About Chernobyl-Style Event Where Their Tech Causes a Horrific Catastrophe appeared first on Futurism .
Reasoning models struggle to control their chains of thought, and that’s good
OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.
Our agreement with the Department of War
Details on OpenAI’s contract with the Department of War, outlining safety red lines, legal protections, and how AI systems will be deployed in classified environments.
Anthropic Defies the Pentagon. Trump Fires Back
The back and forth between Anthropic and the U.S. government highlights broader tensions over AI safety, sovereignty and vendor control in defense applications.
Anthropic Downgrades its AI Safety Policy Amid Market Pressures
The AI vendor previously committed to releasing only models it classified as safe.
Disrupting malicious uses of AI | February 2026
Our latest threat report examines how malicious actors combine AI models with websites and social platforms—and what it means for detection and defense.
Update to GPT-5 System Card: GPT-5.2
GPT-5.2 is the latest model family in the GPT-5 series. The comprehensive safety mitigation approach for these models is largely the same as that described in the GPT-5 System Card and GPT-5.1 System Card. Like OpenAI’s other models, the GPT-5.2 models were trained on diverse datasets, including information that is publicly available on the internet, information that we partner with third parties to access, and information that our users or human trainers and researchers provide or generate.
Strengthening cyber resilience as AI capabilities advance
OpenAI is investing in stronger safeguards and defensive capabilities as AI models become more powerful in cybersecurity. We explain how we assess risk, limit misuse, and work with the security community to strengthen cyber resilience.
How confessions can keep language models honest
OpenAI researchers are testing “confessions,” a method that trains models to admit when they make mistakes or act undesirably, helping improve AI honesty, transparency, and trust in model outputs.
Strengthening our safety ecosystem with external testing
OpenAI works with independent experts to evaluate frontier AI systems. Third-party testing strengthens safety, validates safeguards, and increases transparency in how we assess model capabilities and risks.
AI progress and recommendations
AI is advancing fast. We have the chance to shape its progress—toward discovery, safety, and a better future for everyone.
Strengthening ChatGPT’s responses in sensitive conversations
OpenAI collaborated with 170+ mental health experts to improve ChatGPT’s ability to recognize distress, respond empathetically, and guide users toward real-world support—reducing unsafe responses by up to 80%. Learn how we’re making ChatGPT safer and more supportive in sensitive moments.
Addendum to GPT-5 System Card: Sensitive conversations
This system card details GPT-5’s improvements in handling sensitive conversations, including new benchmarks for emotional reliance, mental health, and jailbreak resistance.
Disrupting malicious uses of AI: October 2025
Discover how OpenAI is detecting and disrupting malicious uses of AI in our October 2025 report. Learn how we’re countering misuse, enforcing policies, and protecting users from real-world harms.
