AnyAi.fyi - Discover ANY AI to make more online for less.

OpenAI and Anthropic conducted safety evaluations of each other's AI systems

Most of the time, AI companies are locked in a race to the top, treating each other as rivals and competitors. Today, OpenAI and Anthropic revealed that they agreed to evaluate the alignment of each other's publicly available systems and shared the results of their analyses. The full reports get pretty technical, but are worth a read for anyone who's following the nuts and bolts of AI development. A broad summary showed some flaws with each company's offerings, as well as revealing pointers for how to improve future safety tests.
Anthropic said it evaluated OpenAI models for "sycophancy, whistleblowing, self-preservation, and supporting human misuse, as well as capabilities related to undermining AI safety evaluations and oversight." Its review found that o3 and o4-mini models from OpenAI fell in line with results for its own models, but raised concerns about possible misuse with the GPT-4o and GPT-4.1 general-purpose models. The company also said sycophancy was an issue to some degree with all tested models except for o3.
Anthropic's tests did not include OpenAI's most recent release. GPT-5 has a feature called Safe Completions, which is meant to protect users and the public against potentially dangerous queries. OpenAI recently faced its first wrongful death lawsuit after a tragic case where a teenager discussed attempts and plans for suicide with ChatGPT for months before taking his own life.
On the flip side, OpenAI ran tests on Anthropic models for instruction hierarchy, jailbreaking, hallucinations and scheming. The Claude models generally performed well in instruction hierarchy tests, and had a high refusal rate in hallucination tests, meaning they were less likely to offer answers in cases where uncertainty meant their responses could be wrong.
The move for these companies to conduct a joint assessment is intriguing, particularly since OpenAI allegedly violated Anthropic's terms of service by having programmers use Claude in the process of building new GPT models, which led to Anthropic barring OpenAI's access to its tools earlier this month. But safety with AI tools has become a bigger issue as more critics and legal experts seek guidelines to protect users, particularly minors. This article originally appeared on Engadget at https://www.engadget.com/ai/openai-and-anthropic-conducted-safety-evaluations-of-each-others-ai-systems-223637433.html?src=rss

Discover Copy

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

Anthropic is giving away its powerful Claude Haiku 4.5 AI for free to take

<a href="https://anthropic.com/">Anthropic</a> released <a href="https://www.anthropic.com/news/claude-haiku-4-5">Claude Haik [...]

More Copy

Match Score: 185.75

venturebeat

Anthropic's Claude Code can now read your Slack messages and write cod

<a href="https://anthropic.com/">Anthropic</a> on Monday launched a beta integration that connects its fast-growing <a href="https://www.claud [...]

More Copy

Match Score: 178.27

venturebeat

Artificial Analysis overhauls its AI Intelligence Index, replacing popular

The arms race to build smarter AI models has a measurement problem: the tests used to rank them are becoming obsolete almost as quickly as the models improve. On Monday, <a href="http [...]

More Copy

Match Score: 136.20

venturebeat

Anthropic vs. OpenAI red teaming methods reveal different security prioriti

Model providers want to prove the security and robustness of their models, releasing system cards and conducting red-team exercises with each new release. But it can be difficul [...]

More Copy

Match Score: 133.83

venturebeat

Anthropic rolls out Claude AI for finance, integrates with Excel to rival M

<a href="http://anthropic.com">Anthropic</a> is making its most aggressive push yet into the trillion-dollar financial services industry, unveiling a [...]

More Copy

Match Score: 124.97

venturebeat

How Anthropic’s ‘Skills’ make Claude faster, cheaper, and more consis

<a href="https://anthropic.com/">Anthropic</a> launched a new capability on Thursday that allows its <a href="https://claude.ai/">< [...]

More Copy

Match Score: 123.29

venturebeat

Anthropic launches Cowork, a Claude Desktop agent that works in your files

<a href="https://www.anthropic.com/">Anthropic</a> released <a href="https://claude.com/blog/cowork-research-preview">Cowork</a> on Monday, a new A [...]

More Copy

Match Score: 122.82

venturebeat

Anthropic’s Claude Opus 4.5 is here: Cheaper AI, infinite chats, and codi

<a href="https://anthropic.com/">Anthropic</a> released its most capable artificial intelligence model yet on Monday, slashing prices by roughly two-thirds while claimin [...]

More Copy

Match Score: 122.38

venturebeat

Anthropic cracks down on unauthorized Claude usage by third-party harnesses

Anthropic has confirmed the implementation of strict new technical safeguards preventing third-party applications from spoofing its official coding client, Claude Code, in order to access the [...]

More Copy

Match Score: 101.84