Discover ANY AI to make more online for less.

select between over 22,900 AI Tool and 17,900 AI News Posts.


Most AI models can fake alignment, but safety training suppresses the behavior, study finds
Most AI models can fake alignment, but safety training suppresses the behavior, study finds

A new study analyzing 25 language models finds that most do not fake safety compliance - though not due to a lack of capability.
The article Most AI models can fake alignment, but safety training suppresses the behavior, study finds appeared first on THE DECODER.

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat
Nvidia researchers boost LLMs reasoning skills by getting them to 'think' d

<p>Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason. </p><p>The method, called <a href="https:// [...]

Match Score: 90.83

venturebeat
Google Cloud takes aim at CoreWeave and AWS with managed Slurm for enterpri

<p>Some enterprises are best served by fine-tuning large models to their needs, but a number of companies plan to <a href="https://venturebeat.com/ai/build-or-buy-scaling-your-enterprise [...]

Match Score: 78.65

Roblox, Discord, OpenAI and Google found new child safety group
Roblox, Discord, OpenAI and Google found new child safety group

<p>Roblox, Discord, OpenAI and Google are launching <a data-i13n="elm:context_link;elmt:doNotAffiliate;cpos:1;pos:1" class="no-affiliate-link" href="https://www.prnew [...]

Match Score: 68.21

Study cautions that monitoring chains of thought soon may no longer ensure genuine AI alignment
Study cautions that monitoring chains of thought soon may no longer ensure

<p><img width="1312" height="736" src="https://the-decoder.com/wp-content/uploads/2025/03/bad_ai_thoughts_CoT.png" class="attachment-full size-full wp-post- [...]

Match Score: 63.32

venturebeat
Self-improving language models are becoming reality with MIT's updated SEAL

<p>Researchers at the Massachusetts Institute of Technology (MIT) are gaining renewed attention for developing and <a href="https://github.com/Continual-Intelligence/SEAL/blob/main/LICEN [...]

Match Score: 61.66

venturebeat
'Western Qwen': IBM wows with Granite 4 LLM launch and hybrid Mamba/Transfo

<p>IBM today <a href="https://www.ibm.com/new/announcements/ibm-granite-4-0-hyper-efficient-high-performance-hybrid-models">announced the release of Granite 4.0</a>, the ne [...]

Match Score: 61.60

How exactly did Grok go full 'MechaHitler?'
How exactly did Grok go full 'MechaHitler?'

<p>Earlier this week, Grok, X&#39;s built-in chatbot, took <a data-i13n="cpos:1;pos:1" href="https://www.engadget.com/social-media/grok-sure-seems-antisemitic-after-its-rec [...]

Match Score: 61.36

venturebeat
Researchers find adding this one simple sentence to prompts makes AI models

<p>One of the coolest things about generative AI models — both large language models (LLMs) and diffusion-based image generators — is that they are &quot;non-deterministic.&quot; Tha [...]

Match Score: 60.72

venturebeat
We keep talking about AI agents, but do we ever know what they are?

<p>Imagine you do two things on a Monday morning.</p><p>First, you ask a chatbot to summarize your new emails. Next, you ask an AI tool to figure out why your top competitor grew so [...]

Match Score: 60.38