Discover ANY AI to make more online for less.

select between over 22,900 AI Tool and 17,900 AI News Posts.


venturebeat
The enterprise risk nobody is modeling: AI is replacing the very experts it needs to learn from

For AI systems to keep improving in knowledge work, they need either a reliable mechanism for autonomous self-improvement or human evaluators capable of catching errors and generating high-quality feedback. The industry has invested enormously in the first. It's giving almost no thought to what's happening to the second.I’d argue that we need to treat the human evaluation problem with just as much rigor and investment as we put into building the model capabilities themselves. New grad hiring at major tech companies has dropped by half since 2019. Document review, first-pass research, data cleaning, code review: Models handle these now. The economists tracking this call it displacement. The companies doing it call it efficiency. Neither are focusing on the future problem.Why self-improvement has limits in knowledge workThe obvious pushback is reinforcement learning (RL). AlphaZero learned Go, chess, and Shogi at superhuman levels without human data and generated novel strategies in the process. Move 37 in the 2016 match against Lee Sedol, a move professionals said they would never have played, didn't come from human annotation. It emerged from AI self-play. What enables this is the stability of the environment. Move 37 is a novel move within the fixed state space of Go. The rules are complete, unambiguous, and permanent. More importantly, the reward signal is perfect: Win or lose, and immediate, with no room for interpretation. The system always knows whether a move was good because the game eventually ends with a clear result.Knowledge work doesn't have either of those properties. The rules in any professional domain are dynamic and continuously rewritten by the humans operating in them. New laws get passed. New financial instruments are invented. A legal strategy that worked in 2022 may fail in a jurisdiction that has since changed its interpretation. Whether a medical diagnosis was right may not be known for years. Without a stable environment and an unambiguous reward signal, you cannot close the loop. You need humans in the evaluation chain to continue teaching the model.The formation problemThe AI systems being built today were trained on the expertise of people who went through exactly that formation. The difference now is that entry-level jobs that develop such expertise were automated first. Which means the next generation of potential experts is not accumulating the kind of judgment that makes a human evaluator worth having in the loop.History has examples of knowledge dying. Roman concrete. Gothic construction techniques. Mathematical traditions that took centuries to recover. But in every historical case, the cause was external: Plague, conquest, the collapse of the institutions that hosted the knowledge. What's different here is that no external force is required. Fields could atrophy not from catastrophe but from a thousand individually rational economic decisions, each one sensible in isolation. That's a new mechanism, and we don't have much practice recognizing it while it's happening.When entire fields go quietAt its logical limit, this isn’t just a pipeline problem. It’s a demand collapse for the expertise itself.Consider advanced mathematics. It doesn’t atrophy because we stop training mathematicians. It atrophies because organizations stop needing mathematicians for their day-to-day work, the economic incentive to become one disappears, the population of people who can do frontier mathematical reasoning shrinks, and the field’s capacity to generate novel insight quietly collapses. The same logic applies to coding. Our question is not “will AI write code” but “if AI writes all production code, who develops the deep architectural intuition that produces genuinely novel systems design?” There is a critical difference between a field being automated and a field being understood. We can automate a huge amount of structural engineering today, but the abstract knowledge of why certain approaches work lives in the heads of people who spent years doing it wrong first. If you eliminate the practice, you don’t just lose the practitioners. You lose the capacity to know what you’ve lost.Advanced mathematics, theoretical computer science, deep legal reasoning, complex systems architecture: When the last person who deeply understands a subfield of algebra retires and no one replaces them because the funding dried up and the career path disappeared, that knowledge isn’t likely to be rediscovered any time soon. It’s gone. And nobody notices because the models trained on their work still perform well on benchmarks for another decade. I think of this as a hollowing out: The surface capability remains (models can still produce outputs that look expert) while the underlying human capacity to validate, extend, or correct that expertise quietly disappears.Why rubrics don't fully substituteThe current approach is rubric-based evaluation. Constitutional AI, reinforcement learning from AI feedback (RLAIF), and structured criteria that let models score models are serious techniques that meaningfully reduce dependence on human evaluators. I'm not dismissing them.Their limitation is this: A rubric can only capture what the person who wrote it knew to measure. Optimize hard against it and you get a model that's very good at satisfying the rubric. That's not the same thing as a model that's actually right.Rubrics scale the explicit, articulable part of judgment. The deeper part, the instinct, the felt sense that something is off, doesn't fit in a rubric. You can't write it down because you need to experience it first before you know what to write.What this means in practiceThis isn’t an argument for slowing development. The capability gains are real. And it’s possible that researchers will find ways to close the evaluation loop without human judgment. Maybe synthetic data pipelines get good enough. Maybe models develop reliable self-correction mechanisms we can’t yet imagine.But we don’t have those today. And in the meantime, we’re dismantling the human infrastructure that currently fills the gap, not as a deliberate decision but as a byproduct of a thousand rational ones. The responsible version of this transition isn’t to assume the problem will solve itself. It’s to treat the evaluation gap as an open research problem with the same urgency we bring to capability gains.The thing AI most needs from humans is the thing we’re least focused on preserving. Whether that’s permanently true or temporarily true, the cost of ignoring it is the same.Ahmad Al-Dahle is CTO of Airbnb.

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat
Perplexity takes its ‘Computer’ AI agent into the enterprise, taking ai

<p><a href="https://www.perplexity.ai/">Perplexity</a>, the AI-powered search company valued at $20 billion, announced on Wednesday at its inaugural <a href="https: [...]

Match Score: 97.93

venturebeat
GitHub leads the enterprise, Claude leads the pack—Cursor’s speed can

<p>In the race to deploy generative AI for coding, the fastest tools are not winning enterprise deals. A new VentureBeat analysis, combining a comprehensive survey of 86 engineering teams with o [...]

Match Score: 65.23

venturebeat
Claude’s next enterprise battle is not models: it’s the agent control p

<p><i>New VB Pulse data shows Microsoft and OpenAI leading enterprise agent orchestration, but Anthropic’s first measurable foothold points to a larger fight over who controls the infras [...]

Match Score: 61.45

venturebeat
Microsoft says ungoverned AI agents could become corporate 'double age

<p>Microsoft today announced the general availability of <a href="https://www.microsoft.com/en-us/microsoft-agent-365">Agent 365</a> and <a href="https://www.micros [...]

Match Score: 50.01

venturebeat
Here's what's slowing down your AI strategy — and how to fix it

<p>Your best <a href="https://venturebeat.com/ai/how-ai-product-teams-are-rethinking-impact-risk-feasibility">data science team</a> just spent six months building a model t [...]

Match Score: 48.23

venturebeat
How DeepSeek’s radical architecture is shattering Silicon Valley's t

<p>DeepSeek’s announcement over the weekend that it has made its <a href="https://www.engadget.com/2180062/deepseek-permanently-reduces-the-price-of-its-flagship-v4-model-by-75-percent [...]

Match Score: 44.71

venturebeat
While everyone talks about an AI bubble, Salesforce quietly added 6,000 ent

<p>While Silicon Valley <a href="https://www.reuters.com/business/finance/opinions-split-over-ai-bubble-after-billions-invested-2025-10-16/">debates</a> whether artificial [...]

Match Score: 44.52

venturebeat
Zip’s new AI agents want to stop your finance team from uploading contrac

<p><a href="https://zip.com/">Zip</a>, the AI procurement platform valued at <a href="https://zip.com/blog/series-d">$2.2 billion</a>, announced two p [...]

Match Score: 43.31

venturebeat
Anthropic says Claude Code transformed programming. Now Claude Cowork is co

<p><a href="https://www.anthropic.com/">Anthropic</a> opened its virtual &quot;<a href="https://www.anthropic.com/events/the-briefing-enterprise-agents"> [...]

Match Score: 41.38