AnyAi.fyi - Discover ANY AI to make more online for less.

venturebeat

Booking.com’s agent strategy: Disciplined, modular and already delivering 2× accuracy

When many enterprises weren’t even thinking about agentic behaviors or infrastructures, Booking.com had already “stumbled” into them with its homegrown conversational recommendation system.

This early experimentation has allowed the company to take a step back and avoid getting swept up in the frantic AI agent hype. Instead, it is taking a disciplined, layered, modular approach to model development: small, travel-specific models for cheap, fast inference; larger large language models (LLMs) for reasoning and understanding; and domain-tuned evaluations built in-house when precision is critical.

With this hybrid strategy — combined with selective collaboration with OpenAI — Booking.com has seen accuracy double across key retrieval, ranking and customer-interaction tasks.

As Pranav Pathak, Booking.com’s AI product development lead, posed to VentureBeat in a new podcast: “Do you build it very, very specialized and bespoke and then have an army of a hundred agents? Or do you keep it general enough and have five agents that are good at generalized tasks, but then you have to orchestrate a lot around them? That's a balance that I think we're still trying to figure out, as is the rest of the industry.”

Check out the new Beyond the Pilot podcast here, and continue reading for highlights. Moving from guessing to deep personalization without being ‘creepy’Recommendation systems are core to Booking.com’s customer-facing platforms; however, traditional recommendation tools have been less about recommendation and more about guessing, Pathak conceded. So, from the start, he and his team vowed to avoid generic tools: As he put it, the price and recommendation should be based on customer context.

Booking.com’s initial pre-gen AI tooling for intent and topic detection was a small language model, what Pathak described as “the scale and size of BERT.” The model ingested the customer’s inputs around their problem to determine whether it could be solved through self-service or bumped to a human agent.

“We started with an architecture of ‘you have to call a tool if this is the intent you detect and this is how you've parsed the structure,” Pathak explained. “That was very, very similar to the first few agentic architectures that came out in terms of reason and defining a tool call.”

His team has since built out that architecture to include an LLM orchestrator that classifies queries, triggers retrieval-augmented generation (RAG) and calls APIs or smaller, specialized language models. “We've been able to scale that system quite well because it was so close in architecture that, with a few tweaks, we now have a full agentic stack,” said Pathak.

As a result, Booking.com is seeing a 2X increase in topic detection, which in turn is freeing up human agents’ bandwidth by 1.5 to 1.7X. More topics, even complicated ones previously identified as ‘other’ and requiring escalation, are being automated.

Ultimately, this supports more self-service, freeing human agents to focus on customers with uniquely-specific problems that the platform doesn’t have a dedicated tool flow for — say, a family that is unable to access its hotel room at 2 a.m. when the front desk is closed.

That not only “really starts to compound,” but has a direct, long-term impact on customer retention, Pathak noted. “One of the things we've seen is, the better we are at customer service, the more loyal our customers are.”

Another recent rollout is personalized filtering. Booking.com has between 200 and 250 search filters on its website — an unrealistic amount for any human to sift through, Pathak pointed out. So, his team introduced a free text box that users can type into to immediately receive tailored filters.

“That becomes such an important cue for personalization in terms of what you're looking for in your own words rather than a clickstream,” said Pathak.

In turn, it cues Booking.com into what customers actually want. For instance, hot tubs — when filter personalization first rolled out, jacuzzi’s were one of the most popular requests. That wasn’t even a consideration previously; there wasn’t even a filter. Now that filter is live.

“I had no idea,” Pathak noted. “I had never searched for a hot tub in my room honestly.”

When it comes to personalization, though, there is a fine line; memory remains complicated, Pathak emphasized. While it’s important to have long-term memories and evolving threads with customers — retaining information like their typical budgets, preferred hotel star ratings or whether they need disability access — it must be on their terms and protective of their privacy.

Booking.com is extremely mindful with memory, seeking consent so as to not be “creepy” when collecting customer information.

“Managing memory is much harder than actually building memory,” said Pathak. “The tech is out there, we have the technical chops to build it. We want to make sure we don't launch a memory object that doesn't respect customer consent, that doesn't feel very natural.”Finding a balance of build versus buy
As agents mature, Booking.com is navigating a central question facing the entire industry: How narrow should agents become?

Instead of committing to either a swarm of highly specialized agents or a few generalized ones, the company aims for reversible decisions and avoids “one-way doors” that lock its architecture into long-term, costly paths. Pathak’s strategy is: Generalize where possible, specialize where necessary and keep agent design flexible to help ensure resiliency.

Pathak and his team are “very mindful” of use cases, evaluating where to build more generalized, reusable agents or more task-specific ones. They strive to use the smallest model possible, with the highest level of accuracy and output quality, for each use case. Whatever can be generalized is.

Latency is another important consideration. When factual accuracy and avoiding hallucinations is paramount, his team will use a larger, much slower model; but with search and recommendations, user expectations set speed. (Pathak noted: “No one’s patient.”)

“We would, for example, never use something as heavy as GPT-5 for just topic detection or for entity extraction,” he said.

Booking.com takes a similarly elastic tack when it comes to monitoring and evaluations: If it's general-purpose monitoring that someone else is better at building and has horizontal capability, they’ll buy it. But if it’s instances where brand guidelines must be enforced, they’ll build their own evals.

Ultimately, Booking.com has leaned into being “super anticipatory,” agile and flexible. “At this point with everything that's happening with AI, we are a little bit averse to walking through one way doors,” said Pathak. “We want as many of our decisions to be reversible as possible. We don't want to get locked into a decision that we cannot reverse two years from now.”What other builders can learn from Booking.com’s AI journeyBooking.com’s AI journey can serve as an important blueprint for other enterprises.

Looking back, Pathak acknowledged that they started out with a “pretty complicated” tech stack. They’re now in a good place with that, “but we probably could have started something much simpler and seen how customers interacted with it.”

Given that, he offered this valuable advice: If you’re just starting out with LLMs or agents, out-of-the-box APIs will do just fine. “There's enough customization with APIs that you can already get a lot of leverage before you decide you want to go do more.”

On the other hand, if a use case requires customization not available through a standard API call, that makes a case for in-house tools.

Still, he emphasized: Don't start with the complicated stuff. Tackle the “simplest, most painful problem you can find and the simplest, most obvious solution to that.”

Identify the product market fit, then investigate the ecosystems, he advised — but don’t just rip out old infrastructures because a new use case demands something specific (like moving an entire cloud strategy from AWS to Azure just to use the OpenAI endpoint).

Ultimately: “Don't lock yourself in too early,” Pathak noted. “Don't make decisions that are one-way doors until you are very confident that that's the solution that you want to go with.”

Discover Copy

Rating

Innovation

Pricing

Technology

Usability

Match Score: 76.61