Patrick O'Shaughnessy

Don't let AI agents push your buttons - use webMCP instead!

Browser agents are coming. In this talk, Khushal Sagar, Staff Software Engineer for Chrome, covers a new way that web authors can integrate with these agents: webMCP. WebMCP is a new proposal in the webML working group that gives web authors the ability to define the capabilities of their site as tools that a browser agent can use. With webMCP you can create faster, more reliable and more helpful experiences for users interacting with your site using an AI agent.

Published
Published Nov 17, 2025
Uploaded
Uploaded Jun 13, 2026
File type
YouTube
Queried
0

Full transcript

Showing the full transcript for this video.

AI-generated transcript with timestamped sections.

0:00-1:34

[00:00] music [00:10] Thank you. [00:12] Good afternoon. [00:12] I'm Khashal Sagar. I'm a software engineer on the Chrome team, and I've been working on browsers for over a decade. [00:19] Having worked on multiple facets of the browser, the best moments of my career have been working on web standards. [00:26] It's an amazing feeling to engage with a community of web developers [00:30] browser engineers. [00:31] researchers across organizations. [00:34] all you folks in the room today. [00:36] working together with a shared mission to make the web better. [00:39] I'm here today with the same excitement to talk to you about an idea. [00:43] incubating at the WebML community group. [00:46] I'm confident it will be a critical piece. [00:48] in the web's evolution. [00:50] with AI. [00:51] Thank you. [00:52] Thank you. [00:53] The web platform has been the world's gateway to information and capabilities. [00:58] And now we're seeing a new wave. [00:59] for users to derive even more value from the web. [01:02] Agents, which act like a user's personal assistant, [01:06] are increasingly gaining traction. [01:08] in how users accomplish what they need. These agents can be built directly into the browser. [01:13] This means they implicitly understand what the user is doing [01:17] they can take in context across their tabs, [01:19] and take over those tedious parts of the user journey. [01:23] the result is fewer steps. [01:24] for users to get what they want on the web. [01:28] But this evolution has been happening on top of a web that you all built for a human user.

1:34-3:04

[01:34] As you can see here. [01:36] these agents are interacting with websites. [01:38] by parsing pages. [01:40] clicking on UI elements, [01:41] waiting on animations. [01:43] a series of steps commonly called UI actuation. [01:46] This carefully crafted user experience was never intended for an agent. [01:51] and it's only getting in the way of accomplishing [01:54] what the user actually wants. [01:57] And this problem of connecting agents to external systems isn't unique to the web either. [02:03] the agent ecosystem already saw a proliferation [02:07] of bespoke integrations. [02:08] between each agent provider and different services. [02:12] the desire for a shared standard became clear. [02:15] and Model Context Protocol, or MCP, quickly became that standard. [02:20] Its goal is to be the USB-C port. [02:22] of AI applications. [02:24] Thank you. [02:24] some of the UI actuation you saw earlier. [02:27] could be replaced by these backend MCP integrations. [02:30] where an agent directly talks to a server using MCP. [02:35] Thank you. [02:36] but the human use of the web isn't going away. [02:39] users will continue to come to the web for rich UI experiences. [02:43] for entertainment, education, shopping, and every other endeavor they enjoy. [02:48] Now, with a companion agent helping them on that journey. [02:51] Thank you. [02:52] The future of the web is a shared interface. [02:55] which provides visually rich, [02:57] cooperative interactions. [02:58] between the user, the web page, and the agent. [03:02] And this cooperative model is essential.

3:04-4:38

[03:04] for the success of all entities involved. [03:07] Agents don't have to take away the connection between the site and the user. [03:12] Instead, they can enhance that connection. [03:14] by helping the site be even more useful to the user. [03:18] and the result is a web that works better for our users and builds on top of the UX innovation [03:26] that has happened across a wide range of sites. [03:28] and that's the future we want to build with you. [03:32] So what does this cooperative interaction actually entail? [03:36] It's the addition of agent-specific paths to websites. [03:39] that expand the ways in which agents can interact with sites. [03:43] beyond the UI design for humans. [03:45] There's three pillars to these agent-specific paths. [03:48] The first is context. [03:50] the data to understand what the user is currently doing. [03:53] and also long-term memory for what they have done. [03:56] Thank you. [03:57] Browsers can automatically provide application state visible to them [04:02] as context to an agent. [04:03] In other words, the information which is in the DOM. [04:06] but that information is usually intentionally limited. [04:10] As an example. [04:11] If you're watching a lecture series and you're on chapter two, [04:14] The browser can't see the information for other chapters in the series. [04:18] but it could be relevant information to answer a user's question. [04:21] and also help them jump. [04:23] to the relevant section in that series. [04:26] Imperative rendering techniques like canvas elements [04:29] further limit the information that the browser can see in the town. [04:32] Thank you. [04:33] The second is capabilities, the ability to take actions on the user's behalf.

4:38-6:15

[04:38] instead of just answering questions for them, given the context. [04:42] Thank you. [04:42] Capabilities is what moves an agent from helping the user accomplish tasks. [04:47] to doing tasks for them. [04:49] just as we've been optimizing web UI for a human. [04:53] exposing actions to an agent, [04:55] help users navigate your site faster. [04:57] And the last is coordination. Authors can help optimize how the control flows between the user and the agent. [05:05] The agent can drive the interaction until human input is needed. [05:09] For example, you wanted to buy whole milk, but only 2% is available. [05:12] And now you need to make that choice. [05:16] Thank you. [05:17] WebMCP is an MCP-like API, which has been designed explicitly [05:22] with these three forms of cooperation in mind. [05:25] Here's roughly how it works. [05:27] the user [05:28] or the agent in response to a user query [05:31] load the web page on their browser. [05:33] The loaded page declares its agent-specific functionality as tools. [05:38] you can roughly think of them as well-documented APIs. [05:41] This set of tools is then sent to the browser's agent. [05:45] the agent chooses which tool to invoke. [05:47] based on the user query. [05:49] that request is routed back to the page, where the corresponding function is executed [05:53] on the page itself. [05:55] Now during this execution, the website can do multiple things. [05:59] It can ask for user input if necessary. [06:02] it can present any relevant information in its UI. [06:05] It has access to all the local client state [06:08] what exactly is the user looking at, if they have selected any text. It has access to cookies for authentication, authorization,

6:15-7:50

[06:15] probably the user's already logged in. [06:17] And it can interact with its backend server if necessary. [06:21] Thank you. [06:22] Finally, the result of that execution is sent back to the browser's agent, where it can plan its next action. [06:28] Thank you. [06:30] And here's an example of this API in practice. [06:33] Say the user goes to their favorite clothing brand. The first thing they do is ask the agent to help them narrow down the options. [06:40] They make a query like, [06:42] "Show me dresses in my size suitable for a cocktail at our wedding." [06:45] I'm attaching an image of one I like. [06:47] The code on the top shows how easy it is [06:50] for the site to register tools for this. [06:52] agent. [06:53] Each tool has a unique name. [06:55] search products in this case. [06:57] A description explaining the purpose of this API and when and how the agent should use it [07:03] and a JSON schema. [07:05] explaining the parameters the agent must pass. [07:08] when it invokes this API. [07:09] Thank you. [07:10] The agency is the search products tool on this site. [07:13] And instead of the actuation you saw earlier, it outputs the code to execute this tool. [07:19] the browser will translate that request and run the corresponding function on the site. [07:24] Now this approach has several advantages over actuation. [07:27] first. [07:28] the site can show the minimal UI necessary. [07:31] so users understand what actions an agent is taking on their site. [07:35] The primary focus of the site UI can then be on other information which could be relevant to the user. [07:41] you could show discount offers or related products for that search. [07:45] Thank you. [07:45] Second, the agent can get the result of the tool in one step.

7:50-9:21

[07:50] while actuation for this would likely have taken multiple steps. [07:54] either because the products are being added to the DOM lazily, [07:57] as the user scrolls. [07:59] or they're split across multiple pages. [08:01] that the agent has to click through. [08:03] like a user would have. [08:04] Thank you. [08:05] So using the tool, [08:06] makes the agent faster and also provides a better UX. [08:10] The JSON below is the result of this execution. [08:14] which is passed back to the agent. [08:15] where it will apply further filtering [08:17] based on the image the user provided. [08:20] But the journey doesn't end here. [08:22] Thank you. [08:23] the agent has to present this filtered list of products to the user. [08:27] And without WebMCP, its only option is to present it in its own UI. [08:32] But the site has registered another tool. [08:34] called Show Products. [08:36] the agent realizes that it can use this tool [08:39] to show the filtered list of products [08:41] within the site's UI itself. [08:43] So the agent helps the user get to the information faster. [08:47] and the site continues doing what it does best. [08:50] create rich branded experiences. [08:53] that their users love. [08:56] Let's continue with this journey a bit more. The user now asks, "Which of these are available for delivery to my address?" [09:04] the agent isn't able to see a tool for this query. [09:07] but there's an option in the web UI to enter the delivery zip code. [09:11] So it uses that instead. [09:13] the API has been designed to be a progressive enhancement. [09:16] the agent will always prefer to use the agent-specific tools [09:20] when available.

9:21-10:53

[09:21] but it can fall back to using web UI like a user. [09:24] This gives authors an incremental way to add agent-specific paths to their website. [09:30] based on the user journeys that they think will help a user [09:33] when navigating their site using an agent. [09:37] Now, coming to the MCP part of WebMCP. [09:42] We're intentionally aligning the syntax of WebMCP with MCP's base primitives. [09:47] This is a simple example of what a tool declared in MCP looks like. [09:51] as compared to WebMCP. [09:53] This design ensures that agentic capabilities on the web [09:57] can be used by any MCP compatible client, with minimal translation layers. [10:03] and makes it possible for authors to have reuse of code between their MCP servers [10:08] and WebMCP in their website. [10:11] Thank you. [10:12] There's a bunch of brilliant folks across Google, Microsoft, and open source contributors [10:17] that have been actively contributing to this project. [10:19] I want to take a moment to thank them. [10:21] And I'm glad that I was able to present their work to you today. [10:25] And before I go, this proposal is an early incubation phase. [10:30] This is where feedback from web developers, agent providers, [10:33] and browser vendors is critical. [10:35] You can find the full explainer on our WebML community group repo. [10:38] And we'd love your feedback. Please file issues. [10:42] Secondly, we're actively working on a prototype for this in Chrome. [10:45] you can follow the Chrome status entry for WebMCP. [10:48] to know when it's ready for a depth trial. [10:50] I'm looking forward to the amazing experiences

10:53-10:57

[10:53] you always will. [10:54] Thank you. [10:55] Thank you.

Want to learn more?