Tech 8 min read

Playwright MCP Gets browser_drop, Making Drag-and-Drop a First-Class Tool

IkesanContents

A post appeared on DEV Community: Native Drag-and-Drop Automation Arrives in Playwright MCP.
Playwright MCP v0.0.71 added browser_drop, letting MCP clients run drag-and-drop operations directly.

The v0.0.71 release on GitHub lists browser_drop under New Tools.
The release notes describe the change as exposing Playwright’s Locator.drop as an MCP tool.
The same release also adds response body retrieval for browser_network_requests and plain expression support for browser_evaluate.

browser_drop Brings Drag Operations into MCP

Getting an AI agent to perform drag operations used to be a pain.
Clicks, text input, and navigation all had clean MCP tool calls.
But drag-and-drop usually ended up as a workaround—faking events through JavaScript evaluate or chaining mouse.move calls.

browser_drop replaces those workarounds with Playwright’s native operation.

{
  "source": "text=report.pdf",
  "target": "[data-testid='upload-zone']"
}

You pass the drag source and drop target.
The actual operation rides on Playwright’s locator-based drag handling, so auto-wait and visibility checks come for free.
This is more resilient to browser differences and UI framework quirks than hand-rolled DOM events.

The places where this matters: file upload drop zones, sortable grids, Kanban boards, rich text editors, node editors.
These are all operations that feel natural to a human but consistently trip up test automation.

One Fewer Failure Pattern for AI Test Generation

I wrote about the difference between Playwright MCP and natural-language E2E testing in a Shortest article.
Shortest fixes what to test in natural language and lets AI figure out how.
Playwright MCP has the agent inspect browser state on the fly and call tools accordingly.

That difference gets amplified with drag operations.
When an agent starts assembling dragstart / drop events through evaluate, things break on DataTransfer handling, pointer events, frame boundaries, and scroll position.
Worse, when it fails, it’s hard to tell whether the test target’s UI is broken or the agent’s fabricated event is wrong.

With browser_drop, at least the drag operation itself is a named tool.
The agent can focus on selector choice and pre/post verification.
This isn’t a flashy new feature—it’s a fix that shrinks the space where AI starts writing unnecessary hand-rolled code.

The same v0.0.71 release also lets browser_network_requests return response bodies, which quietly helps too.
You can now drop-to-upload, inspect the API response, and verify the UI with browser_evaluate—all within MCP calls.

For CLI Users, Playwright’s Own Drag API Often Suffices

The obvious question: if it’s in MCP now, does anything change for CLI-based workflows?

For standard Playwright tests and CLI-based workflows, this update isn’t necessarily something you’ve been waiting for.
Playwright already has locator.dragTo().
If you can write test code, this is all you need:

const source = page.getByText("report.pdf");
const target = page.getByTestId("upload-zone");

await source.dragTo(target);

For E2E tests in CI, reproducible tests committed to a repo, or operation sequences you want reviewable—Playwright Test code is more practical than MCP.

The official README itself notes that for coding agents, CLI+SKILLS may fit better than MCP.
The reason is token efficiency.
MCP loads tool definitions and the accessibility tree into the model’s context.
With CLI, the agent just runs the commands it needs and reads results from files or stdout.

This mirrors the trade-off I covered in From CLI to AI, the Way Humans Talk to Software Is Changing.
CLI is lightweight; MCP is strong on discoverability and state management.
Drag-and-drop support doesn’t shift that boundary—it’s better read as “a missing operation on the MCP side got filled in.”

Off-Screen Elements Will Fail

The most practical note from the original article: browser_drop assumes both elements are visible in the viewport.
This is natural Playwright behavior, but it’s a classic source of CI-only failures.

On a wide local display, the drop zone is visible.
In CI’s narrow viewport, it’s scrolled out of view.
That difference makes browser_drop fail.

The fix is simple—scroll first:

{
  "expression": "document.querySelector('[data-testid=\"upload-zone\"]')?.scrollIntoView()"
}

Then call browser_drop.
Unless you bake this into tool definitions or prompt-side boilerplate, you’ll still get flaky tests even with native operations.

Keep an Eye on Permission Boundaries When Running Browser MCP

Playwright MCP is useful, but casually keeping a browser-controlling MCP connected at all times is a different story.
I previously covered the DNS rebinding vulnerability in Playwright MCP v0.0.39 and below in a security scan of 50 open-source MCP servers.
That was fixed in v0.0.40, but browser-controlling MCPs touch external web pages, local servers, and authenticated sessions—failures have a wide blast radius.

The v0.0.71 README describes the default persistent profile, isolated mode, and connecting to an existing browser via an extension.
Using an authenticated session is convenient, but you should decide upfront which URLs the agent can access and where authentication boundaries lie.

My own approach: MCP for exploration and one-off UI checks.
Tests that need to persist go into Playwright Test code.
Anything running at scale goes through CLI or a standard test runner.
browser_drop doesn’t change this calculus—it makes one weak spot in MCP’s toolset ordinary.

Do Intermediate Events Fire During Drag?

Drag-and-drop has intermediate steps.
The HTML Drag and Drop API fires a sequence: dragstartdragdragenterdragoverdragleavedropdragend.
When the cursor moves within a drop zone, dragover fires repeatedly; when it leaves, dragleave fires.

Playwright’s dragTo() and browser_drop execute real pointer operations (mousedown → mousemove → mouseup).
The browser receives these pointer actions and fires native Drag and Drop events in response.
Intermediate events fire normally—unlike hand-rolled DOM events via evaluate, you get the browser’s standard event sequence.

So if a drop zone adds a CSS class on dragenter for highlighting, that class will be applied during browser_drop execution.
The dragleave handler that removes it will also run if the cursor passes outside the zone.
From the app’s event listeners’ perspective, the event sequence is identical to a human dragging with a mouse.

Can You Verify CSS Changes via Screenshot During Drag?

Here’s the catch.
browser_drop is a single MCP tool invocation that runs from drag start to drop completion in one shot.
There’s no way to pause mid-drag and take a screenshot.

There are real cases where you’d want to verify the highlight state while the cursor is over a drop zone.
File upload UIs often show a “drop here” indicator when you drag into the zone—that’s part of the UX.
But taking a screenshot after calling browser_drop only captures the post-drop state.

To verify CSS changes during drag, you need to break the operation apart.
Use browser_evaluate to dispatch pointerdown, then pointermove to the drop zone coordinates, take a browser_screenshot mid-drag, and finally dispatch pointerup to complete the drop.

Concretely: dispatch pointerdown on the drag source via browser_evaluate, then pointermove to coordinates over the drop zone.
Call browser_screenshot here to capture the highlighted state.
Then pointerup to finish the drop.

But this brings back exactly the complexity browser_drop was meant to eliminate.
DataTransfer object assembly, coordinate calculation, frame boundary handling—all back on the table.

You have to separate drag-during verification from drop-result verification.
Whether the UI is correct after the drop can be checked with browser_dropbrowser_screenshot.
If you need to verify the highlight during drag, write a Playwright Test with page.mouse.move() and take screenshots mid-operation:

const source = page.getByText("report.pdf");
const target = page.getByTestId("upload-zone");

const sourceBox = await source.boundingBox();
const targetBox = await target.boundingBox();

await page.mouse.move(sourceBox!.x + sourceBox!.width / 2, sourceBox!.y + sourceBox!.height / 2);
await page.mouse.down();
await page.mouse.move(targetBox!.x + targetBox!.width / 2, targetBox!.y + targetBox!.height / 2);

// Capture the mid-drag state
await page.screenshot({ path: 'drag-over-state.png' });

await page.mouse.up();

MCP alone can’t do mid-operation screenshots because there’s no way to pause between steps.
browser_drop is for cases where you only need to verify the result of the drag. If you need to inspect the visual state during the drag itself, drop down to Playwright Test.

References