You know that moment when you’re building something with an AI coding assistant and you think, “I really wish it could just *look* at the webpage”?
That’s the itch. And after months of scratching it with hacky workarounds — screenshots piped through base64, Chrome DevTools protocols that break every third Tuesday, headless browsers that get flagged by every anti-bot system on the internet — I finally built something that actually works.
I put my browser in a Docker container, gave it an MCP server, and connected it to Claude Code. Now my AI assistant can navigate pages, click buttons, fill forms, take screenshots, and inspect the DOM — all through a clean SSE connection. No browser windows stealing focus on my desktop. No “works on my machine” debugging. Just a containerized browser that sits there quietly, doing what it’s told.
Here’s how.
The Problem
If you’re using Claude Code (Anthropic’s CLI tool), you’ve probably discovered the Model Context Protocol (MCP). It’s the protocol that lets Claude talk to external tools — databases, APIs, file systems, and yes, browsers.
The official Playwright MCP server works great… until it doesn’t:
1. It hijacks your desktop. The browser window pops up, steals focus, and suddenly you’re watching your AI click around while you’re trying to work.
2. Anti-bot detection. Sites like LinkedIn, Amazon, and most modern web apps can detect HeadlessChrome in the user agent and block you outright.
3. State management is painful. Cookies disappear between sessions. Auth flows break. You’re constantly re-logging in.
4. Port conflicts. If you’re running multiple dev servers (and who isn’t?), the browser inside the container can’t reach localhost:3000 on your host machine.
The Solution: Docker + Playwright + MCP + SSE
The architecture is simple:
┌─────────────────────────────────────────┐
│ Docker Container │
│ │
│ Xvfb (virtual display) │
│ └─ Chromium (headed, not headless) │
│ └─ Playwright MCP Server │
│ └─ SSE endpoint (:3000) │
│ │
│ socat port forwarding │
│ localhost:8888 → host:8888 │
│ localhost:5173 → host:5173 │
│ … (configurable) │
└──────────────┬──────────────────────────┘
│ SSE (port 9106)
▼
┌─────────────────────────────────────────┐
│ Host Machine │
│ │
│ Claude Code │
│ └─ .mcp.json: │
│ “playwright”: { │
│ “type”: “sse”, │
│ “url”: “http://localhost:9106” │
│ } │
└─────────────────────────────────────────┘
Let’s break it down.
Step 1: The Docker Container
The container runs three things:
- Xvfb — A virtual X11 display. This lets Chromium run in “headed” mode (not headless) inside a container with no physical monitor. Why headed? Because `HeadlessChrome` gets fingerprinted and blocked. Headed mode with a virtual display gives you the stealth of a real browser with the convenience of a container.
- Chromium via Playwright — The actual browser. Playwright’s `@playwright/mcp` package wraps it with MCP tool definitions: `browser_navigate`, `browser_click`, `browser_snapshot`, `browser_fill_form`, `browser_take_screenshot`, and about 20 more.
- socat port forwarding — This is the secret sauce for local development. Your dev servers run on the host machine (localhost:8888, localhost:5173, etc.), but the container has its own network namespace. `socat` transparently forwards ports from the container’s localhost to the host, so the browser can reach your dev servers as if they were local.
Here’s a simplified docker-compose.yml:
services:
browser:
build:
context: .
dockerfile: Dockerfile.playwright
ports:
– “9106:3000” # MCP SSE endpoint
environment:
– FORWARD_PORTS=8888,5173,5174,8000
extra_hosts:
– “host.docker.internal:host-gateway”
volumes:
– ./storageState.json:/app/storageState.json
The Dockerfile.playwright installs Playwright, Xvfb, and sets up the entrypoint:
FROM mcr.microsoft.com/playwright:v1.52.0-noble
RUN apt-get update && apt-get install -y xvfb socat
RUN npm install -g @anthropic-ai/claude-code @playwright/mcp
COPY entrypoint.sh /app/entrypoint.sh
ENTRYPOINT [“/app/entrypoint.sh”]
The entrypoint script:
#!/bin/bash
# Clean stale X lock files
rm -f /tmp/.X99-lock /tmp/.X11-unix/X99
# Start virtual display
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99
# Forward ports from container → host
IFS=’,’ read -ra PORTS <<< “$FORWARD_PORTS”
for port in “${PORTS[@]}”; do
socat TCP-LISTEN:${port},fork,reuseaddr \
TCP:host.docker.internal:${port} &
done
# Launch MCP server
npx @playwright/mcp –port 3000
Step 2: Connect to Claude Code
In your project’s .mcp.json:
{
“mcpServers”: {
“playwright”: {
“type”: “sse”,
“url”: “http://localhost:9106/sse”
}
}
}
That’s it. Restart Claude Code and you’ll see ~30 new tools available: browser_navigate, browser_click, browser_snapshot, browser_take_screenshot, browser_fill_form, browser_evaluate, and more.
Step 3: Persist Authentication
The killer feature for development: storageState.json. This file stores cookies and local storage, so you don’t have to re-authenticate every time the container restarts.
Export it after logging in:
// In browser_evaluate:
await page.context().storageState({ path: ‘/app/storageState.json’ });
Mount it as a volume in docker-compose, and the browser starts pre-authenticated on every boot.
What You Can Actually Do With This
Here’s what my setup handles daily:
Visual QA — “Take a screenshot of the homepage and tell me if the new CSS changes look right.” Claude navigates, screenshots, and analyzes the page — all without opening a browser on my desktop.
Form testing — “Fill out the contact form with test data and submit it.” The browser fills fields, clicks submit, and reports what happened.
Scraping for development — “Go to my Goodreads author page and pull all the reader reviews.” Today I used this to scrape 8 Amazon reviews and 1 Goodreads editorial review for my book series, then integrated them directly into my website.
Multi-site management — I run 7 different browser containers, one per project, each with its own port forwarding and auth state. My WordPress site gets port 9106, my game project gets 9101, my job search tool gets 9102. Each one is isolated, persistent, and always ready.
Production verification — After deploying changes, Claude navigates the live site, takes screenshots, and confirms everything rendered correctly.
The Anti-Detection Angle
This is where it gets interesting. Most anti-bot systems check three things:
- User agent — `HeadlessChrome` is an instant flag. Running headed mode with Xvfb eliminates this.
- Browser fingerprint — WebGL, canvas, fonts, plugins. The container can inject custom fingerprint overrides at startup.
- Behavioral patterns — Instant navigation, no mouse movement, perfect timing. Playwright’s built-in delays and human-like interaction patterns help here.
For legitimate use cases — testing your own sites, scraping your own data, automating your own workflows — this is the right approach. You get a real browser that behaves like a real browser, because it *is* a real browser. It’s just running on a virtual display instead of your monitor.
Gotchas I Learned the Hard Way
1. Use `localhost`, not `host.docker.internal` inside the container for forwarded ports. The socat forwarding maps container localhost → host, so the browser should navigate to localhost:8888, not host.docker.internal:8888.
2. SSE connections only work in fresh Claude Code sessions. If you start Claude Code before the container is running, the MCP connection fails silently. Start the container first, then open Claude Code.
3. Screenshots are returned inline AND saved to the container. The MCP response includes the image, but if you need the file: docker cp container:/app/screenshot.png ./screenshot.png
4. Port 3000 is the MCP default. Don’t run anything else on that port inside the container. Map it to whatever external port you want in docker-compose.
5. Clean X lock files on startup. If the container crashes, stale lock files at /tmp/.X99-lock will prevent Xvfb from starting on the next boot. The entrypoint script should always clean these.
The Bigger Picture
MCP is changing how AI assistants interact with the world. But most MCP servers are simple — they wrap an API or read a file. A browser MCP server is different. It gives your AI agent *eyes*. It can see what users see, interact with what users interact with, and verify what users would verify.
Running it in Docker is the responsible way to do this. The browser is sandboxed. The ports are explicit. The auth state is controlled. And your desktop stays clean.
I’ve been running this setup for about two weeks across 7 projects, and I can’t imagine going back. The AI assistant that can only read code is helpful. The AI assistant that can also see the result? That’s a different tool entirely.



// COMMENTS