Lumbox Docs

Browser Automation

Control a real browser for signups, logins, and web interactions.

Browser Automation

Lumbox includes a full browser automation system powered by Steel Browser (open-source, Apache-2.0). Your AI agents can navigate websites, fill forms, click buttons, and complete signup/login flows — all integrated with email tools for end-to-end automation.

How It Works

AI Agent (Claude, GPT, etc.)
    ↓ MCP tools
Lumbox MCP Server
    ↓ Steel API
Steel Browser (headless Chrome)
    ↓ renders
Real websites

Steel Browser runs as a Docker container alongside the Lumbox API. It provides:

  • Headless Chrome with anti-detection fingerprinting
  • CAPTCHA auto-solving (reCAPTCHA v2/v3, Cloudflare Turnstile, AWS WAF)
  • Credential auto-injection with field blurring
  • Session isolation for parallel tasks
  • Accessibility tree snapshots for AI-friendly page understanding

Setup

# docker-compose.yml
services:
  steel:
    image: ghcr.io/steel-dev/steel-browser:latest
    restart: always
    ports:
      - "3000:3000"   # Steel API
      - "9223:9223"   # Chrome DevTools Protocol
    volumes:
      - ./data/steel-cache:/app/.cache
    shm_size: "2gb"
    cap_add:
      - SYS_ADMIN
    deploy:
      resources:
        limits:
          memory: 2G
docker compose up -d

Podman (Alternative)

podman run -d --name steel \
  -p 3000:3000 -p 9223:9223 \
  --shm-size=2g --cap-add SYS_ADMIN \
  -e STEEL_PUBLIC_URL=https://browser.yourdomain.com \
  -v ./data/steel-cache:/app/.cache \
  ghcr.io/steel-dev/steel-browser:latest

Environment Variables

Set these in your MCP server environment:

VariableDefaultDescription
STEEL_API_URL(required)Steel server URL (e.g. http://localhost:3000)
STEEL_API_KEY(optional)Steel API key if authentication is enabled

Requirements

  • Docker or Podman
  • At least 4GB RAM (2GB for Steel)
  • 10GB free disk space

Browser Tools

ToolDescription
browser_navigateNavigate to a URL, returns accessibility tree snapshot
browser_snapshotGet current page state with element refs (@e1, @e2...)
browser_screenshotTake a screenshot (saved to temp file)

Interaction

ToolDescription
browser_clickClick an element by ref
browser_typeType text into an input by ref
browser_selectSelect a dropdown option
browser_evalRun JavaScript on the page

Session Management

ToolDescription
browser_save_stateSave cookies + localStorage to file
browser_load_stateRestore a saved session
browser_closeClose a browser session

Understanding Snapshots

When you navigate or interact with a page, Steel returns an accessibility tree snapshot — a text representation of the page with element references:

@e1 heading "Sign Up"
@e2 textbox "Email address"
@e3 textbox "Password"
@e4 button "Create account"
@e5 link "Already have an account? Sign in"

Use these refs with browser_click, browser_type, etc:

browser_type(ref="@e2", text="bot@example.com")
browser_type(ref="@e3", text="password123")
browser_click(ref="@e4")

CAPTCHA Solving

All Steel sessions are created with CAPTCHA auto-solving enabled. Supported types:

CAPTCHAStatus
reCAPTCHA v2 / v3Supported
Cloudflare TurnstileSupported
ImageToTextSupported
Amazon AWS WAFSupported
GeeTest v3/v4Coming soon

When a CAPTCHA is detected:

  1. Steel monitors the page for CAPTCHA elements
  2. Identifies the CAPTCHA type
  3. Auto-solves using ML models and browser automation
  4. Continues the session transparently

No configuration needed — it happens automatically.

Skill Tools (Compound Actions)

These high-level tools chain browser + email operations into single calls:

skill_signup_with_email

Complete a signup flow in one call:

  1. Navigate to signup page
  2. Find and fill email field with Lumbox inbox address
  3. Click submit
  4. Wait for verification email
  5. Return OTP code and/or verification links
skill_signup_with_email(
  url="https://example.com/signup",
  inbox_id="inb_abc123",
  wait_timeout=60
)

skill_login_with_email

Complete a magic-link login flow:

  1. Navigate to login page
  2. Enter inbox email address
  3. Submit form
  4. Wait for magic link or OTP email
  5. Return the link/code

skill_verify_email

Wait for a verification email and click the link:

  1. Wait for verification email to arrive
  2. Extract the verification link
  3. Navigate to it in the browser
  4. Return the result

login_with_credential

Most secure login — uses Steel's credential injection:

  1. Create browser session with credential auto-injection
  2. Navigate to login page
  3. Steel fills username + password at browser level
  4. Fields blurred (invisible to screenshots)
  5. Form auto-submitted

See Credential Vault for details.

Session Isolation

Use separate session names for parallel tasks:

browser_navigate(url="https://github.com", session_id="github-signup")
browser_navigate(url="https://twitter.com", session_id="twitter-signup")

Each session gets its own Chrome tab with isolated cookies and state.

Example: Full Signup Flow

Here's what an AI agent does to sign up for a service:

1. Create inbox:        create_inbox(name="github-bot")
2. Store credentials:   store_credential(service="github.com", identifier="bot@lumbox.co", value="S3cur3P@ss!")
3. Navigate to signup:  browser_navigate(url="https://github.com/signup")
4. Fill email:          browser_type(ref="@e2", text="bot@lumbox.co")
5. Fill password:       use_credential_in_browser(service="github.com", identifier="bot@lumbox.co", ref="@e3")
6. Submit:              browser_click(ref="@e4")
7. Wait for OTP:        get_otp(inbox_id="inb_abc")  → "847291"
8. Enter OTP:           browser_type(ref="@e5", text="847291", press_enter=true)

Or use the skill tool for steps 3-7 in one call:

skill_signup_with_email(url="https://github.com/signup", inbox_id="inb_abc")

Example: Secure Re-Login

Later, when the agent needs to log back in:

login_with_credential(
  login_url="https://github.com/login",
  service="github.com"
)

Steel auto-fills the stored credentials, blurs the fields, and submits. The password never appears in the conversation.

Production Deployment

Reverse Proxy (Caddy)

browser.yourdomain.com {
    reverse_proxy localhost:3000
}

Security

  • Never expose port 9223 (CDP) to the internet — it gives full browser control
  • Use STEEL_API_KEY for authentication in production
  • Keep Steel behind a reverse proxy with HTTPS
  • Set STEEL_PUBLIC_URL for correct URL resolution in screenshots

Resource Limits

Steel uses ~1-2GB RAM per active session. For production:

deploy:
  resources:
    limits:
      memory: 2G

Monitor with docker stats steel.