Published on

Sandboxing AI Agents: Running OpenClaw in Docker So It Doesn't Nuke Your Codebase

Authors
  • avatar
    Name
    Asif Foysal Meem
    Twitter

Let me paint you a picture: you've got an AI agent that can execute shell commands, browse the web, and modify files on your machine. Exciting, right? Also terrifying.

That's the reality of tools like OpenClaw — an AI-powered terminal automation agent that can interact with your codebase, run commands, and even chat with you over Telegram. It's incredibly powerful, but handing an AI agent unrestricted access to your filesystem feels like giving your car keys to a raccoon.

Hot take 🔥 — If you wouldn't chmod 777 / on your production server, maybe don't give an AI agent root access to your dev machine either.

I could've gone the route a lot of people are taking — drop ~$1000 AUD on a Mac Mini dedicated to running AI agents. But I'd rather spend that money on literally anything else. Docker gives us the same isolation for free, and it's portable.

So I decided to sandbox it. Here's how I set up OpenClaw inside a Docker container with proper security constraints, and all the delightful issues I ran into along the way.

Why Sandbox an AI Agent?

OpenClaw is designed to be an autonomous coding assistant. It can:

  • Execute arbitrary shell commands
  • Read and write files in your workspace
  • Browse the web
  • Communicate via Telegram

That's a lot of surface area. Even with the best intentions, an AI agent with unrestricted filesystem access could accidentally delete files, overwrite configs, or — in the worst case — exfiltrate sensitive data. Sandboxing it inside Docker gives us isolation while still letting it do useful work.

The goal was simple:

  • Mount only my project repo with read/write access
  • Run as a non-root user
  • Drop all Linux capabilities
  • Keep high-risk tools on an explicit allowlist
  • Connect to my local Ollama instance for model inference

The Docker Setup

The Dockerfile

The first decision was the base image. OpenClaw is a Node.js application, so node:22-slim was the natural choice. I also needed Python tooling (since my project is a FastAPI app) and uv for dependency management:

FROM node:22-slim

RUN apt-get update && apt-get install -y --no-install-recommends \
    python3 python3-pip python3-venv python3-dev \
    git curl ca-certificates build-essential cmake \
    && rm -rf /var/lib/apt/lists/*

# uv for Python dependency management
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

# pnpm for the Next.js frontend
RUN corepack enable && corepack prepare pnpm@latest --activate

# Non-root user
USER node
ENV PATH="/home/node/.npm-global/bin:/home/node/.local/bin:${PATH}"

# Pre-seed the OpenClaw security config
COPY --chown=node:node openclaw.json /home/node/.openclaw/openclaw.json

WORKDIR /workspace
CMD ["bash"]

Notice what's not in there? OpenClaw itself. That was lesson number one.

The Security Config

OpenClaw uses a JSON config file at ~/.openclaw/openclaw.json. Here's the locked-down version:

{
  "agents": {
    "defaults": {
      "sandbox": {
        "mode": "all",
        "scope": "session",
        "workspaceAccess": "rw"
      },
      "tools": {
        "allow": ["exec", "browser", "web_fetch"],
        "deny": []
      }
    }
  },
  "gateway": {
    "auth": { "mode": "token" },
    "bind": "loopback",
    "port": 18789
  }
}

Key decisions:

  • sandbox: "all" — Every tool runs in sandbox mode. No exceptions.
  • scope: "session" — Sandbox state resets between sessions. No persistent side effects.
  • workspaceAccess: "rw" — Only the mounted workspace is writable.
  • Explicit tool allowlist — Only exec, browser, and web_fetch. Everything else is denied by default.
  • Token auth — Gateway requires authentication. No open access.

The Compose File

services:
  openclaw:
    build: docker/openclaw
    user: '1000:1000'
    cap_drop: [ALL]
    ports: ['18789:18789']
    volumes:
      - .:/workspace:rw
      - openclaw-home:/home/node
    extra_hosts: ['host.docker.internal:host-gateway']
    environment: [OLLAMA_HOST=http://host.docker.internal:11434]
    working_dir: /workspace
    stdin_open: true
    tty: true

volumes:
  openclaw-home:

A few things worth calling out:

  • cap_drop: [ALL] — Drops every Linux capability. The container can't mount filesystems, change ownership, or do anything privileged.
  • user: "1000:1000" — Runs as the node user, not root.
  • Bind mount for workspace.:/workspace:rw means the container sees your project directory directly. Changes are bidirectional and immediate.
  • Named volume for homeopenclaw-home persists OpenClaw's state (config, sessions, installed skills) across container restarts without exposing the host filesystem.
  • host.docker.internal — Bridges the container to host services like Ollama.

The Install Saga

Attempt 1: Install During Build (Failed)

My first instinct was to install OpenClaw in the Dockerfile:

RUN curl -fsSL https://openclaw.ai/install.sh | bash

This blew up immediately. The install script runs an interactive onboarding flow that tries to read from /dev/tty — which doesn't exist during docker build. There's no TTY, no stdin, nothing. The script just hangs and fails.

error: /dev/tty: No such device or address

Attempt 2: Suppress the Onboarding (Wrong Approach)

I tried appending || true to ignore the failure:

RUN curl -fsSL https://openclaw.ai/install.sh | bash || true

This technically "worked" — the build completed — but OpenClaw was left in a half-configured state. The binary was there but onboarding never ran, so nothing was actually set up.

Attempt 3: Let the User Install Interactively (The Right Way)

The solution was embarrassingly obvious: don't install OpenClaw during build. Instead, build the image with all the prerequisites and let the user install interactively after starting the container:

docker compose -f docker-compose.openclaw.yml run --rm openclaw bash
# Inside the container:
curl -fsSL https://openclaw.ai/install.sh | bash

This way the onboarding flow gets a real TTY, can ask questions, do the GitHub Copilot device auth flow, set up Telegram, configure skills — the whole nine yards.

The openclaw-home named volume means this only needs to happen once. After the initial setup, everything persists across container restarts. Kill the container, start a new one, OpenClaw is still there with all your config intact.

Post-Install Issues

The Gateway Won't Start

After installation, I tried starting the gateway:

openclaw gateway start
Gateway service check failed: Error: systemctl --user unavailable: spawn systemctl ENOENT

OpenClaw tries to register itself as a systemd user service. Docker containers don't have systemd (nor should they). The fix: run the gateway directly in the foreground:

openclaw gateway run

This works perfectly and logs straight to stdout — which is actually more useful in a container context anyway.

Config Overwrite

Something to watch out for: the OpenClaw installer overwrites ~/.openclaw/openclaw.json during onboarding with its own defaults. If you pre-seeded a security config (like we did in the Dockerfile), it'll be replaced.

The installer does create a backup at openclaw.json.bak, so you can restore it if needed. But honestly, the onboarding configures most of the same settings interactively anyway — sandbox mode, tool allowlists, gateway auth. Just make sure the final config matches your security requirements before you start using it for real work.

Copilot Token Exchange 403

After getting everything running, sending a message to the Telegram bot returned:

Embedded agent failed before reply: Copilot token exchange failed: HTTP 403

During onboarding, I connected OpenClaw to my GitHub Copilot subscription via the device code auth flow. The default model was set to github-copilot/gpt-5.2-codex. But the token exchange was failing inside the container.

Worth noting — you don't need a Copilot subscription for this. OpenClaw supports free models too, including locally-hosted ones via Ollama (which is what the OLLAMA_HOST environment variable in our compose file is for). If you're running Ollama on your host machine, you can point OpenClaw at it and skip the Copilot auth entirely.

To switch models, run:

openclaw configure --section model

For my setup, re-running the configure flow and re-authenticating with GitHub resolved the 403. But if you'd rather avoid external API dependencies altogether, Ollama is the cleaner path — no tokens to manage, everything stays local.

Telegram Pairing

Before you can use OpenClaw via Telegram, you need a bot. Head to @BotFather on Telegram and run /newbot. Give it a name, grab the API token, and paste it into OpenClaw during onboarding (or later via openclaw configure --section telegram).

Once the bot is running, the first message you send it will return something like:

OpenClaw: access not configured.
Your Telegram user id: XXXXXXXXXX
Pairing code: XXXXXXXX

This is expected — OpenClaw uses a pairing flow for security. But since the gateway was running in the foreground in the only shell, approving the pairing required a second terminal:

docker compose -f docker-compose.openclaw.yml exec openclaw \
  openclaw pairing approve telegram <PAIRING_CODE>

Alternatively, you can background the gateway with openclaw gateway run & and approve from the same shell. The second terminal approach is cleaner.

The Bind Mount: Changes Are Real

One thing that's worth emphasizing: this setup uses a bind mount, not a copy. When OpenClaw edits a file inside /workspace, it's editing your actual codebase on the host machine. There's no copy, no sync delay, no intermediate layer.

volumes:
  - .:/workspace:rw

This is intentional — the whole point is to let OpenClaw work on your real code. But it also means there's no undo button. If the AI agent decides to rm -rf src/, that's happening to your real files. The sandbox mode and capability dropping mitigate this, but it's worth keeping in mind. Git is your safety net here.

Key Takeaways

  1. Don't install interactive tools during docker build — If an installer needs a TTY, let the user run it after container start. Use named volumes to persist the installation.

  2. Systemd doesn't exist in containers — And that's fine. Run services in the foreground. It's actually better for logging.

  3. Pre-seeded configs will get overwritten — Always have a restore strategy. The backup file pattern (*.bak) is your friend.

  4. cap_drop: [ALL] is your baseline — Start with zero capabilities and add back only what's needed. For an AI coding agent, you need surprisingly little.

  5. host.docker.internal bridges the gap — When your AI agent needs to talk to services on the host (like Ollama), this is the clean way to do it. No --network host required.

  6. Bind mounts are bidirectional — Changes inside the container are immediately visible on the host and vice versa. This is a feature, not a bug, but treat it with respect.

What's Next

The container is running, the Telegram bot is connected, and the workspace is mounted. The remaining piece is getting the Copilot model auth stable — or just switching to a locally-hosted model via Ollama, which honestly might be the better play for a sandboxed environment anyway. No external API calls, no tokens to manage, everything stays local.

If you're running AI agents on your machine, seriously consider sandboxing them. The 30 minutes of Docker setup is nothing compared to the peace of mind of knowing your agent can't accidentally rm -rf / your home directory.

Stay safe out there!