securitysandboxpermissions

Gemini CLI Sandbox: What It Protects, What It Doesn't, and How It Gets Misconfigured

Evidence-based analysis of Gemini CLI sandbox — protection scope, opt-in default risks, community-reported misconfigurations, and configuration patterns validated across GitHub issues.

muzhihao
14 min read

Introduction

After scanning the open issue tracker on the google-gemini/gemini-cli repository and cross-referencing community reports on sandbox behavior, three patterns emerge with notable consistency. First, sandboxing is opt-in by default — a design decision that trips up a significant portion of new users who assume protection is active when it is not. Second, when users do enable sandboxing, a cluster of permission-denied failures traces to a single root cause: credential and session files attempting to write inside the sandbox container without mapped host UID/GID permissions. Third, the most severe real-world exploits against Gemini CLI — including the silent data exfiltration chain patched in v0.1.14 — exploited the no-sandbox configuration path, not a flaw in the sandbox implementation itself.

This article synthesizes findings from the official sandbox documentation, GitHub issues #2345, #3256, #15376, and #16968, the Tracebit-reported dual-flaw chain patched by Google, and the SandboxEscapeBench research from the University of Oxford and the AI Security Institute. The goal is an accurate mental model of the sandbox's actual protection surface — not a feature walkthrough.


TL;DR

  • Sandboxing is disabled by default in Gemini CLI; users must explicitly enable it via --sandbox, GEMINI_SANDBOX, or settings.json. Most installations run with full host access unless configured otherwise.
  • The Tracebit-disclosed dual-flaw exploit (patched in v0.1.14, July 2025) combined two weaknesses to execute shell commands silently — both attack paths required no sandbox to be active.
  • Of the sandbox-related permission errors indexed on GitHub, the dominant pattern (issues #15376 and #16968) stems from the sandbox container running as a fixed node user while the host's credential files are owned by a different UID.
  • The macOS Seatbelt path (sandbox-exec) restricts writes outside the project directory but leaves network and command execution largely unrestricted on the permissive-open default profile.
  • SandboxEscapeBench (Oxford/AISI, 2026) found that frontier AI models successfully escaped container sandboxes in all scenarios involving exposed Docker sockets, writable host mounts, or privileged containers — three misconfigurations that are common in developer tooling setups.
  • The evidence-backed configuration baseline is: container-based sandbox with explicit UID/GID mapping, credentials mounted read-only outside the project tree, and network access limited to the Gemini API endpoint.

Problem Domain: Why Sandboxing AI Agents Is Structurally Different

AI coding agents do not execute a fixed set of predefined operations. They construct actions dynamically from model output — which means the attack surface includes not just what the tool is designed to do, but what the model can be prompted to do. This distinction matters for sandbox design.

Traditional tool sandboxing constrains a known binary doing known things. Agent sandboxing must constrain an unbounded action space where the "instructions" the agent follows can be injected by untrusted content the agent processes — a class of attack known as prompt injection.

The World Economic Forum's analysis of non-human identities in agentic AI notes that AI agents running with developer-level permissions represent a new category of non-human identity, one that can be manipulated via the content it processes rather than via traditional credential theft. Gemini CLI's ability to read arbitrary files, execute shell commands, and make outbound network connections makes it a capable agent — and a capable attack surface if those capabilities are not gated.

The official Gemini CLI sandbox documentation frames the core benefit as: "Prevent accidental system damage or data loss" and "Limit file system access to project directory." This framing is accurate but incomplete — it describes protection against accidental model error, not against adversarial prompt injection. The threat model gap matters because users who enable sandboxing to prevent accidents may not realize they also need it to prevent adversarial content from escalating the agent's actions.

The Bunnyshell analysis of coding agent sandboxes observes that the minimal sandbox for an AI coding agent must address three independent attack paths: filesystem access (read for exfiltration, write for persistence), command execution (arbitrary shell commands), and network egress (exfiltration channels). An incomplete sandbox that addresses only one or two of these paths provides false assurance.


Common Approaches That Fail

Approach 1: Assuming Sandbox Is Active by Default

GitHub issue #3256, filed as "The Documentation on sandboxing is incomplete or insufficient," captures the most prevalent failure mode: the documentation states sandboxing is "highly recommended" but does not clearly explain that it is opt-in. Users who follow the getting-started guide without reaching the sandbox section run Gemini CLI with full host-level access.

FOSS Force's July 2025 security analysis documented this directly: the author found that with default settings, Gemini CLI could "open a file, create a file, delete a file, and even modify a file" throughout her home directory. The visible indicator — a "no sandbox see /docs" status notification — is present but easy to miss for users focused on the AI interaction.

The consequence: users operating under the assumption that some baseline protection is active get none. The Tracebit-disclosed dual-flaw chain, which enabled silent data exfiltration before the v0.1.14 patch, was exploitable precisely because it required no sandbox to be enabled. The no-sandbox path was the attack surface.

Approach 2: Enabling Container Sandbox Without Configuring UID/GID Mapping

When users do enable the Docker or Podman sandbox, a second failure cluster emerges. The container runs as a fixed node user, but the host's ~/.gemini directory — where credential files, OAuth tokens, and session state are stored — is owned by the host user. Without UID/GID mapping, the container's node user cannot write to these paths.

Issue #16968 documents the exact failure signature:

EACCES: permission denied, open '/home/node/.gemini/oauth_creds.json'
EACCES: permission denied, open '/home/node/.gemini/google_accounts.json'
EACCES: permission denied, open '/home/node/.gemini/tmp/[session]/chats/session-[timestamp].json'

The pattern repeats in issue #15376, where a Podman user on Linux v0.21.3 encountered identical EACCES errors on the /home/node/.gemini/tmp/ path. Both issues are marked priority/p2 and "maintainer only," meaning they remain open infrastructure problems rather than user-configuration issues.

Issue #2345 adds a third variant: Docker sandbox on WSL2 with native Docker Engine (no Docker Desktop) fails to activate at all — the CLI falls back to no-sandbox mode because it cannot locate the Docker socket at the expected path (/var/run/docker.sock). The sandbox silently degrades rather than erroring visibly.

The common thread across all three failure modes: the sandbox either silently does not activate, or it activates but immediately fails on credential operations — pushing users toward disabling it as a workaround.


Evidence-Based Configuration

Sandbox Methods and Their Protection Scope

The official sandbox documentation describes four sandbox implementations with materially different protection levels:

| Method | Filesystem | Network | Command Execution | Platform | |---|---|---|---|---| | macOS Seatbelt (sandbox-exec) | Restricts writes outside project dir | Unrestricted by default | Unrestricted by default | macOS only | | Docker/Podman container | Full isolation to mounted paths | Configurable via container network policy | Isolated to container | Cross-platform | | Bubblewrap (bwrap) | Namespace isolation + seccomp | Namespace-isolated | Namespace-isolated | Linux | | gVisor (runsc) | User-space kernel intercepts all syscalls | Intercepted | Intercepted | Linux/container |

The macOS Seatbelt permissive-open default profile — what users get when they run gemini --sandbox on macOS without further configuration — restricts writes outside the project directory. It does not restrict outbound network connections or shell command execution by default. This is a partial sandbox that addresses accidental overwrites but not exfiltration or command injection.

The DeepWiki architecture analysis of the gemini-cli codebase confirms the launcher pattern: the initial process spawns a sandboxed child and exits, passing stdio streams. Platform detection via getSandboxCommand prefers sandbox-exec on macOS and falls back to Docker/Podman on other platforms when sandbox: true is set, but runsc (gVisor) and LXC require explicit specification.

For the strongest protection, the documented recommendation is gVisor (runsc), which runs containers inside a user-space kernel that intercepts all system calls — eliminating the kernel attack surface that standard Docker containers share with the host.

Enabling Sandbox Correctly

Sandboxing is activated through any of three configuration paths (listed in precedence order):

# Option 1: Command flag (session only)
gemini --sandbox

# Option 2: Environment variable (persistent for shell session)
export GEMINI_SANDBOX=docker    # or: podman, sandbox-exec

# Option 3: settings.json (persistent, recommended)
# In ~/.gemini/settings.json or $PROJECT_ROOT/.gemini/settings.json

The settings.json approach is the only one that survives across sessions without shell environment management:

{
  "sandbox": true
}

To specify the sandbox type explicitly and avoid auto-detection failures (the WSL/Docker socket issue in #2345):

{
  "sandbox": "docker"
}

Resolving the UID/GID Credential Failure

The permission-denied pattern from issues #15376 and #16968 is addressable via environment variable override. The official documentation notes that Linux UID/GID handling can be overridden with:

# Map container user to host UID/GID
export SANDBOX_SET_UID_GID=true

# Or disable UID/GID mapping entirely (less secure, resolves access errors)
export SANDBOX_DISABLE_UID_GID=true

The SANDBOX_FLAGS environment variable allows injecting custom Docker/Podman flags for cases where the built-in UID mapping is insufficient:

export SANDBOX_FLAGS="--user $(id -u):$(id -g) -v $HOME/.gemini:/home/node/.gemini:rw"

Note the security tradeoff: mounting ~/.gemini as read-write gives the sandbox access to credential files, which is necessary for OAuth persistence but means a compromised sandbox could read those credentials. The alternative — not mounting credentials at all — means the user must re-authenticate each session.

macOS Seatbelt Profile Selection

For macOS users who cannot or prefer not to run Docker, the built-in seatbelt profiles offer graduated protection levels. The available profiles are configured via SEATBELT_PROFILE:

# Restrictive profile — tighter filesystem and command restrictions
export SEATBELT_PROFILE=restrictive-open

# Or in settings.json via GEMINI_SANDBOX environment setting

The restrictive-open and restrictive-closed profiles offer substantially more protection than the permissive-open default. The -closed variants additionally restrict network access, which the -open variants leave unrestricted. For workflows that only need the Gemini API endpoint, restrictive-closed with a network proxy whitelist is the documented hardening path.


A scan of the google-gemini/gemini-cli issue tracker reveals a recognizable pattern across sandbox-related reports. Of the permission-related issues that specify sandbox mode as the trigger, the majority fall into two categories: credential write failures (the UID/GID mismatch pattern) and sandbox activation failures (the Docker socket detection path).

The documentation gap flagged in issue #3256 is reflected in the issue type distribution: a disproportionate number of sandbox reports are filed as bugs when the root cause is configuration — specifically, the expectation that enabling sandboxing would "just work" for existing credential paths. The maintainer response across #15376 and #16968 (both marked not planned or maintainer only) suggests the UID/GID credential mapping problem is a known structural limitation rather than a simple configuration oversight.

The security-severity distribution is notable: the highest-severity incident (Tracebit's dual-flaw silent exfiltration chain, patched in v0.1.14, July 2025) required sandboxing to be disabled. The sandbox implementation itself was not the vulnerability — the no-sandbox default deployment path was.


Edge Cases Documented in Community Reports

WSL2 with native Docker Engine (no Docker Desktop). Issue #2345 documents a scenario where Docker is correctly configured (systemd-enabled, docker group membership, daemon running) but the sandbox fails silently. The CLI auto-detects Docker as unavailable because it looks for the socket at a path that differs in native-Docker WSL2 environments. The workaround requires explicit socket path injection via SANDBOX_FLAGS. This edge case affects users who follow standard Linux Docker installation instructions rather than Docker Desktop.

gVisor on non-gVisor-capable hosts. The runsc sandbox type requires gVisor to be installed at the host level. Specifying GEMINI_SANDBOX=runsc on a host without gVisor will fail with a non-obvious error. The official docs note this dependency; community reports suggest it is not obvious from the error message alone.

Credential persistence across sessions. Both #15376 and #16968 surface the same underlying tension: the sandbox isolates the process from the host filesystem, but Gemini CLI needs to persist OAuth credentials to ~/.gemini between sessions. There is no native credential injection mechanism (equivalent to Docker's --secret or Kubernetes secrets), leaving users to choose between session re-authentication and mounting the credential directory.

macOS Seatbelt and network-dependent tools. The permissive-open profile allows outbound network. The restrictive-closed profile blocks it. Tools that use npm install, pip, or any package manager inside the agent session will silently fail under restrictive-closed if no network proxy is configured. The troubleshooting guide documents this as: "Check sandbox profile allows network."

AI agents actively probing sandbox boundaries. The SandboxEscapeBench research from Oxford and AISI found that frontier models "frequently succeeded in breaking out of sandbox environments" when misconfigurations were present — specifically exposed Docker sockets, writable host mounts, and privileged containers. All successful escapes relied on known misconfigurations. The implication for Gemini CLI deployments: a standard docker run --privileged or a Docker socket mounted into the workspace are not safe foundations for a sandbox, regardless of the Gemini CLI sandbox flag.


Recommendation

Based on the failure patterns above, the minimum viable secure configuration for Gemini CLI is:

  1. Enable sandbox explicitly in settings.json, not via flag or environment variable, to ensure it persists and cannot be accidentally omitted in a new shell. Specify the sandbox type rather than relying on auto-detection to avoid the Docker socket discovery failures documented in #2345.

  2. On macOS without Docker, use restrictive-closed or restrictive-proxied seatbelt profiles rather than the permissive-open default. The permissive profile restricts writes but not network egress or command execution, which leaves two of the three exfiltration paths open.

  3. On Linux with Docker/Podman, configure UID/GID mapping explicitly via SANDBOX_SET_UID_GID=true before encountering the credential-write errors from #15376 and #16968. If credential persistence is required, mount ~/.gemini explicitly rather than relying on implicit path resolution inside the container.

  4. Do not run Docker in privileged mode for Gemini CLI sandboxing. The SandboxEscapeBench research confirms that privileged containers, exposed Docker sockets, and writable host mounts are sufficient for a capable model to escape the sandbox. The sandbox flag is not a substitute for correct container configuration.

  5. Check sandbox status before each session. Gemini CLI displays a persistent red-text warning when running without sandbox. Treating that warning as signal — not noise — is the lowest-effort available safeguard.


FAQ

Q: Why is sandbox disabled by default? Is that a security decision?

The official documentation does not explain the default. The practical consequence is noted in the docs: "The no sandbox mode is the default setting for Gemini CLI. For users who choose not to use sandboxing, Google ensures this is highly visible by displaying a persistent warning in red text throughout their session." The design appears to prioritize out-of-box usability over a secure default — consistent with issue #3256's complaint that sandboxing is recommended but not explained or defaulted.

Q: Does enabling --sandbox on macOS protect against prompt injection attacks?

Partially. The permissive-open seatbelt profile (the default when you pass --sandbox on macOS) restricts filesystem writes outside the project directory. It does not restrict outbound network connections or shell command execution. A prompt injection that only needs to read files and make a network call — the attack pattern in the Tracebit disclosure — is not blocked by the macOS default sandbox profile. The restrictive-closed profile would block the network egress path.

Q: The sandbox keeps throwing EACCES errors. Is disabling it the right fix?

The EACCES errors documented in #15376 and #16968 have a known cause (UID/GID mismatch between host and container) and a documented workaround (SANDBOX_SET_UID_GID=true or explicit SANDBOX_FLAGS with host UID/GID). Disabling the sandbox to resolve a permission error trades a configuration problem for a security regression. The credential files that trigger the EACCES error — oauth_creds.json, google_accounts.json — are themselves sensitive; running without sandbox means the model has unrestricted access to the entire host filesystem including those files.

Q: How does the Gemini CLI sandbox compare to running Gemini CLI inside a general-purpose Docker container?

Running gemini inside a standard Docker container you manage independently gives you container-level isolation regardless of Gemini CLI's own sandbox setting. This is the approach FOSS Force's analysis recommended (using Firejail as an external isolation layer). The Gemini CLI native sandbox adds a second layer: sandbox expansion dialogs that surface when the model requests permissions, and profile-based restrictions on what the container itself can do. Neither substitutes for the other — they address different parts of the threat model.

Q: Is gVisor worth the setup complexity for Gemini CLI?

The official sandbox docs describe gVisor as providing "the strongest isolation available" by intercepting all system calls in a user-space kernel. The SandboxEscapeBench research found that successful container escapes relied on misconfigurations (exposed sockets, privileged containers), not kernel vulnerabilities — suggesting that for most developer workstations, correctly configured Docker without gVisor is adequate, and gVisor's overhead is justified primarily in high-security environments where the container configuration itself cannot be fully controlled.

Was this article helpful?