toolsdevelopmentadvanced

Building Custom Tools for Gemini CLI: A Developer's Cookbook

Learn how to build, test, and publish custom tools for Gemini CLI. Three complete tool examples with source code, testing patterns, and distribution workflow.

Zhihao MuZhihao Mu
Updated: April 12, 202625 min read

Introduction

The default Gemini CLI installation is already capable: it can read and write files, run shell commands, search the web, and reason across long context windows. But real-world engineering workflows are never that generic. You have internal APIs with custom authentication schemes, proprietary database schemas, bespoke file formats, and CI/CD pipelines that no off-the-shelf tool knows about.

That gap is exactly where custom tools live. Gemini CLI's tool system lets you extend the model's capabilities with first-class, schema-validated, testable functions that behave identically to the built-in ones. The model discovers your tools at startup, understands their inputs and outputs from the JSON Schema you provide, and calls them with the same confidence it would call a native capability.

This cookbook is organised around three complete, runnable recipes: a file-content analyzer, a database query tool, and a REST API integration with authentication. Each recipe progresses from a minimal skeleton to production-grade code covering error handling, testing, and security hardening. The final sections cover the full lifecycle — unit testing, end-to-end verification, packaging, and publishing to npm so other teams can consume your tools.

If you have built a Gemini CLI MCP server before, note that custom tools covered here are lighter-weight integrations that run in-process rather than as a separate server process. Use tools when you want a single self-contained function; use MCP servers when you need stateful sessions or a catalogue of many related capabilities.


TL;DR

  • Custom tools are TypeScript functions registered with a name, description, and a JSON Schema input definition.
  • The model reads the description and schema to decide when and how to call your tool — write them carefully.
  • Three complete recipes: file analyzer, database query, REST API integration.
  • Test tools in isolation with a thin harness; mock the external dependency, not the tool itself.
  • Publish to npm with gemini-cli-tool- prefix convention so users can install with a single command.

Understanding the Tool API

Tool Registration Interface

Every custom tool is a plain JavaScript object that satisfies the GeminiTool interface exported from @google/gemini-cli-core:

import type { GeminiTool, ToolContext, ToolResult } from "@google/gemini-cli-core";

const myTool: GeminiTool = {
  // The name the model uses to invoke the tool. Use snake_case.
  name: "my_tool",

  // The description the model reads to decide whether to call this tool.
  // Be specific: mention the format of the output, not just what it does.
  description:
    "Reads a text file and returns a summary of line count, word count, " +
    "and the first 200 characters. Useful for inspecting unknown files before processing.",

  // JSON Schema for the tool's input. The model constructs arguments
  // that must pass validation before execute() is called.
  inputSchema: {
    type: "object",
    properties: {
      filePath: {
        type: "string",
        description: "Absolute path to the file to analyse.",
      },
    },
    required: ["filePath"],
  },

  // The implementation. context provides access to the working directory,
  // the active session, and a logger instance.
  async execute(
    input: { filePath: string },
    context: ToolContext
  ): Promise<ToolResult> {
    // ... implementation
    return { content: "..." };
  },
};

Input and Output Formats

Input — The model generates a JSON object that is validated against inputSchema before execute() is called. Validation failures are surfaced back to the model as a structured error, allowing it to retry with corrected arguments. Always define tight schemas: required fields, type constraints, enum values where applicable, and description on every property. The model uses property descriptions to populate values correctly.

Outputexecute() returns a ToolResult, which is a union type:

// Plain text result — most common
type TextResult = { content: string };

// Structured result — returned as JSON, model can reason over fields
type StructuredResult = { content: string; data: Record<string, unknown> };

// Error result — signals a recoverable failure; model may retry or ask user
type ErrorResult = { error: string; recoverable: boolean };

Return ErrorResult with recoverable: true for transient failures (network timeout, locked file). Return recoverable: false for permanent failures (file not found, invalid credentials). The model uses this flag to decide whether to retry automatically or surface the error to the user.

Tool Lifecycle

Gemini CLI startup
       │
       ▼
  loadTools()          ← Your tool file is imported here
       │
       ▼
  Schema validation    ← inputSchema is compiled to a validator
       │
       ▼
  Tool registration    ← Tool is added to the model's system prompt
       │
       ▼
  User message
       │
       ▼
  Model decides to call your tool
       │
       ▼
  Argument validation  ← JSON Schema validation runs
       │
       ▼
  execute(input, ctx)  ← Your implementation runs
       │
       ▼
  ToolResult returned to model
       │
       ▼
  Model continues reasoning

Tools are loaded once per session. There is no hot-reload; restart Gemini CLI after modifying a tool.


Recipe 1: File Analyzer Tool

This tool reads a file, extracts structural metadata, and returns a human-readable summary. It demonstrates input validation, binary file detection, and graceful truncation for large files.

The Implementation

// tools/file-analyzer.ts
import * as fs from "fs";
import * as path from "path";
import type { GeminiTool, ToolContext, ToolResult } from "@google/gemini-cli-core";

/** Maximum bytes to read for content preview. */
const MAX_PREVIEW_BYTES = 4_096;

/** Heuristic: if more than 10% of sampled bytes are non-printable, treat as binary. */
function isBinary(buffer: Buffer): boolean {
  const sampleSize = Math.min(buffer.length, 512);
  let nonPrintable = 0;
  for (let i = 0; i < sampleSize; i++) {
    const byte = buffer[i];
    if (byte < 9 || (byte > 13 && byte < 32) || byte === 127) {
      nonPrintable++;
    }
  }
  return nonPrintable / sampleSize > 0.1;
}

/** Count lines without loading the entire file into memory for large inputs. */
function countLines(content: string): number {
  let count = 1;
  for (let i = 0; i < content.length; i++) {
    if (content[i] === "\n") count++;
  }
  return count;
}

export const fileAnalyzerTool: GeminiTool = {
  name: "analyze_file",
  description:
    "Reads a file and returns its size, line count, word count, detected language " +
    "(from extension), and a content preview of up to 4 KB. " +
    "Returns an error for binary files. " +
    "Use this before editing an unfamiliar file to understand its structure.",

  inputSchema: {
    type: "object",
    properties: {
      filePath: {
        type: "string",
        description:
          "Absolute or workspace-relative path to the file to analyse.",
      },
      includePreview: {
        type: "boolean",
        description:
          "Whether to include a content preview in the output. Defaults to true.",
        default: true,
      },
    },
    required: ["filePath"],
    additionalProperties: false,
  },

  async execute(
    input: { filePath: string; includePreview?: boolean },
    context: ToolContext
  ): Promise<ToolResult> {
    const { filePath, includePreview = true } = input;

    // Resolve relative paths against the workspace root.
    const resolved = path.isAbsolute(filePath)
      ? filePath
      : path.resolve(context.workspaceRoot, filePath);

    // --- Existence check ---
    if (!fs.existsSync(resolved)) {
      return {
        error: `File not found: ${resolved}`,
        recoverable: false,
      };
    }

    const stat = fs.statSync(resolved);

    if (stat.isDirectory()) {
      return {
        error: `Path is a directory, not a file: ${resolved}`,
        recoverable: false,
      };
    }

    // Read only what we need for the preview + binary detection.
    const fd = fs.openSync(resolved, "r");
    const sampleBuffer = Buffer.alloc(Math.min(stat.size, MAX_PREVIEW_BYTES));
    const bytesRead = fs.readSync(fd, sampleBuffer, 0, sampleBuffer.length, 0);
    fs.closeSync(fd);

    const sample = sampleBuffer.slice(0, bytesRead);

    if (isBinary(sample)) {
      return {
        error:
          `File appears to be binary (${stat.size} bytes). ` +
          "Use a dedicated binary analysis tool instead.",
        recoverable: false,
      };
    }

    const previewText = sample.toString("utf8");

    // For small files we can count lines precisely from the preview.
    // For large files we estimate based on average line length in the sample.
    let lineCount: number;
    if (stat.size <= MAX_PREVIEW_BYTES) {
      lineCount = countLines(previewText);
    } else {
      const linesInSample = countLines(previewText);
      const avgLineLength = MAX_PREVIEW_BYTES / Math.max(linesInSample, 1);
      lineCount = Math.round(stat.size / avgLineLength);
    }

    const wordCount = previewText
      .split(/\s+/)
      .filter(Boolean).length;

    const ext = path.extname(resolved).toLowerCase();
    const languageMap: Record<string, string> = {
      ".ts": "TypeScript",
      ".tsx": "TypeScript (JSX)",
      ".js": "JavaScript",
      ".jsx": "JavaScript (JSX)",
      ".py": "Python",
      ".go": "Go",
      ".rs": "Rust",
      ".java": "Java",
      ".cs": "C#",
      ".cpp": "C++",
      ".c": "C",
      ".md": "Markdown",
      ".mdx": "MDX",
      ".json": "JSON",
      ".yaml": "YAML",
      ".yml": "YAML",
      ".toml": "TOML",
      ".sql": "SQL",
      ".sh": "Shell",
      ".html": "HTML",
      ".css": "CSS",
    };
    const language = languageMap[ext] ?? `Unknown (${ext || "no extension"})`;

    const truncated = stat.size > MAX_PREVIEW_BYTES;

    let content =
      `File: ${resolved}\n` +
      `Size: ${stat.size.toLocaleString()} bytes\n` +
      `Lines: ${lineCount.toLocaleString()}${truncated ? " (estimated)" : ""}\n` +
      `Words: ${wordCount.toLocaleString()}${truncated ? " (preview only)" : ""}\n` +
      `Language: ${language}\n` +
      `Last modified: ${stat.mtime.toISOString()}\n`;

    if (includePreview) {
      content +=
        `\n--- Preview (first ${bytesRead} bytes) ---\n` +
        previewText +
        (truncated ? "\n... (truncated)" : "");
    }

    return { content };
  },
};

Registering the Tool

Add the tool to your Gemini CLI configuration file (~/.gemini/config.ts or the project-local .gemini/config.ts):

// .gemini/config.ts
import { fileAnalyzerTool } from "./tools/file-analyzer";

export default {
  tools: [fileAnalyzerTool],
};

Trying It Out

> Analyse the file src/components/Button.tsx

╔ Tool call: analyze_file
║ filePath: "src/components/Button.tsx"
╚ includePreview: true

File: /workspace/src/components/Button.tsx
Size: 2,847 bytes
Lines: 94
Words: 312
Language: TypeScript (JSX)
Last modified: 2026-03-20T14:32:01.000Z

--- Preview (first 2847 bytes) ---
import React from "react";
...

Recipe 2: Database Query Tool

A read-only query tool for PostgreSQL. The design enforces security at multiple layers: only SELECT statements are allowed, query results are paginated, and connection credentials are read from environment variables, never from tool input.

The Implementation

// tools/db-query.ts
import { Client } from "pg";
import type { GeminiTool, ToolContext, ToolResult } from "@google/gemini-cli-core";

/** Hard limit on rows returned to avoid flooding context window. */
const MAX_ROWS = 200;

/**
 * Validates that a SQL string is a read-only SELECT statement.
 * This is a defence-in-depth check; the database user should also
 * have read-only privileges.
 */
function assertSelectOnly(sql: string): void {
  const normalised = sql.trim().toUpperCase();

  // Strip leading comments.
  const stripped = normalised.replace(/^(\/\*.*?\*\/|--[^\n]*\n)*/s, "").trim();

  if (!stripped.startsWith("SELECT") && !stripped.startsWith("WITH")) {
    throw new Error(
      "Only SELECT (and WITH ... SELECT) statements are permitted. " +
        `Got: ${stripped.slice(0, 40)}`
    );
  }

  // Block dangerous keywords that could appear in CTEs or subqueries.
  const forbidden = [
    /\bINSERT\b/,
    /\bUPDATE\b/,
    /\bDELETE\b/,
    /\bDROP\b/,
    /\bTRUNCATE\b/,
    /\bALTER\b/,
    /\bCREATE\b/,
    /\bGRANT\b/,
    /\bREVOKE\b/,
    /\bEXECUTE\b/,
    /\bCALL\b/,
  ];

  for (const pattern of forbidden) {
    if (pattern.test(normalised)) {
      throw new Error(
        `Query contains a forbidden keyword matching ${pattern}. ` +
          "Only read-only queries are allowed."
      );
    }
  }
}

/** Formats a query result as a Markdown table for readability in the model context. */
function toMarkdownTable(
  columns: string[],
  rows: Record<string, unknown>[]
): string {
  if (rows.length === 0) return "_No rows returned._";

  const header = `| ${columns.join(" | ")} |`;
  const separator = `| ${columns.map(() => "---").join(" | ")} |`;
  const body = rows
    .map(
      (row) =>
        `| ${columns
          .map((col) => String(row[col] ?? "NULL").replace(/\|/g, "\\|"))
          .join(" | ")} |`
    )
    .join("\n");

  return [header, separator, body].join("\n");
}

export const dbQueryTool: GeminiTool = {
  name: "query_database",
  description:
    "Executes a read-only SQL SELECT query against the project database and returns " +
    "results as a Markdown table (max 200 rows). " +
    "Credentials are loaded from environment variables — never pass them as arguments. " +
    "Use this to explore data, verify migrations, or investigate production issues.",

  inputSchema: {
    type: "object",
    properties: {
      sql: {
        type: "string",
        description:
          "A valid PostgreSQL SELECT statement. Only read operations are permitted.",
      },
      limit: {
        type: "integer",
        description: "Maximum number of rows to return. Defaults to 50, max 200.",
        minimum: 1,
        maximum: MAX_ROWS,
        default: 50,
      },
    },
    required: ["sql"],
    additionalProperties: false,
  },

  async execute(
    input: { sql: string; limit?: number },
    context: ToolContext
  ): Promise<ToolResult> {
    const { sql, limit = 50 } = input;
    const effectiveLimit = Math.min(limit, MAX_ROWS);

    // --- Security gate ---
    try {
      assertSelectOnly(sql);
    } catch (err) {
      return {
        error: (err as Error).message,
        recoverable: false,
      };
    }

    // --- Connection ---
    const connectionString = process.env.DATABASE_URL;
    if (!connectionString) {
      return {
        error:
          "DATABASE_URL environment variable is not set. " +
          "Set it before starting Gemini CLI.",
        recoverable: false,
      };
    }

    const client = new Client({ connectionString });

    try {
      await client.connect();

      // Enforce row limit by wrapping the user query.
      const limitedSql = `SELECT * FROM (${sql}) AS _q LIMIT ${effectiveLimit + 1}`;

      const result = await client.query(limitedSql);
      const columns = result.fields.map((f) => f.name);
      const allRows = result.rows as Record<string, unknown>[];

      const truncated = allRows.length > effectiveLimit;
      const rows = allRows.slice(0, effectiveLimit);

      const table = toMarkdownTable(columns, rows);
      const rowsLabel = `${rows.length}${truncated ? "+" : ""} row(s)`;

      return {
        content:
          `Query returned ${rowsLabel}:\n\n` +
          table +
          (truncated
            ? `\n\n_Results truncated at ${effectiveLimit} rows. Refine your query or increase the limit._`
            : ""),
      };
    } catch (err) {
      const message = (err as Error).message;

      // Postgres syntax errors are recoverable — the model can fix the SQL.
      const isSyntaxError =
        message.includes("syntax error") || message.includes("column") && message.includes("does not exist");

      return {
        error: `Database error: ${message}`,
        recoverable: isSyntaxError,
      };
    } finally {
      await client.end().catch(() => {
        // Ignore disconnect errors.
      });
    }
  },
};

Security Checklist

Before deploying this tool in a team environment, verify each of the following:

  • Database user is read-only. Create a dedicated role with GRANT SELECT ON ALL TABLES IN SCHEMA public TO gemini_readonly;. Never use a superuser or application write credential.
  • DATABASE_URL is not logged. The context.logger instance redacts strings that match secret patterns, but always audit your log pipeline.
  • Firewall rules. If the database is in a private network, ensure Gemini CLI runs inside the same VPC or through a secure tunnel.
  • Query timeout. Add statement_timeout to the connection string: postgresql://...?options=-c statement_timeout=10000 to cap queries at 10 seconds.

Recipe 3: API Integration Tool

This tool calls a paginated REST API, handles bearer-token authentication with automatic refresh, and normalises the response into a consistent format the model can reason over.

The Implementation

// tools/api-integration.ts
import type { GeminiTool, ToolContext, ToolResult } from "@google/gemini-cli-core";

interface TokenStore {
  accessToken: string;
  expiresAt: number; // Unix timestamp (ms)
}

/** Module-level token cache so refresh happens at most once per session. */
let tokenStore: TokenStore | null = null;

/**
 * Obtains or refreshes a bearer token using client credentials flow.
 * Tokens are cached in memory for the lifetime of the Gemini CLI session.
 */
async function getBearerToken(): Promise<string> {
  const now = Date.now();

  if (tokenStore && tokenStore.expiresAt > now + 30_000) {
    return tokenStore.accessToken;
  }

  const clientId = process.env.API_CLIENT_ID;
  const clientSecret = process.env.API_CLIENT_SECRET;
  const tokenEndpoint = process.env.API_TOKEN_ENDPOINT;

  if (!clientId || !clientSecret || !tokenEndpoint) {
    throw new Error(
      "Missing API credentials. Set API_CLIENT_ID, API_CLIENT_SECRET, " +
        "and API_TOKEN_ENDPOINT environment variables."
    );
  }

  const response = await fetch(tokenEndpoint, {
    method: "POST",
    headers: { "Content-Type": "application/x-www-form-urlencoded" },
    body: new URLSearchParams({
      grant_type: "client_credentials",
      client_id: clientId,
      client_secret: clientSecret,
    }),
  });

  if (!response.ok) {
    throw new Error(
      `Token endpoint returned ${response.status}: ${await response.text()}`
    );
  }

  const data = (await response.json()) as {
    access_token: string;
    expires_in: number;
  };

  tokenStore = {
    accessToken: data.access_token,
    expiresAt: now + data.expires_in * 1_000,
  };

  return tokenStore.accessToken;
}

/**
 * Performs a GET request with bearer auth, following cursor-based pagination
 * up to `maxPages`. Returns all accumulated items.
 */
async function fetchAllPages<T>(
  baseUrl: string,
  params: Record<string, string>,
  maxPages: number
): Promise<T[]> {
  const token = await getBearerToken();
  const accumulated: T[] = [];
  let cursor: string | null = null;
  let page = 0;

  while (page < maxPages) {
    const url = new URL(baseUrl);
    for (const [key, value] of Object.entries(params)) {
      url.searchParams.set(key, value);
    }
    if (cursor) url.searchParams.set("cursor", cursor);

    const response = await fetch(url.toString(), {
      headers: {
        Authorization: `Bearer ${token}`,
        Accept: "application/json",
      },
      signal: AbortSignal.timeout(15_000),
    });

    if (response.status === 401) {
      // Token may have been revoked externally; clear cache and throw so the
      // caller can surface a clear error rather than a cryptic auth failure.
      tokenStore = null;
      throw new Error("API returned 401 Unauthorized. Token has been cleared; retry to re-authenticate.");
    }

    if (!response.ok) {
      throw new Error(`API error ${response.status}: ${await response.text()}`);
    }

    const body = (await response.json()) as {
      items: T[];
      next_cursor?: string;
    };

    accumulated.push(...body.items);
    cursor = body.next_cursor ?? null;
    page++;

    if (!cursor) break;
  }

  return accumulated;
}

export const apiIntegrationTool: GeminiTool = {
  name: "fetch_api_resources",
  description:
    "Fetches resources from the configured REST API. " +
    "Handles authentication automatically using client credentials stored in environment variables. " +
    "Supports filtering by resource type, date range, and free-text search. " +
    "Returns up to 500 items across paginated responses.",

  inputSchema: {
    type: "object",
    properties: {
      resourceType: {
        type: "string",
        enum: ["users", "orders", "products", "events"],
        description: "The type of resource to retrieve.",
      },
      search: {
        type: "string",
        description: "Optional free-text search filter applied server-side.",
      },
      since: {
        type: "string",
        format: "date",
        description:
          "ISO 8601 date (YYYY-MM-DD). Only return resources created on or after this date.",
      },
      maxItems: {
        type: "integer",
        description: "Maximum items to return across all pages. Defaults to 100, max 500.",
        minimum: 1,
        maximum: 500,
        default: 100,
      },
    },
    required: ["resourceType"],
    additionalProperties: false,
  },

  async execute(
    input: {
      resourceType: "users" | "orders" | "products" | "events";
      search?: string;
      since?: string;
      maxItems?: number;
    },
    context: ToolContext
  ): Promise<ToolResult> {
    const { resourceType, search, since, maxItems = 100 } = input;

    const apiBaseUrl = process.env.API_BASE_URL;
    if (!apiBaseUrl) {
      return {
        error: "API_BASE_URL environment variable is not set.",
        recoverable: false,
      };
    }

    const params: Record<string, string> = {
      page_size: "50",
    };
    if (search) params.q = search;
    if (since) params.created_since = since;

    const maxPages = Math.ceil(maxItems / 50);

    try {
      const items = await fetchAllPages<Record<string, unknown>>(
        `${apiBaseUrl}/v1/${resourceType}`,
        params,
        maxPages
      );

      const trimmed = items.slice(0, maxItems);

      if (trimmed.length === 0) {
        return {
          content: `No ${resourceType} found matching the given filters.`,
        };
      }

      // Summarise the result set rather than dumping raw JSON.
      const summary =
        `Found ${trimmed.length} ${resourceType}` +
        (items.length > maxItems ? ` (showing first ${maxItems} of ${items.length}+)` : "") +
        ":\n\n" +
        "```json\n" +
        JSON.stringify(trimmed, null, 2) +
        "\n```";

      return { content: summary };
    } catch (err) {
      const message = (err as Error).message;
      const isTransient =
        message.includes("timeout") ||
        message.includes("ECONNRESET") ||
        message.includes("ENOTFOUND");

      return {
        error: message,
        recoverable: isTransient,
      };
    }
  },
};

Environment Variables Reference

| Variable | Required | Description | | --- | --- | --- | | API_BASE_URL | Yes | Base URL, e.g. https://api.example.com | | API_CLIENT_ID | Yes | OAuth 2.0 client ID | | API_CLIENT_SECRET | Yes | OAuth 2.0 client secret | | API_TOKEN_ENDPOINT | Yes | Token URL, e.g. https://auth.example.com/oauth/token |

Store these in your project's .env file and load them with dotenv before starting Gemini CLI, or use a secrets manager integration.


Testing Your Tools

Unit Testing Patterns

Each tool's execute function is a plain async function that accepts two arguments. You do not need to start a real Gemini CLI process to test it — create a minimal ToolContext mock and call execute directly.

// tools/__tests__/file-analyzer.test.ts
import * as fs from "fs";
import * as path from "path";
import * as os from "os";
import { fileAnalyzerTool } from "../file-analyzer";
import type { ToolContext } from "@google/gemini-cli-core";

/** Minimal ToolContext sufficient for the file analyzer. */
function makeContext(workspaceRoot: string): ToolContext {
  return {
    workspaceRoot,
    logger: {
      debug: jest.fn(),
      info: jest.fn(),
      warn: jest.fn(),
      error: jest.fn(),
    },
  } as unknown as ToolContext;
}

describe("fileAnalyzerTool", () => {
  let tmpDir: string;

  beforeEach(() => {
    tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "tool-test-"));
  });

  afterEach(() => {
    fs.rmSync(tmpDir, { recursive: true, force: true });
  });

  it("returns metadata for a valid text file", async () => {
    const filePath = path.join(tmpDir, "sample.ts");
    fs.writeFileSync(filePath, "const x = 1;\nconst y = 2;\n");

    const result = await fileAnalyzerTool.execute(
      { filePath, includePreview: false },
      makeContext(tmpDir)
    );

    expect("error" in result).toBe(false);
    expect(result.content).toContain("Lines: 2");
    expect(result.content).toContain("Language: TypeScript");
  });

  it("returns a recoverable=false error for missing files", async () => {
    const result = await fileAnalyzerTool.execute(
      { filePath: "/does/not/exist.txt" },
      makeContext(tmpDir)
    );

    expect(result).toMatchObject({ error: expect.stringContaining("not found"), recoverable: false });
  });

  it("returns an error for binary files", async () => {
    const filePath = path.join(tmpDir, "image.png");
    // Write a minimal PNG header (binary content).
    fs.writeFileSync(filePath, Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a]));

    const result = await fileAnalyzerTool.execute(
      { filePath },
      makeContext(tmpDir)
    );

    expect(result).toMatchObject({ error: expect.stringContaining("binary"), recoverable: false });
  });

  it("resolves relative paths against workspaceRoot", async () => {
    fs.writeFileSync(path.join(tmpDir, "relative.txt"), "hello world\n");

    const result = await fileAnalyzerTool.execute(
      { filePath: "relative.txt" },
      makeContext(tmpDir)
    );

    expect("error" in result).toBe(false);
  });
});

Mocking External Dependencies

For tools that call external services (the database and API recipes), mock at the module level so your tests run offline and deterministically.

// tools/__tests__/db-query.test.ts
import { dbQueryTool } from "../db-query";
import type { ToolContext } from "@google/gemini-cli-core";

// Mock the 'pg' module before importing the tool.
jest.mock("pg", () => {
  const mockClient = {
    connect: jest.fn().mockResolvedValue(undefined),
    query: jest.fn(),
    end: jest.fn().mockResolvedValue(undefined),
  };
  return { Client: jest.fn(() => mockClient) };
});

import { Client } from "pg";

const mockContext = {
  workspaceRoot: "/tmp",
  logger: { debug: jest.fn(), info: jest.fn(), warn: jest.fn(), error: jest.fn() },
} as unknown as ToolContext;

describe("dbQueryTool", () => {
  beforeEach(() => {
    process.env.DATABASE_URL = "postgresql://localhost/test";
    jest.clearAllMocks();
  });

  it("rejects non-SELECT statements", async () => {
    const result = await dbQueryTool.execute(
      { sql: "DELETE FROM users" },
      mockContext
    );
    expect(result).toMatchObject({ recoverable: false, error: expect.stringContaining("forbidden") });
  });

  it("formats results as a Markdown table", async () => {
    const mockClient = new (Client as jest.MockedClass<typeof Client>)();
    (mockClient.query as jest.Mock).mockResolvedValue({
      fields: [{ name: "id" }, { name: "email" }],
      rows: [{ id: 1, email: "alice@example.com" }],
    });

    const result = await dbQueryTool.execute(
      { sql: "SELECT id, email FROM users LIMIT 1" },
      mockContext
    );

    expect(result.content).toContain("| id | email |");
    expect(result.content).toContain("alice@example.com");
  });

  it("returns recoverable error for syntax errors", async () => {
    const mockClient = new (Client as jest.MockedClass<typeof Client>)();
    (mockClient.query as jest.Mock).mockRejectedValue(
      new Error("syntax error at or near SELECT")
    );

    const result = await dbQueryTool.execute(
      { sql: "SELECT * FORM users" },
      mockContext
    );

    expect(result).toMatchObject({ recoverable: true });
  });
});

End-to-End Testing

For a full integration test, start a real Gemini CLI session with --non-interactive and pipe a prompt that exercises your tool:

#!/usr/bin/env bash
# e2e/test-file-analyzer.sh

set -euo pipefail

RESULT=$(echo "Analyse the file package.json and tell me how many lines it has." \
  | gemini --non-interactive --config .gemini/config.ts 2>&1)

if echo "$RESULT" | grep -q "Lines:"; then
  echo "PASS: file analyzer returned line count"
else
  echo "FAIL: expected 'Lines:' in output"
  echo "$RESULT"
  exit 1
fi

Run this as part of your CI pipeline after the unit tests pass. Keep end-to-end tests lean — one happy path and one error path per tool is sufficient.


Error Handling Patterns

The Three-Layer Model

Well-designed tool error handling has three layers, each responsible for a different failure class:

Layer 1 — Input validation (schema + business rules)
  │  Failure: return { error: "...", recoverable: false }
  │  The model will not retry invalid input automatically.
  ▼
Layer 2 — External dependency (network, disk, database)
  │  Transient failure: return { error: "...", recoverable: true }
  │  The model may retry once before surfacing to the user.
  │  Permanent failure: return { error: "...", recoverable: false }
  ▼
Layer 3 — Unexpected exceptions (bugs, invariant violations)
  │  Wrap in a top-level try/catch; return { error: "Unexpected: ...", recoverable: false }
  │  Log the full stack trace with context.logger.error()

Fallback Strategy Pattern

For tools that have a primary and a fallback data source, implement explicit fallback chaining rather than silent degradation:

async execute(input, context): Promise<ToolResult> {
  // Attempt primary source.
  try {
    const data = await fetchFromPrimarySource(input);
    return formatResult(data, "primary");
  } catch (primaryErr) {
    context.logger.warn(
      `Primary source failed: ${(primaryErr as Error).message}. Attempting fallback.`
    );
  }

  // Attempt fallback source.
  try {
    const data = await fetchFromFallbackSource(input);
    return formatResult(data, "fallback (note: data may be stale)");
  } catch (fallbackErr) {
    return {
      error:
        `Both primary and fallback sources failed. ` +
        `Fallback error: ${(fallbackErr as Error).message}`,
      recoverable: false,
    };
  }
}

Always tell the model when you fell back to a secondary source. The model will include this caveat in its response to the user.

Timeout Wrapping

Wrap any I/O operation with an explicit timeout so a hung connection does not stall the entire conversation:

async function withTimeout<T>(promise: Promise<T>, ms: number, label: string): Promise<T> {
  const timeout = new Promise<never>((_, reject) =>
    setTimeout(() => reject(new Error(`${label} timed out after ${ms}ms`)), ms)
  );
  return Promise.race([promise, timeout]);
}

// Usage in execute():
const data = await withTimeout(fetchData(input), 10_000, "fetchData");

Publishing and Distribution

Package Structure

Organise your tool package so it can export multiple tools from a single entry point:

gemini-cli-tool-mycompany/
├── package.json
├── tsconfig.json
├── src/
│   ├── index.ts          ← re-exports all tools
│   ├── file-analyzer.ts
│   ├── db-query.ts
│   └── api-integration.ts
├── dist/                 ← compiled output (gitignored)
└── tests/
    ├── file-analyzer.test.ts
    ├── db-query.test.ts
    └── api-integration.test.ts

src/index.ts:

export { fileAnalyzerTool } from "./file-analyzer";
export { dbQueryTool } from "./db-query";
export { apiIntegrationTool } from "./api-integration";

package.json

{
  "name": "gemini-cli-tool-mycompany",
  "version": "1.0.0",
  "description": "Custom Gemini CLI tools for MyCompany engineering workflows",
  "main": "dist/index.js",
  "types": "dist/index.d.ts",
  "keywords": ["gemini-cli", "gemini-cli-tool"],
  "peerDependencies": {
    "@google/gemini-cli-core": ">=0.1.0"
  },
  "scripts": {
    "build": "tsc",
    "test": "jest",
    "prepublishOnly": "npm test && npm run build"
  }
}

Using the gemini-cli-tool- prefix and the gemini-cli-tool keyword makes your package discoverable via npm search gemini-cli-tool.

Version Management

Follow semantic versioning strictly. Tool input schema changes deserve special attention:

  • Patch (1.0.x): Bug fixes, documentation improvements, non-breaking error message changes.
  • Minor (1.x.0): New optional input properties, new tools added, non-breaking output changes.
  • Major (x.0.0): Removing required input properties, renaming tool names, changing the shape of the output in a way that breaks existing prompts.

Add a CHANGELOG.md and update it with every release. Users who pin to a version in their package.json will thank you.

Installing a Published Tool

# Install the package.
npm install gemini-cli-tool-mycompany

# Register it in .gemini/config.ts
// .gemini/config.ts
import { fileAnalyzerTool, dbQueryTool, apiIntegrationTool }
  from "gemini-cli-tool-mycompany";

export default {
  tools: [fileAnalyzerTool, dbQueryTool, apiIntegrationTool],
};

FAQ

Q: Can a tool call another tool?

Not directly — Gemini CLI does not expose a programmatic tool-calling API inside execute(). However, you can import and call the other tool's execute function directly as a plain TypeScript function. There is no magic in the tool wrapper itself; execute is just an async function.

Q: Can tools maintain state between calls?

Yes, via module-level variables. The tokenStore in Recipe 3 is an example. Module state persists for the lifetime of the Gemini CLI session. Use this sparingly: complex state is hard to reason about and debug. For anything beyond simple caching, write state to a file or database instead.

Q: My tool is called with the wrong arguments. How do I debug this?

First, check your description and the description fields on each input property. The model relies almost entirely on these strings to understand how to populate arguments. Second, add a console.error or context.logger.debug at the top of execute() to log the raw input object — you may find the model is passing correct values but your validation is rejecting them incorrectly. Third, test your inputSchema with an online JSON Schema validator against the argument object you expect.

Q: How do I handle tools that take a long time to run?

For operations exceeding 30 seconds, consider returning a job ID immediately and providing a second "poll status" tool. Gemini CLI will surface the job ID to the user, who can ask the model to check status in a follow-up message. This pattern avoids HTTP timeouts and keeps the conversation responsive.

Q: Can I restrict which tools are available per project?

Yes. Each project has its own .gemini/config.ts. Export only the tools relevant to that project. You can also implement conditional registration in a shared config helper: check an environment variable or the presence of a config file and skip registering tools that are not applicable.


Conclusion

Custom tools are the most direct path to making Gemini CLI useful in the specific context of your team's infrastructure. The three recipes in this cookbook cover the most common integration patterns — local file system access, structured data queries, and authenticated API calls — and each one demonstrates the same underlying discipline: tight input schemas, explicit error handling with recoverability signals, and clean separation between business logic and the tool wrapper.

A few principles to carry forward:

  • Write descriptions as if you are explaining the tool to a smart junior engineer who has never seen your codebase. The model's ability to call your tool correctly is entirely dependent on the quality of those strings.
  • Always validate at the boundary. Do not trust that the model will pass perfectly formed input, and do not trust that external services will return well-formed responses.
  • Keep tools small and composable. A tool that does one thing well is easier to test, easier to describe accurately, and easier for the model to use correctly than a multi-mode Swiss Army knife.

The publishing workflow described in the final section means you can share tools across repositories and teams with the same friction as any other npm package. A well-maintained gemini-cli-tool- package with good documentation and a solid test suite is a genuine force multiplier for every engineer on your team who uses Gemini CLI.

Zhihao Mu

Zhihao Mu

· Full-stack Developer

Developer and technical writer passionate about AI-powered development tools. Building geminicli.one to help developers unlock the full potential of Gemini CLI.

GitHub Profile

Was this article helpful?