No description
  • TypeScript 92.7%
  • JavaScript 7.3%
Find a file
2026-03-21 20:22:55 -03:00
src Fix stale dist, add vitest tests, and fix StagehandManager bugs 2026-03-20 09:21:52 -03:00
tests Fix stale dist, add vitest tests, and fix StagehandManager bugs 2026-03-20 09:21:52 -03:00
.gitignore Initial commit 2025-08-24 03:26:58 -03:00
package-lock.json Fix stale dist, add vitest tests, and fix StagehandManager bugs 2026-03-20 09:21:52 -03:00
package.json Fix stale dist, add vitest tests, and fix StagehandManager bugs 2026-03-20 09:21:52 -03:00
README.md Added Readme.md 2026-03-21 20:22:55 -03:00
tsconfig.json Initial commit 2025-08-24 03:26:58 -03:00
vitest.config.ts Fix stale dist, add vitest tests, and fix StagehandManager bugs 2026-03-20 09:21:52 -03:00

Stagehand Tools

Browser-based web page fetching and content extraction tool for the Fractal Synapse agent system, powered by Stagehand.

Overview

Stagehand Tools uses a real headless Chromium browser to fetch and extract content from web pages. Unlike HTTP-based tools, it fully renders JavaScript and handles dynamic content. Stagehand's AI-powered extraction understands page structure and pulls out structured content without needing to write CSS selectors.

Features

  • Real browser rendering - Executes JavaScript, handles SPAs, dynamic content, and cookie walls
  • AI-powered extraction - Uses OpenAI GPT-4o to intelligently extract structured content from any page
  • Multiple extraction modes - Text extraction, CSS selector targeting, or structured data
  • Singleton browser session - One shared Stagehand instance per process for efficiency
  • Robust error handling - Structured error objects for all failure scenarios, never throws

Requirements

  • OPENAI_API_KEY environment variable
  • Chromium/Chrome browser (installed automatically by Playwright)

Installation

npm install
npm run build

Usage

Extraction Modes

Text Mode (default, extractText: true)

Returns the main readable content of the page:

{
  "url": "https://example.com",
  "timestamp": "2026-03-20T12:16:43.561Z",
  "title": "Example Domain",
  "content": "This domain is for use in documentation examples...",
  "summary": "Brief summary of the page content"
}

Selector Mode (when selector is provided)

Extracts content from a specific element:

{
  "url": "https://example.com",
  "timestamp": "2026-03-20T12:16:47.309Z",
  "title": "Extracted Content",
  "content": "Example Domain"
}

Structured Mode (extractText: false)

Returns links, images, and content as structured data:

{
  "url": "https://example.com",
  "timestamp": "2026-03-20T12:16:51.127Z",
  "title": "Example Domain",
  "content": "Example Domain\n\nThis domain is for use in documentation examples...",
  "links": ["https://www.iana.org/domains/reserved"],
  "images": []
}

Parameters

Parameter Type Required Default Description
url string yes The URL to fetch content from
selector string no CSS selector to target specific content
extractText boolean no true Text mode vs. structured data mode

Error Handling

The tool never throws. All failure scenarios return a structured error object:

{
  "error": true,
  "message": "Unable to resolve domain",
  "details": "Cannot resolve https://example.invalid. The website may not exist or be temporarily unavailable.",
  "timestamp": "2026-03-20T12:00:00.000Z",
  "toolName": "web-fetch",
  "url": "https://example.invalid"
}
Scenario message
Malformed URL "Invalid URL format"
DNS failure "Unable to resolve domain"
Connection refused "Connection refused"
Navigation timeout "Timeout while loading page"
Other navigation error "Failed to navigate to page"
AI extraction failure "Failed to extract content from page"

Testing

Unit Tests (mocked browser)

Fast, no API key or browser required:

npm run test:unit

Covers: StagehandManager singleton behavior, URL validation, all navigation error cases, extraction modes, page cleanup, ToolDefinition wiring.

Integration Tests (real browser)

Launches a real headless Chromium browser and makes live network requests. Requires OPENAI_API_KEY:

OPENAI_API_KEY=your-key npm run test:integration

Covers: real Stagehand initialization, fetching example.com in all three extraction modes, unreachable domain error handling.

All Tests

npm run test:run

Integration with Fractal Synapse

  1. Add to package.json dependencies:

    {
      "dependencies": {
        "stagehand-tools": "file:../../packages/tools/stagehand-tools"
      }
    }
    
  2. Import and register:

    import { webFetchToolDefinition } from 'stagehand-tools';
    
    toolRegistry.registerTool('web-fetch', webFetchToolDefinition);
    
  3. Add to AgentDefinition:

    const agentDefinition = new AgentDefinition(
      'My Agent',
      'Description',
      'System prompt',
      ['web-fetch'],
      'openai-gpt-4o'
    );
    

Comparison with Web Fetch HTTP Tool

Feature Stagehand (this) Web Fetch HTTP
JavaScript rendering Yes No
Dynamic content / SPAs Yes No
Speed 🐌 Slower Fast
Resource usage 🔋 Heavy (browser) 💡 Lightweight
Setup Needs browser Simple
Reliability on static sites High High

Use Stagehand for JavaScript-heavy sites, SPAs, or pages that require interaction. Use web-fetch-http-tool for static sites where speed and simplicity matter.

License

ISC

Author

James Peret