Browser Control

Overview

Browser control endpoints allow you to directly manipulate the browser environment independently of agent tasks. Use these endpoints to:

Navigate to specific URLs before or during agent execution
Capture screenshots for debugging, monitoring, or verification

When to Use Browser Control

Navigate Endpoint

Use the navigate endpoint when you need:

Pre-positioning the browser before sending a task
Direct control over navigation (bypassing agent decision-making)
Immediate navigation without waiting for agent processing

Screenshot Endpoint

Use the screenshot endpoint when you need:

Visual debugging of agent behavior
Progress monitoring during long-running tasks
Verification of page state after operations
Documentation of agent actions

Comparison with Agent Messages

Method	When to Use
Navigate API	Direct control, immediate navigation
Message with start_url	Let agent handle navigation naturally
Message with URL in text	Agent decides how to interpret URL

Best Practices

Use navigate before sending a message to ensure the agent starts on the right page.

Screenshots are useful for debugging and verification, but avoid excessive calls as they can be resource-intensive.

The navigate endpoint changes browser state immediately. Use carefully to avoid disrupting agent tasks.

For most use cases, providing a start_url in your message is more flexible than using the navigate endpoint directly.

Navigate Browser

Navigate to a specific URL

Get Screenshot

Capture browser screenshots

Getting Started

Session Management

Agent Interaction

Session Control

Overview

When to Use Browser Control

Navigate Endpoint

Screenshot Endpoint

Comparison with Agent Messages

Best Practices

Navigate Browser

Get Screenshot

Getting Started

Session Management

Agent Interaction

Session Control

​Overview

​When to Use Browser Control

​Navigate Endpoint

​Screenshot Endpoint

​Comparison with Agent Messages

​Best Practices

​Related Endpoints

Navigate Browser

Get Screenshot

Overview

When to Use Browser Control

Navigate Endpoint

Screenshot Endpoint

Comparison with Agent Messages

Best Practices

Related Endpoints