OpenTelemetry for Agentic Browsers: Tracing Auto‑Agent AI Browser Pipelines with User‑Agent/Client‑Hints Context and “What Is My Browser Agent” Probes
TL;DR: Treat your agentic browser as a multi-layer distributed system. Use OpenTelemetry to trace the orchestrator, CDP/BiDi commands, DOM interactions, page network traffic, and LLM calls; enrich spans with user-agent and Client Hints; run “what is my browser agent” probes as synthetic health checks; emit security-risk signals; and ship SLOs and drift alerts with cost-aware sampling. This article lays out the semantic model, code examples, and operational patterns you can adopt today.
Why observability for agentic browsers is fundamentally different
Agentic browsers are not just automated UI tests. They are autonomous workflows that combine:
- An orchestrator that decomposes goals into steps (often LLM-driven)
- A browser automation stack (CDP or WebDriver BiDi; e.g., Playwright, Puppeteer, Selenium)
- In-page execution (DOM events, script injection, fetch/XHR)
- Out-of-band calls (vector DBs, tools, model inference APIs)
This topology behaves like a distributed system inside one machine. Failures are multi-causal: anti-bot challenges, DOM instability, network CORS, drift in UA-CH, model hallucinations, and more. Without end-to-end tracing, you’ll spend hours correlating console logs and screenshots instead of answering basic questions:
- Which CDP command caused the page to navigate to a blocked domain?
- Why do success rates drop only on pages with certain Client Hints?
- Are we paying to trace every keystroke when costs spike, or are we sampling intelligently?
- Did our stealth profile drift (navigator.webdriver, UA-CH) and trigger bot defenses?
OpenTelemetry (OTel) gives you the language to unify all of that.
Architectural overview: spans across orchestrator, CDP/BiDi, DOM, and network
We’ll instrument five layers, each producing spans and metrics:
-
Orchestrator layer
- Top-level “run” span per agent task; span links for subgoals
- LLM/tool spans with cost attributes
- Context propagation into browser sessions
-
CDP/BiDi layer
- A span per command (e.g., Page.navigate, Runtime.callFunctionOn, Input.dispatchKeyEvent)
- Attributes for command name, params summary, timing, errors
-
DOM layer (in-page)
- Spans for element queries, clicks, keystrokes, scrolls
- Spans for long tasks, layout shifts, console errors
-
Network layer
- Spans for fetch/XHR and resource loads, correlated with CDP Network events
- Sanitized headers and major response metadata
-
Context enrichment and probes
- UA string, Client Hints (UA-CH), navigator.webdriver
- Periodic “what is my browser agent” synthetic probes
All spans share a run_id and browser_session_id. The orchestrator injects a W3C traceparent into the browser page so that in-page spans join the same trace.
Semantic model and attributes: what to name things
OpenTelemetry has standards for HTTP spans and general RPC/database, and there are growing proposals for AI workloads. CDP/BiDi and in-page DOM semantics are less standardized. Use conservative, vendor-neutral attribute names with clear namespaces to avoid collisions.
Recommended resource attributes (on every process):
- service.name: "agent-browser-orchestrator" (or "agent-browser-worker", "agent-page")
- service.version, service.instance.id
- deployment.environment: "prod" | "staging" | "dev"
- browser.brand, browser.version (when in-page or CDP metadata available)
- os.type, os.version (host machine if relevant)
Common span attributes:
- agent.run_id: stable UUID for the user task
- agent.step_id: per-subgoal
- browser.session_id: map to context/page
- browser.page_url, browser.frame_id
- user_agent.original: full UA string used by the browser
- ua_ch.brands, ua_ch.platform, ua_ch.mobile, ua_ch.arch, ua_ch.model (stringified JSON if needed)
- automation.webdriver: boolean from navigator.webdriver
- antibot.challenge: one of "recaptcha", "turnstile", "cf_interstitial", "akamai_bm", "none"
CDP/BiDi spans:
- rpc.system: "cdp" or "bidi"
- rpc.service: CDP domain (e.g., "Page", "Network", "Runtime")
- rpc.method: command name (e.g., "Page.navigate")
- cdn.params.preview: shallow preview of arguments with redaction
- cdn.result.preview: shallow preview of result (redacted)
- cdn.seq: command sequence number
DOM spans (in-page):
- browser.dom.action: "click" | "type" | "select" | "scroll" | "query" | "wait"
- browser.dom.selector: CSS/XPath used
- browser.dom.element.role | .name (ARIA), .text_preview (redacted)
Network spans:
- http.url, http.method, http.status_code (OTel semantic conv.)
- server.address, server.port
- net.peer.name, net.peer.ip (avoid PII; bound to infrastructure)
- http.request.header.user_agent (if present)
- http.response.header.cf_ray, cf_cache_status, x-powered-by (risk signals)
LLM/tool spans (draft/unofficial):
- ai.system: "openai" | "anthropic" | "local_gguf" | "tool"
- ai.model.name, ai.model.provider
- ai.request.tokens, ai.response.tokens, ai.total_cost_usd
- ai.prompt.redacted (never raw content without a data policy!)
Security events (as span events or logs):
- security.csp.violation: { directive, blocked_uri }
- security.mixed_content: { type, url }
- security.download.blocked: { url }
- security.cookies.third_party: count
These attributes are suggestions; adjust names to match your org’s conventions and data governance.
Context propagation: get the traceparent into the page
If you don’t propagate context into the browser page, you’ll end up with multiple traces per run. Two pragmatic options:
- CDP injection: Write the current traceparent into a window global via Runtime.evaluate
- HTTP injection: Add a traceparent query param to the initial navigation URL (good for your own pages)
Example (Node.js orchestrator with Playwright + OTel JS):
tsimport { chromium } from 'playwright'; import { context, trace, SpanStatusCode } from '@opentelemetry/api'; import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node'; import { SimpleSpanProcessor } from '@opentelemetry/sdk-trace-base'; import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'; // Basic tracer setup const provider = new NodeTracerProvider(); provider.addSpanProcessor(new SimpleSpanProcessor(new OTLPTraceExporter({ url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT }))); provider.register(); const tracer = trace.getTracer('agent.orchestrator'); async function runAgentTask(task: string) { await tracer.startActiveSpan('agent.run', async (span) => { const runId = crypto.randomUUID(); span.setAttributes({ 'agent.run_id': runId, 'agent.task': task, 'deployment.environment': process.env.ENV || 'dev' }); const browser = await chromium.launch({ headless: true, args: ['--disable-blink-features=AutomationControlled'] }); const contextPW = await browser.newContext(); const page = await contextPW.newPage(); // Inject traceparent into the page const traceparent = context.active().getValue(Symbol.for('traceparent')) || ''; await page.addInitScript(({ tp }) => { // Store in a deterministic place window.__TRACEPARENT__ = tp; }, { tp: trace.getSpan(context.active())?.spanContext().traceId ? trace.getSpan(context.active())!.spanContext().traceId : '' }); // Alternatively, set via CDP Runtime evaluation const cdp = await contextPW.newCDPSession(page); const currentSpan = trace.getSpan(context.active()); const spanContext = currentSpan?.spanContext(); const tpHeader = spanContext ? `00-${spanContext.traceId}-${spanContext.spanId}-01` : ''; await cdp.send('Runtime.evaluate', { expression: `window.__TRACEPARENT__ = ${JSON.stringify(tpHeader)}`, contextId: 0, }); // Navigate await page.goto('https://example.com/?traceparent=' + encodeURIComponent(tpHeader)); // ... perform actions ... await browser.close(); span.setStatus({ code: SpanStatusCode.OK }); span.end(); }); }
Note: For production, use a stable way to propagate the active trace context. You can also expose it via a cookie or localStorage key if you control the site.
Instrumenting CDP/BiDi commands
Wrap the low-level CDP/BiDi send channel so each command is a child span. Here’s a Playwright CDP example in Node.js:
tsfunction instrumentCDPSession(cdp, tracer, sessionAttrs) { const originalSend = cdp.send.bind(cdp); let seq = 0; cdp.send = async (method, params) => { seq += 1; return await tracer.startActiveSpan(`cdp.${method}`, async (span) => { span.setAttributes({ 'rpc.system': 'cdp', 'rpc.service': method.split('.')[0], 'rpc.method': method, 'cdp.seq': seq, 'agent.run_id': sessionAttrs.runId, 'browser.session_id': sessionAttrs.sessionId, }); // Shallow preview with redaction const preview = JSON.stringify(params)?.slice(0, 512) || ''; span.setAttribute('cdp.params.preview', preview.replace(/("authorization":")[^"]+"/i, '$1<redacted>"')); try { const result = await originalSend(method, params); span.setAttribute('cdp.result.preview', JSON.stringify(result)?.slice(0, 512) || ''); span.end(); return result; } catch (err) { span.recordException(err as Error); span.setStatus({ code: 2, message: 'CDP command failed' }); span.end(); throw err; } }); }; return cdp; }
For BiDi (WebDriver BiDi), the same pattern applies: wrap send/receive methods and tag rpc.system = "bidi".
In-page instrumentation: DOM actions, performance, and Client Hints
Install the OpenTelemetry Web SDK in the controlled page when you can, or inject a light shim otherwise. Capture:
- DOM interactions the agent performs
- Long tasks and layout shifts (PerformanceObserver)
- fetch/XHR via OTel web instrumentation
- UA/Client Hints via navigator.userAgent and navigator.userAgentData
Example injection script (runs in the page):
js(async () => { // Minimal signal bus; replace with full OTel Web SDK if you own the app const tp = window.__TRACEPARENT__ || ''; // Client Hints (available in Chromium-based if server policy allows; userAgentData still useful) const ua = navigator.userAgent; let uaCH = {}; if (navigator.userAgentData) { try { const brands = navigator.userAgentData.brands || []; const high = await navigator.userAgentData.getHighEntropyValues(['platform', 'platformVersion', 'architecture', 'model', 'uaFullVersion']); uaCH = { brands, ...high, mobile: navigator.userAgentData.mobile }; } catch {} } // Expose to later CDP reads as well window.__AGENT_CONTEXT__ = { tp, ua, uaCH, webdriver: navigator.webdriver === true }; // Track DOM clicks (agent actions often dispatch synthetic events) document.addEventListener('click', (e) => { const t = e.target; const sel = t?.closest ? cssPath(t) : ''; window.__AGENT_EVENTS__ = window.__AGENT_EVENTS__ || []; window.__AGENT_EVENTS__.push({ type: 'dom.click', ts: Date.now(), selector: sel, ariaRole: t?.getAttribute?.('role') || '', textPreview: (t?.innerText || '').slice(0, 80) }); }, { capture: true }); // Performance: long tasks if ('PerformanceObserver' in window && 'LongTask' in window) { const po = new PerformanceObserver((list) => { for (const e of list.getEntries()) { window.__AGENT_EVENTS__ = window.__AGENT_EVENTS__ || []; window.__AGENT_EVENTS__.push({ type: 'perf.longtask', ts: Date.now(), duration: e.duration }); } }); try { po.observe({ entryTypes: ['longtask'] }); } catch {} } function cssPath(el) { if (!(el instanceof Element)) return ''; const path = []; while (el && path.length < 5) { let selector = el.nodeName.toLowerCase(); if (el.id) { selector += '#' + el.id; path.unshift(selector); break; } let sib = el; let ix = 1; while (sib.previousElementSibling) { sib = sib.previousElementSibling; ix++; } selector += `:nth-child(${ix})`; path.unshift(selector); el = el.parentElement; } return path.join('>'); } })();
From the orchestrator, periodically pull window.AGENT_CONTEXT and window.AGENT_EVENTS via CDP Runtime.evaluate and emit them as span events, so that the trace shows DOM actions interleaved with CDP commands and network activity.
Network instrumentation: fetch/XHR and CDP Network domain
You want both perspectives:
- In-page fetch/XHR instrumentation for app-level endpoints (uses OTel web or your hook on fetch and XMLHttpRequest)
- CDP Network events for all resources (document, stylesheet, images, third-party scripts)
Example: CDP Network events to spans:
tsfunction instrumentNetwork(cdp, tracer, sessionAttrs) { const requests = new Map(); cdp.send('Network.enable'); cdp.on('Network.requestWillBeSent', (e) => { const span = tracer.startSpan('http.client', { attributes: { 'http.url': e.request.url, 'http.method': e.request.method, 'rpc.system': 'cdp', 'agent.run_id': sessionAttrs.runId, 'browser.session_id': sessionAttrs.sessionId, } }); requests.set(e.requestId, span); }); cdp.on('Network.responseReceived', (e) => { const span = requests.get(e.requestId); if (!span) return; span.setAttributes({ 'http.status_code': e.response.status, 'server.address': e.response.remoteIPAddress || '', 'server.port': e.response.remotePort || 0, 'http.response.header.cf_ray': e.response.headers['cf-ray'] || '', 'http.response.header.x_powered_by': e.response.headers['x-powered-by'] || '', }); if (e.response.status >= 400) span.recordException({ name: 'HTTPError', message: `${e.response.status}` }); }); cdp.on('Network.loadingFinished', (e) => { const span = requests.get(e.requestId); if (span) { span.end(); requests.delete(e.requestId); } }); cdp.on('Network.loadingFailed', (e) => { const span = requests.get(e.requestId); if (span) { span.recordException({ name: 'NetworkError', message: e.errorText || 'loadingFailed' }); span.setStatus({ code: 2 }); span.end(); requests.delete(e.requestId); } }); }
Always redact sensitive headers. Avoid recording cookies or Authorization values. If you need to diagnose authentication, record the presence (boolean) rather than values.
Enrich spans with User-Agent and Client Hints
User-Agent (UA) strings are slowly being phased out in favor of User-Agent Client Hints (UA-CH). In Chromium, you can also override UA-CH via CDP to emulate profiles (Emulation.setUserAgentOverride). Enrich the top-level run span with both the UA string and UA-CH resolved at runtime.
Pull from the page:
tsconst uaContext = await cdp.send('Runtime.evaluate', { expression: 'JSON.stringify(window.__AGENT_CONTEXT__ || {})', returnByValue: true, }); const { ua, uaCH, webdriver } = JSON.parse(uaContext.result.value || '{}'); runSpan.setAttributes({ 'user_agent.original': ua || '', 'ua_ch.brands': JSON.stringify(uaCH?.brands || []), 'ua_ch.platform': uaCH?.platform || '', 'ua_ch.mobile': String(uaCH?.mobile ?? ''), 'automation.webdriver': webdriver === true });
From CDP Network, you can also observe which UA and Client Hints were actually sent to a given site (some CH are only sent if server opts-in via Accept-CH and Permissions-Policy). Use this to detect drift between your desired profile and the effective headers.
“What is my browser agent” probes: synthetic health checks
Run a lightweight, scheduled probe that launches your agent browser with the same settings used in production, visits a small suite of detectors, and validates invariants:
- UA matches expected pattern
- Client Hints present (brands, platform, full version) when policies allow
- navigator.webdriver equals expected (true for transparent automation, false if you rely on stealth profiling and have a data/privacy model to justify it)
- Headless vs. headed status
- No unexpected challenge pages (e.g., Cloudflare interstitial)
Targets you can use:
- https://httpbin.org/headers (echoes request headers, including Sec-CH-UA* and User-Agent)
- https://httpbin.org/user-agent (simple UA echo)
- A self-hosted echo endpoint that returns headers and navigator info via a small JS snippet
Example probe code with OTel:
tsasync function runBrowserProbe(tracer, browserFactory) { return tracer.startActiveSpan('probe.browser_agent', async (span) => { const browser = await browserFactory(); const contextPW = await browser.newContext(); const page = await contextPW.newPage(); await page.goto('https://httpbin.org/headers'); const body = await page.textContent('pre'); const headers = JSON.parse(body || '{}').headers || {}; const ua = headers['User-Agent'] || headers['user-agent'] || ''; const ch = { ua: headers['Sec-Ch-Ua'] || headers['sec-ch-ua'] || '', platform: headers['Sec-Ch-Ua-Platform'] || headers['sec-ch-ua-platform'] || '', mobile: headers['Sec-Ch-Ua-Mobile'] || headers['sec-ch-ua-mobile'] || '', }; // Collect navigator data too const nav = await page.evaluate(() => ({ webdriver: navigator.webdriver, hasUAData: !!navigator.userAgentData, brands: navigator.userAgentData?.brands || [], })); span.setAttributes({ 'probe.ua': ua, 'probe.ua_ch': JSON.stringify(ch), 'probe.navigator.webdriver': String(nav.webdriver), 'probe.ua_data.brands': JSON.stringify(nav.brands), }); // Assertions const errors = []; if (!/Chrome|Firefox|Safari|Edg/.test(ua)) errors.push('UA missing major brand'); if (!ch.ua) errors.push('UA-CH Sec-CH-UA header missing'); if (nav.webdriver !== true) errors.push('navigator.webdriver expected true in this profile'); if (errors.length) { for (const err of errors) span.addEvent('probe.failure', { message: err }); span.setStatus({ code: 2, message: errors.join('; ') }); } await browser.close(); span.end(); }); }
Run this on a cron and alert on failures. Keep it cheap: do not load heavy pages or run long flows. The goal is early detection of environmental drift (browser version changes, CH behavior changes, launch flags broken).
Surface security‑risk signals
Agentic browsers can unintentionally look like scrapers or bots. Emit signals to detect and mitigate risk while preserving user intent.
Signals to capture (as attributes or span events):
- Anti-bot challenges
- Detect DOM patterns: presence of reCAPTCHA/Turnstile widgets
- HTTP responses with interstitial HTML or challenge headers (cf-ray, cf-chl-bypass)
- CSP violations
- Listen to SecurityPolicyViolationEvent in-page; attach directive and blocked URL
- Mixed-content and insecure requests
- From CDP Security domain: securityStateChanged events
- Cookie and storage access patterns
- Count third-party cookies set; detect SameSite=None without Secure on HTTPS
- Download attempts
- Page.download events; block or log
- Suspicious third-party scripts
- Tag resource loads from non-allowlisted hosts
- Navigation to sensitive domains
- Maintain a policy list and emit violations
CDP examples:
tsfunction instrumentSecurity(cdp, tracer, session) { cdp.send('Security.enable'); cdp.on('Security.securityStateChanged', (e) => { const span = tracer.startSpan('security.state'); span.setAttributes({ 'security.state': e.securityState, 'browser.session_id': session.sessionId, 'agent.run_id': session.runId }); for (const iss of e.explanations || []) { span.addEvent('security.explanation', { securityState: iss.securityState, description: iss.description }); } span.end(); }); }
In-page CSP violations:
jswindow.addEventListener('securitypolicyviolation', (e) => { window.__AGENT_EVENTS__ = window.__AGENT_EVENTS__ || []; window.__AGENT_EVENTS__.push({ type: 'security.csp.violation', directive: e.violatedDirective, blocked: e.blockedURI, ts: Date.now() }); });
These signals enable policy enforcement and post-incident forensics, and they’re invaluable for SLOs tied to safe operation.
SLOs for agentic browser pipelines
Define SLOs rooted in business outcomes but diagnosable through telemetry.
Recommended SLIs:
- Task success rate
- Numerator: runs with final state success
- Denominator: all initiated runs
- Tag by domain, site class, device profile
- End-to-end latency (p99/p95)
- From agent.run span duration
- Challenge rate
- Fraction of runs encountering an anti-bot challenge
- DOM stability
- Max Cumulative Layout Shift (CLS)-like metric or count of long tasks > 200ms
- HTTP failure rate
- 4xx/5xx per run, weighted by criticality
- UA/CH drift rate
- Fraction of runs where UA/CH differ from configured profile
- Cost per successful run
- Sum ai.total_cost_usd over LLM/tool spans per run
Alerting ideas:
- Success rate below target SLO for 30 minutes
- p95 run duration above threshold
- Challenge rate spikes and correlates with UA/CH drift
- Cost per success increases by > X% week-over-week
Emit SLI metrics via OTel Metrics or derive in your backend using trace-to-metrics. Use exemplars to link metrics to hot traces for debugging.
Cost‑aware sampling: spend tracing budget where it matters
Agentic workflows can be verbose. Tracing every CDP call in production can be cost-prohibitive. Implement cost-aware sampling with two levers:
- Head sampling in SDK: lower baseline rate for successful, cheap runs
- Tail sampling in the OTel Collector: keep slow/error/expensive traces at higher rates
Tag spans with cost signals:
- ai.request.tokens, ai.response.tokens
- agent.cdp.commands: count of CDP commands in a run
- agent.heuristic.risk: high/medium/low
Then configure the Collector for tail sampling policies.
Example OTel Collector config (YAML snippet):
yamlprocessors: attributes/strip_sensitive: actions: - key: http.request.header.authorization action: delete - key: http.request.header.cookie action: delete tail_sampling: decision_wait: 10s num_traces: 50000 policies: - name: errors type: status_code status_code: status_codes: [ERROR] - name: slow_traces type: latency latency: threshold_ms: 5000 - name: expensive_ai type: attribute attribute: key: ai.total_cost_usd min_value: 0.20 - name: default_prob type: probabilistic probabilistic: hash_seed: 11 sampling_percentage: 5 service: pipelines: traces: processors: [attributes/strip_sensitive, tail_sampling]
Interpretation:
- Always keep error traces, slow traces, and traces with AI cost >= $0.20
- Otherwise keep 5% baseline
In the SDK, you can also raise the sampling probability dynamically when the agent detects risk (e.g., challenge encountered) by setting sampling hint attributes on the root span.
Data governance: scrub PII and secrets
Automated browsing is prone to collecting sensitive data. Guardrails:
- Never record full page content into spans by default
- Redact headers: Authorization, Cookie, Set-Cookie; only record presence flags
- Mask form inputs and text previews, restrict to metadata or hashed tokens
- Store screenshots only on explicit debug flags; link by storage key, not inline
- Respect site robots/policies and internal allowlists
- Separate production and staging exporters and storage
OTel processors (attributes, transform) can enforce redaction centrally in the Collector.
End‑to‑end example: wiring orchestrator, CDP, DOM and UA‑CH enrichment
Putting the pieces together with Playwright and OTel JS (TypeScript). This is a simplified but cohesive sketch.
tsimport { chromium } from 'playwright'; import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node'; import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base'; import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'; import { trace, context, SpanStatusCode } from '@opentelemetry/api'; const provider = new NodeTracerProvider(); provider.addSpanProcessor(new BatchSpanProcessor(new OTLPTraceExporter({ url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT }))); provider.register(); const tracer = trace.getTracer('agent.browser'); async function run(url: string) { await tracer.startActiveSpan('agent.run', async (runSpan) => { const runId = crypto.randomUUID(); runSpan.setAttributes({ 'agent.run_id': runId, 'deployment.environment': process.env.ENV || 'dev' }); const browser = await chromium.launch({ headless: true }); const contextPW = await browser.newContext(); const page = await contextPW.newPage(); const cdp = await contextPW.newCDPSession(page); // Instrument instrumentCDPSession(cdp, tracer, { runId, sessionId: page.context()._guid || 'page-1' }); instrumentNetwork(cdp, tracer, { runId, sessionId: page.context()._guid || 'page-1' }); instrumentSecurity(cdp, tracer, { runId, sessionId: page.context()._guid || 'page-1' }); // Inject DOM & UA-CH collector await page.addInitScript({ path: './inject-dom-ua.js' }); // Navigate with a child span await tracer.startActiveSpan('browser.navigate', async (navSpan) => { navSpan.setAttributes({ 'browser.page_url': url }); try { await page.goto(url, { waitUntil: 'domcontentloaded' }); navSpan.setStatus({ code: SpanStatusCode.OK }); } catch (e) { navSpan.recordException(e as Error); navSpan.setStatus({ code: 2 }); } finally { navSpan.end(); } }); // Pull UA/CH and navigator info const uaContext = await cdp.send('Runtime.evaluate', { expression: 'JSON.stringify(window.__AGENT_CONTEXT__ || {})', returnByValue: true }); const { ua, uaCH, webdriver } = JSON.parse(uaContext.result.value || '{}'); runSpan.setAttributes({ 'user_agent.original': ua || '', 'ua_ch.brands': JSON.stringify(uaCH?.brands || []), 'ua_ch.platform': uaCH?.platform || '', 'ua_ch.mobile': String(uaCH?.mobile ?? ''), 'automation.webdriver': webdriver === true }); // Perform an action: click the first link await tracer.startActiveSpan('browser.dom.click', async (span) => { try { await page.click('a'); span.setAttributes({ 'browser.dom.selector': 'a' }); span.setStatus({ code: SpanStatusCode.OK }); } catch (e) { span.recordException(e as Error); span.setStatus({ code: 2 }); } finally { span.end(); } }); // Drain in-page events and attach to trace const events = await page.evaluate(() => { const evts = window.__AGENT_EVENTS__ || []; window.__AGENT_EVENTS__ = []; return evts; }); for (const ev of events) runSpan.addEvent(ev.type, { ...ev, ts: undefined }); await browser.close(); runSpan.end(); }); } run('https://example.com').catch(console.error);
This example:
- Starts an agent.run span
- Instruments CDP commands, network, and security
- Injects a small DOM/UA collector
- Enriches the run with UA/CH and webdriver signals
- Captures a simple DOM action and flushes in-page events
In production, add LLM/tool spans, error handling, and more robust context propagation.
Handling Client Hints correctly (and overriding when needed)
Client Hints are sent only if the server opts in via Accept-CH and relevant Permissions-Policy (e.g., "ch-ua", "ch-ua-platform"). If you need deterministic UA-CH for external sites (e.g., to maintain a stable profile), pure header injection won’t work for all hints; the browser enforces policy.
However, for Chrome/CDP you can emulate UA and UA-CH metadata via Emulation.setUserAgentOverride:
tsawait cdp.send('Emulation.setUserAgentOverride', { userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36', acceptLanguage: 'en-US,en;q=0.9', platform: 'Windows', userAgentMetadata: { brands: [{ brand: 'Chromium', version: '121' }, { brand: 'Not A(Brand', version: '99' }, { brand: 'Google Chrome', version: '121' }], fullVersion: '121.0.6167.85', platform: 'Windows', platformVersion: '15.0.0', architecture: 'x86', model: '', mobile: false, } });
Use this sparingly and consistently; your probe should assert that the effective UA/CH match your intended profile, to catch drift when browsers upgrade.
Correlating LLM/tool calls with browsing steps
Agentic pipelines often interleave browsing with LLM reasoning. Instrument those calls as first-class spans so you can answer: "Which prompt caused the agent to click the ad banner?"
Sketch (Node.js):
tsasync function callModel(tracer, provider, model, prompt, runId) { return tracer.startActiveSpan('ai.invoke', async (span) => { span.setAttributes({ 'ai.system': provider, 'ai.model.name': model, 'agent.run_id': runId }); const t0 = performance.now(); try { const res = await invokeModel(provider, model, prompt); // your SDK const t1 = performance.now(); span.setAttributes({ 'ai.request.tokens': res.usage?.prompt_tokens || 0, 'ai.response.tokens': res.usage?.completion_tokens || 0, 'ai.total_cost_usd': estimateCost(provider, model, res.usage), 'ai.latency_ms': t1 - t0, }); span.setStatus({ code: 1 }); span.end(); return res; } catch (e) { span.recordException(e as Error); span.setStatus({ code: 2 }); span.end(); throw e; } }); }
With these spans, set tail sampling policies to retain expensive or error-prone AI calls, and annotate run spans with ai.total_cost_usd aggregated across children.
Drift detection: when your browser’s identity changes
Drift sources:
- Browser updates changing UA-CH brand versions
- Launch flags changing headless/headed or automation signals
- Plugin/extension differences
- Timezone/locale changes
- Cookie jar or storage state differences
Detect drift by:
- Persisting a golden profile (UA string hash, UA-CH JSON canonicalized) per env
- On each run, compute hashes and compare; emit span events on mismatch
- Maintain SLI: ua_ch_drift_rate
- Probe daily with the “what is my browser agent” job
Example drift check:
tsfunction canonicalUAProfile(ua, uaCH) { return JSON.stringify({ ua, brands: (uaCH.brands || []).map(b => ({ brand: b.brand, version: b.version })), platform: uaCH.platform, mobile: !!uaCH.mobile, arch: uaCH.architecture || '' }); } const current = canonicalUAProfile(ua, uaCH); if (current !== process.env.GOLDEN_UA_PROFILE) { runSpan.addEvent('ua_ch.drift', { current, golden: process.env.GOLDEN_UA_PROFILE || '' }); }
Alert when drift exceeds a threshold (e.g., after a browser auto-update) and roll out updates intentionally.
Storage and backend: where to send signals
- Traces: OTLP to your OTel Collector, then to a backend like Tempo, Jaeger, Honeycomb, Lightstep, etc.
- Metrics: OTLP metrics to Collector -> Prometheus/OTLP exporter -> your TSDB
- Logs/events: Prefer span events for correlated context; otherwise OTLP logs
Keep all three signals cross-linked via trace_id and run_id. Expose dashboards that pivot by domain, UA profile, and environment.
Operational tips and gotchas
- Headless does not equal undetectable. Decide if you embrace transparency (navigator.webdriver = true) or invest in stealth—and document the ethics and policies.
- CDP rate limiting: Flooding commands can trigger page instability. Trace command durations; back off on slow or error commands.
- Screenshots and recordings are expensive. Link them by object storage keys and fetch on demand only for sampled traces.
- Use span links when fan-out happens (e.g., parallel fetches or multi-tab flows) rather than forcing a strict parent/child tree.
- Timeouts: Treat timeouts as first-class errors and emit consistent error types (agent.timeout.navigation, agent.timeout.selector).
A minimal Python example (Selenium BiDi)
If you prefer Python and Selenium with WebDriver BiDi:
pythonfrom selenium import webdriver from selenium.webdriver.common.by import By from opentelemetry import trace from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter provider = TracerProvider() provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter(endpoint='http://localhost:4318/v1/traces'))) trace.set_tracer_provider(provider) tracer = trace.get_tracer('agent.selenium') with tracer.start_as_current_span('agent.run') as run_span: options = webdriver.ChromeOptions() options.add_argument('--headless=new') driver = webdriver.Chrome(options=options) try: with tracer.start_as_current_span('browser.navigate') as s: driver.get('https://httpbin.org/headers') ua = driver.execute_script('return navigator.userAgent') webdriver_flag = driver.execute_script('return navigator.webdriver === true') run_span.set_attribute('user_agent.original', ua) run_span.set_attribute('automation.webdriver', webdriver_flag) # Click example (if page had it) # driver.find_element(By.CSS_SELECTOR, 'a').click() finally: driver.quit()
You can integrate BiDi events similarly and emit rpc.system = "bidi" for command spans.
Checklist: production readiness
- Instrument
- Orchestrator, CDP/BiDi, DOM, network, LLM/tools
- Context propagation into page (traceparent)
- Enrich
- UA, UA-CH, navigator.webdriver
- Anti-bot and security signals
- Probe
- Scheduled “what is my browser agent” job
- Drift check against golden profile
- Govern
- PII redaction in Collector
- Separate environments and exporters
- Operate
- SLOs: success, latency, challenge rate, cost per success
- Dashboards by domain and UA profile
- Cost-aware tail sampling policies
Conclusion
Agentic browsers deserve first-class observability. When you instrument the orchestrator, CDP/BiDi commands, DOM, and network with OpenTelemetry—and enrich traces with UA and Client Hints—you move from guesswork to causal diagnosis. Add “what is my browser agent” probes to catch drift before it becomes an outage. Emit security-risk signals to keep your automation well-behaved. Finally, apply cost-aware sampling so that you pay for the traces you actually need. With these patterns, your auto-agent browser pipelines become debuggable, governable, and production-grade.