agents.txt for Agentic Browsers: A Robots.txt‑Style Policy Using Client Hints and Tokens to Reduce Agent Browser Risk
Agentic browsers—automated, tool-using user agents that can navigate, extract data, fill forms, and transact—are crossing the chasm from research to deployment. They blend headless browsing, language models, and toolkits capable of acting on web UIs. This brings value for end users and developers (automation, accessibility, testing, QA, data intake), but also raises obvious risks: scraping, abuse, fraudulent transactions, privacy violations, and operational load.
The status quo is brittle. Sites rely on:
- Unreliable User-Agent strings and UA switchers.
- CAPTCHAs and anti-bot mitigations that harm accessibility and legitimate automation.
- Ad-hoc rate limits and manual allowlists.
We can do better. This article proposes agents.txt, a robots.txt‑style policy and handshake that lets origins publish capability, rate-limit, and personally identifiable information (PII) rules. Agentic browsers discover and honor those rules using well-known URLs, HTTP headers, and User-Agent Client Hints; they present sender-constrained tokens when required and adapt tools accordingly. The aim is to reduce risk compared to guesswork automation while remaining privacy-aware and deployable with today’s stacks.
I’ll propose a concrete, incremental design that:
- Uses an RFC 8615 well-known URI and a simple line-based format, plus an optional JSON form.
- Leverages HTTP Client Hints (RFC 8942) and UA-CH for controlled agent identification, without reintroducing passive fingerprinting.
- Provides a token challenge-answer flow using OAuth 2.0 Bearer (RFC 6750) or DPoP (RFC 9449) for sender-constrained tokens.
- Defines a capability taxonomy (read, click, type, download, form.submit, purchase, etc.), rate limiting, and PII/consent rules.
- Adds compatibility with robots.txt and on-page meta overrides.
It’s opinionated and designed to fit real systems, CDNs, and app servers, while acknowledging standardization paths (IETF HTTPBIS, WICG for UA-CH).
Why we need agents.txt now
- Agentic traffic is growing: AI copilots, RPA-like assistants, research bots, and QA/test agents increasingly browse real sites.
- UA spoofing is widespread: without a policy handshake, good and bad automation looks the same. Sites block aggressively, hurting accessibility and developer tooling.
- Robots.txt isn’t enough: it’s optimized for indexing crawlers, not interactive agents that fill forms or trigger side effects.
- Privacy and compliance pressure: unconsented PII extraction and retention invites regulatory risk (GDPR, CCPA).
A declarative policy plus a discovery and token handshake gives sites a standardized way to say: “Here’s what your agent may do here, at what rate, with what privacy constraints. If you need elevated capabilities, here’s how to authenticate.”
Design goals
- Safety: Reduce security and privacy risk by making agent behavior explicit, auditable, and adaptive.
- Deployability: Work with existing infrastructure (CDN caches, app servers) and incrementally adoptable.
- Privacy: Avoid passively identifying users; use opt-in Client Hints and sender-constrained tokens.
- Clarity: Simple text format like robots.txt, with an optional JSON variant for richer policy.
- Extensibility: Versioned schema; future headers/fields can be added without breakage.
A quick recap: relevant web building blocks
- robots.txt: A de facto standard line-based policy for crawlers. It uses wildcards and path Allow/Disallow, plus hints like Crawl-delay and Sitemap. It’s not designed for interactive agent capabilities.
- HTTP Client Hints (RFC 8942): Opt-in headers that servers can request from clients (e.g., UA hints). The UA-CH initiative (WICG) defines Sec-CH-UA, Sec-CH-UA-Platform, etc. They’re privacy-aware and controlled by server opt-in and Permissions-Policy.
- Well-Known URIs (RFC 8615): A standardized way to discover config/policy at /.well-known/<name>.
- OAuth 2.0 (RFC 6749) and Bearer Tokens (RFC 6750): Authentication/authorization.
- DPoP (RFC 9449): Demonstration of Proof-of-Possession for sender-constrained tokens; mitigates token theft.
- Structured Headers (RFC 8941): A way to define well-typed, parseable HTTP header fields.
We’ll use these rather than inventing everything from scratch.
The agents.txt policy
Location and media types
- Primary: /.well-known/agents.txt
- Optional JSON: /.well-known/agents.policy.json
- Proposed media types:
- text/agents-policy (line-based, robots-like)
- application/agents-policy+json (JSON variant)
Servers may advertise the policy with a header:
Agent-Policy: url="/.well-known/agents.txt"; version=1; ch="Sec-CH-UA, Sec-CH-UA-Platform"; token="dpop"; required=?0
Agent-Policy is a Structured Header. Fields:
- url: absolute or origin-relative policy location.
- version: policy schema version integer.
- ch: comma-separated list of Client Hints used by the policy matching rules.
- token: preferred token type (none, bearer, dpop, mtls, privacypass).
- required: boolean (SB) indicating whether tokens are required for elevated capabilities.
If not present, agents can still attempt discovery at the well-known path.
Agent selector and semantics
Policies are scoped by agent selectors. An “agent” is identified by brand/vendor and optional product/model version. To avoid reviving fingerprinting, selectors are matched against explicit agent declarations (opt-in), not passively scraped UA strings.
We define a normalized agent tag format:
- vendor/product["/"major["."minor["."patch]]]
- Wildcards allowed: vendor/*, /, *
- Example tags: openai/browser, openai/browser/1, anthropic/sonar, acme/agent/2025.1, agentic/*, /
Agents SHOULD declare their tag via an opt-in Client Hint Sec-CH-Agent-AI (proposed) or, for early adoption, via a dedicated header Agent-Name. Servers must first opt-in to receive CH using Accept-CH and Permissions-Policy.
Example opt-in from server:
Accept-CH: Sec-CH-Agent-AI
Permissions-Policy: ch-ua=(); ch-ua-platform=(); ch-agent-ai=(self)
Vary: Sec-CH-Agent-AI
Example request by an agent:
GET / HTTP/1.1
Sec-CH-Agent-AI: "openai/browser/1"
User-Agent: Mozilla/5.0 (compatible; Agentic;)
The new CH would need registration, but deployments can start with the non-CH header Agent-Name while preserving Vary and transparency.
Capabilities taxonomy
To keep the spec grounded, we propose a minimal capability set agents can negotiate. Capabilities reflect actions with side effects and access intensity.
- nav.read: Navigate and read content (GET, HEAD)
- dom.read: Read DOM structure
- dom.screenshot: Capture screenshots
- dom.click: Click actionable elements
- dom.type: Type into inputs and textareas
- form.submit: Submit forms
- download: Download files
- upload: Upload files
- login: Perform authentication flows
- account.modify: Change account settings (profile, password)
- purchase: Initiate purchases or payments
- api.call: Call documented JSON/GraphQL APIs on the same origin
- websocket: Open WebSocket connections
- script.inject: Inject custom script (often forbidden)
Sites can allow or deny capabilities per agent selector and path scope.
Rate limits and backoff
We define:
- rps: requests per second average
- burst: maximum short-term burst in a 1-second bucket
- concurrency: simultaneous in-flight requests per origin
- daily: total requests per calendar day
- crawl-delay: minimum seconds between page fetches within a path scope
- backoff: behavior on 429/503 (exponential with jitter)
PII policy and consent
Agents need explicit rules for collection, processing, retention, and storage of PII.
- pii.collect: none | minimal | consented | allowed
- pii.detect: list of detectors agents must run (email, phone, ssn, credit_card)
- pii.redact: fields to redact in logs and downstream storage
- pii.store: retention limit (e.g., 0d, 7d, 30d)
- consent.ui: actions requiring a user-confirmation UI (e.g., purchase, account.modify)
- provenance.tag: require agents to attach provenance metadata in requests they trigger (e.g., via headers)
This is not legal advice; the policy is a technical tool to implement compliance controls.
Authentication and tokens
To gate elevated capabilities (form.submit, purchase, login), sites can require tokens. Supported token types:
- none (default for public read)
- bearer (RFC 6750)
- dpop (RFC 9449) sender-constrained token
- mtls (mutual TLS) – enterprise-only
- privacypass (for frictionless human verification)
Servers advertise the challenge via WWW-Authenticate and/or the agents.txt Auth field. Example challenge:
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Agent realm="example", as="dpop", token_endpoint="https://auth.example.com/token", jwks_uri="https://auth.example.com/jwks.json", scope="dom.read dom.click form.submit"
Agents fetch the token, then present:
Agent-Token: DPoP eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9... (JWS)
DPoP: eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzI1NiIsImtyp... (proof)
Use DPoP for sender-constrained tokens to limit replay and token theft risk. Where privacy matters, pair with minimal CH exposure.
Reporting and audit
To aid debugging and enforcement:
- report.to: endpoint for violation reports (e.g., via Reporting API)
- policy.id: a hash of the policy to include in agent requests (Agent-Compliance header)
Agents include an ack header to attest they read and will comply:
Agent-Compliance: version=1; policy-hash="sha256:..."; as="openai/browser/1"
Servers can sample and verify behavior; misbehavior can be blocked with 403 and added to enforcement rules.
The agents.txt line format (text/agents-policy)
A simple, robots-like, line-based format. Lines are UTF-8, comments begin with #, directives are Key: value.
Core directives:
- Version: integer policy version
- Contact: admin email or URL
- Policy-ID: opaque server-generated string or hash
- Agent: selector (wildcards allowed)
- Allow, Disallow: path patterns
- Capabilities-Allow, Capabilities-Deny: comma-separated capabilities
- Rate: rps=, burst=, concurrency=, daily=, crawl-delay=
- Backoff: base=, max=, jitter=
- PII: collect=, detect=, redact=, store=
- Consent-UI: actions
- Auth: type=, issuer=, token_endpoint=, jwks_uri=, scope=
- Report-To: URL
- Expires: duration or RFC 7231 date
Blocks begin with Agent: and apply until the next Agent: or EOF. Multiple Agent lines can be grouped.
Example:
# /.well-known/agents.txt
Version: 1
Contact: security@example.com
Policy-ID: 2b4e2f80
# Default policy for all agentic browsers
Agent: agentic/*
Allow: /
Disallow: /admin
Disallow: /private
Capabilities-Allow: nav.read, dom.read, dom.screenshot
Capabilities-Deny: script.inject
Rate: rps=1; burst=5; concurrency=2; daily=5000; crawl-delay=1
Backoff: base=2s; max=60s; jitter=full
PII: collect=minimal; detect=email,phone,credit_card; redact=credit_card,ssn; store=7d
Consent-UI: purchase, account.modify
Report-To: https://example.com/.well-known/agent-report
Expires: 2026-01-01T00:00:00Z
# Elevated capabilities for a named agent with tokens
Agent: openai/browser/1
Allow: /
Disallow: /private
Capabilities-Allow: nav.read, dom.read, dom.screenshot, dom.click, dom.type, form.submit, api.call
Rate: rps=2; burst=10; concurrency=4; daily=20000
Auth: type=dpop; issuer=https://auth.example.com; token_endpoint=https://auth.example.com/token; jwks_uri=https://auth.example.com/jwks.json; scope=dom.read dom.click form.submit
PII: collect=consented; detect=email,phone,credit_card; redact=credit_card,ssn; store=0d
Consent-UI: purchase
# Strict block for unknown high-risk agents
Agent: */*
Disallow: /checkout
Capabilities-Deny: purchase, account.modify, upload
JSON variant (application/agents-policy+json)
For richer tooling, a JSON variant mirrors the fields and adds typing.
{
"version": 1,
"contact": "security@example.com",
"policy_id": "2b4e2f80",
"policies": [
{
"agent": "agentic/*",
"paths": { "allow": ["/"], "disallow": ["/admin", "/private"] },
"capabilities": { "allow": ["nav.read", "dom.read", "dom.screenshot"], "deny": ["script.inject"] },
"rate": { "rps": 1, "burst": 5, "concurrency": 2, "daily": 5000, "crawl_delay": 1 },
"backoff": { "base": "2s", "max": "60s", "jitter": "full" },
"pii": { "collect": "minimal", "detect": ["email", "phone", "credit_card"], "redact": ["credit_card", "ssn"], "store": "7d" },
"consent_ui": ["purchase", "account.modify"],
"report_to": "https://example.com/.well-known/agent-report",
"expires": "2026-01-01T00:00:00Z"
},
{
"agent": "openai/browser/1",
"paths": { "allow": ["/"], "disallow": ["/private"] },
"capabilities": { "allow": ["nav.read", "dom.read", "dom.screenshot", "dom.click", "dom.type", "form.submit", "api.call"], "deny": [] },
"rate": { "rps": 2, "burst": 10, "concurrency": 4, "daily": 20000 },
"auth": { "type": "dpop", "issuer": "https://auth.example.com", "token_endpoint": "https://auth.example.com/token", "jwks_uri": "https://auth.example.com/jwks.json", "scope": ["dom.read", "dom.click", "form.submit"] },
"pii": { "collect": "consented", "detect": ["email", "phone", "credit_card"], "redact": ["credit_card", "ssn"], "store": "0d" },
"consent_ui": ["purchase"]
},
{
"agent": "*/*",
"paths": { "disallow": ["/checkout"] },
"capabilities": { "deny": ["purchase", "account.modify", "upload"] }
}
]
}
Discovery and handshake flow
End-to-end handshake for an agentic browser visiting https://example.com/:
- Discovery:
- Agent makes initial request with minimal CH. Server advertises policy.
- Response contains Agent-Policy header and Accept-CH for Sec-CH-Agent-AI.
HTTP/1.1 200 OK
Agent-Policy: url="/.well-known/agents.txt"; version=1; ch="Sec-CH-Agent-AI"; token="dpop"; required=?1
Accept-CH: Sec-CH-Agent-AI
Permissions-Policy: ch-agent-ai=(self)
Vary: Sec-CH-Agent-AI
- Fetch policy:
- Agent GET /.well-known/agents.txt, including Sec-CH-Agent-AI if available.
GET /.well-known/agents.txt HTTP/1.1
Sec-CH-Agent-AI: "openai/browser/1"
-
Evaluate rules:
- Agent resolves the most specific matching Agent block(s) and merges directives. Later blocks for the same agent override earlier ones.
-
Token challenge (if needed):
- If the agent intends to use elevated capabilities requiring tokens, it obtains a DPoP token from token_endpoint and caches it per-origin.
-
Operate with compliance:
- Agent includes Agent-Compliance with policy hash and Agent-Token for protected actions. It enforces rate limits locally and adapts to 429/401/403 responses.
-
Reporting and backoff:
- On 429, agent applies exponential backoff with jitter; on violations, it may send reports to report.to.
Interaction with robots.txt and meta tags
- robots.txt remains authoritative for indexing; agents.txt is orthogonal for interactive agents.
- On-page overrides: page-level control via meta tag and header.
HTML meta example:
<meta name="agents" content="capabilities-deny=dom.screenshot,upload; pii.collect=none" />
HTTP header override:
Agent-Page-Policy: capabilities-deny="dom.screenshot upload"; pii.collect="none"; rate.rps=0.5
Page-level rules are more specific than site-wide agents.txt for that resource.
Security and privacy considerations
Threats considered:
- Unauthorized automation: bots performing sensitive actions without consent.
- Token theft/replay: intercepted bearer tokens used across networks.
- PII leakage: agents logging or exfiltrating private data.
- Fingerprinting and tracking: passive identification of users/agents without consent.
- Cache poisoning and policy confusion: intermediaries serving stale/mismatched policies.
Mitigations:
- Opt-in Client Hints: Only send Sec-CH-Agent-AI when requested by the origin; apply Vary to ensure caches separate responses.
- Sender-constrained tokens: Prefer DPoP or mTLS for elevated capabilities; rotate keys; minimize token scope and lifetime.
- Strict TLS: Require HTTPS for policy discovery and token endpoints.
- Least privilege: Capabilities default-deny; only allow what’s necessary.
- Local enforcement: Agents must implement rate limiting and PII filters client-side to avoid burdening servers.
- PII handling: Respect pii.redact and pii.store; don’t persist disallowed data. Align with GDPR/CCPA principles (data minimization, purpose limitation).
- Cache control: Serve /.well-known/agents.txt with sensible Cache-Control; include ETag for change detection.
Privacy note: Avoid sending granular model/version unless required; use coarse agent tags or token-based attestation. Servers should not request broader CH than necessary.
Implementation examples
NGINX: serve agents.txt and advertise policy
server {
listen 443 ssl;
server_name example.com;
location = /.well-known/agents.txt {
default_type text/plain;
add_header Content-Type "text/agents-policy";
add_header Cache-Control "max-age=3600, must-revalidate";
try_files $uri =404;
}
location / {
add_header Agent-Policy 'url="/.well-known/agents.txt"; version=1; ch="Sec-CH-Agent-AI"; token="dpop"; required=?1' always;
add_header Accept-CH "Sec-CH-Agent-AI" always;
add_header Permissions-Policy "ch-agent-ai=(self)" always;
add_header Vary "Sec-CH-Agent-AI" always;
try_files $uri /index.html;
}
}
Express.js: read agents.txt and enforce page-level overrides
import express from 'express';
import crypto from 'crypto';
const app = express();
const policyText = `Version: 1\nPolicy-ID: 2b4e2f80\nAgent: agentic/*\nAllow: /\nDisallow: /admin\nCapabilities-Allow: nav.read, dom.read\nRate: rps=1; burst=5; concurrency=2; daily=5000\n`;
const policyHash = 'sha256:' + crypto.createHash('sha256').update(policyText).digest('hex');
app.get('/.well-known/agents.txt', (req, res) => {
res.set('Content-Type', 'text/agents-policy');
res.set('Cache-Control', 'max-age=3600, must-revalidate');
res.send(policyText);
});
app.use((req, res, next) => {
res.set('Agent-Policy', 'url="/.well-known/agents.txt"; version=1; ch="Sec-CH-Agent-AI"; token="dpop"; required=?1');
res.set('Accept-CH', 'Sec-CH-Agent-AI');
res.set('Permissions-Policy', 'ch-agent-ai=(self)');
res.set('Vary', 'Sec-CH-Agent-AI');
next();
});
// Example protected route
app.post('/purchase', (req, res) => {
const token = req.get('Agent-Token') || '';
const compliance = req.get('Agent-Compliance') || '';
if (!token || !compliance.includes(policyHash)) {
res.status(401).set('WWW-Authenticate', 'Agent realm="example", as="dpop", token_endpoint="https://auth.example.com/token", scope="purchase"').send('Token required');
return;
}
// TODO: verify DPoP proof and token claims
res.send('ok');
});
app.listen(3000);
Python: a minimal agent honoring agents.txt
import re
import time
import hashlib
import requests
AGENT_TAG = 'openai/browser/1'
class AgentsPolicy:
def __init__(self, text):
self.text = text
self.policy_id = None
self.blocks = []
self._parse(text)
def _parse(self, text):
current = None
for line in text.splitlines():
line = line.strip()
if not line or line.startswith('#'): continue
if ':' not in line: continue
k, v = [s.strip() for s in line.split(':', 1)]
if k.lower() == 'policy-id':
self.policy_id = v
elif k.lower() == 'agent':
if current: self.blocks.append(current)
current = { 'agent': v, 'allow': [], 'disallow': [], 'caps_allow': set(), 'caps_deny': set(), 'rate': {} }
elif k.lower() == 'allow':
current['allow'].append(v)
elif k.lower() == 'disallow':
current['disallow'].append(v)
elif k.lower() == 'capabilities-allow':
current['caps_allow'].update([x.strip() for x in v.split(',')])
elif k.lower() == 'capabilities-deny':
current['caps_deny'].update([x.strip() for x in v.split(',')])
elif k.lower() == 'rate':
# naive parser: rps=1; burst=5; concurrency=2
parts = [p.strip() for p in v.split(';')]
for p in parts:
if '=' in p:
key, val = p.split('=', 1)
try:
current['rate'][key.strip()] = float(val)
except ValueError:
current['rate'][key.strip()] = val
if current: self.blocks.append(current)
def _match(self, selector, tag):
# simple wildcard match vendor/product[/ver]
sel = selector.replace('*', '.*')
return re.fullmatch(sel, tag) is not None
def resolve(self, tag):
matches = [b for b in self.blocks if self._match(b['agent'], tag)]
# choose most specific (fewest wildcards)
matches.sort(key=lambda b: b['agent'].count('*'))
merged = { 'allow': [], 'disallow': [], 'caps': set(), 'rate': {'rps': 1, 'burst': 5, 'concurrency': 2} }
for b in matches:
merged['allow'] += b['allow']
merged['disallow'] += b['disallow']
merged['caps'] |= b['caps_allow']
merged['caps'] -= b['caps_deny']
merged['rate'].update(b['rate'])
return merged
session = requests.Session()
session.headers['Sec-CH-Agent-AI'] = f'"{AGENT_TAG}"'
origin = 'https://example.com'
pol = session.get(origin + '/.well-known/agents.txt')
pol.raise_for_status()
policy = AgentsPolicy(pol.text)
conf = policy.resolve(AGENT_TAG)
policy_hash = 'sha256:' + hashlib.sha256(pol.text.encode()).hexdigest()
session.headers['Agent-Compliance'] = f'version=1; policy-hash="{policy_hash}"; as="{AGENT_TAG}"'
# obey rate
min_interval = 1.0 / float(conf['rate'].get('rps', 1) or 1)
last = 0.0
def fetch(path):
global last
delta = time.time() - last
if delta < min_interval:
time.sleep(min_interval - delta)
r = session.get(origin + path)
last = time.time()
if r.status_code == 429:
time.sleep(2.0)
return r
print(fetch('/').status_code)
Cloudflare Workers: handle CH and cache safely
export default {
async fetch(request, env, ctx) {
const url = new URL(request.url);
const headers = new Headers({
'Agent-Policy': 'url="/.well-known/agents.txt"; version=1; ch="Sec-CH-Agent-AI"; token="dpop"; required=?1',
'Accept-CH': 'Sec-CH-Agent-AI',
'Permissions-Policy': 'ch-agent-ai=(self)',
'Vary': 'Sec-CH-Agent-AI'
});
if (url.pathname === '/.well-known/agents.txt') {
const body = `Version: 1\nContact: security@example.com\nAgent: agentic/*\nAllow: /\n`;
headers.set('Content-Type', 'text/agents-policy');
headers.set('Cache-Control', 'max-age=3600');
return new Response(body, { headers });
}
// proxy to origin or serve content
const resp = await fetch(request);
const out = new Response(resp.body, resp);
headers.forEach((v, k) => out.headers.set(k, v));
return out;
}
};
Behavior on errors and edge cases
- Missing policy: If /.well-known/agents.txt is 404 and no Agent-Policy header is present, agents fall back to conservative defaults: read-only, 0.5 rps, no form.submit, obey robots.txt Disallow for safety.
- Conflicting blocks: Most-specific selector wins; later blocks for a selector override earlier ones.
- 401 without WWW-Authenticate: Treat as non-agent-aware; do not retry with tokens unless documented.
- 403: Agent should reduce capabilities; if repeated, cease elevated actions.
- 429: Honor backoff; log; optionally send a report to report.to.
- Multi-tenant/CDN: Use Vary on Sec-CH-Agent-AI and Cache-Key customization to avoid policy mixing.
Migration and compatibility
- Non-agent browsers: No change; agents.txt is ignored by normal browsers.
- Headless automation: Tools can adopt the handshake without breaking sites.
- robots.txt: If both are present, agents.txt governs interactive capability; robots.txt continues to govern indexing. Agents SHOULD respect Disallow in robots.txt as an additional safety layer for read-only fetches.
- Sitemaps: Optional link in agents.txt for agents that need discovery: Sitemap: https://example.com/sitemap.xml
Standardization path
- Register /.well-known/agents.txt under RFC 8615.
- Define text/agents-policy and application/agents-policy+json media types with IANA.
- Specify Agent-Policy and Agent-Page-Policy as Structured Headers (RFC 8941), with clear ABNF and error handling.
- Propose Sec-CH-Agent-AI as a UA Client Hint through WICG/WHATWG, building on RFC 8942. Gate with Permissions-Policy to protect privacy.
- Encourage DPoP (RFC 9449) support in agent SDKs; reference OAuth 2.0 best practices for short-lived tokens and scope minimization.
This is deliberately incremental: everything works today with non-standard headers, and improves as pieces are standardized.
Why this beats UA switchers and ad-hoc bot walls
- Explicit capabilities reduce guesswork. Sites can tailor rules to an agent’s needs and risk profile.
- Frictionless for good actors. Agents can present tokens and adapt tools instead of brute-forcing with random delays and UA strings.
- Lower operational risk. Rate limits and backoff conventions reduce 500s and keep infrastructure stable.
- Better privacy posture. PII rules and provenance headers help organizations meet regulatory expectations.
- Upgrade path. As CH and token mechanisms mature, capabilities and attestations can be formalized without breaking early adopters.
Open questions and future work
- Attestation: Standard ways to attest agent runtime provenance (e.g., TEE attestation, code signing) without violating user privacy.
- Provenance headers: Align with emerging web provenance specs to label agent-initiated actions.
- Legal semantics: Clarify where agents.txt sits relative to terms-of-service and robots.txt—avoid creating false expectations.
- Tool taxonomy: Find the right granularity; too coarse is meaningless, too fine is brittle.
- Fine-grained per-path capability rules: Do we need directory-scoped capability lists beyond Allow/Disallow paths?
- Human-in-the-loop: Standardize how agents signal consent UI flows and capture user approval on sensitive actions.
Conclusion
Agentic browsers are inevitable. Pretending they’re just human browsers with different UA strings leads to brittle defenses, indiscriminate blocks, and poor outcomes for both websites and legitimate automation. A simple, well-known agents.txt policy—paired with opt-in Client Hints and sender-constrained tokens—lets sites publish what they allow, at what rates, under which privacy rules. Agents can then adapt tools and behavior rather than guessing.
The proposal here is intentionally pragmatic: it uses well-known URLs, a robots-like text format, optional JSON, structured headers, and existing token mechanisms. It’s deployable on today’s stacks, incrementally adoptable, and respectful of privacy. Standardization across IETF and WICG would refine the details, but implementers don’t need to wait to realize the core benefits.
If you run a site: publish an agents.txt with conservative defaults and a contact address. If you build agentic browsers: read the file, honor rate limits and PII rules, and present tokens for elevated capabilities. With that handshake, we can replace guesswork UA switching with a safer, more transparent agent ecosystem.