Core Concepts

PII Anonymization

Strip personally identifiable information from page content before it reaches the LLM. A local ONNX NER model runs on your machine — your LLM provider never sees real PII.

Enable anonymization

Pass anonymize: true when opening a session. All get_content and evaluate results will have PII stripped before they reach the agent.

open with anonymization
open_session(profile="personal", anonymize=true)

Note: Screenshots are blocked in anonymization mode to prevent PII leakage through images.

What gets detected

Default (regex-based, always available)

  • EMAIL addresses
  • PHONE numbers
  • CREDIT_CARD numbers
  • IBAN numbers
  • SSN (Social Security Numbers)
  • IP addresses

With NER model (optional, ~65MB download)

  • PERSON names
  • ORG (organization) names

The NER model runs locally via ONNX Runtime — no network calls, no cloud processing.

Anonymization modes

Tokenize (default)

PII is replaced with tokens like [EMAIL:a3f9b2]. The token maps back to the real value in a local vault. When you pass a token to fill or type_text, pagerunner de-tokenizes it before writing to the DOM.

tokenize mode
// Agent sees:
"Contact [EMAIL:a3f9b2] for details"
// Agent can use the token in fill/type_text:
fill(session_id, target_id, "#email", "[EMAIL:a3f9b2]")
// Pagerunner writes the real email to the DOM

Redact

One-way replacement with [EMAIL], [PHONE], etc. No vault, no de-tokenization. Use this when you want permanent removal.

redact mode
open_session(
  profile="personal",
  anonymize=true,
  anonymization_mode="redact"
)

Limit to specific entity types

Only anonymize certain types of PII:

selective anonymization
open_session(
  profile="personal",
  anonymize=true,
  anonymization_entities=["EMAIL", "PHONE"],
  anonymization_mode="redact"
)

NER model setup

For PERSON and ORG name detection, you need the NER build and the model file:

terminal
# Build with NER support
cargo build --release --features ner
# Download the model (~65MB, one-time)
pagerunner download-model

Homebrew builds include NER support by default. To disable NER globally:

~/.pagerunner/config.toml
[ner]
enabled = false

Custom patterns

Define anonymization profiles in config.toml with custom regex patterns, then reference them when opening a session:

custom profile
open_session(
  profile="personal",
  anonymization_profile="jira-work"
)

Why this matters

When an AI agent reads your email, bank statements, or CRM, the raw text is sent to the LLM provider. Without anonymization, your personal data, customer data, and financial information flows to a third party.

Pagerunner's anonymization layer intercepts this data before it leaves your machine. The LLM sees tokens or redacted placeholders. The real data stays local. This is the only browser automation tool with built-in PII protection.

Next: Recordings →