SafeDOM.ai – Privacy-first DOM-to-AI context builder with automatic PII redaction

Build prompt context from real UI state while minimizing the risk of leaking sensitive data. Annotate the DOM with data-ai attributes, automatically redact common PII, and reinject values safely on the backend when needed.

Demo

Live (GitHub Pages): https://jennifer-ha.github.io/SafeDOM.ai/
Local: cd examples/demo-site && npm install && npm run dev

What it does

data-ai annotations decide what to include, exclude, or redact.
Produces structured fields, combined rawText, and a detailed redactions list.
Redacts common PII (email, phone, IBAN, credit card, SSN) before sending data to AI providers.
Backend helper reinjects placeholders after the model responds, keeping sensitive data away from the model.

Why it matters

Users routinely paste emails, tickets, and documents into support tools.
AI prompts often collect PII by accident; this helps enforce privacy-by-design defaults.
Heuristic redaction provides a lightweight data minimisation layer for EU/US style privacy expectations.

Quickstart

Install (workspace root):

npm install

Annotate the DOM:

<div id="ticket-root">
  <h2 data-ai="include" data-ai-label="subject">Login blocked</h2>
  <p data-ai="redact:email phone" data-ai-label="customer">
    Contact: alice@example.com, phone +1 212-555-7890
  </p>
  <p data-ai="exclude">Internal notes never leave the browser.</p>
</div>

Build context in the browser:

import { buildAiContext } from "safedom-ai";

const ctx = buildAiContext("#ticket-root", { labeledOnly: true, region: "eu" });

// ctx.fields.subject -> "Login blocked"
// ctx.rawText -> includes placeholders like "__EMAIL_1__"
// ctx.redactions -> [{ placeholder: "__EMAIL_1__", original: "alice@example.com", type: "email" }, ...]

Send to a provider (example with OpenAI; ensure you comply with their data handling terms):

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: ctx.rawText }]
});

Reinject placeholders on the backend (Node):

import { reinjectPlaceholders } from "@safedom/ai-node";
import type { Redaction } from "safedom-ai";

function finalizeAnswer(modelText: string, redactions: Redaction[]) {
  // Deterministic, regex-free replacement to avoid accidental pattern injection.
  return reinjectPlaceholders(modelText, redactions);
}

API Reference

`buildAiContext(root, options?): AiContext`

Builds a privacy-minded context from a DOM subtree.

root: Element or CSS selector. Throws if not found.
options.labeledOnly (default true): When true, only data-ai annotated elements are processed. When false, unlabeled text within the root is collected as a fallback with redaction applied (excluding subtrees already annotated or excluded).
options.redactionRules: Override default redaction rules. Provide an array of { type, pattern, placeholderPrefix }. Patterns must include the global /g flag.
options.region: "eu" | "us" | "global" hint for host apps to select policy; not used for any network or geolocation.

Returns:

fields: Key-value map keyed by data-ai-label; multiple occurrences concatenate with newlines.
rawText: Combined string of included/redacted text separated by double newlines for easy prompt construction.
redactions: Array of { placeholder, original, type }.

`data-ai` directives

data-ai="include": Include text content as-is.
data-ai="exclude": Exclude element and children.
data-ai="redact:email phone iban creditcard ssn": Apply selected redaction rules. Unknown types fall back to the available ruleset.
data-ai-label="subject": Optional logical label; concatenates across multiple elements.
data-ai-sensitivity (optional/future): high|medium|low hint for downstream tooling; not used by core logic yet.

`defaultRedactionRules`

Heuristic regexes for:

Email: __EMAIL_n__
Phone (generic E.164-ish): __PHONE_n__
IBAN (generic ISO 13616 shape): __IBAN_n__
Credit card (13-19 digits, Luhn-validated): __CARD_n__
US SSN: __SSN_n__

Rules run in deterministic order. Placeholder numbering increments across all rules within a call.

`createRedactionRules(options)`

Build a ruleset with optional country-specific patterns. Country-specific rules come first, then core rules:

import { createRedactionRules, buildAiContext } from "safedom-ai";

const rules = createRedactionRules({
  countries: ["nl", "de", "gb", "fr", "us"], // ISO-like country codes
  includeGenericPhone: true, // keep fallback E.164-ish phone matching
  includeGenericIban: true, // keep generic IBAN shape matcher
  extraRules: [] // add your own RedactionRule objects
});

const ctx = buildAiContext("#ticket-root", { redactionRules: rules });

Included country profiles (IBAN + phone where relevant): nl, de, gb, fr, us. Extend by adding entries to countryRedactionRules or by passing extraRules.

`applyRedactions(input, rules, startIndex?)`

Utility that replaces matches with placeholders and returns { text, redactions }. Exported for advanced use-cases.

`validateRedactionRules(rules)`

Ensures rules include the global /g flag. For untrusted patterns, prefer an allowlist of vetted regexes to avoid ReDoS risk.

`findUnknownPlaceholders(text, knownPlaceholders)`

Helper to detect placeholder-shaped tokens in text that do not appear in the current redactions list. Useful for UI warnings when users type placeholder-like strings manually.

`@safedom/ai-node – reinjectPlaceholders(text, redactions)`

Backend-only helper to replace placeholders with originals. Uses split/join to avoid regex injection and MUST NOT be used in the browser with real PII.

Privacy & security notes

No network calls or telemetry; the library only reads the DOM.
Uses textContent (not innerHTML) to avoid HTML injection paths.
Redaction is heuristic pseudonymisation. It does not guarantee full anonymisation or legal compliance.
Defaults follow privacy-by-default: labeledOnly is true, and common PII redaction rules are enabled.
EU/US friendly: encourages data minimisation and keeping sensitive data off third-party AI providers. You must still assess lawfulness, consent, and processor agreements.
Avoid logging redactions; they contain sensitive originals.

Limitations

Regex-based detection can miss edge cases or produce false positives.
Does not manage consent, audit logs, DSAR/subject-access, or data retention policies.
Not a substitute for a full privacy/compliance program. Consult legal/privacy experts.

Example scenario: support dashboard

<section id="ticket-root">
  <h3 data-ai="include" data-ai-label="subject">Cannot reset password</h3>
  <p data-ai="redact:email phone" data-ai-label="customer">
    Customer: j.doe@example.com, phone +31 6 1234 5678
  </p>
  <p data-ai="include" data-ai-label="summary">
    User reports password reset emails are not arriving.
  </p>
  <div data-ai="exclude">Admin-only debug info</div>
</section>

import { buildAiContext } from "safedom-ai";

const ctx = buildAiContext("#ticket-root", { labeledOnly: true, region: "eu" });

const prompt = `You are a support assistant.\n\nTicket:\n${ctx.rawText}`;
// Send prompt to your chosen model...

On the backend:

import { reinjectPlaceholders } from "@safedom/ai-node";

const finalAnswer = reinjectPlaceholders(modelResponse, ctx.redactions);

Development

Prerequisite: Node.js 18+.
Install dependencies: npm install
Lint: npm run lint
Test: npm test
Build: npm run build

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines. When proposing new redaction rules, include:

Rationale and threat/privacy considerations.
Tests demonstrating realistic matches and avoiding obvious false positives.
Documentation updates if the public API or defaults change.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
examples		examples
node		node
src		src
test		test
.editorconfig		.editorconfig
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
.npmignore		.npmignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SafeDOM.ai – Privacy-first DOM-to-AI context builder with automatic PII redaction

Demo

What it does

Why it matters

Quickstart

API Reference

`buildAiContext(root, options?): AiContext`

`data-ai` directives

`defaultRedactionRules`

`createRedactionRules(options)`

`applyRedactions(input, rules, startIndex?)`

`validateRedactionRules(rules)`

`findUnknownPlaceholders(text, knownPlaceholders)`

`@safedom/ai-node – reinjectPlaceholders(text, redactions)`

Privacy & security notes

Limitations

Example scenario: support dashboard

Development

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SafeDOM.ai – Privacy-first DOM-to-AI context builder with automatic PII redaction

Demo

What it does

Why it matters

Quickstart

API Reference

buildAiContext(root, options?): AiContext

data-ai directives

defaultRedactionRules

createRedactionRules(options)

applyRedactions(input, rules, startIndex?)

validateRedactionRules(rules)

findUnknownPlaceholders(text, knownPlaceholders)

@safedom/ai-node – reinjectPlaceholders(text, redactions)

Privacy & security notes

Limitations

Example scenario: support dashboard

Development

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`buildAiContext(root, options?): AiContext`

`data-ai` directives

`defaultRedactionRules`

`createRedactionRules(options)`

`applyRedactions(input, rules, startIndex?)`

`validateRedactionRules(rules)`

`findUnknownPlaceholders(text, knownPlaceholders)`

`@safedom/ai-node – reinjectPlaceholders(text, redactions)`

Packages