Build prompt context from real UI state while minimizing the risk of leaking sensitive data. Annotate the DOM with data-ai attributes, automatically redact common PII, and reinject values safely on the backend when needed.
- Live (GitHub Pages): https://jennifer-ha.github.io/SafeDOM.ai/
- Local:
cd examples/demo-site && npm install && npm run dev
data-aiannotations decide what to include, exclude, or redact.- Produces structured
fields, combinedrawText, and a detailedredactionslist. - Redacts common PII (email, phone, IBAN, credit card, SSN) before sending data to AI providers.
- Backend helper reinjects placeholders after the model responds, keeping sensitive data away from the model.
- Users routinely paste emails, tickets, and documents into support tools.
- AI prompts often collect PII by accident; this helps enforce privacy-by-design defaults.
- Heuristic redaction provides a lightweight data minimisation layer for EU/US style privacy expectations.
- Install (workspace root):
npm install- Annotate the DOM:
<div id="ticket-root">
<h2 data-ai="include" data-ai-label="subject">Login blocked</h2>
<p data-ai="redact:email phone" data-ai-label="customer">
Contact: alice@example.com, phone +1 212-555-7890
</p>
<p data-ai="exclude">Internal notes never leave the browser.</p>
</div>- Build context in the browser:
import { buildAiContext } from "safedom-ai";
const ctx = buildAiContext("#ticket-root", { labeledOnly: true, region: "eu" });
// ctx.fields.subject -> "Login blocked"
// ctx.rawText -> includes placeholders like "__EMAIL_1__"
// ctx.redactions -> [{ placeholder: "__EMAIL_1__", original: "alice@example.com", type: "email" }, ...]- Send to a provider (example with OpenAI; ensure you comply with their data handling terms):
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: ctx.rawText }]
});- Reinject placeholders on the backend (Node):
import { reinjectPlaceholders } from "@safedom/ai-node";
import type { Redaction } from "safedom-ai";
function finalizeAnswer(modelText: string, redactions: Redaction[]) {
// Deterministic, regex-free replacement to avoid accidental pattern injection.
return reinjectPlaceholders(modelText, redactions);
}Builds a privacy-minded context from a DOM subtree.
root: Element or CSS selector. Throws if not found.options.labeledOnly(defaulttrue): Whentrue, onlydata-aiannotated elements are processed. Whenfalse, unlabeled text within the root is collected as a fallback with redaction applied (excluding subtrees already annotated or excluded).options.redactionRules: Override default redaction rules. Provide an array of{ type, pattern, placeholderPrefix }. Patterns must include the global/gflag.options.region:"eu" | "us" | "global"hint for host apps to select policy; not used for any network or geolocation.
Returns:
fields: Key-value map keyed bydata-ai-label; multiple occurrences concatenate with newlines.rawText: Combined string of included/redacted text separated by double newlines for easy prompt construction.redactions: Array of{ placeholder, original, type }.
data-ai="include": Include text content as-is.data-ai="exclude": Exclude element and children.data-ai="redact:email phone iban creditcard ssn": Apply selected redaction rules. Unknown types fall back to the available ruleset.data-ai-label="subject": Optional logical label; concatenates across multiple elements.data-ai-sensitivity(optional/future):high|medium|lowhint for downstream tooling; not used by core logic yet.
Heuristic regexes for:
- Email:
__EMAIL_n__ - Phone (generic E.164-ish):
__PHONE_n__ - IBAN (generic ISO 13616 shape):
__IBAN_n__ - Credit card (13-19 digits, Luhn-validated):
__CARD_n__ - US SSN:
__SSN_n__
Rules run in deterministic order. Placeholder numbering increments across all rules within a call.
Build a ruleset with optional country-specific patterns. Country-specific rules come first, then core rules:
import { createRedactionRules, buildAiContext } from "safedom-ai";
const rules = createRedactionRules({
countries: ["nl", "de", "gb", "fr", "us"], // ISO-like country codes
includeGenericPhone: true, // keep fallback E.164-ish phone matching
includeGenericIban: true, // keep generic IBAN shape matcher
extraRules: [] // add your own RedactionRule objects
});
const ctx = buildAiContext("#ticket-root", { redactionRules: rules });Included country profiles (IBAN + phone where relevant): nl, de, gb, fr, us. Extend by adding entries to countryRedactionRules or by passing extraRules.
Utility that replaces matches with placeholders and returns { text, redactions }. Exported for advanced use-cases.
Ensures rules include the global /g flag. For untrusted patterns, prefer an allowlist of vetted regexes to avoid ReDoS risk.
Helper to detect placeholder-shaped tokens in text that do not appear in the current redactions list. Useful for UI warnings when users type placeholder-like strings manually.
Backend-only helper to replace placeholders with originals. Uses split/join to avoid regex injection and MUST NOT be used in the browser with real PII.
- No network calls or telemetry; the library only reads the DOM.
- Uses
textContent(notinnerHTML) to avoid HTML injection paths. - Redaction is heuristic pseudonymisation. It does not guarantee full anonymisation or legal compliance.
- Defaults follow privacy-by-default:
labeledOnlyistrue, and common PII redaction rules are enabled. - EU/US friendly: encourages data minimisation and keeping sensitive data off third-party AI providers. You must still assess lawfulness, consent, and processor agreements.
- Avoid logging redactions; they contain sensitive originals.
- Regex-based detection can miss edge cases or produce false positives.
- Does not manage consent, audit logs, DSAR/subject-access, or data retention policies.
- Not a substitute for a full privacy/compliance program. Consult legal/privacy experts.
<section id="ticket-root">
<h3 data-ai="include" data-ai-label="subject">Cannot reset password</h3>
<p data-ai="redact:email phone" data-ai-label="customer">
Customer: j.doe@example.com, phone +31 6 1234 5678
</p>
<p data-ai="include" data-ai-label="summary">
User reports password reset emails are not arriving.
</p>
<div data-ai="exclude">Admin-only debug info</div>
</section>import { buildAiContext } from "safedom-ai";
const ctx = buildAiContext("#ticket-root", { labeledOnly: true, region: "eu" });
const prompt = `You are a support assistant.\n\nTicket:\n${ctx.rawText}`;
// Send prompt to your chosen model...On the backend:
import { reinjectPlaceholders } from "@safedom/ai-node";
const finalAnswer = reinjectPlaceholders(modelResponse, ctx.redactions);- Prerequisite: Node.js 18+.
- Install dependencies:
npm install - Lint:
npm run lint - Test:
npm test - Build:
npm run build
Contributions welcome! See CONTRIBUTING.md for guidelines. When proposing new redaction rules, include:
- Rationale and threat/privacy considerations.
- Tests demonstrating realistic matches and avoiding obvious false positives.
- Documentation updates if the public API or defaults change.
MIT. See LICENSE.