Skip to content

jschulte/rtf-toolkit

@jonahschulte/rtf-toolkit

Modern TypeScript RTF parser with track changes support

License: MIT TypeScript Tests

A production-grade RTF parsing library for JavaScript/TypeScript with comprehensive track changes support. Perfect for government contracts, legal documents, and any application requiring RTF document analysis.

✨ Features

  • Full RTF Parsing - Handles RTF 1.9.1 specification
  • Track Changes Support - Parse insertions, deletions, and author metadata
  • HTML Conversion - Clean, semantic HTML output
  • TypeScript First - Full type safety and IntelliSense
  • Zero Dependencies - Lightweight core library
  • 100% Tested - 94 comprehensive unit tests
  • Visual Track Changes - HTML rendering with color-coded changes

Installation

npm install @jonahschulte/rtf-toolkit

Quick Start

Basic RTF to HTML Conversion

import { parseRTF, toHTML } from '@jonahschulte/rtf-toolkit';

// Parse RTF
const rtf = '{\\rtf1\\b Bold\\b0 and \\i italic\\i0 text}';
const doc = parseRTF(rtf);

// Convert to HTML
const html = toHTML(doc);
console.log(html);
// Output: <div class="rtf-content">
//           <p><strong>Bold</strong> and <em>italic</em> text</p>
//         </div>

Track Changes (Revisions)

import { parseRTF, getTrackChanges, getTrackChangeMetadata } from '@jonahschulte/rtf-toolkit';

// Parse RTF with track changes
const rtf = `{\\rtf1
{\\*\\revtbl{Unknown;}{John Doe;}{Jane Smith;}}
Original text {\\revised\\revauth1 inserted by John} more text.
{\\deleted\\revauth2 removed by Jane} final text.}`;

const doc = parseRTF(rtf);

// Get all track changes
const changes = getTrackChanges(doc);
changes.forEach((change) => {
  console.log(`${change.type}: "${change.text}" by ${change.author}`);
});
// Output:
// insertion: "inserted by John" by John Doe
// deletion: "removed by Jane" by Jane Smith

// Get summary metadata
const metadata = getTrackChangeMetadata(doc);
console.log(`${metadata.totalChanges} changes by ${metadata.authors.length} authors`);
// Output: 2 changes by 2 authors

HTML with Track Changes Visualization

import { parseRTF, toHTML } from '@jonahschulte/rtf-toolkit';

const rtfWithChanges = `{\\rtf1
{\\*\\revtbl{Unknown;}{Editor;}}
Text with {\\revised\\revauth1 new content} here.}`;

const doc = parseRTF(rtfWithChanges);
const html = toHTML(doc);

console.log(html);
// Output includes:
// <span class="rtf-revision-inserted"
//       style="background-color: #d4edda; border-bottom: 2px solid #28a745;"
//       data-revision-type="insertion"
//       data-author="Editor">new content</span>

API Reference

Parsing

parseRTF(rtf: string): RTFDocument

Parse an RTF string into an Abstract Syntax Tree (AST).

const doc = parseRTF(rtfString);
console.log(doc.rtfVersion); // 1
console.log(doc.charset); // 'ansi'
console.log(doc.fontTable); // Array of fonts
console.log(doc.colorTable); // Array of colors
console.log(doc.content); // Array of paragraphs

Rendering

toHTML(doc: RTFDocument, options?: HTMLOptions): string

Convert RTF document AST to HTML.

const html = toHTML(doc, {
  includeWrapper: true, // Wrap in <div class="rtf-content">
});

HTMLOptions:

  • includeWrapper?: boolean - Wrap output in container div (default: true)
  • useClasses?: boolean - Use CSS classes instead of inline styles
  • classPrefix?: string - Custom CSS class prefix

Track Changes

getTrackChanges(doc: RTFDocument): TrackChange[]

Extract all track changes from the document.

const changes = getTrackChanges(doc);

changes.forEach((change) => {
  console.log(change.id); // Unique identifier
  console.log(change.type); // 'insertion' | 'deletion' | 'formatting'
  console.log(change.author); // Author name
  console.log(change.authorIndex); // Author index in revision table
  console.log(change.text); // Change content
  console.log(change.timestamp); // Date object
  console.log(change.position); // { paragraphIndex, characterOffset }
});

getTrackChangeMetadata(doc: RTFDocument): TrackChangeMetadata

Get summary statistics about track changes.

const metadata = getTrackChangeMetadata(doc);

console.log(metadata.totalChanges); // Total number of changes
console.log(metadata.insertions); // Number of insertions
console.log(metadata.deletions); // Number of deletions
console.log(metadata.authors); // Array of unique author names
console.log(metadata.hasRevisions); // Boolean flag

RTF Features Supported

Document Structure

  • ✅ RTF header (\rtf1, \ansi, \deff)
  • ✅ Font table (\fonttbl) with font families
  • ✅ Color table (\colortbl) with RGB values
  • ✅ Revision table (\revtbl) with author names

Character Formatting

  • ✅ Bold (\b)
  • ✅ Italic (\i)
  • ✅ Underline (\ul)
  • ✅ Font size (\fs)
  • ✅ Font family (\f)
  • ✅ Text color (\cf)
  • ✅ Background color (\cb)

Paragraph Formatting

  • ✅ Alignment (\qc, \qr, \ql, \qj)
  • ✅ Spacing before/after (\sb, \sa)
  • ✅ Indentation (\li, \ri, \fi)
  • ✅ Paragraph breaks (\par)

Track Changes

  • ✅ Revision table parsing
  • ✅ Insertions (\revised)
  • ✅ Deletions (\deleted)
  • ✅ Author tracking (\revauth)
  • ✅ Timestamps (\revdttm)
  • ✅ Visual HTML rendering

Special Characters

  • ✅ Hex escapes (\'XX)
  • ✅ Unicode characters (\u)
  • ✅ Control symbols (\~, \-, \_)
  • ✅ Literal escapes (\\, \{, \})

Use Cases

Government Contracts & Legal Documents

// Parse contract with redlines
const contract = parseRTF(governmentContractRTF);
const changes = getTrackChanges(contract);

// Review all changes
changes.forEach((change) => {
  console.log(`${change.author} ${change.type}d: "${change.text}"`);
});

// Generate HTML for review
const html = toHTML(contract);
// Insertions shown in green, deletions in red with strikethrough

Document Review Workflows

// Get summary for review dashboard
const metadata = getTrackChangeMetadata(doc);

console.log(`Pending Review: ${metadata.totalChanges} changes`);
console.log(`Contributors: ${metadata.authors.join(', ')}`);

Content Migration

// Extract clean text without markup
const doc = parseRTF(rtfString);
const html = toHTML(doc);

// Use in modern CMS or web application

Examples

See the examples/ directory for complete demonstrations:

  • basic-usage.ts - Basic parsing and HTML conversion
  • track-changes-demo.ts - Track changes extraction and visualization

Development

# Install dependencies
npm install

# Run tests
npm test

# Build
npm run build

# Run examples
npm run build && node examples/track-changes-demo.js

TypeScript Support

Full TypeScript definitions included:

import type {
  RTFDocument,
  ParagraphNode,
  TextNode,
  RevisionNode,
  TrackChange,
  TrackChangeMetadata,
} from '@jonahschulte/rtf-toolkit';

// Fully typed API with IntelliSense support

Browser Support

Works in all modern browsers and Node.js:

  • ✅ Chrome/Edge (latest)
  • ✅ Firefox (latest)
  • ✅ Safari (latest)
  • ✅ Node.js 18+

Performance

Optimized for real-world documents:

  • Typical documents (<100KB): <100ms parsing
  • Large documents (1MB+): <1s parsing
  • Efficient memory usage
  • Streaming capable (future)

License

MIT © 2025 Jonah Schulte

Credits

Built to solve real-world government contract NDA management needs.

Open-sourced to help the community deal with legacy RTF systems.

Acknowledgments

  • RTF 1.9.1 Specification by Microsoft
  • Inspired by the limitations of existing RTF libraries

Contributing

Issues and PRs welcome! See GitHub Issues.

Star this repo if you find it useful! ⭐

About

A production-grade RTF parsing library for JavaScript/TypeScript with comprehensive track changes support. Perfect for government contracts, legal documents, and any application requiring RTF document analysis.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors