A robust solution for automatically deduplicating companies in HubSpot using Operations Hub workflows.
This script provides a complete solution for company deduplication in HubSpot, addressing the common problem of duplicate company records. It intelligently identifies duplicates using multiple properties, merges them following a consistent strategy, and tracks the process using a custom property.
- Bidirectional Merging: Both merges duplicates into primary companies AND pulls duplicates into primary companies
- Multi-Property Matching: Uses name plus domain/LinkedIn URLs for accurate duplicate detection
- Self-Documenting: Tracks deduplication status using a custom property
- Primary Preservation: Always keeps the oldest company record as the primary
- Detailed Logging: Comprehensive logging for troubleshooting
- Error Handling: Robust error handling with HubSpot's built-in retry mechanism
First, create a custom property in HubSpot:
- Go to Settings → Properties → Create property
- Object type: Company
- Property name:
deduplication_status - Label: "Deduplication Status"
- Field type: Single-line text
- Group: Custom properties
- Create a new workflow in HubSpot
- Set it to trigger when companies are:
- Created, OR
- Specific properties are updated (like name, domain), OR
- Manually enrolled
- Add a "Custom Code" action to your workflow
- Select "Node.js 20.x" as the language
- Copy and paste the script code into the editor
- Add your API token as a secret named "ACCESSTOKEN"
- Save and activate your workflow
For optimal deduplication:
-
Sequential Processing: Add multiple instances of the script in sequence:
- First script action
- 5-10 second delay
- Second script action (same code)
- 5-10 second delay
- Third script action (optional)
-
Batch Processing: When processing large numbers of companies:
- Process in batches of 100-200 companies at a time
- Monitor workflow execution to avoid excessive API rate limits
-
Status Check: The script first checks the company's
deduplication_status:- If already
merged, the company is skipped - If
primaryor no status, it checks for duplicates
- If already
-
Duplicate Detection: Finds duplicates using a combination of:
- Company name matching
- Plus at least one additional identifier (domain, LinkedIn URL)
-
Merge Strategy:
- If the current company is the oldest (lowest ID), other duplicates are merged into it
- If the current company is not the oldest, it's merged into the oldest company
-
Status Tracking:
- Companies that become the main record are marked as
primary - Companies that are merged into others are marked as
merged
- Companies that become the main record are marked as
The script uses these properties for matching:
name: Primary company name (required)domain: Company website domainlinkedin_company_page: LinkedIn company page URLsales_navigator_url: LinkedIn Sales Navigator URL
Additional properties like country, city, address, and phone are logged but not used for matching.
You can customize the script by modifying these constants at the top:
// Primary property to use for deduplication
const DEDUPE_PROPERTY = 'name';
// Additional properties to use for deduplication when available
const SECONDARY_PROPERTIES = [
'domain',
'linkedin_company_page',
'sales_navigator_url'
];
// Properties to use for logging and debugging
const LOGGING_PROPERTIES = [
'country',
'city',
'phone',
'address'
];- API Rate Limits: If you're processing many companies and hitting rate limits, reduce batch sizes and add delays.
- Property Issues: Ensure the
deduplication_statusproperty is created correctly and has no validation rules. - Merge Direction: Companies merge based on ID (lowest/oldest becomes primary), not creation date.
The script provides detailed logging that shows exactly what's happening:
- Which duplicate companies were found
- Which properties matched
- The merge direction
- Success/failure of each operation
Look for log messages with a ✓ symbol to confirm successful operations.
This script builds on solutions shared in the HubSpot Community with significant enhancements to handle bidirectional merging and status tracking.
MIT License