fix: link-check FP reduction round 2 — short labels, ALL_CAPS, version-adjacent by ohjonathan · Pull Request #77 · ohjonathan/Project-Ontos

ohjonathan · 2026-02-12T21:56:13Z

Summary

This PR implements Option A from the second external review response plan: add more pattern exclusions to _looks_like_doc_id() to further reduce link-check false positives. This is the hotfix approach agreed upon in the plan; Option B (require multi-segment snake_case) was explicitly deferred as a more fundamental redesign.

Result: broken references reduced from ~92 → 45 (51% further reduction; ~89% total from the original 408 baseline).

Plan Reference

This PR executes the "Implementation Plan: Further FP Reduction (Option A)" section of the second external review response plan. The plan was developed after the original reviewer re-evaluated the codebase post-PR #76 and verified 7 of 9 items as fixed.

Context from the plan

The remaining ~92 broken references decomposed into 25 unique values across these categories:

Category	Examples	Unique Count
Track/finding labels	A1, A3, B2, X-H1, NB-1, M-2	14
Short alphanumeric (curation levels)	L0, L1, L2, L3	4
Snake_case prose tokens	logs_dir, warn_legacy, snake_case	3
Version-adjacent	v3.2.1b, v2.x	2
ALL_CAPS config	AUTO_CONSOLIDATE	1
Other	1-safe	1

This PR addresses the first, second, fourth, and fifth categories. The snake_case prose tokens (e.g. logs_dir, warn_legacy) are intentionally not filtered — they genuinely look like doc IDs and are the hardest to distinguish without the multi-segment redesign (Option B).

Reviewer corrections from the plan (not in scope for this PR, but for context)

The plan also documented two factual corrections to the reviewer's claims:

"Dual CHANGELOGs — no cross-reference" — Incorrect. CHANGELOG.md line 5 already contains a cross-reference to Ontos_CHANGELOG.md, added in commit 6305302.
"Legacy scripts in CI — still runs 152 tests" — Incorrect. pyproject.toml line 58 sets testpaths = ["tests"]. Legacy tests only run when explicitly invoked. The FutureWarning is a Python import-time side effect, not test execution.

Changes

`ontos/core/body_refs.py`

Three new regex constants and corresponding filter checks in _looks_like_doc_id():

_SHORT_LABEL_RE — ^[A-Z]{1,2}-?[A-Z]?\d{1,2}$
- Rejects short alphanumeric labels: A1, B2, X-H1, NB-1, L0, M-2
- This was the single largest remaining FP category (~18 unique values, ~60+ occurrences)
_ALL_CAPS_RE — ^[A-Z][A-Z_]+[A-Z]$
- Rejects SCREAMING_SNAKE_CASE config constants: AUTO_CONSOLIDATE
- Matches tokens that are entirely uppercase letters + underscores, minimum 3 chars
_VERSION_WILDCARD_RE — ^v?\d+(\.\d+)*\.x$
- Rejects version wildcards: v2.x, v3.2.x
Updated _VERSION_RE — ^v?\d+(\.\d+)+[a-z]?$ (was ^v?\d+(\.\d+)+$)
- Now also catches trailing pre-release letters: v3.2.1b, 3.2.1a

All four checks are inserted after the file-extension filter and before the if "_" in token or "." in token catch-all, ensuring they intercept tokens that would otherwise be misclassified as doc IDs.

`tests/core/test_body_refs.py`

24 new test cases added across existing test classes:

TestLooksLikeDocIdFilters (unit tests on _looks_like_doc_id):

test_short_labels_rejected — 14 parametrized cases (A1, A3, B2, L0–L3, M-2, B-2, NB-1–NB-3, X-H1, X-H2)
test_all_caps_constants_rejected — 3 parametrized cases (AUTO_CONSOLIDATE, MY_CONFIG, SOME_SETTING)
test_version_adjacent_rejected — 4 parametrized cases (v3.2.1b, v2.x, v3.2.x, 3.2.1a)

TestFalsePositiveScanning (integration tests via full scan_body_references):

test_short_labels_not_in_scan — verifies A1, NB-1 not in scan output
test_all_caps_constants_not_in_scan — verifies AUTO_CONSOLIDATE not in scan output
test_version_wildcards_not_in_scan — verifies v2.x, v3.2.1b not in scan output

`tests/commands/test_link_check.py`

2 new integration tests documenting the precision/recall tradeoff:

test_link_check_broken_ref_matching_filtered_pattern_not_detected_in_generic_scan — Documents the known gap: a broken bare-token reference whose ID matches a filtered pattern (e.g., A2) is NOT detected by the generic scan. This is the accepted tradeoff for eliminating ~60+ FPs from short labels.
test_link_check_short_label_doc_id_detected_when_exists — Confirms the known-ID scan (Pass 1) correctly detects short-label doc IDs when the referenced document exists.

Safety and known tradeoffs

All new filters only affect Pass 2 (generic unknown-ID scan, _iter_generic_id_candidates). The known-ID scan (Pass 1, _iter_known_id_candidates) always finds existing doc IDs by exact match regardless of naming pattern, so references to existing documents are never missed.

However, broken references to non-existent documents whose IDs match filtered patterns (e.g., a typo A2 when A1 exists) will not be detected by the generic scan. This is an intentional precision/recall tradeoff — the same class of gap exists for all pre-existing filters (_VERSION_RE, _BARE_NUMBER_RE, _FILE_EXTENSION_RE, _KNOWN_FIELD_NAMES) and is inherent to heuristic-based generic scanning. The gap is narrow: it only affects documents with IDs matching short labels / ALL_CAPS / version wildcards that (a) don't exist and (b) are referenced only as bare tokens in body text (not in frontmatter depends_on, which is always validated against the full ID set).

The existing TestKnownIdsBypassFilters test class validates the Pass 1 safety invariant, and the new test_link_check_broken_ref_matching_filtered_pattern_not_detected_in_generic_scan test explicitly documents the accepted tradeoff.

Verification

$ python3 -m pytest tests/core/test_body_refs.py -v    # 88 passed
$ python3 -m pytest tests/commands/test_link_check.py -v  # 14 passed
$ python3 -m pytest tests/ --tb=short                  # 918 passed, 2 skipped
$ ontos link-check --json | python3 -c "..."           # broken_references: 45

Test plan

All 88 body_refs tests pass (including 24 new)
All 14 link-check integration tests pass (including 2 new)
Full suite: 918 passed, 2 skipped, 0 failures
Broken reference count verified: 92 → 45
Known tradeoff documented with explicit test case
Review board: verify filter patterns don't reject legitimate doc ID naming conventions
Review board: verify two-pass safety claim by checking TestKnownIdsBypassFilters

🤖 Generated with Claude Code

…n-adjacent Reduce link-check false positives from ~92 → 45 by adding three new pattern exclusions to _looks_like_doc_id() in the generic body scan: 1. Short label pattern (^[A-Z]{1,2}-?[A-Z]?\d{1,2}$) Rejects track/finding labels: A1, B2, X-H1, NB-1, L0, M-2, etc. ~18 unique values, the largest remaining FP category. 2. ALL_CAPS token pattern (^[A-Z][A-Z_]+[A-Z]$) Rejects SCREAMING_SNAKE_CASE config constants: AUTO_CONSOLIDATE. 3. Version-adjacent patterns: - Extended _VERSION_RE to catch trailing letters (v3.2.1b) - Added _VERSION_WILDCARD_RE for .x wildcards (v2.x, v3.2.x) Safety: All filters only affect Pass 2 (generic unknown-ID scan). Pass 1 (known-ID scan) always finds existing doc IDs regardless of naming pattern, so no false negatives are introduced. 24 new tests added (14 short label, 3 ALL_CAPS, 4 version-adjacent, 3 integration scan tests). Full suite: 916 passed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ohjonathan · 2026-02-13T00:15:18Z

Adversarial review (single comprehensive pass)

Findings

High: PR fix: link-check FP reduction round 2 — short labels, ALL_CAPS, version-adjacent #77 expands a real false-negative blind spot for broken bare-token references.

Evidence:

Link-check runs a two-pass body scan: known IDs exact-match pass (ontos/core/link_diagnostics.py:327) + generic unknown pass (ontos/core/link_diagnostics.py:336).
PR fix: link-check FP reduction round 2 — short labels, ALL_CAPS, version-adjacent #77 adds additional generic rejections in _looks_like_doc_id() (ontos/core/body_refs.py:676, ontos/core/body_refs.py:680, ontos/core/body_refs.py:683).
This means missing refs that fit those shapes can be silently dropped in generic mode.

Concrete repro I ran:

Temp repo with one doc: id: A1
Body content: Roadmap references A2.
ontos link-check --json result: broken_references: 0, exit_code: 0

Impact:

Broken bare references like A2, AUTO_CONSOLIDATE, v2.x can be missed if they don’t already exist in the graph.

Nuance:

The PR claim that “known IDs are safe” is true for existing IDs (pass 1).
The gap is unknown/missing IDs (the actual broken-reference class), which rely on pass 2 heuristics.

Medium: Test coverage does not protect the above failure mode.

Evidence:

Existing tests validate generic filtering (tests/core/test_body_refs.py:278) and known-id bypass (tests/core/test_body_refs.py:264).
There is no command-level test showing link-check still reports broken bare refs for the newly filtered token classes.

Impact:

Regression can pass all current tests while weakening broken-reference detection recall.

Open questions

Is the intended product behavior now “these token classes should never be treated as bare ID candidates,” even if true broken refs are missed?
If yes, should this be an explicit mode/config tradeoff (precision vs recall) rather than a silent heuristic tightening?

Validation run

pytest -q tests/core/test_body_refs.py -> 88 passed
pytest -q tests/commands/test_link_check.py tests/commands/test_rename.py -> 37 passed
pytest -q -> 916 passed, 2 skipped
ontos link-check --json -> broken_references: 45 (matches PR claim)

Overall: this is a strong incremental FP reduction, but I would treat Finding #1 as blocking unless reduced recall for broken bare-token references is an explicit, accepted tradeoff.

Two new tests for PR #77 adversarial review response: - Known gap: broken ref matching filtered pattern not detected in generic scan - Happy path: short-label doc ID detected by known-ID scan when it exists Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ohjonathan and others added 2 commits February 12, 2026 19:23

docs: archive Ontos session logs and context map

47ff9ac

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ohjonathan merged commit 81481b1 into main Feb 13, 2026

ohjonathan deleted the fix/link-check-fp-reduction-round2 branch February 13, 2026 00:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: link-check FP reduction round 2 — short labels, ALL_CAPS, version-adjacent#77

fix: link-check FP reduction round 2 — short labels, ALL_CAPS, version-adjacent#77
ohjonathan merged 3 commits intomainfrom
fix/link-check-fp-reduction-round2

ohjonathan commented Feb 12, 2026 •

edited

Loading

Uh oh!

ohjonathan commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ohjonathan commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Plan Reference

Context from the plan

Reviewer corrections from the plan (not in scope for this PR, but for context)

Changes

ontos/core/body_refs.py

tests/core/test_body_refs.py

tests/commands/test_link_check.py

Safety and known tradeoffs

Verification

Test plan

Uh oh!

ohjonathan commented Feb 13, 2026

Findings

Open questions

Validation run

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ohjonathan commented Feb 12, 2026 •

edited

Loading

`ontos/core/body_refs.py`

`tests/core/test_body_refs.py`

`tests/commands/test_link_check.py`