fix: link-check FP reduction round 2 — short labels, ALL_CAPS, version-adjacent#77
Merged
ohjonathan merged 3 commits intomainfrom Feb 13, 2026
Merged
Conversation
…n-adjacent
Reduce link-check false positives from ~92 → 45 by adding three new
pattern exclusions to _looks_like_doc_id() in the generic body scan:
1. Short label pattern (^[A-Z]{1,2}-?[A-Z]?\d{1,2}$)
Rejects track/finding labels: A1, B2, X-H1, NB-1, L0, M-2, etc.
~18 unique values, the largest remaining FP category.
2. ALL_CAPS token pattern (^[A-Z][A-Z_]+[A-Z]$)
Rejects SCREAMING_SNAKE_CASE config constants: AUTO_CONSOLIDATE.
3. Version-adjacent patterns:
- Extended _VERSION_RE to catch trailing letters (v3.2.1b)
- Added _VERSION_WILDCARD_RE for .x wildcards (v2.x, v3.2.x)
Safety: All filters only affect Pass 2 (generic unknown-ID scan).
Pass 1 (known-ID scan) always finds existing doc IDs regardless of
naming pattern, so no false negatives are introduced.
24 new tests added (14 short label, 3 ALL_CAPS, 4 version-adjacent,
3 integration scan tests). Full suite: 916 passed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Owner
Author
|
Adversarial review (single comprehensive pass) Findings
Evidence:
Concrete repro I ran:
Impact:
Nuance:
Evidence:
Impact:
Open questions
Validation run
Overall: this is a strong incremental FP reduction, but I would treat Finding #1 as blocking unless reduced recall for broken bare-token references is an explicit, accepted tradeoff. |
Two new tests for PR #77 adversarial review response: - Known gap: broken ref matching filtered pattern not detected in generic scan - Happy path: short-label doc ID detected by known-ID scan when it exists Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements Option A from the second external review response plan: add more pattern exclusions to
_looks_like_doc_id()to further reduce link-check false positives. This is the hotfix approach agreed upon in the plan; Option B (require multi-segment snake_case) was explicitly deferred as a more fundamental redesign.Result: broken references reduced from ~92 → 45 (51% further reduction; ~89% total from the original 408 baseline).
Plan Reference
This PR executes the "Implementation Plan: Further FP Reduction (Option A)" section of the second external review response plan. The plan was developed after the original reviewer re-evaluated the codebase post-PR #76 and verified 7 of 9 items as fixed.
Context from the plan
The remaining ~92 broken references decomposed into 25 unique values across these categories:
This PR addresses the first, second, fourth, and fifth categories. The snake_case prose tokens (e.g.
logs_dir,warn_legacy) are intentionally not filtered — they genuinely look like doc IDs and are the hardest to distinguish without the multi-segment redesign (Option B).Reviewer corrections from the plan (not in scope for this PR, but for context)
The plan also documented two factual corrections to the reviewer's claims:
CHANGELOG.mdline 5 already contains a cross-reference toOntos_CHANGELOG.md, added in commit6305302.pyproject.tomlline 58 setstestpaths = ["tests"]. Legacy tests only run when explicitly invoked. The FutureWarning is a Python import-time side effect, not test execution.Changes
ontos/core/body_refs.pyThree new regex constants and corresponding filter checks in
_looks_like_doc_id():_SHORT_LABEL_RE—^[A-Z]{1,2}-?[A-Z]?\d{1,2}$A1,B2,X-H1,NB-1,L0,M-2_ALL_CAPS_RE—^[A-Z][A-Z_]+[A-Z]$AUTO_CONSOLIDATE_VERSION_WILDCARD_RE—^v?\d+(\.\d+)*\.x$v2.x,v3.2.xUpdated
_VERSION_RE—^v?\d+(\.\d+)+[a-z]?$(was^v?\d+(\.\d+)+$)v3.2.1b,3.2.1aAll four checks are inserted after the file-extension filter and before the
if "_" in token or "." in tokencatch-all, ensuring they intercept tokens that would otherwise be misclassified as doc IDs.tests/core/test_body_refs.py24 new test cases added across existing test classes:
TestLooksLikeDocIdFilters(unit tests on_looks_like_doc_id):test_short_labels_rejected— 14 parametrized cases (A1, A3, B2, L0–L3, M-2, B-2, NB-1–NB-3, X-H1, X-H2)test_all_caps_constants_rejected— 3 parametrized cases (AUTO_CONSOLIDATE, MY_CONFIG, SOME_SETTING)test_version_adjacent_rejected— 4 parametrized cases (v3.2.1b, v2.x, v3.2.x, 3.2.1a)TestFalsePositiveScanning(integration tests via fullscan_body_references):test_short_labels_not_in_scan— verifies A1, NB-1 not in scan outputtest_all_caps_constants_not_in_scan— verifies AUTO_CONSOLIDATE not in scan outputtest_version_wildcards_not_in_scan— verifies v2.x, v3.2.1b not in scan outputtests/commands/test_link_check.py2 new integration tests documenting the precision/recall tradeoff:
test_link_check_broken_ref_matching_filtered_pattern_not_detected_in_generic_scan— Documents the known gap: a broken bare-token reference whose ID matches a filtered pattern (e.g.,A2) is NOT detected by the generic scan. This is the accepted tradeoff for eliminating ~60+ FPs from short labels.test_link_check_short_label_doc_id_detected_when_exists— Confirms the known-ID scan (Pass 1) correctly detects short-label doc IDs when the referenced document exists.Safety and known tradeoffs
All new filters only affect Pass 2 (generic unknown-ID scan,
_iter_generic_id_candidates). The known-ID scan (Pass 1,_iter_known_id_candidates) always finds existing doc IDs by exact match regardless of naming pattern, so references to existing documents are never missed.However, broken references to non-existent documents whose IDs match filtered patterns (e.g., a typo
A2whenA1exists) will not be detected by the generic scan. This is an intentional precision/recall tradeoff — the same class of gap exists for all pre-existing filters (_VERSION_RE,_BARE_NUMBER_RE,_FILE_EXTENSION_RE,_KNOWN_FIELD_NAMES) and is inherent to heuristic-based generic scanning. The gap is narrow: it only affects documents with IDs matching short labels / ALL_CAPS / version wildcards that (a) don't exist and (b) are referenced only as bare tokens in body text (not in frontmatterdepends_on, which is always validated against the full ID set).The existing
TestKnownIdsBypassFilterstest class validates the Pass 1 safety invariant, and the newtest_link_check_broken_ref_matching_filtered_pattern_not_detected_in_generic_scantest explicitly documents the accepted tradeoff.Verification
Test plan
TestKnownIdsBypassFilters🤖 Generated with Claude Code