Skip to content

feat: add MFA bypass and edge cases vulnerability skills with test suite#334

Open
fresh3nough wants to merge 2 commits intousestrix:mainfrom
fresh3nough:cody
Open

feat: add MFA bypass and edge cases vulnerability skills with test suite#334
fresh3nough wants to merge 2 commits intousestrix:mainfrom
fresh3nough:cody

Conversation

@fresh3nough
Copy link

@fresh3nough fresh3nough commented Mar 1, 2026

Closes #335

Summary

This PR adds two new vulnerability skill files and a comprehensive test suite to strengthen Strix's coverage in areas that were previously missing: MFA bypass techniques and edge case exploitation (caching races, partial failures, boundary conditions).

New Skills

1. MFA Bypass (vulnerabilities/mfa_bypass.md)

Covers testing for multi-factor authentication weaknesses including:

  • Session fixation / pre-MFA access -- exploiting the gap between password verification and MFA completion where pre-MFA sessions may already grant API access
  • Code validation flaws -- OTP code reuse, brute force without rate limiting, time window abuse, and race conditions on single-use enforcement
  • Fallback and recovery abuse -- backup code weaknesses, method downgrade (TOTP to SMS), recovery flow bypass, and remember-me token theft
  • Enrollment flow weaknesses -- adding/removing MFA methods without verifying the existing factor, TOTP secret exposure on re-enrollment
  • State machine abuse -- step skipping, cross-flow confusion, downgrade to single factor
  • Token and claim manipulation -- JWT amr/acr claim tampering, client-side MFA gates
  • CWE-308 classification with full attack surface, testing methodology, validation criteria, and chaining attack patterns

2. Edge Cases (vulnerabilities/edge_cases.md)

Covers testing for system boundary vulnerabilities including:

  • Cache poisoning races -- TOCTOU on CDN, cache key confusion, Vary header omission, web cache deception, parameter cloaking
  • Partial failure exploitation -- half-committed transactions, orphaned resources, retry amplification, compensation races, dead letter exploitation
  • Eventual consistency windows -- read-your-writes violations, cross-region stale reads, search index lag, distributed counter drift
  • Boundary condition abuse -- integer overflow/underflow, pagination edge cases, time boundary exploitation, encoding differentials, floating point precision
  • Graceful degradation weaknesses -- fallback path bypass, circuit breaker open state, feature flag defaults, cache stampede
  • Stale state and revocation gaps -- token revocation lag, permission cache staleness, DNS rebinding
  • CWE-362 classification with full reconnaissance, bypass techniques, and validation criteria

Test Suite

Added tests/skills/test_skills.py with 50 test cases across 9 test classes:

  • TestGetAvailableSkills -- skill discovery, category filtering, sorting
  • TestGetAllSkillNames -- aggregation across categories, new skill presence
  • TestValidateSkillNames -- valid/invalid/mixed input handling
  • TestLoadSkills -- loading, frontmatter stripping, multi-skill loading, nonexistent handling
  • TestGenerateSkillsDescription -- description generation and content
  • TestGetAllCategories -- internal category listing
  • TestMfaBypassSkillContent -- frontmatter validation, required sections, topic coverage (session fixation, OTP reuse, fallback abuse, enrollment)
  • TestEdgeCasesSkillContent -- frontmatter validation, required sections, topic coverage (cache poisoning, partial failures, consistency, boundaries, degradation)
  • TestAllSkillFilesStructure -- structural validation across ALL skill .md files (frontmatter, name/description fields, non-empty content)

All 50 tests pass. Ruff lint clean.

Files Changed

  • strix/skills/vulnerabilities/mfa_bypass.md (new)
  • strix/skills/vulnerabilities/edge_cases.md (new)
  • tests/skills/__init__.py (new)
  • tests/skills/test_skills.py (new)

Both skills follow the existing template style (frontmatter with name/description/category/tags/cwe, plus standardized sections: Attack Surface, High-Value Targets, Reconnaissance, Key Vulnerabilities, Bypass Techniques, Testing Methodology, Validation, False Positives, Impact, Pro Tips, Summary).

- Add mfa_bypass.md: MFA bypass testing covering session fixation, code
  reuse, fallback abuse, enrollment flow weaknesses (CWE-308)
- Add edge_cases.md: edge case testing covering caching races, partial
  failures, boundary conditions, eventual consistency exploitation (CWE-362)
- Add tests/skills/test_skills.py: 50 test cases covering skill loading,
  validation, frontmatter parsing, content quality, and structural checks
  across all skill files
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 1, 2026

Greptile Summary

This PR adds two well-written vulnerability skill files (mfa_bypass.md and edge_cases.md) and a 50-test suite to cover them. The skill content and structure are solid; the main issues are in the test file.

  • No-op replace in test_mentions_new_skills (tests/skills/test_skills.py:597): skill.replace("_", "_") is a complete no-op, making the assertion identical to assert skill in desc. The test passes vacuously and does not verify any meaningful transformation or the hyphenated frontmatter name variant.
  • test_includes_internal_categories never tests internal categories (tests/skills/test_skills.py:608–611): Despite the name and docstring claiming to assert that _get_all_categories() returns scan_modes and coordination (the categories excluded from get_available_skills()), the only assertion is for "vulnerabilities", which is present in both functions. The key behavioural difference of _get_all_categories is completely untested.
  • Missing test_frontmatter_has_cwe in TestEdgeCasesSkillContent: The parallel TestMfaBypassSkillContent class tests for the cwe frontmatter field; TestEdgeCasesSkillContent lacks this check despite edge_cases.md also declaring cwe: 362.

Confidence Score: 4/5

  • Safe to merge; issues are confined to the test suite and do not affect runtime behaviour or the skill content itself.
  • The two new skill markdown files are well-formed and follow existing conventions exactly. The test file has two logic gaps (a no-op assertion and an under-specified internal-category test) that reduce coverage guarantees but do not cause test failures or runtime regressions. The overall change is additive and low-risk.
  • tests/skills/test_skills.py — contains a no-op assertion and a test that does not validate its stated contract.

Important Files Changed

Filename Overview
tests/skills/test_skills.py New 373-line test suite with 50 cases across 9 classes; contains two logic issues: a no-op replace("_", "_") in test_mentions_new_skills that makes the assertion trivially meaningless, and test_includes_internal_categories which never asserts the internal scan_modes/coordination categories it claims to cover. One minor inconsistency: TestEdgeCasesSkillContent is missing the test_frontmatter_has_cwe assertion present in the parallel TestMfaBypassSkillContent class.
strix/skills/vulnerabilities/mfa_bypass.md New skill file covering MFA bypass techniques; well-structured with valid frontmatter (name, description, category, tags, cwe: 308) and all required sections. Content is comprehensive and follows the existing skill template style.
strix/skills/vulnerabilities/edge_cases.md New skill file covering edge case exploitation (caching races, partial failures, boundary conditions); well-structured with valid frontmatter (name, description, category, tags, cwe: 362) and all required sections. Content is comprehensive and follows the existing skill template style.
tests/skills/init.py Trivial new package init file with a single comment; no issues.

Last reviewed commit: bd242ab

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 1, 2026

Additional Comments (3)

tests/skills/test_skills.py
No-op replace makes assertion trivially meaningless

skill.replace("_", "_") replaces underscores with underscores — it is a complete no-op and produces the exact same string. The assertion is therefore identical to assert skill in desc, which happens to pass because generate_skills_description() already uses underscore-separated file stems (e.g. mfa_bypass, edge_cases).

The comment above says "The description includes a comma-separated list of all skill names" and the replace call suggests the author intended to test the hyphenated variant (e.g. mfa-bypass) or some other transformation. As written, the test adds no value beyond the base assert skill in desc check and silently hides the developer's original intent.

            assert skill in desc

If the intent was to verify the hyphenated name field from frontmatter appears in the description, note that generate_skills_description() uses file-stem names (underscores), not frontmatter names (hyphens), so the hyphenated form would never be found in the output. The test should simply assert the underscore form, as shown above.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/skills/test_skills.py
Line: 596-597

Comment:
**No-op `replace` makes assertion trivially meaningless**

`skill.replace("_", "_")` replaces underscores with underscores — it is a complete no-op and produces the exact same string. The assertion is therefore identical to `assert skill in desc`, which happens to pass because `generate_skills_description()` already uses underscore-separated file stems (e.g. `mfa_bypass`, `edge_cases`).

The comment above says _"The description includes a comma-separated list of all skill names"_ and the replace call suggests the author intended to test the hyphenated variant (e.g. `mfa-bypass`) or some other transformation. As written, the test adds no value beyond the base `assert skill in desc` check and silently hides the developer's original intent.

```suggestion
            assert skill in desc
```

If the intent was to verify the hyphenated `name` field from frontmatter appears in the description, note that `generate_skills_description()` uses file-stem names (underscores), not frontmatter names (hyphens), so the hyphenated form would never be found in the output. The test should simply assert the underscore form, as shown above.

How can I resolve this? If you propose a fix, please make it concise.

tests/skills/test_skills.py
Test doesn't verify what it claims — internal categories never asserted

The test is named test_includes_internal_categories and the docstring explicitly states "Should include categories that get_available_skills excludes", referring to scan_modes and coordination. However, the only assertion is:

assert "vulnerabilities" in cats

vulnerabilities is present in both get_available_skills() and _get_all_categories(). This assertion would pass even if _get_all_categories() were functionally identical to get_available_skills(). The distinguishing behaviour of _get_all_categories — that it returns the internal scan_modes and coordination categories — is never exercised, so the test provides no coverage guarantee for the function's unique contract.

    def test_includes_internal_categories(self) -> None:
        cats = _get_all_categories()
        # Should include categories that get_available_skills excludes
        assert "vulnerabilities" in cats
        assert "scan_modes" in cats, "scan_modes should be returned by _get_all_categories"
        assert "coordination" in cats, "coordination should be returned by _get_all_categories"
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/skills/test_skills.py
Line: 608-611

Comment:
**Test doesn't verify what it claims — internal categories never asserted**

The test is named `test_includes_internal_categories` and the docstring explicitly states _"Should include categories that `get_available_skills` excludes"_, referring to `scan_modes` and `coordination`. However, the only assertion is:

```python
assert "vulnerabilities" in cats
```

`vulnerabilities` is present in **both** `get_available_skills()` and `_get_all_categories()`. This assertion would pass even if `_get_all_categories()` were functionally identical to `get_available_skills()`. The distinguishing behaviour of `_get_all_categories` — that it returns the internal `scan_modes` and `coordination` categories — is never exercised, so the test provides no coverage guarantee for the function's unique contract.

```suggestion
    def test_includes_internal_categories(self) -> None:
        cats = _get_all_categories()
        # Should include categories that get_available_skills excludes
        assert "vulnerabilities" in cats
        assert "scan_modes" in cats, "scan_modes should be returned by _get_all_categories"
        assert "coordination" in cats, "coordination should be returned by _get_all_categories"
```

How can I resolve this? If you propose a fix, please make it concise.

tests/skills/test_skills.py
TestEdgeCasesSkillContent missing CWE frontmatter assertion

TestMfaBypassSkillContent includes test_frontmatter_has_cwe to verify the cwe field is present. edge_cases.md also declares cwe: 362 in its frontmatter, but TestEdgeCasesSkillContent has no equivalent test. For consistency and to prevent regressions (e.g., accidental deletion of the cwe field), add the same assertion here:

def test_frontmatter_has_cwe(self, frontmatter: dict[str, str]) -> None:
    assert "cwe" in frontmatter
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/skills/test_skills.py
Line: 674-720

Comment:
**`TestEdgeCasesSkillContent` missing CWE frontmatter assertion**

`TestMfaBypassSkillContent` includes `test_frontmatter_has_cwe` to verify the `cwe` field is present. `edge_cases.md` also declares `cwe: 362` in its frontmatter, but `TestEdgeCasesSkillContent` has no equivalent test. For consistency and to prevent regressions (e.g., accidental deletion of the `cwe` field), add the same assertion here:

```python
def test_frontmatter_has_cwe(self, frontmatter: dict[str, str]) -> None:
    assert "cwe" in frontmatter
```

How can I resolve this? If you propose a fix, please make it concise.

- Remove no-op replace('_', '_') in test_mentions_new_skills
- Add scan_modes and coordination assertions in test_includes_internal_categories
- Add test_frontmatter_has_cwe to TestEdgeCasesSkillContent for parity

Signed-off-by: fresh3nough <251773092+fresh3nough@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing vulnerability skills: MFA bypass and edge cases (caching races, partial failures)

1 participant