Skip to content

fix: extract_boxed_answer returns full text when no \boxed{} found#1028

Open
chopratejas wants to merge 1 commit intoPrimeIntellect-ai:mainfrom
chopratejas:fix/extract-boxed-fallback
Open

fix: extract_boxed_answer returns full text when no \boxed{} found#1028
chopratejas wants to merge 1 commit intoPrimeIntellect-ai:mainfrom
chopratejas:fix/extract-boxed-fallback

Conversation

@chopratejas
Copy link
Contributor

@chopratejas chopratejas commented Mar 17, 2026

Problem

extract_boxed_answer() returns the entire input text when no \boxed{} tag is found. This text is then passed to math_verify, which extracts any number it can find — allowing a model to get correct-answer credit by mentioning the answer anywhere in its output without using \boxed{}.

Training impact

We ran rewardprobe simulate on the GSM8K MathRubric. The strategy scoreboard shows:

correct_lazy         ████████████████████ 1.00  ← just the answer, no reasoning
shortcut             ████████████████████ 1.00  ← skips computation  
perfect              █████████████░░░░░░░ 0.67  ← full reasoning + boxed answer

A model trained against this will learn to skip reasoning entirely — because correct_lazy (just outputting the number) scores higher than perfect (showing work in \boxed{}).

Fix

Add a strict parameter to extract_boxed_answer (default: False).

  • strict=False (default): returns full text on no match — backwards compatible, existing callers unaffected
  • strict=True: returns "" on no match — used by MathRubric to enforce \boxed{} format

Changes

  1. data_utils.py: Add strict parameter with safe default
  2. math_rubric.py: MathRubric uses strict=True via functools.partial
  3. test_math_rubric.py: Tests updated to use \boxed{} format (reflecting correct behavior)

Addresses Cursor bot feedback

  • "Sub-LLM responses silently dropped in RLM environment": Fixed. strict defaults to False, so rlm_env.py and all other callers get the same passthrough behavior as before.
  • "Change breaks existing valid-answer tests": Fixed. Tests now use \boxed{} completions, which is the correct format MathRubric should require.

Found using rewardprobe, a pre-training QA tool for reward functions.

Problem:
When a completion contains no \boxed{} tag, extract_boxed_answer returns
the entire input text. This is passed to math_verify, which matches any
number in the text — allowing a model to get correct-answer credit by
mentioning the answer anywhere without using \boxed{}.

During RL training, this means a model can skip the \boxed{} format
entirely and still score 1.0 by embedding the correct number in its
reasoning text. The strategy scoreboard from rewardprobe shows the
impact: "correct_lazy" (just outputting the answer) scores 1.0, while
"perfect" (full reasoning + boxed answer) scores only 0.67.

Fix:
Add a `strict` parameter to extract_boxed_answer (default: False).
When strict=True, returns "" on no match instead of the full text.
MathRubric now uses strict=True via functools.partial.

This is backwards compatible:
- extract_boxed_answer(text) still returns text (default strict=False)
- Only MathRubric's parser uses strict=True
- Other callers (rlm_env.py, etc.) are unaffected
- Tests updated to use \boxed{} format in completions

Found using rewardprobe (https://github.com/chopratejas/rewardprobe).
Copy link
Member

@mikasenghaas mikasenghaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree that strict behavior here is prob desirable but will have to verify that this doesn't create problems in other consumers of this function

@chopratejas chopratejas force-pushed the fix/extract-boxed-fallback branch from 9fbc510 to 4d0b28d Compare March 18, 2026 23:14
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

{"completion": "\\frac{1}{2}", "answer": "0.5"},
{"completion": "\\boxed{1}", "answer": "1"},
{"completion": "\\boxed{x + 1}", "answer": "1 + x"},
{"completion": "\\boxed{\\frac{1}{2}}", "answer": "0.5"},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout test breaks with strict boxed extraction

High Severity

The test_timeout test sets completion = "1" * int(1e5) without wrapping it in \boxed{}. Now that MathRubric defaults to strict=True, extract_boxed_answer returns "" for this completion, causing correct_answer to immediately return 0.0 via the if response == "" early exit. The assertion at line 124–125 expects 1.0 when timeout_seconds == 10, but will always get 0.0, so this test will fail.

Additional Locations (1)
Fix in Cursor Fix in Web

{"completion": "\\frac{1}{2}", "answer": "0.5"},
{"completion": "\\boxed{1}", "answer": "1"},
{"completion": "\\boxed{x + 1}", "answer": "1 + x"},
{"completion": "\\boxed{\\frac{1}{2}}", "answer": "0.5"},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid answer tests no longer test wrong math

Medium Severity

The test_score_invalid_answers completions ("1" and "\\frac{1}{3}") lack \boxed{} wrapping. With the new strict extraction in MathRubric, these return "" and score 0.0 due to format rejection, not because of wrong math. These tests pass for the wrong reason and no longer validate that incorrect math answers are scored as 0.0. Tests like \boxed{1} vs answer "2" are needed to actually test wrong-answer scoring.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants