Skip to content

fix: use max_completion_tokens for gpt-4.1+, gpt-5.x, and o-series models#38

Open
YuHuang0525 wants to merge 1 commit intoPickle-Pixel:mainfrom
YuHuang0525:fix/openai-max-completion-tokens
Open

fix: use max_completion_tokens for gpt-4.1+, gpt-5.x, and o-series models#38
YuHuang0525 wants to merge 1 commit intoPickle-Pixel:mainfrom
YuHuang0525:fix/openai-max-completion-tokens

Conversation

@YuHuang0525
Copy link

Problem

Newer OpenAI models — gpt-4.1, gpt-5.x, o1, o3, o4 — reject the legacy max_tokens parameter with HTTP 400 and require max_completion_tokens instead. This means any user who sets LLM_MODEL=gpt-5.2 (or any other newer model) gets an immediate 400 error on every LLM call, breaking scoring, tailoring, and cover letter generation entirely.

Relevant code before this fix (llm.py _chat_compat()):

payload = {
    "model": self.model,
    "messages": messages,
    "temperature": temperature,
    "max_tokens": max_tokens,   # ← rejected by gpt-4.1+, gpt-5.x, o-series
}

Fix

Detect the model prefix at call time and send the correct parameter:

_new_param_models = ("gpt-4.1", "gpt-5", "o1", "o3", "o4")
if any(self.model.startswith(p) for p in _new_param_models):
    token_param = {"max_completion_tokens": max_tokens}
else:
    token_param = {"max_tokens": max_tokens}
  • All other providers (Gemini compat, Gemini native, local/Ollama) are unaffected — they continue using max_tokens as before.
  • The native Gemini path already uses maxOutputTokens and is untouched.
  • No behaviour change for existing gpt-4o, gpt-4o-mini, or local model users.

Testing

Verified manually with gpt-5.2 — the 400 error is resolved and completions return successfully after this change. No existing automated tests cover _chat_compat() directly.

…dels

Newer OpenAI models (gpt-4.1+, gpt-5.x, o1, o3, o4) reject the legacy
max_tokens parameter with HTTP 400 and require max_completion_tokens instead.

_chat_compat() now detects the model prefix at call time and sends the
correct parameter, while all other providers (Gemini, local) continue
using max_tokens unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant