Skip to content

docs(snowflake): add REST API pricing for 18 Snowflake Cortex models#25707

Open
sfc-gh-ridalemkaalel wants to merge 1 commit intoBerriAI:mainfrom
sfc-gh-ridalemkaalel:add-snowflake-model-pricing
Open

docs(snowflake): add REST API pricing for 18 Snowflake Cortex models#25707
sfc-gh-ridalemkaalel wants to merge 1 commit intoBerriAI:mainfrom
sfc-gh-ridalemkaalel:add-snowflake-model-pricing

Conversation

@sfc-gh-ridalemkaalel
Copy link
Copy Markdown

Summary

  • Adds pricing entries for 18 Snowflake Cortex models from Table 6(b) of the Snowflake Service Consumption Table — REST API with Prompt Caching, using Regional pricing throughout
  • 9 Claude models (AWS Regional): claude-3-7-sonnet, claude-4-opus, claude-4-sonnet, claude-haiku-4-5, claude-opus-4-5, claude-opus-4-6, claude-sonnet-4-5, claude-sonnet-4-5-long-context, claude-sonnet-4-6 — all include cache_creation_input_token_cost and cache_read_input_token_cost
  • 9 OpenAI models (Azure Regional): openai-gpt-4.1, openai-gpt-5, openai-gpt-5-mini, openai-gpt-5-nano, openai-gpt-5.1, openai-gpt-5.2, openai-gpt-5.4, openai-gpt-5.4-long-context, openai-o4-mini — all include cache_read_input_token_cost (no cache write cost per the table)
  • Both model_prices_and_context_window.json and litellm/model_prices_and_context_window_backup.json are updated in sync (backup brought in line with root, which is what ci_cd/check_files_match.py enforces)

Test plan

  • Added tests/test_litellm/llms/snowflake/test_snowflake_pricing.py with 6 tests:
    • test_snowflake_models_exist — all 18 models present in the JSON
    • test_snowflake_models_have_correct_providerlitellm_provider == "snowflake" for all
    • test_snowflake_models_have_positive_pricinginput_cost_per_token and output_cost_per_token are positive
    • test_snowflake_claude_models_have_prompt_caching_fields — Claude models have both cache write and read costs, and supports_prompt_caching: true
    • test_snowflake_openai_models_have_cache_read_but_no_cache_write — OpenAI models have cache read cost but no cache write cost
    • test_snowflake_models_have_context_window — all models define max_input_tokens, max_output_tokens, max_tokens
  • Both JSON files validated with python3 -c "import json; json.load(open(...))"
  • Test file passes black formatting check and ruff linting

.... Generated with Cortex Code

   models

    Adds pricing entries for 18 Snowflake models from Table 6(b) of the
    Snowflake Service Consumption Table (REST API with Prompt Caching),
    using Regional pricing throughout.

    - 9 Claude models (AWS Regional): claude-3-7-sonnet, claude-4-opus,
      claude-4-sonnet, claude-haiku-4-5, claude-opus-4-5, claude-opus-4-6,
      claude-sonnet-4-5, claude-sonnet-4-5-long-context, claude-sonnet-4-6

    - 9 OpenAI models (Azure Regional): openai-gpt-4.1, openai-gpt-5,
      openai-gpt-5-mini, openai-gpt-5-nano, openai-gpt-5.1, openai-gpt-5.2,
      openai-gpt-5.4, openai-gpt-5.4-long-context, openai-o4-mini
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 14, 2026 7:54pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 14, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing sfc-gh-ridalemkaalel:add-snowflake-model-pricing (34af0de) with main (0e43050)

Open in CodSpeed

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 14, 2026

Greptile Summary

This PR adds pricing entries for 18 Snowflake Cortex REST API models (9 Claude via AWS Regional, 9 OpenAI via Azure Regional) to both model_prices_and_context_window.json and its backup, and adds a new test file that validates the entries. No existing entries are modified.

Confidence Score: 4/5

Safe to merge after addressing the missing supports_prompt_caching flag on OpenAI models

One P2 data inconsistency: the 9 OpenAI Snowflake models have cache_read_input_token_cost but omit supports_prompt_caching: true, causing litellm.supports_prompt_caching() to return False for them. This is a real API-correctness gap. A second P2 asks for confirmation that gpt-5 and gpt-5.1 identical pricing is intentional. Both JSON files are in sync and the test file is well-structured.

model_prices_and_context_window.json — the 9 OpenAI Snowflake model entries need supports_prompt_caching: true added

Important Files Changed

Filename Overview
model_prices_and_context_window.json Adds 18 new snowflake model entries; OpenAI models include cache_read_input_token_cost but omit supports_prompt_caching: true, inconsistent with the existing pattern for OpenAI models
litellm/model_prices_and_context_window_backup.json Kept in sync with root JSON; same 18 entries added identically
tests/test_litellm/llms/snowflake/test_snowflake_pricing.py New test file; reads local JSON (no network calls), covers existence, provider, pricing, caching fields, and context window for all 18 models

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Snowflake Cortex REST API] --> B{Model Family}
    B --> C[Claude Models x9\nAWS Regional]
    B --> D[OpenAI Models x9\nAzure Regional]
    C --> E[input_cost_per_token\noutput_cost_per_token\ncache_creation_input_token_cost\ncache_read_input_token_cost\nsupports_prompt_caching: true]
    D --> F[input_cost_per_token\noutput_cost_per_token\ncache_read_input_token_cost\n⚠️ supports_prompt_caching: missing]
    E --> G[model_prices_and_context_window.json\n+ backup]
    F --> G
Loading

Reviews (1): Last reviewed commit: "feat(snowflake): add REST API pricing fo..." | Re-trigger Greptile

Comment on lines +27942 to +27953
"snowflake/openai-gpt-4.1": {
"litellm_provider": "snowflake",
"max_input_tokens": 1047576,
"max_output_tokens": 32768,
"max_tokens": 32768,
"mode": "chat",
"input_cost_per_token": 2.2e-06,
"output_cost_per_token": 8.8e-06,
"cache_read_input_token_cost": 5.5e-07,
"supports_function_calling": true,
"supports_vision": true
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing supports_prompt_caching flag on OpenAI Snowflake models

All 9 snowflake/openai-* entries carry cache_read_input_token_cost but none set "supports_prompt_caching": true. Every direct OpenAI model with cache pricing (e.g. gpt-4o-mini) includes this flag, and litellm.supports_prompt_caching("snowflake/openai-gpt-5") will return False as a result. The same pattern applies to all 9 OpenAI Snowflake entries.

Suggested change
"snowflake/openai-gpt-4.1": {
"litellm_provider": "snowflake",
"max_input_tokens": 1047576,
"max_output_tokens": 32768,
"max_tokens": 32768,
"mode": "chat",
"input_cost_per_token": 2.2e-06,
"output_cost_per_token": 8.8e-06,
"cache_read_input_token_cost": 5.5e-07,
"supports_function_calling": true,
"supports_vision": true
},
"snowflake/openai-gpt-4.1": {
"litellm_provider": "snowflake",
"max_input_tokens": 1047576,
"max_output_tokens": 32768,
"max_tokens": 32768,
"mode": "chat",
"input_cost_per_token": 2.2e-06,
"output_cost_per_token": 8.8e-06,
"cache_read_input_token_cost": 5.5e-07,
"supports_prompt_caching": true,
"supports_function_calling": true,
"supports_vision": true
},

Comment on lines +27954 to +27966
"snowflake/openai-gpt-5": {
"litellm_provider": "snowflake",
"max_input_tokens": 272000,
"max_output_tokens": 128000,
"max_tokens": 128000,
"mode": "chat",
"input_cost_per_token": 1.38e-06,
"output_cost_per_token": 1.1e-05,
"cache_read_input_token_cost": 1.4e-07,
"supports_function_calling": true,
"supports_vision": true,
"supports_reasoning": true
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Identical pricing for openai-gpt-5 and openai-gpt-5.1

snowflake/openai-gpt-5 and snowflake/openai-gpt-5.1 share exactly the same input_cost_per_token ($1.38e-06), output_cost_per_token ($1.1e-05), and cache_read_input_token_cost ($1.4e-07). Is this intentional per Snowflake's pricing table, or a copy-paste error for one of the two entries? Are openai-gpt-5 and openai-gpt-5.1 genuinely the same price tier in Table 6(b), or should one of the entries have different values?

@sfc-gh-ridalemkaalel sfc-gh-ridalemkaalel changed the title feat(snowflake): add REST API pricing for 18 Snowflake Cortex models docs(snowflake): add REST API pricing for 18 Snowflake Cortex models Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant