[sharktank] Llama 3.1 f16 HF import presets #2290

sogartar · 2025-09-19T15:00:40Z

Add registry for import presets and populate it with Llama 3.1 f16 models.

Add support for referencing a HF dataset during import. This decouples specification of what needs to be downloaded versus how to import the dataset after it is download.

Expand the HF datasets to specify the model files not completely explicitly, but by using filters like huggingface_hub.snapshot_download.

Next steps are:

Make the CI use this new mechanism.
Add importation of models with more complicated transformations like quantization.

github-actions · 2025-09-19T15:09:28Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
sharktank/sharktank/utils
hf.py					65, 89-100, 115-141, 152-157, 215
hf_datasets.py					40-43, 46-47, 65, 68, 88
sharktank/tests/models/llama
test_llama.py					213, 218-231
Project Total

_{This report was generated by python-coverage-comment-action}

sharktank/tests/models/llama/test_llama.py

IanNod · 2025-09-22T22:41:29Z

sharktank/tests/models/llama/test_llama.py

+                "--preset=meta_llama3_1_8b_instruct_f16",
+            ]
+        )
+    assert irpa_path.exists()


we should verify these files in some form instead of seeing that they exist in case of a bad download or something. Maybe have an md5sum that we can compare?

It is possible that it fails silently in such a way. It will be pretty sad that the HF hub package would fail there as robust downloading I would assume is a major goal.
I actually have not decided yet what should be our attitude towards changing IRPA files. Should we force a strict manual file hash change? Meaning that the we make explicit assumption that it does not change.
It may be a problem for example if we add another filed to the model config, which would change the IRPA metadata and its hash.

This works for now. But like Ian pointed out, if there are more reliable ways to verify if the irpa generation was complete/successful, that would be great.

What I think ultimately should happen is to run a job for model importation before running the CI test jobs.

Another option is to run nightly a job that imports from HF then uploads to Azure so that other runners can update their model cache. This has the problem that we may overwrite existing model files with faulty ones if some bug appeared. In this scenario a more thorough model validation would be needed before uploading.

sharktank/sharktank/utils/hf_datasets.py

IanNod · 2025-09-22T22:45:04Z

sharktank/sharktank/tools/import_hf_dataset_from_hub.py

+        )
+    )
+    parser.add_argument(
+        "--output-irpa-file",


I would hesitate to add this flag as we currently can not directly consume it. sharktank expects naming conventions from gguf format, not huggingface

Where are we supposed to consume it in GGUF format? The converter would do on-the-fly conversion to our format which is derived from GGUF. We save only in IRPA. We can't save in GGUF, we can only read it as sharktank.types.Dataset.

sharktank/sharktank/utils/hf.py

sharktank/tests/models/llama/test_llama.py

Add registry for import presets and populate it with Llama 3.1 f16 models. Add support for referencing a HF dataset during import. This decouples specification of what needs to be downloaded versus how to import the dataset after it is download. Expand the HF datasets to specify the models files not completely explicitly, but by using filters like `huggingface_hub.snapshot_download`. Next steps are: 1. Make the CI use this new mechanism. 2. Add importation of models with more complicated transformations like quantization.

sogartar requested review from IanNod, KyleHerndon, archana-ramalingam, Alex-Vasile and rsuderman September 19, 2025 15:27

IanNod reviewed Sep 26, 2025

View reviewed changes

sogartar requested a review from IanNod September 26, 2025 22:07

archana-ramalingam reviewed Oct 7, 2025

View reviewed changes

sharktank/sharktank/utils/hf.py Outdated Show resolved Hide resolved

archana-ramalingam reviewed Oct 7, 2025

View reviewed changes

sharktank/tests/models/llama/test_llama.py Show resolved Hide resolved

sogartar force-pushed the users/sogartar/llama3-hf-import-presets branch from 729498e to e18016c Compare October 7, 2025 14:28

sogartar requested a review from archana-ramalingam October 7, 2025 14:36

archana-ramalingam approved these changes Oct 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[sharktank] Llama 3.1 f16 HF import presets #2290

[sharktank] Llama 3.1 f16 HF import presets #2290

sogartar commented Sep 19, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

IanNod Sep 22, 2025

Uh oh!

sogartar Sep 26, 2025 •

edited

Loading

Uh oh!

archana-ramalingam Oct 7, 2025

Uh oh!

sogartar Oct 7, 2025

Uh oh!

Uh oh!

IanNod Sep 22, 2025

Uh oh!

sogartar Sep 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[sharktank] Llama 3.1 f16 HF import presets #2290

Are you sure you want to change the base?

[sharktank] Llama 3.1 f16 HF import presets #2290

Conversation

sogartar commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage report

Uh oh!

Uh oh!

IanNod Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

sogartar Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

archana-ramalingam Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

sogartar Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

IanNod Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

sogartar Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sogartar commented Sep 19, 2025 •

edited

Loading

github-actions bot commented Sep 19, 2025 •

edited

Loading

sogartar Sep 26, 2025 •

edited

Loading

sogartar Sep 26, 2025 •

edited

Loading