Unnecessary Download of Model Weights using Rclone during Performance Test (Existing Weights Already Available)

When running MLPerf inference benchmark with **pre-downloaded model weights**, the `mlcr run-mlperf` script still attempts to download model weights using **rclone**, instead of using the provided local path. This causes unnecessary failures if remote storage is unavailable.

---

### Steps to Reproduce

1. **Set environment variables:**

```bash
export MLC_SCRIPT_EXTRA_CMD="--adr.python.name=mlperf"
export HF_TOKEN="huggiface-api-acces-token"
export CHECKPOINT_PATH="/work/models/llama31_8b"
export DATASET_PATH="/work/dataset/cnndm"
```

2. **Download model weights & dataset (pre-downloaded successfully):**

```bash
mlcr get,ml-model,llama3,_meta-llama/Llama-3.1-8B-Instruct,_hf \
  --outdirname=${CHECKPOINT_PATH} --hf_token=${HF_TOKEN} -j --skip_system_deps

mlcr get,dataset,cnndm,_validation,_datacenter,_llama3,_mlc,_r2-downloader \
  --outdirname=$DATASET_PATH -j --skip_system_deps
```

3. **Run benchmark with local paths:**

```bash
mlcr run-mlperf,inference,_r5.1-dev,_performance-only \
    --model=llama3_1-8b \
    --use_model_from_host=${CHECKPOINT_PATH}/repo \
    --use_dataset_from_host=${DATASET_PATH}/llama3-1-8b-cnn-eval.uri/cnn_eval.json \
    --implementation=reference \
    --framework=vllm \
    --vllm_tp_size=8 \
    --category=datacenter \
    --scenario=Offline \
    --execution_mode=valid \
    --device=cuda \
    --threads=8 \
    --quiet \
    --precision=bfloat16 \
    --test_query_count=100 \
    --batch_size=16 \
    --results_dir=/work/result18 \
    --hf_token $HF_TOKEN \
    --skip_system_deps=True
```

---

### Expected Behavior

If `--use_model_from_host` is specified and weights already exist locally, the benchmark should **skip rclone download** and directly use the local path.

---

### Actual Behavior

The script **always triggers rclone sync**:

```log
Downloading: rclone sync 'mlc-llama3-1:inference/Llama-3.1-8b-Instruct' ...
2025/09/18 09:29:47 ERROR : Google drive root 'inference/Llama-3.1-8b-Instruct': directory not found
...
mlc.script_action.ScriptExecutionError: Script run execution failed. Error : MLC script failed (name = download-file, return code = 768)
```

Even though model weights are already present in `${CHECKPOINT_PATH}/repo`.

---

### Environment

* **mlcflow version:** 1.1.1
* **Python version:** 3.12
* **Framework:** vLLM
* **GPU:** 8 H100
* **OS/Distro:** Ubuntu 24.04.1 LTS

---

### Suggested Fix

* Add a **skip-download mechanism** when `--use_model_from_host` is passed.
* Check local path existence before attempting `rclone sync`.
* Alternatively, provide a dedicated flag like `--disable_remote_download`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unnecessary Download of Model Weights using Rclone during Performance Test (Existing Weights Already Available) #640

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Suggested Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unnecessary Download of Model Weights using Rclone during Performance Test (Existing Weights Already Available) #640

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Suggested Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions