-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Ran below script. Docker container built but failed to install dependency. There is no nvidia-ammo but it's mentioned in requirements.txt
https://docs.mlcommons.org/inference/benchmarks/language/llama2-70b/#__tabbed_1_2
Script
mlcr run-mlperf,inference,_find-performance,_full,_r5.0-dev
--model=llama2-70b-99
--implementation=nvidia
--framework=tensorrt
--category=datacenter
--scenario=Offline
--execution_mode=test
--device=cuda
--docker --quiet
--test_query_count=50
--tp_size=4
--nvidia_llama2_dataset_file_path=/home/nvidia/open_orca/open_orca_gpt4_tokenized_llama.sampled_24576.pkl --rerun
Error
Requirement already satisfied: torch<=2.2.0a in /home/mlcuser/.local/lib/python3.8/site-packages (from -r requirements.txt (line 16)) (2.1.0a0+git32f93b1)
Collecting nvidia-ammo~=0.7.0
Downloading nvidia-ammo-0.7.4.tar.gz (6.9 kB)
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o3jj2u4v/nvidia-ammo/setup.py'"'"'; file='"'"'/tmp/pip-install-o3jj2u4v/nvidia-ammo/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-o3jj2u4v/nvidia-ammo/pip-egg-info
cwd: /tmp/pip-install-o3jj2u4v/nvidia-ammo/
Complete output (5 lines):
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-o3jj2u4v/nvidia-ammo/setup.py", line 90, in
raise RuntimeError("Bad params")
RuntimeError: Bad params
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Traceback (most recent call last):
File "scripts/build_wheel.py", line 332, in
main(**vars(args))
File "scripts/build_wheel.py", line 68, in main
build_run(
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '"/usr/bin/python3" -m pip install -r requirements-dev.txt --extra-index-url https://pypi.ngc.nvidia.com' returned non-zero exit status 1.
make[1]: *** [Makefile.build:307: build_trt_llm] Error 1
make[1]: Leaving directory '/home/mlcuser/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6b3f63c5/repo/closed/NVIDIA'
make: *** [/home/mlcuser/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6b3f63c5/repo/closed/NVIDIA/Makefile.build:181: build] Error 2
Traceback (most recent call last):
File "/home/mlcuser/.local/bin/mlcr", line 8, in
sys.exit(mlcr())
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/main.py", line 87, in mlcr
main()
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/main.py", line 274, in main
res = method(run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 316, in run
return self.call_script_module_function("run", run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 230, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 240, in run
r = self._run(i)
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1854, in _run
r = self._call_run_deps(prehook_deps, self.local_env_keys, local_env_keys_from_meta, env, state, const, const_state, add_deps_recursive,
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3362, in _call_run_deps
r = script._run_deps(deps, local_env_keys, env, state, const, const_state, add_deps_recursive, recursion_spaces,
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3554, in _run_deps
r = self.action_object.access(ii)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/action.py", line 56, in access
result = method(options)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 316, in run
return self.call_script_module_function("run", run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 230, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 240, in run
r = self._run(i)
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1638, in _run
r = self._call_run_deps(deps, self.local_env_keys, local_env_keys_from_meta, env, state, const, const_state, add_deps_recursive,
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3362, in _call_run_deps
r = script._run_deps(deps, local_env_keys, env, state, const, const_state, add_deps_recursive, recursion_spaces,
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3554, in _run_deps
r = self.action_object.access(ii)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/action.py", line 56, in access
result = method(options)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 316, in run
return self.call_script_module_function("run", run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 248, in call_script_module_function
raise ScriptExecutionError(f"Script {function_name} execution failed. Error : {error}")
mlc.script_action.ScriptExecutionError: Script run execution failed. Error : MLC script failed (name = build-mlperf-inference-server-nvidia, return code = 256)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Please file an issue at https://github.com/mlcommons/mlperf-automations/issues along with the full MLC command being run and the relevant
or full console log.
#######################################################
Action
Edited requirements.txt (which is created in the container) to exclude #nvidia-ammo but script overwrites and fails again when ran the offline script
mlcuser@0a8c7a0385a8:$ cd /home/mlcuser/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6b3f63c5/repo/closed/NVIDIA/build/TRTLLM/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6b3f63c5/repo/closed/NVIDIA/build/TRTLLM$ ls -al
mlcuser@0a8c7a0385a8:
total 108
drwxr-xr-x 14 mlcuser mlc 4096 Sep 10 18:03 .
drwxr-xr-x 5 mlcuser mlc 133 Sep 10 17:59 ..
-rw-r--r-- 1 mlcuser mlc 2356 Sep 10 18:02 .clang-format
-rw-r--r-- 1 mlcuser mlc 215 Sep 10 18:02 .dockerignore
drwxr-xr-x 10 mlcuser mlc 4096 Sep 10 18:03 .git
-rw-r--r-- 1 mlcuser mlc 40 Sep 10 18:02 .gitattributes
drwxr-xr-x 3 mlcuser mlc 36 Sep 10 18:02 .github
-rw-r--r-- 1 mlcuser mlc 313 Sep 10 18:02 .gitignore
-rw-r--r-- 1 mlcuser mlc 404 Sep 10 18:02 .gitmodules
-rw-r--r-- 1 mlcuser mlc 1494 Sep 10 18:02 .pre-commit-config.yaml
drwxr-xr-x 6 mlcuser mlc 80 Sep 10 18:02 3rdparty
-rw-r--r-- 1 mlcuser mlc 5646 Sep 10 18:02 CHANGELOG.md
-rw-r--r-- 1 mlcuser mlc 11358 Sep 10 18:00 LICENSE
-rw-r--r-- 1 mlcuser mlc 20362 Sep 10 18:02 README.md
drwxr-xr-x 4 mlcuser mlc 43 Sep 10 18:02 benchmarks
drwxr-xr-x 6 mlcuser mlc 113 Sep 10 18:02 cpp
drwxr-xr-x 3 mlcuser mlc 103 Sep 10 18:02 docker
drwxr-xr-x 3 mlcuser mlc 136 Sep 10 18:02 docs
drwxr-xr-x 33 mlcuser mlc 4096 Sep 10 18:02 examples
-rw-r--r-- 1 mlcuser mlc 261 Sep 10 18:02 requirements-dev-windows.txt
-rw-r--r-- 1 mlcuser mlc 211 Sep 10 18:02 requirements-dev.txt
-rw-r--r-- 1 mlcuser mlc 530 Sep 10 18:02 requirements-windows.txt
-rw-r--r-- 1 mlcuser mlc 447 Sep 10 18:03 requirements.txt
drwxr-xr-x 2 mlcuser mlc 59 Sep 10 18:02 scripts
-rw-r--r-- 1 mlcuser mlc 73 Sep 10 18:02 setup.cfg
-rw-r--r-- 1 mlcuser mlc 4067 Sep 10 18:02 setup.py
drwxr-xr-x 10 mlcuser mlc 4096 Sep 10 18:02 tensorrt_llm
drwxr-xr-x 12 mlcuser mlc 4096 Sep 10 18:02 tests
drwxr-xr-x 4 mlcuser mlc 125 Sep 10 18:02 windows
###############################################
mlcuser@0a8c7a0385a8:/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6b3f63c5/repo/closed/NVIDIA/build/TRTLLM$ cat requirements.txt=0.7.0; platform_machine=="x86_64"
--extra-index-url https://download.pytorch.org/whl/cu121
--extra-index-url https://pypi.nvidia.com
accelerate==0.25.0
build
colored
cuda-python # Do not override the custom version of cuda-python installed in the NGC PyTorch image.
diffusers==0.15.0
lark
mpi4py
numpy
onnx>=1.12.0
polygraphy
psutil
pynvml>=11.5.0
sentencepiece>=0.1.99
torch<=2.2.0a
#nvidia-ammo
transformers==4.36.1
wheel
optimum
evaluate
janus
How to get the interactive docker ?