Skip to content

Cannot resolve neuronx-cc>=2.0.0a0 when instaling vllm using DLAMI and latest upstreaming-to-vllm #1212

@tom-carbontrail

Description

@tom-carbontrail

Hardware Used

inf2.24xlarge

Training/Inference

Inference

Release Artifacts

upstreaming-to-vllm 2.25 (same in neuron-2.25 branch too)

Model Type

Not got that far.

Description

When installing requirements/neuron.txt, trying to resolve neuronx-cc>=2.0.0a0 results in trying to download every single version to try and satisfy some internal requirement:

Collecting neuronx-cc>=2.0.0a0
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.16.345.0%2B69131dd3-cp310-cp310-linux_x86_64.whl (580.8 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.15.143.0%2Be39249ad-cp310-cp310-linux_x86_64.whl (466.4 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.15.141.0%2Bd3cfc8ca-cp310-cp310-linux_x86_64.whl (466.5 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.15.128.0%2B56dc5a86-cp310-cp310-linux_x86_64.whl (466.4 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.14.227.0%2B2d4f85be-cp310-cp310-linux_x86_64.whl (392.3 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.14.213.0%2B013d129b-cp310-cp310-linux_x86_64.whl (392.3 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.13.72.0%2B78a426937-cp310-cp310-linux_x86_64.whl (350.5 MB)
Collecting islpy<=2023.1,>2021.1
  Downloading islpy-2023.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.3/11.3 MB 3.1 MB/s eta 0:00:00
Collecting networkx
  Downloading networkx-2.6.3-py3-none-any.whl (1.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 135.1 MB/s eta 0:00:00
Collecting neuronx-cc>=2.0.0a0
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.13.68.0%2B6dfecc895-cp310-cp310-linux_x86_64.whl (350.2 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.13.66.0%2B6dfecc895-cp310-cp310-linux_x86_64.whl (350.2 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.12.68.0%2B4480452af-cp310-cp310-linux_x86_64.whl (374.6 MB)
  Using cached https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.12.54.0%2Bf631c2365-cp310-cp310-linux_x86_64.whl (374.6 MB)
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.11.0.35%2B4f5279863-cp310-cp310-linux_x86_64.whl (369.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 369.2/369.2 MB 3.1 MB/s eta 0:00:00
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.11.0.34%2Bc5231f848-cp310-cp310-linux_x86_64.whl (369.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 369.4/369.4 MB 3.2 MB/s eta 0:00:00
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.10.0.35%2B3817a0c8c-cp310-cp310-linux_x86_64.whl (372.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 372.3/372.3 MB 3.1 MB/s eta 0:00:00
Collecting neuronx-hwm==2.10.0.5
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-hwm/neuronx_hwm-2.10.0.5%2B7b1976adf-cp310-cp310-linux_x86_64.whl (2.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 21.6 MB/s eta 0:00:00
Collecting scipy
  Downloading scipy-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (39.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.9/39.9 MB 40.6 MB/s eta 0:00:00
Collecting neuronx-cc>=2.0.0a0
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.10.0.34%2B6c8792c6f-cp310-cp310-linux_x86_64.whl (372.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 372.3/372.3 MB 2.8 MB/s eta 0:00:00
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.9.0.40%2B07376825f-cp310-cp310-linux_x86_64.whl (371.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 371.9/371.9 MB 3.2 MB/s eta 0:00:00
Collecting neuronx-hwm==2.9.0.2
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-hwm/neuronx_hwm-2.9.0.2%2Bf79d59e7b-cp310-cp310-linux_x86_64.whl (2.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 4.7 MB/s eta 0:00:00
Collecting neuronx-cc>=2.0.0a0
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.9.0.16%2Bfa12ba55a-cp310-cp310-linux_x86_64.whl (371.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 371.9/371.9 MB 3.2 MB/s eta 0:00:00
Collecting neuronx-hwm==2.9.0.1
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-hwm/neuronx_hwm-2.9.0.1%2Bf79d59e7b-cp310-cp310-linux_x86_64.whl (2.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 5.2 MB/s eta 0:00:00
Collecting neuronx-cc>=2.0.0a0
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.8.0.25%2Ba3ad0f342-cp310-cp310-linux_x86_64.whl (362.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 362.6/362.6 MB 3.3 MB/s eta 0:00:00
Collecting neuronx-hwm==2.8.0.3
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-hwm/neuronx_hwm-2.8.0.3%2B2b7c6da39-cp310-cp310-linux_x86_64.whl (2.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 21.2 MB/s eta 0:00:00
Collecting neuronx-cc>=2.0.0a0
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.7.0.40%2Bf7c6cf2a3-cp310-cp310-linux_x86_64.whl (342.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 342.7/342.7 MB 3.5 MB/s eta 0:00:00
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.6.0.19%2B3d819e565-cp310-cp310-linux_x86_64.whl (325.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 325.5/325.5 MB 3.7 MB/s eta 0:00:00
  Downloading https://pip.repos.neuron.amazonaws.com/neuronx-cc/neuronx_cc-2.5.0.28%2B1be23f232-cp310-cp310-linux_x86_64.whl (322.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 322.1/322.1 MB 3.8 MB/s eta 0:00:00
INFO: pip is looking at multiple versions of nvidia-nvtx-cu12 to determine which version is compatible with other requirements. This could take a while.
Collecting nvidia-nvtx-cu12==12.6.77
  Using cached nvidia_nvtx_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB)
INFO: pip is looking at multiple versions of nvidia-nvjitlink-cu12 to determine which version is compatible with other requirements. This could take a while.
Collecting nvidia-nvjitlink-cu12==12.6.85
  Using cached nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB)
INFO: pip is looking at multiple versions of nvidia-nccl-cu12 to determine which version is compatible with other requirements. This could take a while.
Collecting nvidia-nccl-cu12==2.26.2

I cannot seem to get a working installation.

Steps to Reproduce

  1. Use the Amazon Linux 2 DLAMI
  2. Follow the instructions on https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/nxd-inference/developer_guides/vllm-user-guide.html#installing-the-aws-neuron-fork-of-vllm
  3. Observe the issue.

Attach logs if any

See above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions