Skip to content

Commit dd25595

Browse files
authored
[TRTLLM-7964][infra] Set nixl to default cache transceiver backend (#7926)
Signed-off-by: Bo Deng <[email protected]>
1 parent e185173 commit dd25595

File tree

7 files changed

+23
-15
lines changed

7 files changed

+23
-15
lines changed

cpp/tensorrt_llm/batch_manager/cacheTransceiver.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ std::unique_ptr<BaseCacheTransceiver> CacheTransceiverFactory::createCacheTransc
8989
}
9090
else
9191
{
92-
backendType = executor::CacheTransceiverConfig::BackendType::UCX;
92+
backendType = executor::CacheTransceiverConfig::BackendType::NIXL;
9393
}
9494
}
9595
cacheTransceiverConfig.value().setBackendType(backendType);

docker/Dockerfile.multi

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,17 @@ RUN if [ -f /etc/redhat-release ]; then \
9191
grep "^Name:" METADATA; \
9292
fi
9393

94+
# Install UCX first
95+
COPY docker/common/install_ucx.sh install_ucx.sh
96+
RUN bash ./install_ucx.sh && rm install_ucx.sh
97+
98+
# Install NIXL
99+
COPY docker/common/install_nixl.sh install_nixl.sh
100+
RUN bash ./install_nixl.sh && rm install_nixl.sh
101+
102+
# Install etcd
103+
COPY docker/common/install_etcd.sh install_etcd.sh
104+
RUN bash ./install_etcd.sh && rm install_etcd.sh
94105

95106
FROM ${TRITON_IMAGE}:${TRITON_BASE_TAG} AS triton
96107

@@ -111,15 +122,6 @@ COPY docker/common/install_triton.sh \
111122

112123
RUN bash ./install_triton.sh && rm install_triton.sh
113124

114-
# Install UCX first
115-
RUN bash ./install_ucx.sh && rm install_ucx.sh
116-
117-
# Install NIXL
118-
RUN bash ./install_nixl.sh && rm install_nixl.sh
119-
120-
# Install etcd
121-
RUN bash ./install_etcd.sh && rm install_etcd.sh
122-
123125
FROM ${DEVEL_IMAGE} AS wheel
124126
WORKDIR /src/tensorrt_llm
125127
COPY benchmarks benchmarks

docs/source/features/disagg-serving.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ cache_transceiver_config:
106106
max_tokens_in_buffer: <int>
107107
```
108108
109-
`backend` specifies the communication backend for transferring the kvCache, valid options include `DEFAULT`,`UCX`, `NIXL`, and `MPI`, the default backend is UCX.
109+
`backend` specifies the communication backend for transferring the kvCache, valid options include `DEFAULT`,`UCX`, `NIXL`, and `MPI`, the default backend is NIXL.
110110

111111
`max_tokens_in_buffer` defines the buffer size for kvCache transfers, it is recommended to set this value greater than or equal to the maximum ISL (Input Sequence Length) of all requests for optimal performance.
112112

docs/source/installation/linux.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,9 @@
2222
pip3 install torch==2.7.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
2323

2424
sudo apt-get -y install libopenmpi-dev
25+
26+
# Optional step: Only required for disagg-serving
27+
sudo apt-get -y install libzmq3-dev
2528
```
2629

2730
PyTorch CUDA 12.8 package is required for supporting NVIDIA Blackwell GPUs and SBSA platform. On prior GPUs or Linux x86_64 platform, this extra installation is not required.

examples/disaggregated/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The `trtllm-serve` command supports the `extra-llm-config.yaml` parameter. In th
1212

1313
```yaml
1414
cache_transceiver_config:
15-
# KV cache transmission backend. Valid options include `DEFAULT` (i.e., UCX), `UCX`, `NIXL`.
15+
# KV cache transmission backend. Valid options include `DEFAULT` (i.e., NIXL), `UCX`, `NIXL`.
1616
backend: <str>
1717
# KV cache buffer size. Set it ≥ the maximum ISL (Input Sequence Length) for best performance.
1818
max_tokens_in_buffer: <int>

tensorrt_llm/_torch/pyexecutor/kv_cache_transceiver.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,10 @@ def create_kv_cache_transceiver(
3838

3939
if cache_transceiver_config.backend == BackendTypeCpp.DEFAULT:
4040
# When cache_transceiver_config.backend is not set, fallback to env_vars settings
41-
# UCX is the default backend
42-
cache_transceiver_config.backend = BackendTypeCpp.UCX
41+
# NIXL is the default backend
42+
cache_transceiver_config.backend = BackendTypeCpp.NIXL
4343
# Ordered by priority
44-
env_vars = [("TRTLLM_USE_NIXL_KVCACHE", BackendTypeCpp.NIXL),
44+
env_vars = [("TRTLLM_USE_UCX_KVCACHE", BackendTypeCpp.UCX),
4545
("TRTLLM_USE_MPI_KVCACHE", BackendTypeCpp.MPI)]
4646
for env_var, be_type in env_vars:
4747
if getenv(env_var) == "1":

tests/integration/defs/disaggregated/test_disaggregated.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -426,6 +426,9 @@ def run_disaggregated_test(example_dir,
426426
config_file
427427
]
428428
else:
429+
pytest.skip(
430+
"https://nvbugs/5584607 Ray orchestrator is not supported with NIXL(DEFAULT) cache transceiver backend."
431+
)
429432
with open(config_file, 'r') as f:
430433
config = yaml.safe_load(f)
431434

0 commit comments

Comments
 (0)