Skip to content

Commit bdf92b6

Browse files
author
zijiansh
committed
[Refactor] Refactor to extract model forward logic to allow plug-in to overwrite
Signed-off-by: Zijian Shen <[email protected]>
1 parent b7d2390 commit bdf92b6

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

vllm/v1/worker/gpu_model_runner.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1456,7 +1456,7 @@ def _forward(
14561456
self,
14571457
attn_metadata: dict[str, Any],
14581458
num_input_tokens: int,
1459-
num_tokens_across_dp: int,
1459+
num_tokens_across_dp: Optional[Any],
14601460
cudagraph_runtime_mode: CUDAGraphMode,
14611461
batch_descriptor: BatchDescriptor,
14621462
scheduler_output: "SchedulerOutput",
@@ -1465,7 +1465,7 @@ def _forward(
14651465
intermediate_tensors: IntermediateTensors,
14661466
inputs_embeds: list[torch.Tensor],
14671467
model_kwargs: dict[str, Any],
1468-
) -> Tuple[torch.Tensor, Optional[KVConnectorOutput]]:
1468+
) -> tuple[torch.Tensor, Optional[KVConnectorOutput]]:
14691469
with set_forward_context(
14701470
attn_metadata,
14711471
self.vllm_config,

0 commit comments

Comments
 (0)