Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions sharktank/sharktank/layers/configs/llm_configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -633,9 +633,6 @@ def to_properties(self) -> "PropertyValueType":
res["activation_dtype"] = dtype_to_serialized_name(self.activation_dtype)
res["attention_dtype"] = dtype_to_serialized_name(self.attention_dtype)
res["fake_quant"] = self.fake_quant
res["tensor_parallelism_size"] = self.tensor_parallelism_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tensor_parallelism_size is not being saved by parallelism_config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is dynamically computed from there
LlamaModelConfig:

    @property
    def tensor_parallelism_size(self) -> int:
        return self.parallelism_config.tensor_parallelism_size

ParallelismConfig:

    @property
    def tensor_parallelism_size(self) -> int:
        """
        How many devices are involved for tensor parallel sharding.
        If greater than 1, the model will expect sharded model parameters and function
        arguments.
        """
        return len(self.devices_for_pipeline(0))

Actually until now this init.

return LlamaModelConfig(**kwargs)

Was causing an overwrite of the method-like properties.

    @property
    def pipeline_parallelism_size(self) -> int:
        return self.parallelism_config.pipeline_size

    @property
    def tensor_parallelism_size(self) -> int:
        return self.parallelism_config.tensor_parallelism_size

This may cause a discrepancy with LlamaModelConfig.parallelism_config.

One more casualty of the power of Python's dynamism.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, its not that.

res["block_to_pipeline_map"] = self.block_to_pipeline_map
res["pipeline_to_device_map"] = self.pipeline_to_device_map
res["attention_kernel"] = self.attention_kernel
res["use_qk_norm"] = self.use_qk_norm
res["attention_chunk_size"] = self.attention_chunk_size
Expand Down
Loading