Skip to content

Conversation

@sogartar
Copy link
Contributor

We are already saving these in parallelism_config.

…o_properties

We are already saving these in `parallelism_config`.
res["activation_dtype"] = dtype_to_serialized_name(self.activation_dtype)
res["attention_dtype"] = dtype_to_serialized_name(self.attention_dtype)
res["fake_quant"] = self.fake_quant
res["tensor_parallelism_size"] = self.tensor_parallelism_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tensor_parallelism_size is not being saved by parallelism_config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is dynamically computed from there
LlamaModelConfig:

    @property
    def tensor_parallelism_size(self) -> int:
        return self.parallelism_config.tensor_parallelism_size

ParallelismConfig:

    @property
    def tensor_parallelism_size(self) -> int:
        """
        How many devices are involved for tensor parallel sharding.
        If greater than 1, the model will expect sharded model parameters and function
        arguments.
        """
        return len(self.devices_for_pipeline(0))

Actually until now this init.

return LlamaModelConfig(**kwargs)

Was causing an overwrite of the method-like properties.

    @property
    def pipeline_parallelism_size(self) -> int:
        return self.parallelism_config.pipeline_size

    @property
    def tensor_parallelism_size(self) -> int:
        return self.parallelism_config.tensor_parallelism_size

This may cause a discrepancy with LlamaModelConfig.parallelism_config.

One more casualty of the power of Python's dynamism.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, its not that.

@sogartar sogartar marked this pull request as draft October 2, 2025 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants