-
Notifications
You must be signed in to change notification settings - Fork 11
feat: Add IndependentSpeculator implementation #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
05874d4
to
7cfc8ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few quick comments, overall I think there's more that needs to be considered and worked out here and additionally some of the model loading pathways and config loading pathways likely will benefit from the utilities being added in #48.
For discussion points to work through for the config:
- The goal should be to augment the base pre trained model config so that the speculator_* properties are added while all other keys and functionality from the original config are kept exactly the same.
- I think we should edit the from_pretrained_config to override the parent from_pretrained method or at least figure out how to support and combine those. The ideal goal would be that from_pretrained on the base SpeculatorModelConfig would continue to only work if it is a previously saved speculator config. Going through the function for the independent config would result in either loading the speculator config if it is already a saved speculator config or it would go through the conversion and injection if not. All of this being able to be supported with the base functionality of being able to load from an HF stub, a local path, a pre trained config, or a dict.
For discussion points to work through for the model:
- We'll need to ensure that from_pretrained and save_pretrained all work with the config requirements laid out above where it's just extras we're injecting so the model is still runnable externally from speculators as the ideal end state -- essentially guaranteeing we've only added extras to it
- The save_pretrained / converting to a state dict pathways will need to flatten the structure so that whatever property we are using to store the model under doesn't get injected into the namespace for saving -- essentially just need to ensure we save the state_dict exactly the same as it would be external from speculators save pathways
- For the above point, that would ideally mean that we have an invisible wrapper / injector to ensure the properties and pathways from the original pretrained model are set to the IndependentSpeculator or forwarded
|
||
speculators_model_type: str = "independent" | ||
speculators_model_type: Literal["independent"] = "independent" | ||
architectures: list[str] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should be able to get rid of this completely, can you double check that this is pulled from the original config and set appropriately in the from_pretrained_config pathway?
Essentially we just want to ensure that saving/loading the config will result in speculators_model_type, speculators_version, and speculators_config being added while keeping the rest of the config exactly the same
default_factory=lambda: ["LlamaForCausalLM"], | ||
description=("List of model architectures that can be used with the model "), | ||
) | ||
draft_model: str = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following why this would end up being used. We should have the verifier in the speculators_config and the config we're injecting to would represent the draft/speculator model
Signed-off-by: Fynn Schmitt-Ulms <[email protected]>
Signed-off-by: Fynn Schmitt-Ulms <[email protected]>
Signed-off-by: Fynn Schmitt-Ulms <[email protected]>
83e148b
to
856fcbf
Compare
Signed-off-by: Fynn Schmitt-Ulms <[email protected]>
I've updated the implementation so that we simply inject the IndependentSpeculatorConfig.from_pretrained("meta-llama/Llama-3.2-3B-Instruct") # works, updates config
SpeculatorModelConfig.from_pretrained("meta-llama/Llama-3.2-3B-Instruct") # fails because model is not a speculator model
IndependentSpeculator.from_pretrained("meta-llama/Llama-3.2-3B-Instruct") # almost works (see note on default SpeculatorConfig)
SpeculatorModel.from_pretrained("meta-llama/Llama-3.2-3B-Instruct") # fails because model is not a speculator model
model = # IndependentSpeculator ...
model.save_pretrained("./indep") # saves draft model only, but includes `speculator_*` additions to config Changes required to get this working
model_type: ClassVar[str] = "speculator_model" # type: ignore[misc] This was fixing the
_auto_class: ClassVar[Optional[str]] = "" # type: ignore[misc] This didn't seem to be needed for loading models (and currently seems to be incorrectly set), however if
Note on the default SpeculatorConfig valueCurrently, the default value for There are 3 potential fixes I see for this.
Note on output of
|
Note: this is a WIP and I've attached a bunch of clarifying questions below.
This pr:
IndependentSpeculator
classIndependentSpeculatorConfig
classComments/Questions:
["LlamaForCausalLM"]
based on what I saw in theEagleSpeculatorConfig
but I'm not sure about that.forward
method currently just wraps theself.draft.forward
method. The arguments I used for this method are based on the eagle speculator implementation, however, I'm not sure that they're all guaranteed to exist for allPreTrainedModels
. Do we currently limit the model types this works with? Is there an api limiting this? I could also change this to*args, **kwargs
if this call varies significantly across models.