Skip to content

[Bug] TypeError: BaseInjectedModule.__init__() got multiple values for argument 'prefill_device' #1526

@xxwweb

Description

@xxwweb

检查清单

  • 1. 我已经搜索过相关问题,但未能获得预期的帮助
  • 2. 该问题在最新版本中尚未修复
  • 3. 请注意,如果您提交的BUG相关 issue 缺少对应环境信息和最小可复现示例,我们将难以复现和定位问题,降低获得反馈的可能性
  • 4. 如果您提出的不是bug而是问题,请在讨论区发起讨论 https://github.com/kvcache-ai/ktransformers/discussions。否则该 issue 将被关闭
  • 5. 为方便社区交流,我将使用中文/英文或附上中文/英文翻译(如使用其他语言)。未附带翻译的非中文/英语内容可能会被关闭

问题描述

RAM: 64G
GPU: RTX4060
运行 deepseek-r1-70b-k4_m 报错:

root@8ebfaf9abff7:/models/gguf_path# python -m ktransformers.local_chat --gguf_path /models/gguf_path/DeepSeek-R1-Distill-Llama-70B-K4/ --model_path /models/model_path/DeepSeek-R1-Distill-Llama-70B/ --cpu_infer 33
2025-10-22 00:50:45,817 - INFO - flashinfer.jit: Prebuilt kernels not found, using JIT backend
found flashinfer
using custom modeling_xxx.py.
LlamaForCausalLM has generative capabilities, as prepare_inputs_for_generation is explicitly overwritten. However, it doesn't directly inherit from GenerationMixin. From 👉v4.50👈 onwards, PreTrainedModel will NOT inherit from GenerationMixin, and this model will lose the ability to call generate and other related functions.

  • If you're using trust_remote_code=True, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  • If you are the owner of the model architecture code, please modify your model class such that it inherits from GenerationMixin (after PreTrainedModel, otherwise you'll get an exception).
  • If you are not the owner of the model architecture class, please contact the model code owner to update it.
    using default_optimize_rule for LlamaForCausalLM
    Injecting model as ktransformers.operators.models . KLlamaModel
    Traceback (most recent call last):
    File "", line 198, in _run_module_as_main
    File "", line 88, in _run_code
    File "/opt/conda/lib/python3.11/site-packages/ktransformers/local_chat.py", line 196, in
    fire.Fire(local_chat)
    File "/opt/conda/lib/python3.11/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/lib/python3.11/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
    ^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/lib/python3.11/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
    ^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/lib/python3.11/site-packages/ktransformers/local_chat.py", line 123, in local_chat
    optimize_and_load_gguf(model, optimize_config_path, gguf_path, config, default_device=device)
    File "/opt/conda/lib/python3.11/site-packages/ktransformers/optimize/optimize.py", line 128, in optimize_and_load_gguf
    inject(module, optimize_config, model_config, weights_loader)
    File "/opt/conda/lib/python3.11/site-packages/ktransformers/optimize/optimize.py", line 33, in inject
    inject_module=module_cls(key = inject_module_meta["key"], gguf_loader = gguf_loader, config = model_config, orig_module=child, **inject_module_meta["kwargs"])
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/lib/python3.11/site-packages/ktransformers/operators/models.py", line 995, in init
    BaseInjectedModule.init(
    TypeError: BaseInjectedModule.init() got multiple values for argument 'prefill_device'

能看下是什么原因导致的吗?

复现步骤

root@8ebfaf9abff7:/models/gguf_path# python -m ktransformers.local_chat --gguf_path /models/gguf_path/DeepSeek-R1-Distill-Llama-70B-K4/ --model_path /models/model_path/DeepSeek-R1-Distill-Llama-70B/ --cpu_infer 33

环境信息

CPU: i7-13650HX
RAM: 64G
GPU: RTX4060

docker pull approachingai/ktransformers:v0.3.2-AVX2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions