[Bug] TypeError: BaseInjectedModule.__init__() got multiple values for argument 'prefill_device'

### 检查清单

- [x] 1. 我已经搜索过相关问题，但未能获得预期的帮助
- [x] 2. 该问题在最新版本中尚未修复
- [x] 3. 请注意，如果您提交的BUG相关 issue 缺少对应环境信息和最小可复现示例，我们将难以复现和定位问题，降低获得反馈的可能性
- [x] 4. 如果您提出的不是bug而是问题，请在讨论区发起讨论 https://github.com/kvcache-ai/ktransformers/discussions。否则该 issue 将被关闭
- [x] 5. 为方便社区交流，我将使用中文/英文或附上中文/英文翻译（如使用其他语言）。未附带翻译的非中文/英语内容可能会被关闭

### 问题描述

RAM: 64G
GPU: RTX4060
运行 deepseek-r1-70b-k4_m 报错：

root@8ebfaf9abff7:/models/gguf_path# python -m ktransformers.local_chat  --gguf_path /models/gguf_path/DeepSeek-R1-Distill-Llama-70B-K4/ --model_path /models/model_path/DeepSeek-R1-Distill-Llama-70B/ --cpu_infer 33
2025-10-22 00:50:45,817 - INFO - flashinfer.jit: Prebuilt kernels not found, using JIT backend
found flashinfer
using custom modeling_xxx.py.
LlamaForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
using default_optimize_rule for LlamaForCausalLM
Injecting model as ktransformers.operators.models . KLlamaModel
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/conda/lib/python3.11/site-packages/ktransformers/local_chat.py", line 196, in <module>
    fire.Fire(local_chat)
  File "/opt/conda/lib/python3.11/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/ktransformers/local_chat.py", line 123, in local_chat
    optimize_and_load_gguf(model, optimize_config_path, gguf_path, config, default_device=device)
  File "/opt/conda/lib/python3.11/site-packages/ktransformers/optimize/optimize.py", line 128, in optimize_and_load_gguf
    inject(module, optimize_config, model_config, weights_loader)
  File "/opt/conda/lib/python3.11/site-packages/ktransformers/optimize/optimize.py", line 33, in inject
    inject_module=module_cls(key = inject_module_meta["key"], gguf_loader = gguf_loader, config = model_config, orig_module=child, **inject_module_meta["kwargs"])
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/ktransformers/operators/models.py", line 995, in __init__
    BaseInjectedModule.__init__(
TypeError: BaseInjectedModule.__init__() got multiple values for argument 'prefill_device'

能看下是什么原因导致的吗？

### 复现步骤

root@8ebfaf9abff7:/models/gguf_path# python -m ktransformers.local_chat  --gguf_path /models/gguf_path/DeepSeek-R1-Distill-Llama-70B-K4/ --model_path /models/model_path/DeepSeek-R1-Distill-Llama-70B/ --cpu_infer 33

### 环境信息

CPU: i7-13650HX
RAM: 64G
GPU: RTX4060

docker pull approachingai/ktransformers:v0.3.2-AVX2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] TypeError: BaseInjectedModule.init() got multiple values for argument 'prefill_device' #1526

检查清单

问题描述

复现步骤

环境信息

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] TypeError: BaseInjectedModule.__init__() got multiple values for argument 'prefill_device' #1526

Description

检查清单

问题描述

复现步骤

环境信息

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Bug] TypeError: BaseInjectedModule.init() got multiple values for argument 'prefill_device' #1526