Skip to content

[Feat] Support Engine Selection#5028

Open
Bobholamovic wants to merge 119 commits intoPaddlePaddle:developfrom
Bobholamovic:feat/transformers
Open

[Feat] Support Engine Selection#5028
Bobholamovic wants to merge 119 commits intoPaddlePaddle:developfrom
Bobholamovic:feat/transformers

Conversation

@Bobholamovic
Copy link
Copy Markdown
Member

@Bobholamovic Bobholamovic commented Mar 2, 2026

升级架构,支持底层切换不同推理引擎。

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Mar 2, 2026

Thanks for your contribution!

@Bobholamovic Bobholamovic marked this pull request as ready for review March 4, 2026 02:37
scyyh11 and others added 25 commits March 27, 2026 16:30
decoder.class_embed and decoder.bbox_embed are tied to enc_score_head
and enc_bbox_head (HF _tied_weights_keys). The HF safetensors only stores
the canonical enc_*_head keys; set_hf_state_dict now duplicates them
under the decoder.* paths so Paddle's set_state_dict does not emit
missing-key warnings. get_hf_state_dict drops the decoder.* duplicates
so the output matches the safetensors key layout.
Fix PP-DocLayoutV3 tied-weight key mapping for decoder heads
The safetensors model directory uses HF tokenizer format (vocab.json +
merges.txt) which is incompatible with QWenTokenizer's qwen.tiktoken
format. Load qwen.tiktoken directly instead of using from_pretrained
(which conflicts with the HF tokenizer_config.json).

Add extra_special_tokens parameter to QWenTokenizer to register the
GOT-OCR2 tokens (<img>, </img>, <imgpad>, etc.) during construction
so they encode as single tokens matching the model's expected IDs.

Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>
Fix PP-Chart2Table tokenizer loading for paddle_dynamic with safetensors
…oading

The `get_transpose_weight_keys()` method was missing two categories of
Linear layer weights that need HF-to-Paddle transpose:

1. `out_proj` — The reverse key conversion changes Paddle's `o_proj` to
   HF's `out_proj`, but `"o_proj"` is not a substring of `"out_proj"`,
   so self-attention output projection weights were skipped.

2. `enc_output` — The encoder output Linear layer was not matched by any
   entry in the t_layers list.

These are all 256x256 (square) weights, so the shape appeared correct but
the values were wrong, causing PP-DocLayoutV3 dynamic inference to produce
near-zero confidence scores.

Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>
PaddlePaddle's meshgrid defaults to "ij" indexing while the position
embedding concat order [sin_h, cos_h, sin_w, cos_w] assumes "xy"
indexing. This mismatch caused transposed position assignments in
the AIFI encoder, producing wrong self-attention patterns.

The fix swaps the meshgrid arguments to achieve "xy" semantics,
matching the HF transformers reference implementation. This fully
aligns paddle_dynamic output with paddle_static (scores match to
4 decimal places).

Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>
Fix missing weight transpose keys for PP-DocLayoutV2/V3 safetensors l…
Rewrite the PaddleX RT-DETR model to use HF transformers-compatible
parameter naming (model.backbone.model.*, model.encoder.*, model.decoder.*)
following the PP-DocLayoutV3 pattern. This enables both paddle_dynamic and
HF transformers backends to load the same HF-format safetensors without
heavy runtime key conversion.

- Add _config_rt_detr.py with RTDETRConfig (model_type="rt_detr")
- Rewrite rt_detr.py with HF-compatible architecture: separate q/k/v
  projections, per-layer class_embed/bbox_embed, HF-style layer naming
- Use lightweight set_hf_state_dict via _apply_rt_detr_key_conversion
  (shared with PP-DocLayoutV2/V3) instead of heavy runtime conversion
- Remove rtdetrl_modules/ (PaddleDet-style modules, no longer needed)
- Convert safetensors on HF to HF format for all 4 models:
  RT-DETR-L_wired_table_cell_det, RT-DETR-L_wireless_table_cell_det,
  PP-DocBlockLayout, PP-DocLayout_plus-L

Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>
Rewrite RT-DETR with HF-compatible naming for dual-backend support
@scyyh11 scyyh11 force-pushed the feat/transformers branch from a445bb5 to 8821e3a Compare April 7, 2026 22:31
Bobholamovic and others added 4 commits April 8, 2026 20:42
These two nn.Linear layers are defined but never used in forward().
The 2D position embedding is computed by rel_bias_module instead.
Their presence causes a misleading "newly initialized" warning on
every model load since no checkpoint contains these weights.

Signed-off-by: Bvicii <yizhanhuang2002@gmail.com>
Remove unused rel_pos_x_bias and rel_pos_y_bias from PPDocLayoutV2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants