Skip to content

Conversation

sungeunk
Copy link
Contributor

@sungeunk sungeunk commented Oct 14, 2025

Description of the issue(symptom, root-cause, how it was resolved)

  • onednn 3d conv post-op mem_desc needs to be canonicalized to 4d when conv output is blocked

The code and line that caused this issue (if it is not changed directly)

  • src/plugins/intel_gpu/src/graph/program_node.cpp

Reproduction step and snapshot (if applicable. Do not attach for customer model)

  • reproduction step and model are attached in the ticket.
// need to convert IR: embedding_model.onnx -> FP32 -> INT8
$ ovc embedding_model.onnx --output_model model_FP32/embedding_model.xml --input "input[?,50,29]" --compress_to_fp16 False
$ python int8_quantization.py

// Run test
$ python openvino_script.py --device GPU.1 --model ov_onnx_model/int8/model_INT8.xml --batch 1

Problematic graph

It doesn't rely on graph patterns.

Checklist

  • Is it a proper fix? (not a workaround)
  • Did you include test case for this fix, if necessary?
  • Did you review existing test that can be extended to cover this scenario? Which test did you review?
    -- No test for this issue.

Tickets:

  • 174583

@sungeunk sungeunk added the category: GPU OpenVINO GPU plugin label Oct 14, 2025
@sungeunk sungeunk requested review from a team as code owners October 14, 2025 05:56
@p-durandin p-durandin added this to the 2025.4 milestone Oct 14, 2025
@sungeunk
Copy link
Contributor Author

sungeunk commented Oct 14, 2025

[Fixed] ov_gpu_unit_tests issue can be reproduced on local machine(A770).

Copy link
Contributor

@jade-cho jade-cho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sungeunk
Copy link
Contributor Author

Pass LLM daily test on BMG/A770/LNL


dnnl::memory::desc in_scale_desc;
if (is_type<gemm>() || is_type<fully_connected>()) {
in_scale_desc = onednn::layout_to_memory_desc(in_scale, onednn::get_default_data_format(in_scale));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't gemm and fc need need_blocked when blocked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems gemm/fc don't support the need_blocked flag. fc test-cases in unit-test are failed with the need_blocked flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but it is difficult to understand the logic. Let's discuss offline together with @jade-cho .

  • @jade-cho is "need_blocked" proper name? Maybe "allow_blocked" is the right name? actually the behavior seems similar to !flatten :(
  • why should we treat gemm and fc differently? If test-case fails, we may change the test case.
  • Maybe should we implement the logic withint the function itself?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline. We will clean this up after initial PR is merged.

auto mem_flag = cldnn::format::is_blocked(get_output_layout().format) ?
onednn::mem_flags::need_blocked : onednn::mem_flags::None;
out_scale_desc = onednn::layout_to_memory_desc(out_scale, dnnl::memory::format_tag::undef, mem_flag);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By Sungeun)
step1) Introduce lambda function

To be done by Jade)
step2) Introduce new function: fused_op_layout_to_memory_desc(fused_op_layout, layer_output_layout ...)
step2) change name: need_blocked --> respect_ov_layout
step2) remove conditional handling for gemm and FC


dnnl::memory::desc in_scale_desc;
if (is_type<gemm>() || is_type<fully_connected>()) {
in_scale_desc = onednn::layout_to_memory_desc(in_scale, onednn::get_default_data_format(in_scale));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline. We will clean this up after initial PR is merged.

auto mem_flag = cldnn::format::is_blocked(get_output_layout().format) ?
onednn::mem_flags::need_blocked : onednn::mem_flags::None;
out_scale_desc = onednn::layout_to_memory_desc(out_scale, dnnl::memory::format_tag::undef, mem_flag);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random spot) could you add a test where 1d conv is fused with quantize and mismatch happens between layout and rank?

@isanghao
Copy link
Contributor

no perf issue from dGPU daily test

@sungeunk sungeunk changed the title [GPU] set need_blocked to onednn::layout_to_memory_desc for blocked format [GPU] Canonicalize 3d shape to handle blocked format for onednn conv/deconv Oct 17, 2025
@sungeunk sungeunk requested review from isanghao and jade-cho October 17, 2025 10:59
@isanghao isanghao changed the title [GPU] Canonicalize 3d shape to handle blocked format for onednn conv/deconv [GPU] Canonicalize 3d shape for onednn conv/deconv post operations Oct 20, 2025
Copy link
Contributor

@isanghao isanghao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@isanghao isanghao added this pull request to the merge queue Oct 20, 2025
Merged via the queue into openvinotoolkit:master with commit e3a81e1 Oct 20, 2025
187 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin Code Freeze

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants