Skip to content

[TASK] Locate Performance Bottlenecks in DeepSeek R1 Decode Subgraph #604

@zhanghb97

Description

@zhanghb97

Deliverables

An experimental report in this issue.

Task Description

  • Execute the default f32 inference workflow of DeepSeek and identify the decode subgraph(build/examples/BuddyDeepSeekR1/subgraph0_decode.mlir).

  • Following the method used in BuddyNext, split the subgraph into separate MLIR files for detailed analysis.

  • Conduct performance evaluation using AOT execution.

  • Summarize the performance metrics of each component, for example:

Image

Timeline

2025.10.30 – 2025.10.31
Code Review: Begins 2025.11.1

If completed early, the review process may start ahead of schedule.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions