Skip to content

Performance regression on Iceberg Metadata queries on $files table #27001

@dotjdk

Description

@dotjdk

It seems that #25677 has indeed improved performance of queries on $files tables, but only for the first 1 to 2 executions after a cluster restart. The first 1-2 executions takes less than a second, but after that it consistently takes 1 minute for the same query.

The cumulative user memory in the first execution is 11-12 MB but 1.72GB in subsequent executions. In stage 0 CPU time is rougly the same, but blocked time goes from 3 minutes to 6 hours.

We can reproduce this behavior consistently after restarting the cluster. Downgrading to 476 eliminates the issue, and the query consistently completes in about 1.5 seconds every time.

Image Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    icebergIceberg connector

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions