Skip to content

Conversation

@kdn36
Copy link
Collaborator

@kdn36 kdn36 commented Oct 1, 2025

fixes #22725

This PR introduces a lazy iterator over groups iter_groups_lazy(), which helps to avoid memory explosion when groups are overlapping (rolling).

The initial PR has the minimum code to fix the issue reported above. More work is needed: (i) add GroupsType::Idx implementation, (ii) provide an indexed parallel iterator for multi-threading , (iii) leverage the iterators in other expressions and/or implementation paths that may suffer from the same memory explosion pattern when aggregating from NotAggregated to AggregatedList on rolling groups.

Micro-benchmark results (/usr/bin/time -v on debug build on the MRE as documented in the issue, single-threaded).
Memory reduced from 1.6 GB to 132 MB.

  • Before: Maximum resident set size (kbytes): 1734688
  • After: Maximum resident set size (kbytes): 131732

For the original issue: from 11.8 GB to 0.36 GB.

Ping @coastalwhite

@kdn36 kdn36 changed the title fix: Optimize memory on rolling groups in ApplyExpr single input. fix: Optimize memory on rolling groups in ApplyExpr Oct 1, 2025
@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars and removed title needs formatting labels Oct 1, 2025
@kdn36 kdn36 added do not merge This pull requests should not be merged right now and removed do not merge This pull requests should not be merged right now labels Oct 1, 2025
@kdn36 kdn36 marked this pull request as draft October 1, 2025 12:35
@kdn36 kdn36 force-pushed the fix_22725_null_count_slow branch from c3c82fd to 7dd115f Compare October 1, 2025 13:02
@codecov
Copy link

codecov bot commented Oct 3, 2025

Codecov Report

❌ Patch coverage is 94.87179% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.80%. Comparing base (7173122) to head (ecc517d).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-expr/src/expressions/group_iter.rs 94.82% 3 Missing ⚠️
crates/polars-expr/src/expressions/apply.rs 95.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #24709   +/-   ##
=======================================
  Coverage   81.79%   81.80%           
=======================================
  Files        1704     1704           
  Lines      235223   235290   +67     
  Branches     2997     2997           
=======================================
+ Hits       192410   192479   +69     
+ Misses      42048    42046    -2     
  Partials      765      765           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kdn36 kdn36 force-pushed the fix_22725_null_count_slow branch from d5aa4b0 to d83135a Compare October 24, 2025 08:10
@kdn36 kdn36 force-pushed the fix_22725_null_count_slow branch from d83135a to ecc517d Compare October 24, 2025 08:25
@kdn36 kdn36 marked this pull request as ready for review October 24, 2025 08:49
@coastalwhite coastalwhite merged commit b2b4866 into pola-rs:main Oct 24, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Bug fix python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory issue when counting null values in rolling context

2 participants