Skip to content

Conversation

Liyixin95
Copy link
Contributor

reading a 1.9GB ipc stream file.

before:

________________________________________________________
Executed in    3.05 secs    fish           external
   usr time    2.01 secs  276.00 micros    2.01 secs
   sys time    1.04 secs  443.00 micros    1.04 secs

peek memory: 3980.86 MB

after:

________________________________________________________
Executed in    2.22 secs    fish           external
   usr time    1.48 secs  265.00 micros    1.48 secs
   sys time    0.74 secs  439.00 micros    0.74 secs

peek memory: 2086.41 MB

@github-actions github-actions bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels Sep 30, 2025
Copy link

codecov bot commented Sep 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.84%. Comparing base (f544238) to head (2bca949).
⚠️ Report is 43 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #24671      +/-   ##
==========================================
+ Coverage   81.83%   81.84%   +0.01%     
==========================================
  Files        1688     1689       +1     
  Lines      229999   230491     +492     
  Branches     2974     2974              
==========================================
+ Hits       188219   188650     +431     
- Misses      41027    41088      +61     
  Partials      753      753              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

scratch,
);

let new_pos = reader.stream_position()?;
Copy link
Member

@orlp orlp Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the reborrow of reader explicit here? That is, pass &mut (&mut *reader).take(block_length as u64) instead? Right now this only works because reader.take automatically gets transformed to (&mut *reader).take. I was really confused how this compiled since take takes self and should consume the &mut R.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have change this to &mut (&mut *reader).take(block_length as u64). But I still think it's a little verbose...

.unwrap_or_else(VecDeque::new);
let mut buffers: VecDeque<arrow_format::ipc::BufferRef> = buffers.iter().collect();

// check that the sum of the sizes of all buffers is <= than the size of the file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this check was removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't easily get file_size after removing the intermediate vec. But remove this check will indeed cause wrong result when file_size and buffer_size not match rather then report an error. Maybe we can change all the subsequent read_to_end to read_excat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't easily get file_size after removing the intermediate vec. But remove this check will indeed cause wrong result when file_size and buffer_size not match rather then report an error. Maybe we can change all the subsequent read_to_end to read_excat?

@orlp kindly pin in case you forget this thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Liyixin95 I think that should be the way forward then yes. I do want the length to be checked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants