-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Explicit heap trimming #6022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Explicit heap trimming #6022
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #6022 +/- ##
=======================================
Coverage 78.6% 78.6%
=======================================
Files 818 820 +2
Lines 68976 69032 +56
Branches 8267 8234 -33
=======================================
+ Hits 54190 54259 +69
+ Misses 14786 14773 -13
🚀 New features to boost your workflow:
|
| auto const statusBefore = readFile(statusPath); | ||
| report.rssBeforeKB = detail::parseVmRSSkB(statusBefore); | ||
|
|
||
| report.trimResult = ::malloc_trim(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest, instead of calling malloc_trim(0), which would try to trim up to the main heap's top boundary (leaving only a minimum amount, 1 page, which is usually 4 or 8 KB, link for other readers), we should instead call malloc_trim(m);, where m is the minimum amount of memory we expect we will need soon enough. It could range from few hundred MBs to few GBs. The description of this PR mentions a rate of about ~1.2 GB/h (on the higher side). So, I would suggest we keep m=~2.4GB for starters. The reports we are accumulating in this class can then help us fine tune it.
Why?
Memory allocation is an expensive operation in itself. In this case(after calling malloc_trim(0)) any new allocations would also require heap extension. This will be exceptionally expensive(by the order of 1000 times, involving user to kernel space context switch and then allocation of new pages by OS). We can avoid that by keeping few GBs in reserve.
The trim after NetworkOps -> SyncComplete could be one of the suboptimal calls, if we do any heavy operations after SyncComplete, requiring memory allocations exceeding 1 page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This note in the manual suggests that attempting to use the trim padding is wasted effort:
Only the main heap (using sbrk(2)) honors the pad argument; thread heaps do not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A malloc_trim(m) call will check all the arenas (main and thread) to see if chunks of memory can be released. It will use the padding arg to decide(keep upto) the chunk that can be freed from main heap top.
For thread heaps(sub-heaps), malloc_trim anyway can't free the sub-heap or part of it(hence neglect padding), if there's even a small block in use. If a sub-heap region becomes completely empty after the last call to free(), allocator will anyway return the memory to OS. So padding doesn't make much sense for thread heaps. Thread heaps are anyway self-managed. So, we don't need to optimise for them. If there's fragmentation in the sub-heaps, that will remain until an entire sub-heap is empty(which will then be released).
lmaisons
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The points where we trim look reasonable. Unless there's some obvious caveat I'm not aware of, I think the RSS reporting should use the self and statm proc features so as to minimize parsing / clerical errors.
|
|
||
| std::string const tagStr = tag.value_or("default"); | ||
| std::string const statusPath = | ||
| "/proc/" + std::to_string(cachedPid) + "/status"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not just use /proc/self/...? Also, it seems we would need to do fewer parsing contortions if we read from /proc/self/statm instead of /proc/self/status
| auto const statusBefore = readFile(statusPath); | ||
| report.rssBeforeKB = detail::parseVmRSSkB(statusBefore); | ||
|
|
||
| report.trimResult = ::malloc_trim(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This note in the manual suggests that attempting to use the trim padding is wasted effort:
Only the main heap (using sbrk(2)) honors the pad argument; thread heaps do not.
High Level Overview of Change
operating mode after initial sync, when sync-related temporary
allocations have been freed.
Context of Change
Type of Change
.gitignore, formatting, dropping support for older tooling)API Impact
libxrplchange (any change that may affectlibxrplor dependents oflibxrpl)