Skip to content

Conversation

@vlntb
Copy link
Collaborator

@vlntb vlntb commented Nov 11, 2025

High Level Overview of Change

  • Introduces a MallocTrim helper in libxrpl to centralise calls to ::malloc_trim(0) on Linux/glibc and optionally record RSS before/after for debugging.
  • Wires mallocTrim into production paths:
    • Application::doSweep: call after sweeps, when we’ve just freed a meaningful amount of heap.
    • Online delete (SHAMapStore): call after clearCaches / prior-ledger cleanup, when online delete drops large chunks of in-memory SHAMap / ledger state.
    • NetworkOPs sync completion: call when transitioning to FULL
      operating mode after initial sync, when sync-related temporary
      allocations have been freed.
  • On non-Linux or non-glibc builds, the helper reports supported = false and is effectively a no-op.

Context of Change

  • Long-running nodes on Linux/glibc showed steady Resident growth under heavier ledger-state scenarios.
  • 24h baseline without malloc_trim:
    • Resident: ~14.0 GB → ~ 32.9 GB (+18.6 GB, ~0.76 GB/h).
    • Referenced: low teens → ~ 32.7 GB (~+29 GB, ~1.2 GB/h).
  • 24h run with malloc_trim from Application::doSweep + online delete:
    • Resident: ~18.3 GB → ~19.0 GB (+0.7 GB, ~0.03 GB/h).
    • Referenced: ~18.0 GB → ~18.8 GB (+0.8 GB, ~0.03 GB/h).
  • Net effect over 24h:
    • ~42% lower Resident/Referenced at the end of the run (~32.9 GB → ~19.0 GB).
    • Growth rate drops from ~0.76 GB/h → ~0.03 GB/h (Resident) and ~1.2 GB/h → ~0.03 GB/h (Referenced), i.e. ~25–30× slower accumulation.
  • Over a longer ~40h trimmed run, we see ~147 GB returned to the OS (~3.7–3.8 GB/h) while steady-state Resident lives in an 18–20 GB band and the Resident–Referenced gap stays small (~0.2 GB). This suggests:
    • malloc_trim is doing real work against fragmentation and short-lived churn.
    • The remaining slow drift is driven by genuine long-lived working set, which will need separate follow-up (caches, data-structure sizing, protocol state), but the production hooks in this PR already give a clear and measurable win.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (non-breaking change that only restructures code)
  • Performance (increase or change in throughput and/or latency)
  • Tests (you added tests for code that already exists, or your new feature included in this PR)
  • Documentation update
  • Chore (no impact to binary, e.g. .gitignore, formatting, dropping support for older tooling)
  • Release

API Impact

  • Public API: New feature (new methods and/or new fields)
  • Public API: Breaking change (in general, breaking changes should only impact the next api_version)
  • libxrpl change (any change that may affect libxrpl or dependents of libxrpl)
  • Peer protocol change (must be backward compatible or bump the peer protocol version)

@vlntb vlntb marked this pull request as ready for review November 13, 2025 14:12
@vlntb vlntb requested a review from a team as a code owner November 13, 2025 14:12
@codecov
Copy link

codecov bot commented Nov 13, 2025

Codecov Report

❌ Patch coverage is 96.42857% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.6%. Comparing base (ad37461) to head (645fdda).

Files with missing lines Patch % Lines
src/libxrpl/basics/MallocTrim.cpp 97.8% 1 Missing ⚠️
src/xrpld/app/main/Application.cpp 0.0% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           develop   #6022   +/-   ##
=======================================
  Coverage     78.6%   78.6%           
=======================================
  Files          818     820    +2     
  Lines        68976   69032   +56     
  Branches      8267    8234   -33     
=======================================
+ Hits         54190   54259   +69     
+ Misses       14786   14773   -13     
Files with missing lines Coverage Δ
include/xrpl/basics/MallocTrim.h 100.0% <100.0%> (ø)
src/xrpld/app/misc/NetworkOPs.cpp 69.9% <100.0%> (+<0.1%) ⬆️
src/xrpld/app/misc/SHAMapStoreImp.cpp 76.1% <100.0%> (+0.7%) ⬆️
src/libxrpl/basics/MallocTrim.cpp 97.8% <97.8%> (ø)
src/xrpld/app/main/Application.cpp 68.5% <0.0%> (-0.1%) ⬇️

... and 5 files with indirect coverage changes

Impacted file tree graph

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

auto const statusBefore = readFile(statusPath);
report.rssBeforeKB = detail::parseVmRSSkB(statusBefore);

report.trimResult = ::malloc_trim(0);
Copy link
Collaborator

@pratikmankawde pratikmankawde Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest, instead of calling malloc_trim(0), which would try to trim up to the main heap's top boundary (leaving only a minimum amount, 1 page, which is usually 4 or 8 KB, link for other readers), we should instead call malloc_trim(m);, where m is the minimum amount of memory we expect we will need soon enough. It could range from few hundred MBs to few GBs. The description of this PR mentions a rate of about ~1.2 GB/h (on the higher side). So, I would suggest we keep m=~2.4GB for starters. The reports we are accumulating in this class can then help us fine tune it.

Why?
Memory allocation is an expensive operation in itself. In this case(after calling malloc_trim(0)) any new allocations would also require heap extension. This will be exceptionally expensive(by the order of 1000 times, involving user to kernel space context switch and then allocation of new pages by OS). We can avoid that by keeping few GBs in reserve.

The trim after NetworkOps -> SyncComplete could be one of the suboptimal calls, if we do any heavy operations after SyncComplete, requiring memory allocations exceeding 1 page.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note in the manual suggests that attempting to use the trim padding is wasted effort:

Only the main heap (using sbrk(2)) honors the pad argument; thread heaps do not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A malloc_trim(m) call will check all the arenas (main and thread) to see if chunks of memory can be released. It will use the padding arg to decide(keep upto) the chunk that can be freed from main heap top.

For thread heaps(sub-heaps), malloc_trim anyway can't free the sub-heap or part of it(hence neglect padding), if there's even a small block in use. If a sub-heap region becomes completely empty after the last call to free(), allocator will anyway return the memory to OS. So padding doesn't make much sense for thread heaps. Thread heaps are anyway self-managed. So, we don't need to optimise for them. If there's fragmentation in the sub-heaps, that will remain until an entire sub-heap is empty(which will then be released).

@bthomee bthomee requested a review from lmaisons November 21, 2025 20:11
Copy link
Collaborator

@lmaisons lmaisons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The points where we trim look reasonable. Unless there's some obvious caveat I'm not aware of, I think the RSS reporting should use the self and statm proc features so as to minimize parsing / clerical errors.


std::string const tagStr = tag.value_or("default");
std::string const statusPath =
"/proc/" + std::to_string(cachedPid) + "/status";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we not just use /proc/self/...? Also, it seems we would need to do fewer parsing contortions if we read from /proc/self/statm instead of /proc/self/status

auto const statusBefore = readFile(statusPath);
report.rssBeforeKB = detail::parseVmRSSkB(statusBefore);

report.trimResult = ::malloc_trim(0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note in the manual suggests that attempting to use the trim padding is wasted effort:

Only the main heap (using sbrk(2)) honors the pad argument; thread heaps do not.

@bthomee bthomee requested a review from g-ripple November 24, 2025 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants