Cache Fsymbol and Rsymbols on disk #32

lkdvos · 2025-07-06T00:05:33Z

This PR updates the Fsymbols and Rsymbols to be stored on disk as well.
The main motivation is that typically, the CGCs take up way more space than the actual F and Rsymbols, so it may be reasonable to have a single machine that precomputes a lot of them and then not bother with copying the (possible terrabytes of) CGCs.

For reference, I currently have:

CGC disk cache info:
====================
* SU(3) - Float64 - 3439 entries - 203.759 MiB
* SU(4) - Float64 - 10833 entries - 51.241 GiB
* SU(5) - Float64 - 1680 entries - 1.004 GiB

F disk cache info:
==================
* SU(3) - 22079 files - 424083 entries - 101.769 MiB
* SU(4) - 16468 files - 130003 entries - 37.697 MiB
* SU(5) - 5848 files - 18989 entries - 7.093 MiB

R disk cache info:
==================
* SU(3) - 116 files - 512 entries - 146.276 KiB
* SU(4) - 141 files - 488 entries - 159.426 KiB
* SU(5) - 149 files - 542 entries - 173.374 KiB

lkdvos · 2025-07-06T00:09:36Z

@LHerviou How do you feel about these changes? I am mostly wondering about how to organize the files, in the sense that it might be reasonable to simply store all Fs and all Rs each in a single file.
For the CGCs, I didn't feel comfortable doing that since these files get so large, but if our assumptions are correct than this shouldn't be the case here.
The only drawbacks I can think of is that this doesn't allow parallel read/write access, but I'm not sure if that ever was really an issue.
Additionally, if something happens and the file gets corrupted mid-write, that would also be quite unpleasant...

codecov · 2025-07-06T00:11:58Z

Codecov Report

Attention: Patch coverage is 73.85621% with 40 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/caching.jl	71.22%	40 Missing ⚠️

Files with missing lines	Coverage Δ
src/sector.jl	`98.50% <100.00%> (ø)`
src/caching.jl	`76.95% <71.22%> (-9.22%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

LHerviou · 2025-07-07T12:36:05Z

@lkdvos Thanks for such a quick implementation.

I think a single file should be fine: read/write access should be much faster. It might be worth thinking if having F_N_abc and R_N_ab is not a better alternative -- though it does increase significantly the number of files. I am indeed worried about the corruption on clusters where you can have instabilities.

One or two quick checks/questions for parallel runs:

the code does check the storing file every time an unknown F is called?
the CGC do stay in memory during a single Julia run, they are just not saved on the disk.

lkdvos · 2025-07-07T14:59:25Z

Just to be clear: the current implementation uses separate files R_N_ab and F_N_abcd.
Do I understand correctly that you think it is better to just have a single file R_N and F_N?

Whenever an F or R or CGC is called, the callstack goes:

check if we have this in memory
check if we have this in file
compute and store

where computing F and R asks for the CGC, which will again go through the same process.
So this implementation is definitely storing everything on disk, it's mainly meant as a first step towards having some precomputed set of F and R, such that the CGCs are no longer really needed.

I'll add a switch for turning disk storage on and off as well, so you can just select which ones you want to store persistently?

lkdvos added 2 commits July 5, 2025 19:16

Add F_CACHE

ed3b957

add R_CACHE

6236657

Merge branch 'main' into Fcache

420ce0f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Cache Fsymbol and Rsymbols on disk #32

Cache Fsymbol and Rsymbols on disk #32

Uh oh!

lkdvos commented Jul 6, 2025

Uh oh!

lkdvos commented Jul 6, 2025

Uh oh!

codecov bot commented Jul 6, 2025 •

edited

Loading

Uh oh!

LHerviou commented Jul 7, 2025

Uh oh!

lkdvos commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Cache Fsymbol and Rsymbols on disk #32

Are you sure you want to change the base?

Cache Fsymbol and Rsymbols on disk #32

Uh oh!

Conversation

lkdvos commented Jul 6, 2025

Uh oh!

lkdvos commented Jul 6, 2025

Uh oh!

codecov bot commented Jul 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LHerviou commented Jul 7, 2025

Uh oh!

lkdvos commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jul 6, 2025 •

edited

Loading