-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add support for THnSparseD histogram in RDataFrame #19975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for THnSparseD histogram in RDataFrame #19975
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
Thanks @ferdymercury , I will try to follow your instructions. I noticed in my local tests that the THnSparseD and THnD too do not work with multithreading. Both seems to work fine with a single thread. |
See https://root-forum.cern.ch/t/filling-histograms-in-parallel/35460 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! Feel free to click on the button "Ready to Review" so that core developers are notified.
I see, thank you for advice. |
I implemented all required changes. I have few comments/questions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concerning the link to the html file, leave it empty for the moment.
Thanks.
One further comment: commit messages usually start with the involved component: [math] fix sth |
765b955
to
689ef1d
Compare
Ok, I added [RDF] or [NFC] to commits |
Thanks again! In
nitpick wrt commit message: [hist] is usually small compared to [RDF] RDataFrame |
689ef1d
to
8b05de0
Compare
Hi @jackapet, thank you very much for this contribution! It looks good to me at the first glance, so I will let the CI run and once we see those results I will also give you a more detailed review. Thanks to @ferdymercury for giving the first reviews and hints as well! |
Test Results 22 files 22 suites 3d 22h 45m 36s ⏱️ For more details on these failures, see this check. Results for commit 7848f3c. ♻️ This comment has been updated with latest results. |
Hi @jackapet, so unfortunately there is an issue with your changes at the ROOT building stage. Have you tried building ROOT locally with all these changes? If you go through the logs of the CI workers, for example, you can see the following:
This will need to be fixed first. |
Hello @martamaja10 , I am able to compile code without tests. I was not able to run or compile tests because I am not able to install required system packages. I am using windows with wsl2. ROOT compilation fails with RVec compilation:
I guess wsl2 does not have this variable defined. Is there some workaround for this? When running on cluster, I am able to compile the ROOT but I can't install libraries. So, I was able to check rdataframe in my local script but not in required tests. Could you provide instructions how to compile tests? Do you have a singularity/docker image for developers with all required libraries installed? I can access cvmfs if it helps. Otherwise, I will try to build my own image. |
when calling cmake, did you enable flag -Dtesting=ON -Droottest=ON ? (Building is enough, no need to install) |
The failure is likely because there is something equivalent to Line 189 in 50551c3
that needs to be implemented in THnSparse.h and .cxx |
Ok, thank you. I am trying to recompile it. I made an apptainer image with settings below. I hope all libraries are now installed.
|
I added the test into the dataframe_concurency.cxx. The helgrind gives no error:
|
I guess helgrind with root.exe looks fine too? @martamaja10 could you restart the CI ? Thanks |
Ah, sorry, I forgot on .exe. It does not look well with root.exe. It crashes. However, it does not seem to be related with THnSparseD.
|
Thanks. Let's the wait for a developer to start the CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
congrats @jackapet the CI is green
Hello, that's great that tests works now. I see that code formatting checks fails. Is there a way how I can fix it? |
If you click on the clang-format and go to the very bottom, they explain you can do this on the already checked out branch:
However, sometimes clang-format does too aggressive changes and it introduces too much noise in the PR, so I'd say let's wait to see if a developer requests this formatting change or not. |
Hi @jackapet. Thank you for your changes and congrats on having the green CI. (P.S. sorry for my slow replies, I was sick last week). Regarding clang-format - it will be needed. My suggestion is to first clean up the commits a bit using interactive rebase - for example, squash the commits of RDF tests into one commit and improve the commit from "Adding tests" to "Adding tests for THnSparse action in RDataFrame". Similarly for the documentation. For the THnSparse implementation in RDataFrame you could just have one commit (so for example: Adding THnSparseDModel, Adding chunk size and Adding interfaces to HistoNSparseD could become a single commit). In the commit message you can add a small description of what exactly you did. For now, leave the [hist] commits as they are. Once you have this sorted, you will have only a few commits. Then you can use command: |
aa6b81b
to
1e296c0
Compare
Hi @jackapet. Thanks for squashing the commits! There are some issues with the CI - not because of your PR but we have just changed to LLVM 20 as of this morning. Could you rebase your branch to the current master and push your commits again? Your local build after rebasing the master will also fail. What you can then do inside your build directory is:
and then build again. If this doesn't work you might need to do a clean build from scratch. Btw have you managed to do |
1e296c0
to
7c054f2
Compare
Hi @martamaja10 , I rebased the branch and I used clang to format the code. Could you rerun the CI again, please? Petr |
7c054f2
to
18dce68
Compare
@martamaja10 , I fixed also the ruff formatting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, thanks for the contribution!
I added some comments for your consideration.
Hello @hageboeck , thank you for your review and comments. I think I implemented everything except of moving constructor of THnT to THn.cxx. I think this is not possible because it is a template class. |
1432f62
to
e43dedc
Compare
That makes a lot of sense! |
0cbd924
to
7294f6f
Compare
Hi @hageboeck , @martamaja10 , I fixed clang formatting in the last force push. May I ask to execute the pipeline again? Petr |
Thanks. (Remaining test failures are not related to this PR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jackapet - thank you very much for your PR, well done - many users will be very happy! Thank you for great communication and engagement - we really appreciate that :)
@jackapet just one very last request before merging - could you please squash the commits (to end up just with one hist and one rdf commit) |
THnSparseD constructors consistent Copy constructor is inmplemented for THnSparseD Making THn and THnSparseD constructors consistent Adding test of THn and THnSparseD constructors
THnSparseD model is added; Interfaces for HistoNSparseD are added Adding tests for HistoNSparseD including concurrency
7294f6f
to
7848f3c
Compare
Done |
Hi all, thank you very much for your help with the PR. It was very interesting to learn how the development works in root:) |
This Pull request:
Add support for THnSparseD histograms in RDF
Changes or fixes:
THnSparse(const THnSparse&) = delete;
Checklist:
Fixes #19969