Skip to content

Conversation

@cmhamel
Copy link
Contributor

@cmhamel cmhamel commented Mar 13, 2025

No description provided.

@cmhamel
Copy link
Contributor Author

cmhamel commented Mar 13, 2025

@lxmota I'm excited to say this PR has the first full example of a GPU solve using this framework.

Take a look at test/poisson/TestPoissonCUDA.jl. It's really simple, I know. But a baseline for building out a more general GPU assembler for coupled problems. There's one more PR I need to put together to fix this on problems that have more than one block. It's an easy fix though.

@lxmota
Copy link

lxmota commented Mar 13, 2025

Awesome! I'll definitely take a look. Also, we should meet to discuss further developments along these lines in terms of possible impactful applications. And I mean beyond what we have been talking so far.

@cmhamel cmhamel merged commit dfdc74e into main Mar 14, 2025
3 checks passed
@cmhamel cmhamel deleted the assemblers-gpu branch March 14, 2025 01:39
@lxmota
Copy link

lxmota commented Mar 20, 2025

Question regarding ROCm support. Given that the implementation uses KernelAbstractions, I wonder how complex it would be to extend compatibility for AMD GPUs via AMDGPU.jl. Since AMDGPU.jl is also built on KernelAbstractions, the transition might be relatively straightforward, though I'm aware that there could be nuances in terms of performance tuning, ROCm-specific libraries, or platform quirks.

I'm particularly interested in this because I have an AMD GPU, and being able to develop on it would motivate me to get more involved with the project, since I'm good at finding excuses not to do it. If the conversion seems straightforward, I'm happy to explore this further. If it's likely to require substantial effort, we can set it aside for now and revisit when the timing makes more sense.

@cmhamel
Copy link
Contributor Author

cmhamel commented Mar 20, 2025

Question regarding ROCm support. Given that the implementation uses KernelAbstractions, I wonder how complex it would be to extend compatibility for AMD GPUs via AMDGPU.jl. Since AMDGPU.jl is also built on KernelAbstractions, the transition might be relatively straightforward, though I'm aware that there could be nuances in terms of performance tuning, ROCm-specific libraries, or platform quirks.

I'm particularly interested in this because I have an AMD GPU, and being able to develop on it would motivate me to get more involved with the project, since I'm good at finding excuses not to do it. If the conversion seems straightforward, I'm happy to explore this further. If it's likely to require substantial effort, we can set it aside for now and revisit when the timing makes more sense.

The main difference will be in the sparse matrix constructors.

Currently I have a package extension for CUDA that specializes the assembler for CUDA CSC sparse matrix types. Other storage formats such as CSR are also possible.

I don't have an AMD gpu right now so I probably can't help here much but happy to chat sometime next week when I'm back in the office.

@lxmota
Copy link

lxmota commented Mar 20, 2025

Yes, I just looked and found that AMDGPU,jl supports both CSC and CSR sparse formats, among others. I'm willing to try to do the conversion since that's the GPU that I have at home. If all else fails, I can dust up an old NVidia card that I have somewhere . Yes, let's chat next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants