Skip to content

Insufficient shared memory available on the GPU when using python binding on A100 #511

@WanqiYuan

Description

@WanqiYuan

Hi there,

Thanks for releasing such an amazing code. I tried to replace several MLPs in my code with Fully-fused-MLP using tcnn python binding. I set the number of neurons as 64 and using A100. I compiled the python binding following the instructions on A100. However, I got an error:

FullyFusedMLP: insufficient shared memory available on the GPU. Reduce n_neurons or use CutlassMLP (better compatibility but slower) instead.

I think A100 should have a big shared memory and the layers or number of neurons of the Fully-fused-MLP are not big. Am I missing something when compiling?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions