[BUG]: 0.24.0 CUDA detection bug

### Description

CPU will always load. When using fallback set to false and skipcheck() it will crash and throw the native api exception. Forcing CUDA does not work.

Temp Solution:
Use NextCoder model instead of my Qwen3 model and use version 0.23.0 of llamasharp with the cuda backend.
Llama loads up and shows backend_ptrs.size() = 2. This normally shows 1 when using 0.24.0.

Problem comes from 0.24.0 specifically.

Strange part:
I have 0.24.0 CUDA working on a MAUI application that is available on the Microsoft store so I know 0.24.0 works for CUDA but it uses a custom library that doesnt seem to work for the console/API server application. 

Changing the targeted type from Any CPU to x64 is required to find the backend .dlls as well so x86 does not work correctly. 

### Reproduction Steps

Attempt to use Qwen3 model with latest llamasharp cuda backend.
Create a .NET 9 console application  or API server and reference the latest CUDA backend.

Load the weights and look at where its assigned.



### Environment & Configuration

- Operating system: Windows 11
- .NET runtime version: .NET 9
- LLamaSharp version: 0.24.0
- CUDA version (if you are using cuda backend): 12.6 & 12.9
- CPU & GPU device: RTX 3090


### Known Workarounds

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG]: 0.24.0 CUDA detection bug #1221

Description

Reproduction Steps

Environment & Configuration

Known Workarounds

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG]: 0.24.0 CUDA detection bug #1221

Description

Description

Reproduction Steps

Environment & Configuration

Known Workarounds

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions