-
Notifications
You must be signed in to change notification settings - Fork 466
Open
Description
Description
Occasionally, cuda errors occur during parallel processing. It is quite difficult to catch where exactly they occur. However, there is a suspicion that the error occurs when multiple contexts are created or deleted simultaneously. Could you protect these calls at a low level by locking threads?
Here is the end:
ggml-org/llama.cpp#11804
Metadata
Metadata
Assignees
Labels
No labels