6. Device Auto Selection

Device Auto-Selection

Introduction

The device auto-selection feature enables the application to automatically choose the optimal compute device for model inference based on hardware availability and user preferences. This system prioritizes GPU acceleration through CUDA (NVIDIA) and Metal (Apple) frameworks, falling back to CPU execution when specialized hardware is unavailable or fails to initialize. The implementation provides both automatic detection and explicit configuration options, allowing users to control where their models run.

Section sources

device.rs
types.rs

Device Selection Algorithm

The device auto-selection algorithm follows a hierarchical priority system that attempts to leverage the most performant available hardware. The selection process is implemented in the select_device function and operates according to a specific precedence order.

``mermaid flowchart TD Start([Device Selection]) --> CheckPreference["Check User Preference"] CheckPreference --> AutoMode{"Preference = Auto?"} AutoMode --> |Yes| CheckCUDA["Check CUDA Availability"] CheckCUDA --> CUDAAvailable{"CUDA Feature Enabled?"} CUDAAvailable --> |Yes| TryCUDA["Attempt CUDA Device Creation"] TryCUDA --> CUDAInit{"CUDA Init Success?"} CUDAInit --> |Yes| ReturnCUDA["Return CUDA Device"] CUDAInit --> |No| LogCUDAError["Log CUDA Error"] LogCUDAError --> CheckMetal["Check Metal Availability"] CUDAAvailable --> |No| CheckMetal["Check Metal Availability"] CheckMetal --> MetalAvailable{"Metal Feature Enabled?"} MetalAvailable --> |Yes| TryMetal["Attempt Metal Device Creation"] TryMetal --> MetalInit{"Metal Init Success?"} MetalInit --> |Yes| ReturnMetal["Return Metal Device"] MetalInit --> |No| LogMetalError["Log Metal Error"] LogMetalError --> ReturnCPU["Return CPU Device"] MetalAvailable --> |No| ReturnCPU AutoMode --> |No| CheckSpecific["Check Specific Preference"] CheckSpecific --> |CPU| ReturnCPU CheckSpecific --> |CUDA| TrySpecificCUDA["Attempt Specific CUDA Device"] TrySpecificCUDA --> CUDAError{"Init Failed?"} CUDAError --> |Yes| LogSpecificCUDAError["Log Error, Return CPU"] CUDAError --> |No| ReturnSpecificCUDA CheckSpecific --> |Metal| TrySpecificMetal["Attempt Metal Device Creation"] TrySpecificMetal --> MetalError{"Init Failed?"} MetalError --> |Yes| LogSpecificMetalError["Log Error, Return CPU"] MetalError --> |No| ReturnSpecificMetal ReturnCUDA --> End([Selected Device]) ReturnMetal --> End ReturnCPU --> End ReturnSpecificCUDA --> End ReturnSpecificMetal --> End


**Diagram sources**
- [device.rs](file://src-tauri/src/core/device.rs#L4-L65)

**Section sources**
- [device.rs](file://src-tauri/src/core/device.rs#L4-L65)

## Configuration Options
The device selection system supports multiple configuration options through the `DevicePreference` enum, which defines the possible device selection strategies. Users can specify their preferred execution environment, or allow the system to automatically determine the optimal device.

``mermaid
classDiagram
class DevicePreference {
+Auto
+Cpu
+Cuda(index : usize)
+Metal
}
class Device {
+Cpu
+Cuda(CudaDevice)
+Metal(MetalDevice)
}
class LoadRequest {
+Gguf
+HubGguf
+HubSafetensors
}
DevicePreference --> LoadRequest : "used in"
DevicePreference --> Device : "maps to"

Diagram sources

types.rs
device.rs

The available configuration options are:

Auto: Allows the system to automatically select the best available device following the CUDA → Metal → CPU priority order
Cpu: Forces execution on the CPU regardless of GPU availability
Cuda { index }: Specifies a particular CUDA device by index (typically 0 for single GPU systems)
Metal: Specifies the use of Apple's Metal framework for GPU acceleration

These preferences can be set in various contexts, including model loading requests:

#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "format", rename_all = "lowercase")]
pub enum LoadRequest {
    Gguf {
        model_path: String,
        tokenizer_path: Option<String>,
        context_length: usize,
        device: Option<DevicePreference>,
    },
    HubGguf {
        repo_id: String,
        revision: Option<String>,
        filename: String,
        context_length: usize,
        device: Option<DevicePreference>,
    },
    HubSafetensors {
        repo_id: String,
        revision: Option<String>,
        context_length: usize,
        device: Option<DevicePreference>,
    },
}

Section sources

types.rs

Error Handling and Fallback Behavior

The device selection system implements comprehensive error handling with graceful fallback mechanisms to ensure application stability even when preferred devices cannot be initialized.

When a device fails to initialize, the system follows these fallback rules:

For Auto preference: Attempts CUDA → Metal → CPU in sequence, with each step only proceeding if the previous failed
For explicit device preferences: Attempts the specified device, falling back to CPU on failure
For all cases: CPU serves as the final fallback option, ensuring the application remains functional

The error handling is implemented with detailed logging to aid troubleshooting:

// In auto-selection mode
if cuda_is_available() {
    match Device::new_cuda(0) {
        Ok(device) => {
            println!("[device] auto-selected CUDA");
            return device;
        }
        Err(e) => {
            eprintln!("[device] CUDA init failed: {}, falling back to next option", e);
        }
    }
}

// For explicit CUDA selection
DevicePreference::Cuda { index } => {
    match Device::new_cuda(index) {
        Ok(device) => device,
        Err(e) => {
            eprintln!("[device] CUDA init failed: {}, falling back to CPU", e);
            Device::Cpu
        }
    }
}

The system also includes safety checks through compile-time feature flags, ensuring that CUDA and Metal initialization are only attempted when the corresponding features are enabled during compilation.

Section sources

device.rs

CUDA and Metal Initialization

The initialization process for CUDA and Metal devices involves specific requirements and potential failure points that users should understand for effective troubleshooting.

CUDA Initialization

CUDA device creation is handled by the Device::new_cuda(ordinal) method in the candle-core library. Key requirements include:

NVIDIA GPU with compatible compute capability
Properly installed CUDA drivers and toolkit
The cuda feature enabled during compilation
Sufficient GPU memory for model loading

``mermaid sequenceDiagram participant User as "Application" participant Select as "select_device()" participant Candle as "candle : : Device" participant CUDA as "CUDA Runtime" User->>Select : select_device(Some(Auto)) Select->>Candle : new_cuda(0) Candle->>CUDA : Initialize Context CUDA-->>Candle : Success/Failure alt Initialization Success Candle-->>Select : CUDA Device Select-->>User : CUDA Device else Initialization Failure Candle-->>Select : Error Select->>Candle : new_metal(0) Note over Select : Attempt Metal fallback end


**Diagram sources**
- [device.rs](file://src-tauri/src/core/device.rs#L15-L36)
- [device.rs](file://example/candle/candle-core/src/device.rs#L233-L253)

### Metal Initialization
Metal device creation is handled by the `Device::new_metal(ordinal)` method. Requirements include:

- Apple device with Metal-capable GPU (macOS, iOS, or iPadOS)
- The `metal` feature enabled during compilation
- Adequate GPU memory for the model being loaded

The Metal backend uses Metal's memory management system with private storage mode for optimal performance, requiring careful buffer management to prevent memory leaks.

**Section sources**
- [device.rs](file://example/candle/candle-core/src/device.rs#L257-L257)
- [metal_backend/device.rs](file://example/candle/candle-core/src/metal_backend/device.rs#L1-L279)

## Practical Examples
The device auto-selection system is used throughout the application, from initial startup to model loading operations.

### Application Initialization
During application startup, the system automatically selects a device using the Auto preference:

rust #[cfg_attr(mobile, tauri::mobile_entry_point)] pub fn run() { // Use auto-selection for initial device instead of hardcoding CPU let initial_device = select_device(Some(DevicePreference::Auto)); let shared: SharedState<Box<dyn ModelBackend + Send>> = Arc::new(Mutex::new(ModelState::new(initial_device)));

tauri::Builder::default()
    .manage(shared)
    // ... other configuration
    .run(tauri::generate_context!())
    .expect("error while running tauri application");

}


### API Usage
The device selection can be controlled through API calls that accept device preferences:

rust // In API handlers pub fn load_model( state: tauri::State<'_, SharedState<Box<dyn ModelBackend + Send>>>, load_request: LoadRequest, ) -> Result<(), String> { // Device preference comes from the load request let device = select_device(load_request.device); // ... use device for model loading }


### Testing Scenarios
The system includes comprehensive tests to verify device selection behavior:

rust #[test] fn test_auto_device_selection() { let device = select_device(Some(DevicePreference::Auto));

// Verify valid device is returned
match device {
    Device::Cpu => assert!(true),
    Device::Cuda(_) => assert!(true),
    Device::Metal(_) => assert!(true),
}

}

#[test] fn test_explicit_cpu_selection() { let device = select_device(Some(DevicePreference::Cpu)); match device { Device::Cpu => assert!(true), _ => panic!("Expected CPU device"), } }


**Section sources**
- [lib.rs](file://src-tauri/src/lib.rs#L29-L51)
- [device_selection.rs](file://src-tauri/tests/device_selection.rs#L1-L34)

## Architecture Overview
The device auto-selection system is integrated into the application's core architecture, providing a flexible and robust mechanism for compute device management.

``mermaid
graph TB
subgraph "Application Core"
Device[Device Selection]
State[Shared State]
Model[Model Backend]
end
subgraph "Candle Framework"
CandleDevice[candle::Device]
CUDA[candle::CudaDevice]
Metal[candle::MetalDevice]
CPU[candle::CpuDevice]
end
subgraph "Hardware"
NVIDIA[NVIDIA GPU]
Apple[Apple GPU]
CPUHardware[CPU]
end
Device --> |select_device| CandleDevice
CandleDevice --> CUDA
CandleDevice --> Metal
CandleDevice --> CPU
CUDA --> NVIDIA
Metal --> Apple
CPU --> CPUHardware
State --> Device
Model --> CandleDevice
style Device fill:#f9f,stroke:#333
style CandleDevice fill:#bbf,stroke:#333

Diagram sources

device.rs
device.rs

The architecture follows a layered approach where the application core delegates device management to the candle framework, which provides unified abstractions for different compute backends. This design allows the application to remain agnostic to the specific hardware while still leveraging optimal performance characteristics of each device type.

Section sources

device.rs
device.rs

Troubleshooting Guide

This section provides guidance for diagnosing and resolving common issues with device auto-selection, particularly related to CUDA and Metal initialization.

Common CUDA Issues

Problem: "CUDA init failed" error message Solutions:

Verify NVIDIA GPU is detected by the system
Check CUDA drivers are properly installed (run nvidia-smi)
Ensure the GPU has sufficient memory for the model
Confirm the application was compiled with CUDA support
Check for driver compatibility issues

Problem: Application falls back to CPU despite GPU being available Solutions:

Verify the cuda feature is enabled in the build configuration
Check for CUDA runtime errors in the application logs
Ensure no other processes are exclusively using the GPU
Verify GPU compute capability is supported

Common Metal Issues

Problem: "Metal init failed" error message Solutions:

Verify the Apple device supports Metal (2012 or later models)
Check macOS version compatibility
Ensure the application was compiled with Metal support
Verify sufficient GPU memory is available

Problem: Poor performance on Metal devices Solutions:

Check for memory pressure in Activity Monitor
Reduce model size or batch processing requirements
Ensure the latest macOS updates are installed
Close other GPU-intensive applications

General Troubleshooting Steps

Check compilation features: Verify the application was built with the appropriate features (cuda, metal)
Review logs: Examine the detailed error messages from device initialization attempts
Test with explicit preferences: Try forcing CPU execution to isolate hardware issues
Verify system requirements: Confirm hardware meets minimum requirements for GPU acceleration
Update drivers/runtime: Ensure graphics drivers and system software are up to date

The system's fallback behavior ensures that even when GPU initialization fails, the application remains functional using CPU execution, providing a graceful degradation path.

Section sources

device.rs
utils.rs
device.rs

Referenced Files in This Document

device.rs
types.rs
lib.rs
device_selection.rs
device.rs
utils.rs
cuda_backend/device.rs
metal_backend/device.rs

6. Device Auto Selection

Device Auto-Selection

Table of Contents

Introduction

Device Selection Algorithm

Error Handling and Fallback Behavior

CUDA and Metal Initialization

CUDA Initialization

Troubleshooting Guide

Common CUDA Issues

Common Metal Issues

General Troubleshooting Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally