-
Notifications
You must be signed in to change notification settings - Fork 0
9. Model Architecture Detection
Changes Made
- Updated documentation to reflect the unified ModelFactory singleton pattern for architecture detection
- Added integration details for GGUF and safetensors model loading APIs
- Enhanced section sources to include newly integrated API files
- Updated diagram sources to reflect actual code structure and file mappings
- Maintained consistency with the enhanced source tracking system
- Introduction
- Architecture Detection System
- Supported Architectures
- Detection Heuristics
- Implementation Details
- Integration Patterns
- Troubleshooting Guide
The Model Architecture Detection system is a core component of the Oxide-Lab application that identifies the architecture type of machine learning models from their metadata. This system enables the application to properly load and execute different model types by determining their architecture before initialization. The detection system supports multiple model formats including GGUF and safetensors, and handles various popular LLM architectures.
Section sources
- src-tauri/tests/architecture_detection.rs
- src-tauri/src/models/registry.rs
The architecture detection system uses a hierarchical approach to identify model types based on metadata extracted from model files. The system is designed to be extensible, allowing for new architectures to be added through a builder pattern. The implementation has been updated to use a global ModelFactory singleton that unifies detection for both GGUF and safetensors formats.
``mermaid flowchart TD A["detect_arch(metadata)"] --> B["get_model_factory()"] B --> C["ModelFactory"] C --> D["Iterate through builders"] D --> E["builder.detect_gguf_arch(metadata)"] E --> F{"Match found?"} F --> |Yes| G["Return ArchKind"] F --> |No| H["Try next builder"] H --> I{"All builders checked?"} I --> |No| D I --> |Yes| J["Return None"]
**Diagram sources**
- [src-tauri/src/models/registry.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/registry.rs#L26-L41)
- [src-tauri/src/models/common/builder.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/common/builder.rs#L145-L155)
**Section sources**
- [src-tauri/src/models/registry.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/registry.rs#L26-L53)
- [src-tauri/src/models/common/builder.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/common/builder.rs#L145-L155)
## Supported Architectures
The system currently supports the following model architectures, as defined in the ArchKind enum:
rust #[derive(Debug, Clone, PartialEq, Eq, Hash)] pub enum ArchKind { Llama, // Covers Llama 2, 3, 3.1, 3.2, 3.3, 4 and CodeLlama Mistral, // Covers Mistral 7B, Mistral Small, Mistral NeMo, Mistral Large Mixtral, Gemma, // Covers Gemma 2, Gemma 3 Qwen3, // Covers Qwen 2, 2.5, 3, 3 Coder Yi, Phi3, // Covers Phi-3, Phi-3.5 DeepSeek, // Covers DeepSeek-R1 variants Pixtral, SmolLM2, // SmolLM 2 }
The system is designed to support additional architectures by registering new model builders in the ModelFactory.
**Section sources**
- [src-tauri/src/models/registry.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/registry.rs#L4-L22)
## Detection Heuristics
The system employs multiple detection strategies to accurately identify model architectures, with a primary focus on Qwen3 detection as the currently implemented architecture.
### Primary Detection Method
The primary detection method checks the `general.architecture` field in the model metadata:
``mermaid
flowchart TD
A["Check general.architecture field"] --> B{"Field exists?"}
B --> |No| C["Proceed to fallback detection"]
B --> |Yes| D["Convert to lowercase"]
D --> E{"Value is 'qwen2' or 'qwen3'?"}
E --> |Yes| F["Return ArchKind::Qwen3"]
E --> |No| C
Diagram sources
- src-tauri/src/models/qwen3_builder.rs
When the primary method fails, the system performs a heuristic search through all metadata fields:
``mermaid flowchart TD A["Iterate through all metadata entries"] --> B{"Value contains 'qwen'?"} B --> |Yes| C["Return ArchKind::Qwen3"] B --> |No| D{"More entries?"} D --> |Yes| A D --> |No| E["Return None"]
**Diagram sources**
- [src-tauri/src/models/qwen3_builder.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/qwen3_builder.rs#L77-L87)
**Section sources**
- [src-tauri/src/models/qwen3_builder.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/qwen3_builder.rs#L65-L88)
## Implementation Details
The architecture detection system is implemented using a factory pattern with pluggable builders for different model types.
### Model Factory Pattern
The ModelFactory manages a collection of ModelBuilder instances, each responsible for a specific architecture. The implementation uses a singleton pattern with OnceLock to ensure a single global instance:
rust /// Global model factory instance static MODEL_FACTORY: OnceLock = OnceLock::new();
/// Get the global model factory instance pub fn get_model_factory() -> &'static ModelFactory { MODEL_FACTORY.get_or_init(|| { let mut factory = ModelFactory::new();
// Register Qwen3 builder
factory.register_builder(crate::models::common::builder::ModelBuilder::Qwen3(Qwen3ModelBuilder::new()));
factory
})
}
``mermaid
classDiagram
class ModelFactory {
+builders : HashMap<ArchKind, ModelBuilder>
+register_builder(builder : ModelBuilder)
+build_from_gguf()
+detect_gguf_arch()
+detect_config_arch()
}
class ModelBuilder {
<<enum>>
+Qwen3(Qwen3ModelBuilder)
}
class Qwen3ModelBuilder {
+from_gguf()
+from_varbuilder()
+detect_gguf_arch()
+detect_config_arch()
+arch_kind()
}
ModelFactory --> ModelBuilder : "contains"
ModelBuilder --> Qwen3ModelBuilder : "variant"
Diagram sources
- src-tauri/src/models/common/builder.rs
- src-tauri/src/models/qwen3_builder.rs
For models that use configuration files (e.g., safetensors format), the system provides an alternative detection method:
``mermaid flowchart TD A["detect_arch_from_config(config)"] --> B["get_model_factory()"] B --> C["ModelFactory"] C --> D["Iterate through builders"] D --> E["builder.detect_config_arch(config)"] E --> F{"Match found?"} F --> |Yes| G["Return ArchKind"] F --> |No| H["Try next builder"] H --> I{"All builders checked?"} I --> |No| D I --> |Yes| J["Return None"]
**Diagram sources**
- [src-tauri/src/models/registry.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/registry.rs#L48-L52)
- [src-tauri/src/models/qwen3_builder.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/qwen3_builder.rs#L90-L110)
**Section sources**
- [src-tauri/src/models/registry.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/registry.rs#L48-L52)
- [src-tauri/src/models/qwen3_builder.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/qwen3_builder.rs#L90-L110)
## Integration Patterns
The system integrates with various model formats and provides adapters for different model implementations.
### GGUF Model Integration
For GGUF format models, the system uses the following integration pattern:
``mermaid
sequenceDiagram
participant User as "Application"
participant Detector as "Architecture Detector"
participant Factory as "ModelFactory"
participant Builder as "Qwen3ModelBuilder"
participant Loader as "candle_transformers"
User->>Detector : detect_arch(metadata)
Detector->>Factory : get_model_factory()
Factory->>Builder : detect_gguf_arch(metadata)
Builder->>Builder : Check general.architecture
Builder->>Builder : Fallback to heuristic search
Builder-->>Factory : ArchKind : : Qwen3
Factory-->>Detector : ArchKind : : Qwen3
Detector-->>User : ArchKind : : Qwen3
Diagram sources
- src-tauri/src/api/model_loading/gguf.rs
- src-tauri/src/models/registry.rs
Section sources
- src-tauri/src/api/model_loading/gguf.rs
- src-tauri/src/models/registry.rs
For safetensors format models, the system uses configuration-based detection:
``mermaid sequenceDiagram participant User as "Application" participant Detector as "Architecture Detector" participant Factory as "ModelFactory" participant Builder as "Qwen3ModelBuilder" participant Loader as "candle_transformers" User->>Detector : detect_arch_from_config(config) Detector->>Factory : get_model_factory() Factory->>Builder : detect_config_arch(config) Builder->>Builder : Check model_type field Builder->>Builder : Check architectures array Builder-->>Factory : ArchKind : : Qwen3 Factory-->>Detector : ArchKind : : Qwen3 Detector-->>User : ArchKind : : Qwen3
**Diagram sources**
- [src-tauri/src/api/model_loading/safetensors.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/api/model_loading/safetensors.rs#L75-L80)
- [src-tauri/src/models/registry.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/registry.rs#L48-L52)
**Section sources**
- [src-tauri/src/api/model_loading/safetensors.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/api/model_loading/safetensors.rs#L75-L80)
- [src-tauri/src/models/registry.rs](file://d:/GitHub/Oxide-Lab/src-tauri/src/models/registry.rs#L48-L52)
### Model Backend Abstraction
The system provides a unified interface for different model implementations through the ModelBackend trait:
``mermaid
classDiagram
class ModelBackend {
<<trait>>
+forward_layered(input : &Tensor, position : usize) -> Result<Tensor, String>
}
class AnyModel {
-inner : Box<dyn ModelBackend + Send>
+from_qwen3()
+from_candle_qwen3()
+from_candle_llama()
+from_candle_phi()
}
class Qwen3CandleAdapter {
-inner : ModelForCausalLM
+forward_layered()
}
class LlamaCandleAdapter {
-inner : Llama
+forward_layered()
}
class PhiCandleAdapter {
-inner : Model
+forward_layered()
}
AnyModel --> ModelBackend : "implements"
Qwen3CandleAdapter --> ModelBackend : "implements"
LlamaCandleAdapter --> ModelBackend : "implements"
PhiCandleAdapter --> ModelBackend : "implements"
Diagram sources
- src-tauri/src/models/common/model.rs
- src-tauri/src/models/common/candle_llm.rs
Section sources
- src-tauri/src/models/common/model.rs
- src-tauri/src/models/common/candle_llm.rs
This section provides guidance for common issues encountered with the model architecture detection system.
Issue: Architecture detection returns None
- Cause: The model metadata does not contain recognizable architecture indicators
-
Solution:
- Verify the model file is not corrupted
- Check that the model format is supported (GGUF or safetensors)
- Ensure the metadata contains the
general.architecturefield or model-specific identifiers - For custom models, ensure the name contains the architecture name (e.g., "qwen" for Qwen models)
Issue: Incorrect architecture detection
- Cause: Heuristic detection matched on a false positive
-
Solution:
- Verify the model's actual architecture
- Check the metadata for the correct
general.architecturevalue - If using a custom model, ensure the metadata is properly set
Issue: Unsupported architecture error
- Cause: The detected architecture does not have a registered builder
-
Solution:
- Check if the architecture is in the supported list (Llama, Mistral, Mixtral, Gemma, Qwen3, Yi, Phi3, DeepSeek, Pixtral, SmolLM2)
- If the architecture should be supported, ensure the appropriate builder is registered in the ModelFactory
- For new architectures, implement a new ModelBuilder and register it
- Inspect model metadata: Use tools to examine the GGUF metadata or config.json to verify architecture information
- Enable logging: Add debug prints to trace the detection process
- Test with known models: Use verified model files to confirm the detection system is working correctly
- Check file format: Ensure the model file is in the expected format (GGUF for quantized models, safetensors for float models)
Section sources
- src-tauri/tests/architecture_detection.rs
- src-tauri/src/models/registry.rs
- src-tauri/src/models/qwen3_builder.rs
Referenced Files in This Document
- src-tauri/tests/architecture_detection.rs - Updated in recent commit
- src-tauri/src/models/registry.rs - Updated in recent commit
- src-tauri/src/models/common/builder.rs - Updated in recent commit
- src-tauri/src/models/qwen3_builder.rs - Updated in recent commit
- src-tauri/src/api/model_loading/gguf.rs - Added integration with ModelFactory
- src-tauri/src/api/model_loading/safetensors.rs - Added integration with ModelFactory
- src-tauri/src/api/model_loading/hub_gguf.rs - Added integration with ModelFactory