[WIP] refactor: Implement modular candle-binding architecture (#254) #266

rootfs · 2025-09-28T12:40:16Z

Restructure codebase into modular layers (core/, ffi/, model_architectures/, classifiers/)
Add unified error handling and configuration loading systems
Implement dual-path architecture for traditional and LoRA models
Add comprehensive FFI layer with memory safety

Maintains backward compatibility while enabling future model integrations.

refactor: Implement modular candle-binding architecture

Restructure codebase into modular layers (core/, ffi/, model_architectures/, classifiers/)
Add unified error handling and configuration loading systems
Implement dual-path architecture for traditional and LoRA models
Add comprehensive FFI layer with memory safety

Maintains backward compatibility while enabling future model integrations.

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Release Notes: Yes/No

- Restructure codebase into modular layers (core/, ffi/, model_architectures/, classifiers/) - Add unified error handling and configuration loading systems - Implement dual-path architecture for traditional and LoRA models - Add comprehensive FFI layer with memory safety Maintains backward compatibility while enabling future model integrations. refactor: Implement modular candle-binding architecture - Restructure codebase into modular layers (core/, ffi/, model_architectures/, classifiers/) - Add unified error handling and configuration loading systems - Implement dual-path architecture for traditional and LoRA models - Add comprehensive FFI layer with memory safety Maintains backward compatibility while enabling future model integrations. Signed-off-by: OneZero-Y <[email protected]>

netlify · 2025-09-28T12:40:22Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`a48b0be`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68dbcf9933a669000858d5e3
😎 Deploy Preview	https://deploy-preview-266--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-09-28T12:40:26Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `candle-binding`

Owners: @rootfs
Files changed:

candle-binding/src/classifiers/lora/intent_lora.rs
candle-binding/src/classifiers/lora/intent_lora_test.rs
candle-binding/src/classifiers/lora/mod.rs
candle-binding/src/classifiers/lora/parallel_engine.rs
candle-binding/src/classifiers/lora/pii_lora.rs
candle-binding/src/classifiers/lora/pii_lora_test.rs
candle-binding/src/classifiers/lora/security_lora.rs
candle-binding/src/classifiers/lora/security_lora_test.rs
candle-binding/src/classifiers/lora/token_lora.rs
candle-binding/src/classifiers/lora/token_lora_test.rs
candle-binding/src/classifiers/mod.rs
candle-binding/src/classifiers/traditional/batch_processor.rs
candle-binding/src/classifiers/traditional/batch_processor_test.rs
candle-binding/src/classifiers/traditional/mod.rs
candle-binding/src/classifiers/traditional/modernbert_classifier.rs
candle-binding/src/classifiers/traditional/modernbert_classifier_test.rs
candle-binding/src/classifiers/unified.rs
candle-binding/src/classifiers/unified_test.rs
candle-binding/src/core/config_loader.rs
candle-binding/src/core/config_loader_test.rs
candle-binding/src/core/mod.rs
candle-binding/src/core/similarity.rs
candle-binding/src/core/tokenization.rs
candle-binding/src/core/unified_error.rs
candle-binding/src/core/unified_error_test.rs
candle-binding/src/ffi/classify.rs
candle-binding/src/ffi/classify_test.rs
candle-binding/src/ffi/init.rs
candle-binding/src/ffi/memory.rs
candle-binding/src/ffi/memory_safety.rs
candle-binding/src/ffi/memory_safety_test.rs
candle-binding/src/ffi/mod.rs
candle-binding/src/ffi/similarity.rs
candle-binding/src/ffi/state_manager.rs
candle-binding/src/ffi/tokenization.rs
candle-binding/src/ffi/types.rs
candle-binding/src/ffi/validation.rs
candle-binding/src/model_architectures/config.rs
candle-binding/src/model_architectures/lora/bert_lora.rs
candle-binding/src/model_architectures/lora/bert_lora_test.rs
candle-binding/src/model_architectures/lora/lora_adapter.rs
candle-binding/src/model_architectures/lora/mod.rs
candle-binding/src/model_architectures/mod.rs
candle-binding/src/model_architectures/model_factory.rs
candle-binding/src/model_architectures/model_factory_test.rs
candle-binding/src/model_architectures/routing.rs
candle-binding/src/model_architectures/routing_test.rs
candle-binding/src/model_architectures/traditional/base_model.rs
candle-binding/src/model_architectures/traditional/base_model_test.rs
candle-binding/src/model_architectures/traditional/bert.rs
candle-binding/src/model_architectures/traditional/bert_test.rs
candle-binding/src/model_architectures/traditional/mod.rs
candle-binding/src/model_architectures/traditional/modernbert.rs
candle-binding/src/model_architectures/traditional/modernbert_test.rs
candle-binding/src/model_architectures/traits.rs
candle-binding/src/model_architectures/unified_interface.rs
candle-binding/src/model_architectures/unified_interface_test.rs
candle-binding/src/test_fixtures.rs
candle-binding/src/utils/memory.rs
candle-binding/src/utils/mod.rs
candle-binding/Cargo.toml
candle-binding/src/lib.rs

📁 `config`

Owners: @rootfs
Files changed:

config/config.yaml

📁 `tools`

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

tools/make/build-run-test.mk
tools/make/common.mk
tools/make/rust.mk

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

rootfs · 2025-09-28T12:41:54Z

@OneZero-Y @Xunzhuo Let's have the following resolved before merging

Add more candle unit tests
Verify API accuracy
Ensure semantic-router use the right binding API
Remove legacy comment and code

feat:unit tests for candle refactoring feat:unit tests for candle refactoring Signed-off-by: OneZero-Y <[email protected]>

rootfs · 2025-09-30T12:42:49Z

@OneZero-Y now since we work on the feature branch, how about you use this branch for both refactoring and new embedding models?

OneZero-Y · 2025-10-09T14:36:21Z

@rootfs OK, I'll advance the embedded model on this branch

rootfs · 2025-10-09T17:16:11Z

@OneZero-Y that's great! I'll switch to this work as soon as i can.

ivarflakstad · 2025-10-11T20:10:50Z

candle-binding/src/classifiers/lora/parallel_engine.rs

+        let handles = vec![
+            self.spawn_intent_task(texts_owned.clone(), Arc::clone(&intent_results)),
+            self.spawn_pii_task(texts_owned.clone(), Arc::clone(&pii_results)),
+            self.spawn_security_task(texts_owned, Arc::clone(&security_results)),
+        ];
+
+        // Wait for all threads to complete
+        for handle in handles {
+            handle.join().map_err(|_| {
+                let unified_err = concurrency_error(
+                    "thread join",
+                    "Failed to join parallel classification thread",
+                );
+                candle_core::Error::from(unified_err)
+            })?;
+        }


This could be simplified a bit. Something like

let intent_handle = thread::spawn(|| intent_task(texts)); // slice is fine, no need to own the data. let pii_handle = ... same let security_handle = ... same let intent_results = intent_handle.join()?; // map_err omitted let pii_results = pii_handle.join()?; let security_results = security_handle.join()?;

Since we're on the topic of threads - you may like some of the abstractions that the rayon crate provides

thank you @ivarflakstad

ivarflakstad · 2025-10-13T17:22:54Z

candle-binding/src/classifiers/lora/pii_lora.rs

+    pub fn parallel_detect(&self, texts: &[&str]) -> Result<Vec<PIIResult>> {
+        let mut results = Vec::new();
+        for text in texts {
+            results.push(self.detect_pii(text)?);
+        }
+        Ok(results)
+    }


If you want this to be in parallel you could do something like

// add `use rayon::prelude::*;` at top of file Ok(texts.par_iter().map(|text| self.detect_pii(text)?).collect())

Though I'm starting to suspect that what you actually want, for the long term, is an async runtime.

@ivarflakstad thanks for looking into this. On a separate note, for async to run most efficiently, would you help look at the if locking is done the right way?

Sure :)
Are you thinking about any specific locks in particular? (pr is fairly large 😉 )

thank you @ivarflakstad

The classify_text is currently protected under lock. This could get us performance hit, would you help share your ideas? Thanks

rootfs requested a review from Xunzhuo as a code owner September 28, 2025 12:40

github-actions bot assigned rootfs Sep 28, 2025

rootfs added this to the v0.1 milestone Sep 28, 2025

OneZero-Y mentioned this pull request Sep 30, 2025

feat:unit tests for candle refactoring #296

Merged

feat:unit tests for candle refactoring (#296)

a48b0be

feat:unit tests for candle refactoring feat:unit tests for candle refactoring Signed-off-by: OneZero-Y <[email protected]>

github-actions bot assigned Xunzhuo Sep 30, 2025

rootfs mentioned this pull request Oct 5, 2025

Feat/mock #348

Closed

ivarflakstad reviewed Oct 11, 2025

View reviewed changes

ivarflakstad reviewed Oct 13, 2025

View reviewed changes

[WIP] refactor: Implement modular candle-binding architecture (#254) #266

Are you sure you want to change the base?

[WIP] refactor: Implement modular candle-binding architecture (#254) #266

Uh oh!

Conversation

rootfs commented Sep 28, 2025

Uh oh!

netlify bot commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 candle-binding

📁 config

📁 tools

🎉 Thanks for your contributions!

Uh oh!

rootfs commented Sep 28, 2025

Uh oh!

rootfs commented Sep 30, 2025

Uh oh!

OneZero-Y commented Oct 9, 2025

Uh oh!

rootfs commented Oct 9, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify bot commented Sep 28, 2025 •

edited

Loading

github-actions bot commented Sep 28, 2025 •

edited

Loading

📁 `candle-binding`

📁 `config`

📁 `tools`