Skip to content

Commit 2ef65a8

Browse files
committed
Improve Infera's configurability
1 parent fab208b commit 2ef65a8

File tree

7 files changed

+621
-29
lines changed

7 files changed

+621
-29
lines changed

ROADMAP.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@ It outlines features to be implemented and their current status.
1818
* [x] Single `FLOAT` scalar output.
1919
* [x] Multiple `FLOAT` outputs as a `VARCHAR` containing JSON.
2020
* [x] Multiple `FLOAT` outputs as a `LIST[FLOAT]`.
21-
* [ ] Return multiple outputs as a `STRUCT` (requires table functions for dynamic schemas).
2221
* **Batch Processing**
2322
* [x] Inference on batches for models with dynamic dimensions.
2423
* [ ] Automatic batch splitting for models with a fixed batch size.

docs/CONFIGURATION.md

Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
## Infera's Configuration Guide
2+
3+
Infera supports configuration via environment variables to customize its behavior without code changes.
4+
5+
### Environment Variables
6+
7+
#### Cache Configuration
8+
9+
##### INFERA_CACHE_DIR
10+
11+
- **Description**: Directory path for caching remote models
12+
- **Type**: String (path)
13+
- **Default**: `$TMPDIR/infera_cache` (system temp directory)
14+
- **Example**:
15+
```bash
16+
export INFERA_CACHE_DIR="/var/cache/infera"
17+
```
18+
19+
##### INFERA_CACHE_SIZE_LIMIT
20+
21+
- **Description**: Maximum cache size in bytes
22+
- **Type**: Integer (bytes)
23+
- **Default**: `1073741824` (1GB)
24+
- **Example**:
25+
```bash
26+
## Set to 5GB
27+
export INFERA_CACHE_SIZE_LIMIT=5368709120
28+
29+
## Set to 500MB
30+
export INFERA_CACHE_SIZE_LIMIT=524288000
31+
```
32+
33+
##### INFERA_CACHE_EVICTION
34+
35+
- **Description**: Cache eviction strategy to use when cache is full
36+
- **Type**: String (`LRU`, `LFU`, `FIFO`)
37+
- **Default**: `LRU` (Least Recently Used)
38+
- **Example**:
39+
```bash
40+
export INFERA_CACHE_EVICTION=LRU
41+
## Note: Currently only LRU is implemented, LFU and FIFO are planned
42+
```
43+
44+
#### HTTP Configuration
45+
46+
##### INFERA_HTTP_TIMEOUT
47+
48+
- **Description**: HTTP request timeout in seconds for downloading remote models
49+
- **Type**: Integer (seconds)
50+
- **Default**: `30`
51+
- **Example**:
52+
```bash
53+
export INFERA_HTTP_TIMEOUT=60
54+
```
55+
56+
##### INFERA_HTTP_RETRY_ATTEMPTS
57+
58+
- **Description**: Number of retry attempts for failed downloads
59+
- **Type**: Integer
60+
- **Default**: `3`
61+
- **Example**:
62+
```bash
63+
## Retry up to 5 times on failure
64+
export INFERA_HTTP_RETRY_ATTEMPTS=5
65+
```
66+
67+
##### INFERA_HTTP_RETRY_DELAY
68+
69+
- **Description**: Initial delay between retry attempts in milliseconds (uses exponential backoff)
70+
- **Type**: Integer (milliseconds)
71+
- **Default**: `1000` (1 second)
72+
- **Example**:
73+
```bash
74+
## Wait 2 seconds between retries
75+
export INFERA_HTTP_RETRY_DELAY=2000
76+
```
77+
78+
#### Logging Configuration
79+
80+
##### INFERA_VERBOSE
81+
82+
- **Description**: Enable verbose logging (deprecated, use INFERA_LOG_LEVEL instead)
83+
- **Type**: Boolean (`1`, `true`, or `0`, `false`)
84+
- **Default**: `false`
85+
- **Example**:
86+
```bash
87+
export INFERA_VERBOSE=1
88+
```
89+
90+
##### INFERA_LOG_LEVEL
91+
92+
- **Description**: Set logging level for detailed output
93+
- **Type**: String (`ERROR`, `WARN`, `INFO`, `DEBUG`)
94+
- **Default**: `WARN`
95+
- **Example**:
96+
```bash
97+
## Show all messages including debug
98+
export INFERA_LOG_LEVEL=DEBUG
99+
100+
## Show only errors
101+
export INFERA_LOG_LEVEL=ERROR
102+
103+
## Show informational messages and above
104+
export INFERA_LOG_LEVEL=INFO
105+
```
106+
107+
### Usage Examples
108+
109+
#### Example 1: Custom Cache Directory
110+
111+
```bash
112+
## Set custom cache directory
113+
export INFERA_CACHE_DIR="/mnt/fast-ssd/ml-cache"
114+
115+
## Start DuckDB
116+
./build/release/duckdb
117+
118+
## Check configuration
119+
SELECT infera_get_version();
120+
SELECT infera_get_cache_info();
121+
```
122+
123+
#### Example 2: Larger Cache for Big Models
124+
125+
```bash
126+
## Set cache to 10GB for large models
127+
export INFERA_CACHE_SIZE_LIMIT=10737418240
128+
129+
## Load large models from remote URLs
130+
./build/release/duckdb
131+
```
132+
133+
#### Example 3: Production Configuration
134+
135+
```bash
136+
## Complete production configuration
137+
export INFERA_CACHE_DIR="/var/lib/infera/cache"
138+
export INFERA_CACHE_SIZE_LIMIT=5368709120 ## 5GB
139+
export INFERA_HTTP_TIMEOUT=120 ## 2 minutes
140+
export INFERA_HTTP_RETRY_ATTEMPTS=5 ## Retry up to 5 times
141+
export INFERA_HTTP_RETRY_DELAY=2000 ## 2 second initial delay
142+
export INFERA_LOG_LEVEL=WARN ## Production logging
143+
export INFERA_CACHE_EVICTION=LRU ## LRU cache strategy
144+
145+
## Run DuckDB with Infera
146+
./build/release/duckdb
147+
```
148+
149+
#### Example 4: Development/Debug Configuration
150+
151+
```bash
152+
## Development setup with verbose logging
153+
export INFERA_CACHE_DIR="./dev-cache"
154+
export INFERA_LOG_LEVEL=DEBUG ## Detailed debug logs
155+
export INFERA_HTTP_TIMEOUT=10 ## Shorter timeout for dev
156+
export INFERA_HTTP_RETRY_ATTEMPTS=1 ## Fail fast in development
157+
158+
## Run DuckDB
159+
./build/release/duckdb
160+
```
161+
162+
#### Example 5: Slow Network Configuration
163+
164+
```bash
165+
## Configuration for slow or unreliable networks
166+
export INFERA_HTTP_TIMEOUT=300 ## 5 minute timeout
167+
export INFERA_HTTP_RETRY_ATTEMPTS=10 ## Many retries
168+
export INFERA_HTTP_RETRY_DELAY=5000 ## 5 second initial delay
169+
export INFERA_LOG_LEVEL=INFO ## Track download progress
170+
171+
./build/release/duckdb
172+
```
173+
174+
### Configuration Verification
175+
176+
You can verify your configuration at runtime:
177+
178+
```sql
179+
-- Check version and cache directory
180+
SELECT infera_get_version();
181+
182+
-- Check cache statistics
183+
SELECT infera_get_cache_info();
184+
```
185+
186+
Example output:
187+
188+
```json
189+
{
190+
"cache_dir": "/var/cache/infera",
191+
"total_size_bytes": 204800,
192+
"file_count": 3,
193+
"size_limit_bytes": 5368709120
194+
}
195+
```
196+
197+
### Retry Policy Details
198+
199+
When downloading remote models, Infera automatically retries failed downloads with exponential backoff:
200+
201+
1. **Attempt 1**: Download immediately
202+
2. **Attempt 2**: Wait `INFERA_HTTP_RETRY_DELAY` milliseconds (e.g., 1 second)
203+
3. **Attempt 3**: Wait `INFERA_HTTP_RETRY_DELAY * 2` milliseconds (e.g., 2 seconds)
204+
4. **Attempt N**: Wait `INFERA_HTTP_RETRY_DELAY * N` milliseconds
205+
206+
This helps handle temporary network issues, server rate limiting, and transient failures.
207+
208+
### Logging Levels
209+
210+
Logging levels control the verbosity of output to stderr:
211+
212+
- **ERROR**: Only critical errors that prevent operations
213+
- **WARN**: Warnings about potential issues (default)
214+
- **INFO**: Informational messages about operations (cache hits/misses, downloads)
215+
- **DEBUG**: Detailed debugging information (retry attempts, file sizes, etc.)
216+
217+
Example log output with `INFERA_LOG_LEVEL=INFO`:
218+
219+
```
220+
[INFO] Cache miss for URL: https://example.com/model.onnx, downloading...
221+
[INFO] Successfully downloaded: https://example.com/model.onnx
222+
[INFO] Cache hit for URL: https://example.com/model.onnx
223+
```
224+
225+
Example log output with `INFERA_LOG_LEVEL=DEBUG`:
226+
227+
```
228+
[DEBUG] Download attempt 1/3 for https://example.com/model.onnx
229+
[INFO] Successfully downloaded: https://example.com/model.onnx
230+
[DEBUG] Downloaded file size: 15728640 bytes
231+
```
232+
233+
### Cache Eviction Strategies
234+
235+
Currently implemented:
236+
237+
- **LRU (Least Recently Used)**: Evicts files that haven't been accessed in the longest time
238+
239+
Planned for future releases:
240+
241+
- **LFU (Least Frequently Used)**: Evicts files with the lowest access count
242+
- **FIFO (First In First Out)**: Evicts oldest downloaded files first
243+
244+
### Notes
245+
246+
- Environment variables are read once when Infera initializes
247+
- Changes to environment variables require restarting DuckDB
248+
- Invalid values fall back to defaults (no errors thrown)
249+
- Cache directory is created automatically if it doesn't exist
250+
- LRU eviction happens automatically when cache limit is reached
251+
- Logging output goes to stderr and doesn't interfere with SQL query results
252+
- Retry delays use exponential backoff to handle rate limiting gracefully

docs/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ The table below includes the information about all SQL functions exposed by Infe
2020

2121
> [!NOTE]
2222
> The `features...` arguments accept `FLOAT` as well as values from `DOUBLE`, `INTEGER`, `BIGINT`, and `DECIMAL`
23-
> columns (casted to floats under the hood).
23+
> columns (all casted to floats under the hood).
2424
2525
---
2626

@@ -185,6 +185,10 @@ You also need to have Rust (nightly version) and Cargo installed.
185185
186186
---
187187

188+
### Configuration
189+
190+
See [CONFIGURATION.md](CONFIGURATION.md) for more information about how to configure various settings for Infera.
191+
188192
### Architecture
189193

190194
Infera is made up of two main components:

0 commit comments

Comments
 (0)