Fine-tune large language models (LLMs) like Mistral or LLaMA efficiently on custom instruction datasets using Unsloth, Hugging Face Transformers, and export to GGUF for fast inference via
llama.cpp
.
This project demonstrates how to fine-tune an LLM on a custom instruction-based dataset using:
- π§ Unsloth for memory-efficient fine-tuning
- π§ TRLβs
SFTTrainer
from Hugging Face for supervised training - πΎ GGUF Export for inference-ready deployment (supports
llama.cpp
,llamafile
, etc.) - π Optional W&B tracking for experiment visualization
Component | Tool/Library |
---|---|
Model | Mistral / LLaMA |
Trainer | TRL's SFTTrainer |
Optimization | AdamW (8-bit), LR Schedulers |
Quantization | 8-bit / 4-bit via GGUF |
Logging | Weights & Biases (optional) |
Hardware Target | Colab / Kaggle GPU |
- π§ Fine-tunes Mistral or LLaMA models with minimal VRAM requirements
- π§ Supports instruction tuning for domain-specific and structured tasks
- β‘ Trains with 8-bit optimizer using
bitsandbytes
for faster and lighter execution - π¦ Exports the final model in GGUF format compatible with
llama.cpp
,llamafile
, etc. - π― Runs on Kaggle, Google Colab, or custom local GPU environments
- π Optional Weights & Biases (W&B) logging for real-time experiment tracking
Hyperparameter | Value |
---|---|
Epochs | 2β3 |
Batch Size | 2 (with accumulation = 8) |
Max Steps | 100 |
Learning Rate | 2e-4 |
Optimizer | AdamW (8-bit) |
Precision | fp16 / bf16 |
Quantization | GGUF export (8-bit) |