An Ontology-Driven Resume Screening Framework for Bias-Free, Context-Autonomous Recruitment
Fairscan revolutionizes AI-powered resume screening by eliminating external biases and ensuring complete organizational context autonomy. Unlike traditional systems that rely on generic pre-trained models, Fairscan leverages semantic reasoning and organization-specific ontologies to provide transparent, interpretable, and contextually relevant candidate evaluations.
Traditional AI resume screening systems face three critical challenges:
- Bias Propagation: Inherit irrelevant biases from external training datasets
- Privacy Vulnerabilities: Risk exposure of sensitive candidate information
- Context Misalignment: Fail to capture organization-specific requirements and values Fairscan addresses these issues through semantic knowledge representation that accurately captures and utilizes organization-specific hiring contexts.
Fairscan implements a novel Cache Augmented Generation (CAG) methodology that combines semantic reasoning with efficient knowledge retrieval:
PDF/DOCX resume processing Ontology document management
Text extraction with privacy obfuscation Personal data anonymization
KV Cache generation for semantic efficiency Context window preloading Job description extraction and structuring
SBERT-based semantic similarity scoring Best job classification and ranking Multi-criteria evaluation framework
Resume text + matched job description fusion Ontology KV Cache integration for contextual reasoning Semantic analysis optimization
Privacy-preserving local inference Semantic analysis and intelligent scoring Bias-free evaluation pipeline
Criteria-based scoring with transparency Detailed reasoning and recommendations HR review integration with audit trails
- GPU Memory: 6GB VRAM minimum
- CUDA: Enabled environment required
- MySql Workbench 8.0
- Python: 3.10
- llama-cpp-python: Latest version
- PyTorch: Latest with CUDA 12.4 support
- Torchvision: Latest with CUDA 12.4 support
$Env:LLAMA_CUBLAS = "1"
$Env:FORCE_CMAKE = "1"
# Add path_to_your NVIDIA GPU Computing Toolkit
# for example
$Env:CMAKE_ARGS="-DGGML_CUDA=on -DCMAKE_GENERATOR_TOOLSET=cuda='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1'"pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124
pip install -r ./requirements.txt
# Run in PowerShell as Administrator
wsl --install# Update package list
sudo apt update
# Install Redis
sudo apt install redis-server
# Start Redis service
sudo service redis-server start
# Enable Redis to start on boot
sudo systemctl enable redis-server
# Test Redis connection
redis-cli ping
# Should return: PONG
# Create the directory if it doesn't exist
mkdir -p scan/main_engine/modelshuggingface-cli download sentence-transformers/all-MiniLM-L6-v2 --local-dir scan/main_engine/models/all-MiniLM-L6-v2
Make sure you have the required packages installed:
pip install huggingface-hub sentence-transformers
Download Mistral-7B-Instruct-v0.3-Q3_K_L.gguf model to the specified directory scan/main_engine/models
- Run database migrations
python manage.py makemigrations
python manage.py migrate- Create superuser account
python manage.py createsuperuser
- Start Celery workers
celery -A fairscan worker --loglevel=info -E -Q chainprocessing --pool=threads
- Start the development server
python manage.py runserver
-
Access the application Open your browser and navigate to http://localhost:8000
-
Login to the system Use your superuser credentials created in step 2
-
Upload ontology document You can download a template using the "Download Template" button Upload your ontology document
-
Upload resume dataset Upload resumes (I used the dataset from Kaggle [https://www.kaggle.com/datasets/palaksood97/resume-dataset])
-
Process the data Click the "Process" button to begin analysis