This repository contains the implementation of our experiments on parameter-efficient fine-tuning (PEFT) methods for adapting BERT to downstream task. Read the paper
In this project, we investigated:
- Which PEFT method performs best (LoRA, Houlsby adapters, or Adapter+) in terms of accuracy and parameter efficiency on a binary classification task (CoLA dataset)?
- Do all transformer layers need adapters? Specifically, what happens if we remove adapters from lower layers?
- Adapter+ is the most effective PEFT method for this binary classification task.
- LoRA is stable but underperforms compared to adapters, even with similar parameter counts.
- Layer ablation suggests that most gains come from higher transformer layers.
- Install dependencies:
pip install -r requirements.txt
- Hugging Face Authentication:
- Get your Hugging Face token from: https://huggingface.co/settings/tokens
- In
src/train.py
, replacelogin()
withlogin("your_token_here")
- Or use my token (contact me for the token)
python train.py --figure figure4
python train.py --figure figure6
Output: Results saved to results_figure4/
or results_figure6/
directories with JSON files containing validation accuracy, parameters, and configuration details.
python figure4.py
python figure6.py
Output: Displays plots showing parameter efficiency comparison and layer ablation study results.