Skip to content

I tried to finetune qwen3-1.7B on implescaling/s1K-1.1_tokenized datasets , by using this repository code , cannot figure out why do I got worse inferrence result ? #128

@hzgdeer

Description

@hzgdeer

S1.1 shows great improvement in agents serving LLM!!!!
I tried to finetune qwen3-1.7B on implescaling/s1K-1.1_tokenized datasets , by using this repository code , cannot figure out why do I got worse inferrence result ?

Reference Running: bash train/sft.sh

{'train_runtime': 5268.8407, 'train_samples_per_second': 0.949, 'train_steps_per_second': 0.119, 'train_loss': 0.1172730620391667, 'epoch': 5.0}

uid="$(date +%Y%m%d_%H%M%S)"
base_model="../models/Qwen-Qwen3-1.7B/"
lr=1e-5
min_lr=0
epochs=3
weight_decay=1e-4 # -> the same training pipe as slurm_training
micro_batch_size=1 # -> batch_size will be 16 if 16 gpus
gradient_accumulation_steps=16 # requires more GPU memory
max_steps=-1
gpu_count=$(nvidia-smi -L | wc -l)
push_to_hub=false

torchrun --nproc-per-node ${gpu_count} --master_port 12345
train/sft-8B.py
--block_size=1024
--per_device_train_batch_size=${micro_batch_size}
--per_device_eval_batch_size=${micro_batch_size}
--gradient_accumulation_steps=${gradient_accumulation_steps}
--num_train_epochs=${epochs}
--train_file_path="./simplescaling/s1K-1.1_tokenized"
--model_name=${base_model}
--warmup_ratio=0.05
--fsdp="full_shard auto_wrap"
--fsdp_config="train/fsdp_config_qwen.json"
--bf16=True
--eval_strategy="no"
--logging_steps=1
--save_strategy="no"
--lr_scheduler_type="cosine"
--learning_rate=${lr}
--weight_decay=${weight_decay}
--adam_beta1=0.9
--adam_beta2=0.95
--output_dir="ckpts/s1-${uid}"
--push_to_hub=${push_to_hub}
--save_only_model=True
--gradient_checkpointing=True

with 1 GPU A100 80G ,

training procedure as follows:

Image

Could you please help me ? Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions