Skip to content

[SFT] Can't apply_chat_template prompt-completion dataset. so, training result always bad. #3468

@HERIUN

Description

@HERIUN

Reproduction

from datasets import load_dataset
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
from trl import (
    SFTConfig,
    SFTTrainer
)

model_name = "Qwen/Qwen2.5-3B-Instruct"
dataset_name = "nvidia/OpenMathInstruct-2"
train_split = "train_1M"

raw_datasets = load_dataset(dataset_name, split=train_split)
dataset = raw_datasets.rename_columns({
        "problem" : "prompt",
        "generated_solution" : "completion"
    })

train_eval_dataset = dataset[train_split].train_test_split(test_size=0.1, seed=42)
train_dataset = train_eval_dataset["train"].take(1000)
eval_dataset = train_eval_dataset["test"].take(100)

training_args = SFTConfig(
    output_dir="/tmp",
)

model = AutoModelForCausalLM.from_pretrained(model_name)

trainer = SFTTrainer(
        model,
        args=training_args,  ## some training_args
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
    )

trainer.train()

In SFTTrainer'_prepare_dataset() doesn't apply chat_template to prompt-completion dataset. just prompt+completion+EOS

This works, but it doesn't seem to learn anything.(In my case repetition answer happen)

System Info

  • Platform: Linux-6.11.0-1013-gcp-x86_64-with-glibc2.39
  • Python version: 3.12.9
  • TRL version: 0.17.0
  • PyTorch version: 2.6.0
  • CUDA device(s): NVIDIA L4, NVIDIA L4, NVIDIA L4, NVIDIA L4
  • Transformers version: 4.51.3
  • Accelerate version: 1.3.0
  • Accelerate config: not found
  • Datasets version: 3.5.0
  • HF Hub version: 0.30.2
  • bitsandbytes version: 0.45.5
  • DeepSpeed version: 0.16.7
  • Diffusers version: 0.33.1
  • Liger-Kernel version: 0.5.9
  • LLM-Blender version: 0.0.2
  • OpenAI version: 1.76.0
  • PEFT version: 0.15.2
  • vLLM version: 0.8.5.post1

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete

Metadata

Metadata

Assignees

No one assigned

    Labels

    🏋 SFTRelated to SFT🐛 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions