tatsu-lab · yunginnanet · Mar 19, 2023
diff --git a/README.md b/README.md
@@ -144,7 +144,7 @@ torchrun --nproc_per_node=4 --master_port=<your_random_port> train.py \
 ### Warning
 `fsdp_transformer_layer_cls_to_wrap` must be set to the name of the specific decoder layer. 
 The LLaMA Hugging Face PR is not stable. 
-Earlier commits used the name `LLaMADecoderLayer` for their decoder layer (the commit hash our code is based on this). 
+Earlier commits used the name `LLaMADecoderLayer` for their decoder layer (as did the commit hash our code is based on). 
 More recent commits use `LlamaDecoderLayer` (notice the small case difference).
 Not setting `fsdp_transformer_layer_cls_to_wrap` to the correct name will lead to drastic slowdowns in training.