Skip to content

Commit cd61d55

Browse files
committed
fix
1 parent 4663868 commit cd61d55

File tree

1 file changed

+11
-11
lines changed

1 file changed

+11
-11
lines changed

docs/run_maxtext/run_maxtext_via_xpk.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -133,13 +133,13 @@ Ensure Docker is configured for sudoless use before running the build script. Fo
133133
- **For TPUs:**
134134

135135
```
136-
bash dependencies/scripts/docker_build_dependency_image.sh DEVICE=tpu MODE=stable
136+
bash src/dependencies/scripts/docker_build_dependency_image.sh DEVICE=tpu MODE=stable
137137
```
138138
139139
- **For GPUs:**
140140
141141
```
142-
bash dependencies/scripts/docker_build_dependency_image.sh DEVICE=gpu MODE=stable
142+
bash src/dependencies/scripts/docker_build_dependency_image.sh DEVICE=gpu MODE=stable
143143
```
144144
145145
---
@@ -165,8 +165,8 @@ This guide focuses on submitting workloads to an existing cluster. Cluster creat
165165
2. **Configure gcloud CLI**
166166

167167
```
168-
gcloud config set project ${PROJECT_ID}
169-
gcloud config set compute/zone ${ZONE}
168+
gcloud config set project ${PROJECT_ID?}
169+
gcloud config set compute/zone ${ZONE?}
170170
```
171171

172172
### A Note on multi-slice and multi-node runs
@@ -180,24 +180,24 @@ For instance, to run a job across **four TPU slices**, you would change `--num-s
180180

181181
```
182182
xpk workload create\
183-
--cluster ${CLUSTER_NAME}\
183+
--cluster ${CLUSTER_NAME?}\
184184
--workload ${USER}-tpu-job\
185185
--base-docker-image maxtext_base_image\
186186
--tpu-type v5litepod-256\
187187
--num-slices 1\
188-
--command "python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${USER}-tpu-job base_output_directory=${BASE_OUTPUT_DIR} dataset_path=${DATASET_PATH} steps=100"
188+
--command "python3 -m maxtext.trainers.pre_train.train run_name=${USER}-tpu-job base_output_directory=${BASE_OUTPUT_DIR?} dataset_path=${DATASET_PATH?} steps=100"
189189
```
190190
191191
- **On your GPU cluster:**
192192
193193
```
194194
xpk workload create\
195-
--cluster ${CLUSTER_NAME}\
195+
--cluster ${CLUSTER_NAME?}\
196196
--workload ${USER}-gpu-job\
197197
--base-docker-image maxtext_base_image\
198198
--device-type h100-80gb-8\
199199
--num-nodes 2\
200-
--command "python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${USER}-gpu-job base_output_directory=${BASE_OUTPUT_DIR} dataset_path=${DATASET_PATH} steps=100"
200+
--command "python3 -m maxtext.trainers.pre_train.train run_name=${USER}-gpu-job base_output_directory=${BASE_OUTPUT_DIR?} dataset_path=${DATASET_PATH?} steps=100"
201201
```
202202
203203
---
@@ -233,7 +233,7 @@ The AOT artifact must be included in your Docker image. The `docker_upload_runne
233233
```bash
234234
export CLOUD_IMAGE_NAME="${USER}-maxtext-aot-runner"
235235

236-
bash dependencies/scripts/docker_upload_runner.sh CLOUD_IMAGE_NAME=${CLOUD_IMAGE_NAME}
236+
bash src/dependencies/scripts/docker_upload_runner.sh CLOUD_IMAGE_NAME=${CLOUD_IMAGE_NAME}
237237
```
238238

239239
### Step 3: Create the XPK workload with the AOT artifact
@@ -276,13 +276,13 @@ Your job will now start faster by skipping the JAX compilation step on the clust
276276
- **List your jobs:**
277277

278278
```
279-
xpk workload list --cluster ${CLUSTER_NAME}
279+
xpk workload list --cluster ${CLUSTER_NAME?}
280280
```
281281

282282
- **Analyze output:** Checkpoints and other artifacts will be saved to the Google Cloud Storage bucket you specified in `BASE_OUTPUT_DIR`.
283283

284284
- **Delete a job:**
285285

286286
```
287-
xpk workload delete --cluster ${CLUSTER_NAME} --workload <your-workload-name>
287+
xpk workload delete --cluster ${CLUSTER_NAME?} --workload <your-workload-name>
288288
```

0 commit comments

Comments
 (0)