Skip to content

Commit e0273ac

Browse files
committed
clean up example
Signed-off-by: vsoch <[email protected]>
1 parent 25bbbe7 commit e0273ac

File tree

1 file changed

+0
-23
lines changed

1 file changed

+0
-23
lines changed

examples/test/jobset-mnist.yaml

Lines changed: 0 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -7,18 +7,6 @@ spec:
77
pullPolicy: Never
88
workflow:
99
completed: 10
10-
# What we could try:
11-
# Having a change to epochs
12-
# Stopping the workflow when we reach a certain threshold accuracy, etc.
13-
# events:
14-
# Custom metric - derived from parsing lammps log
15-
# Max decreases down to 1 (default), with 3 breaks in between, and 3 times
16-
# - metric: mean.lammps.lammps-walltime
17-
# when: "<= 10"
18-
# action: shrink
19-
# repetitions: 3
20-
# backoff: 3
21-
# minSize: 1
2210

2311
cluster:
2412
maxSize: 4
@@ -40,14 +28,3 @@ spec:
4028
PYTHONUNBUFFERED: "0"
4129
epochs: "1"
4230
script: torchrun --rdzv_id=123 --nnodes=${nodes} --nproc_per_node=1 --master_addr=${jobname}-jobset-0-0.${jobname}.default.svc.cluster.local --master_port=$MASTER_PORT --node_rank=$RANK mnist.py --epochs=${epochs} --log-interval=1
43-
44-
# Event parsing. Assume for a log for now
45-
# events:
46-
# script: |
47-
# def parse_log(log):
48-
# import re
49-
# match = re.search('Total wall time: (?P<walltime>.*)', log)
50-
# walltime = match.groupdict()['walltime']
51-
# hours, minutes, seconds = walltime.split(':')
52-
# walltime = (float(hours) * 60 * 60) + (float(minutes) * 60) + (float(seconds))
53-
# return {"lammps-walltime": walltime}

0 commit comments

Comments
 (0)