Skip to content

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Apr 20, 2022

This PR is a WIP definition of a distributed protocol for the Flux job execution module, and defines an initial design for distributing the job execution module across a Flux instance.

A prototype will be developed using this RFC as a starting point, with the idea that any or all of the specification detailed here may change along the way. However, having something proposed in writing will help capture the design constraints, along with discussion and changes along the way.

Copy link
Member

@chu11 chu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome to see how this is coming together. just saw two nits when reading through it. no biggie, I know this is WIP

spec_33.rst Outdated
All job execution modules register a ``job-exec.hello`` service endpoint.
Downstream execution modules send a ``hello`` request to their upstream
peer to initiate the execution module protocol. An execution module SHALL
wait to send a ``hello`` response to its downstream peers until n an initial
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here n an

spec_33.rst Outdated
Comment on lines 110 to 112
Possible values for ``type`` MAY be *add*, *remove*, or *check*, to *add*
a new job, *remove* an inactive job, or *check* that an existing job is
active as expected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can replace the last sentence with "described below" (or equivalent)

@grondo
Copy link
Contributor Author

grondo commented Apr 21, 2022

Thanks @chu11! I fixed those two issues you noted and force-pushed the result.

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think actually this is great as is. I had a couple of nit picky comments at the beginning but as I went on I realized that this is probably going to change as we get experience prototyping and so getting too focused on those details is likely a distraction. I left my initial comments in anyway, in case they are useful.

README.md Outdated
- [30/Job Urgency](spec_30.rst)
- [31/Job Constraints Specification](spec_31.rst)
- [32/Flux Job Execution Protocol Version 1](spec_32.rst)
- [33/Flux Job Execution Module Protocol Version 1](spec_33.rst)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: title the document something like "Distributed Job Control Protocol" or similar.

Main thought is to call out "distributed" to distinguish it from what we currently have, and to choose words that don't make it look too much like it is a replacement for RFC 32.

spec_33.rst Outdated
Comment on lines 43 to 44
in the job. Therefore, the execution service is necessarily distributed
among all ranks of a Flux instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have an execution service that isn't currently distributed (at least, not like this one), maybe this last sentence should be dropped or replaced with something like

The initial execution service was a minimum viable implementation concentrated on rank 0, launching remote processes using the simple broker.rexec protocol. In contrast, the distributed job protocol takes advantage of the tree based overlay network structure to optimize performance, and is structured to allow jobs to be recovered upon Flux instance restart. It also leverages some design insights from the early wreck execution system.

(This is background after all so it doesn't hurt to add some back story here)

spec_33.rst Outdated
- Avoid presenting obstacles to the scaling of job size, the number of jobs
running concurrently, or job throughput.

- Support execution module reload.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be too much detail to add "assuming modules are loaded in root to leaves order"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I meant to remove this bullet, but yeah if we keep it then your suggestion should be added.

spec_33.rst Outdated
Comment on lines 85 to 96
the set of all running jobs for itself and all of its children. This state
SHALL include at a minimum the jobid, userid, job state, and the idset of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One sentence refers to "the set" while next refers to "this state". Maybe use "state" for the first one too?

spec_33.rst Outdated
SHALL include at a minimum the jobid, userid, job state, and the idset of
execution targets on which the job has an allocation.

All job execution modules register a ``job-exec.hello`` service endpoint.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other RFCs I think we've referred to the first word in the topic string as the "service" and the second as the "method". So maybe s/service endpoint/service method/ ?

execution targets on which the job has an allocation.

All job execution modules register a ``job-exec.hello`` service endpoint.
Downstream execution modules send a ``hello`` request to their upstream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how it would break down exactly, but subheadings might be useful to make the document easier to scan. There's a transition from hello types to state update types that is a little run together here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(other subheading comments deleted as "too soon")

@grondo
Copy link
Contributor Author

grondo commented Apr 21, 2022

I think actually this is great as is. I had a couple of nit picky comments at the beginning but as I went on I realized that this is probably going to change as we get experience prototyping and so getting too focused on those details is likely a distraction. I left my initial comments in anyway, in case they are useful.

Feel free to push up any comments directly, I agree with them all. Whether you push individual commits or squash is up to you. I also imagine the details of the message types and contents will change as I try to implement a prototype.

Edit: and I apologize for the sloppy naming, this first draft was written and rewritten several times already

@garlick
Copy link
Member

garlick commented Apr 21, 2022

No worries, yeah I'll go ahead and push some changes if you're not working on it right now.

Problem: The distributed protocol between Flux job execution modules
is not designed or documented.

Add RFC 33 to cover a high-level design of a distributed job execution
protocol, used by the job execution system to launch, monitor, and
control the job shells of a Flux job.
@garlick
Copy link
Member

garlick commented Apr 21, 2022

Feel free to push up any comments directly, I agree with them all. Whether you push individual commits or squash is up to you.

I made the changes I had suggested including the title change and other changes mostly confined to the introductory sections. I didn't try to do any subheading reorg. I squashed my changes with yours, and then reread your comment and I realize you probably wanted me to just squash my changes together. Sorry?

@grondo
Copy link
Contributor Author

grondo commented Apr 21, 2022

and then reread your comment and I realize you probably wanted me to just squash my changes together. Sorry?

No, I meant do whatever you preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants