-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Enable torch.Generator to support pytorch/xla generator implementation
#161369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…f-tree backends.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161369
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 880d298 with merge base 8951df0 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
related issue: pytorch/xla#9159 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding out-of-tree accelerators, I think PrivateUse1 based mechanism should be the recommended option - https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/detail/PrivateUse1HooksInterface.h#L76-L77
|
hi @EikanWang Thanks for reviewing!
|
|
@EikanWang Thanks for the mention. @iwknow PyTorch supports registering generators for out-of-tree backends by inheriting from the AcceleratorHooksInterface class. You can refer to the below link for more information.
Therefore, it seems to me that you might need to add new files named |
torch/csrc/Generator.cpp
Outdated
| } else if (c10::impl::hasGenerator(device_type)) { | ||
| self->cdata = at::Generator(c10::impl::makeGenerator(device)); | ||
| } else { | ||
| throw std::runtime_error("No generator available for device type: " + | ||
| c10::toString(device_type)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, wo do not need to modify this one, all the accelerators except CPU will follow the same AcceleratorHooksInterface logic.
This is actually a brilliant idea, i will take a closer look. |
… from off-tree backends." This reverts commit 32c2b34.
|
Hi @FFFrog |
| virtual DeviceIndex getNumDevices() const { | ||
| return 0; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this one? Can deviceCount from base class AcceleratorHooksInterface help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. i guess it is copy/paste from getNumGPUs XPUHooksInterface.h. removed to use DeviceIndex deviceCount from AcceleratorHooksInterface
|
It would be better if we change the code below into pytorch/aten/src/ATen/Context.h Lines 182 to 183 in f44ad54
|
|
@FFFrog Thanks for reviewing. PR updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM, let's trigger CI first.
@iwknow, I don't have valid approval permissions, so we'll have to get approval from @albanD.
Hey, @albanD, please take a look at this, thanks.
Also, it's necessary to find a better way to implement Hooks registration, otherwise every time we add a new backend to PyTorch, we'll need to add a Hooks interface for it.
|
@albanD, a kindly ping. |
|
Hi @FFFrog, can you please nominate another approver as it seems that albanD is unresponsive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds ok to add to make pytorch/XLA work. Do you have a sister PR on the xla side to make it work we should look at at the same time?
Also I would like one of the pytorch/XLA maintainer to comment here to make sure this is good on their end.
@FFFrog xla is a bit of a weird backend for historical reason. We don't plan on adding any new other backend that is not going through PrivateUse1 going forward indeed.
Sister PR for xla: yes, i do have the code change for XLA. it's basically a concrete implementation of the interface. but it will be in pytorch/XLA package. would you like to review? qihqi has been reviewing the generator change. I think he can take a look at this change as well. qihqi, please leave a comment if this change looks good to you and is what torch/XLA team want. |
Thank you, I got it. |
|
@qihqi ping again. |
|
Maintainer approval is required before merging (you can merge on your own, see this link for more information) |
|
@iwknow Could you open a PR in PyTorch/XLA with the |
Also, rename one method in DeviceAccelerator to align with the interface.
|
@ysiraichi FYI, this is the XLA hooks implementation: pytorch/xla#9683 |
|
Thanks for the linked PR, I'm not sure if something else is needed on the xla side, but happy to merge this one if it's enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepting here since @qihqi acccepted as well and it doesn't impact other components.
aten/src/ATen/DeviceAccelerator.cpp
Outdated
| DETECT_AND_ASSIGN_ACCELERATOR_COMP(HIP) | ||
| DETECT_AND_ASSIGN_ACCELERATOR_COMP(MPS) | ||
| DETECT_AND_ASSIGN_ACCELERATOR_COMP(HPU) | ||
| DETECT_AND_ASSIGN_ACCELERATOR_COMP(XLA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding xla here has quite deep implications FYI. It will change the behavior of many higher level systems wrt the xla device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add then? it seems a right thing to do to my knowledge. if you feel we shouldn't make this change, i can revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albanD Could you give some examples on the behaviors that might change? Without this line, are we still going to see significant behavior changes w.r.t. PyTorch/XLA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the code that checks the current accelerator will now pick up xla when it wasn't before.
https://github.com/search?q=repo%3Apytorch%2Fpytorch+current_accelerator&type=code are the ones in core but there are many more out-of-core now.
Autograd will start to do stream handling for you, autocast will also have different behavior, pinned memory will start trying to use your host allocator, and out of core repos should use generic code instead of xla specific code.
Yes this concern is only about this one line (and the update below in this file). The rest of the changes are lower impact indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after taking a closer look at getAcceleratorHooksInterface(
pytorch/aten/src/ATen/Context.h
Line 72 in 2fde10d
| const AcceleratorHooksInterface& getAcceleratorHooksInterface( |
at::getAccelerator only makes difference when opt_device_type is doesn't have value. when i search the usage of getAcceleratorHooksInterface (https://github.com/search?q=repo%3Apytorch%2Fpytorch%20getAcceleratorHooksInterface&type=code), it turns out that it is only used in a few places and torch/csrc/autograd/engine.cpp is the only place that opt_device_type. The random number generator checks and passes opt_device_type. Therefore, it getAcceleratorHooksInterface works properly even at::getAccelerator doesn't recognize XLA.
Considering the scope of evaluating the impact of changing is at::getAccelerator is way beyond the scope of this change, i created a separate issue #166054 to track the change of at::getAccelerator and revert the related change from this PR.
|
@albanD please merge if there is no other question. thanks! |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@FFFrog i am confused by the merging status. is the merge failed? i see |
Don't worry about the status of the PR, the changes have already been merged into the trunk. :D |
Currently, the implementation of
torch.Generatoronly support "cpu" and "cuda" device type. https://github.com/pytorch/pytorch/blob/main/torch/csrc/Generator.cpp#L55-L61This change enables
torch.Generatorto support more device type by allowing any device backend to register their own generator factory through a Generator Registry. This is similar to what "DeviceGuardImpl registry" does today.Key Changes:
New registry API:
Python/C++ integration:
Backend extensibility:
Example usage:
This allows torch.Generator(device='xla') to return an XlaGeneratorImpl when the torch_xla extension is imported.