-
Notifications
You must be signed in to change notification settings - Fork 213
Added set_attribute + added script module to to include attributes #1183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added set_attribute + added script module to to include attributes #1183
Conversation
|
@NiklasGustafsson one thing that has been bothering me about the previous PR and this one, with regard to While stepping through the The same thing would happen when calling regular module.cuda().cuda() if not for the memo field (_deviceType, _deviceIndex). Generally, when deviceIndex = -1 it means the default device. Is there any way to confirm what the default device is, to avoid all these extra copies? |
|
Interesting. No, I haven't seen a way of enumerating the devices or get metadata on them. -1 is supposed to imply the "best" available device, which isn't necessarily 0, if I understand the logic correctly. On my workstation, I have a P400 and a 2080 SUPRA. -1 is supposed to pick the latter. |
|
Ah, I see. |
I don't think that's true. There's CPU to consider, too, and for type conversion, if the source and target types are the same... |
|
Right. |
|
I browsed through the libtorch code, and I believe that for CUDA, index -1 gets converted into a real index here. Do we want to add handling so that when ".to()" is called with CUDA & index=-1, that we pull the index using that function, and use that as a base for checking if parameters need to be moved? Alternatively since the parameters aren't actually being copied it's not a huge performance issue and we can just let it be. |
I'm all for following the PyTorch behavior as closely as possible, everywhere, with one exception: if a functionally equivalent alternative is higher-performance, then go with the higher performance. |
|
The PyTorch behavior is to not allow a cuda device index of -1. |
|
For now, the behavior is the same as that of PyTorch, so I think it's fine leaving it. |
Following the discussion on #1126.