-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
featureIs an improvement or enhancementIs an improvement or enhancementhelp wantedOpen to be worked onOpen to be worked onstrategy: dp (removed in pl)DataParallelDataParallel
Milestone
Description
In many image generation tasks with GANs, generator and discriminator is trained through the same generated image single iteration.
In PyTorch Lightning, the procedure is written like below:
def training_step(self, batch, batch_nb, optimizer_i):
foo = batch['foo']
bar = batch['bar']
if optimizer_i == 0: # train discriminator
self.foo_out = self.netG(foo) # register as a instance variable
# calc d_loss
d_loss = ...
return {'loss': d_loss}
elif optimizer_i == 1: # train generator
# common reconstruction error
g_loss = F.l1_loss(self.foo_out, bar)
# other losses
...
return {'loss': g_loss}It works well on single GPU, however, self.foo_out has been flushed in optimizer_i == 1 branch when DP is set.
I think it is a undesired behavior, any help or fix?
Metadata
Metadata
Assignees
Labels
featureIs an improvement or enhancementIs an improvement or enhancementhelp wantedOpen to be worked onOpen to be worked onstrategy: dp (removed in pl)DataParallelDataParallel