-
Notifications
You must be signed in to change notification settings - Fork 28
Closed as not planned
Closed as not planned
Copy link
Labels
questionFurther information is requestedFurther information is requestedwontfixThis will not be worked onThis will not be worked on
Description
Is your feature request related to a problem?
I have a common workflow where I use xbatcher
in conjunction with torchdata.datapipes
to load data that works really well for chaining together transformations. But, when putting the resulting datapipe
into the torch.utils.data.DataLoader
with num_workers>1
ends up having each worker create it's own copy of the full dataset that gets iterated over. This results in a single "epoch" being actually num_workers
passes over the full data.
Describe the solution you'd like
Ideally, it would be nice to have a hook into xbatcher
that we can use to specify as a worker_init_fn
to the torch.utils.data.DataLoader
so that each worker only handles it's own unique portion of the dataset.
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requestedwontfixThis will not be worked onThis will not be worked on