Skip to content

Legacy interface for prototype datasets? #5040

@pmeier

Description

@pmeier

The prototype datasets change the interface in two ways:

  1. The input parameters are a little different in some cases. For example in the current API the MNIST dataset would be instantiated with datasets.MNIST(..., train=True) whereas now it looks like datasets.load("mnist", split="train").
  2. The output is completely different. Before we returned a tuple (sometimes of varying length) whereas now we always return a dictionary. Furthermore, before we used PIL images and numpy arrays as return types, whereas now we always use tensor subclasses.

To lower the burden to move to the new style datasets a little, we could have a legacy: bool = False keyword argument on datasets.load(). For all datasets that have a legacy variant, we could simply implement two functions that map the input and output. If a dataset doesn't support a legacy variant, we could simply error out.

cc @pmeier @bjuncek

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions