Skip to content

Define a metadata format for textual inversion concepts #799

@allo-

Description

@allo-

Is your feature request related to a problem? Please describe.
The textual inversion script currently creates a dict with one term and its associated embedding. Usual scripts load one or more embeddings using the keys as token for prompt.

This format does not allow for metadata associated with the key(s). The longer the format exists, the more it will be used everywhere, so it would be good to define a more extendable format as soon as possible.

Describe the solution you'd like
Change the format from:

{
  "term": embedding_tensor
}

to something like

=== First version (leaving room for additional data for each term) ===

{
  "term": {
    "embedding": embedding_tensor
  }
}

or even

=== Second version (leaving room for additional data for each term and data for the whole file) ===

{
  "terms": [
    "term": {
      "embedding": embedding_tensor
    }
  }
}

which leaves room for adding metadata that not every script needs to understand like

=== First version ===

{
  "term": {
    "embedding": embedding_tensor
    "training_steps": 2000 # will be shown in the loading screen
    "training_files": ["hash1", "hash2", "hash3"] # is used to continue training
    "some_other_metadata": "for_the_embedding"
  }
}

=== Second version ===

{
  "terms": [
    "term": {
      "embedding": embedding_tensor
      "training_steps": 2000 # will be shown in the loading screen
      "training_files": ["hash1", "hash2", "hash3"] # is used to continue training
      "some_other_metadata": "for_the_embedding"
    }
  },
  "author": "My Name", # can be extracted for attribution
  "license": "CC-0"
  "some_other_metadata": "for_the_file"
}

The additional keys can be read or can be ignored (which means front-end developers can agree on a standard later on and convert between different formats), but moving the tensor into a sub-structure is required to make it possible to add additional data, which can be distinguished from the actual embedding data, without breaking other people's codes.

Describe alternatives you've considered
With the current structure it would require workarounds like a special key "metadata" which is skipped by scripts loading the embeddings. Another option would be to store the metadata in a separate file.

Additional context
It would often be useful to provide a embedding in different versions and it should be easy to bundle the associated metadata with the embedding.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions