Skip to content

Conversation

@vincentqb
Copy link
Contributor

@vincentqb vincentqb commented Sep 29, 2020

This PR takes the levenshtein distance from #632 and moves it to torchaudio. This also adds the docstring from vincentqb#3.

cc notebook

@vincentqb vincentqb requested a review from cpuhrsch September 29, 2020 23:20
@cpuhrsch cpuhrsch changed the title add levenshtein distance to torchaudio from pipeline example levenshtein distance Sep 30, 2020


class TestLevenshteinDistance(common_utils.TorchaudioTestCase):
@parameterized.expand(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add edge cases such as ["abc", "", 3] or ["", "", 0] as well



def levenshtein_distance(r: Union[str, List[str]], h: Union[str, List[str]]):
def levenshtein_distance(r: Union[str, List[str]], h: Union[str, List[str]]) -> int:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a minimalism perspective, I can see that List[str] arguments can be useful, but aside from syntactic sugar, what else do they offer users?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you mix List[str] and str?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function runs with any two arbitrary sequences. I updated this. That being said, a user would only really using this for two Sequence[T] for the same T.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relate to comment.

["aa", "aaa", 1],
["aaa", "aa", 1],
["abc", "bcd", 2],
[["hello", "world"], ["hello", "world", "!"], 1],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would [["hello", "world"], "world"] do?

Copy link
Contributor Author

@vincentqb vincentqb Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would said that ["hello", "world"] is not the same as "w", and then "o" "r" "d" will need to be added, so the edit distance would be 1 replacement + 4 additions, so 5 edits. This is a particular case of comment.

Added as a test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could decide to detect such cases and compute the edit distance between the word, and each of the word, and return a list of edit distance instead.

This then lead to the question: what API do we expect a metric to offer, and how would this one align?

Copy link
Contributor Author

@vincentqb vincentqb Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go that path, then we'd treat as special cases:

  • [Sequence[T]], Sequence[T]]]
  • [Sequence[T], List[Sequence[T]]]
  • [List[Sequence[T]], Sequence[T]]
  • [List[Sequence[T]], List[Sequence[T]]]

The last is ambiguous: batch compare 1-1, compare all possible pairs, or edit distance between the two lists (e.g. comparing sentences)?

The two current use cases are

  • two strings
  • two sentences (list of strings)

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd stick to "two strings" and add the other behavior later on, if necessary. I know you already use it in your example, but how much worse does the code become?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow your suggestiong: are you saying the function only handles a reference and hypothesis string: levenshtein_distance("Hello", "ello")?

How do you suggest comparing the edit distance between the sentences "Hello World!" and "Bonjour World!"? Right now, the sentence is first split into by spaces, and the distance function is then applied on the two lists of words: levenshtein_distance(["Hello", "World!"], ["Bonjour", "World!"]).

Calculate the Levenshtein distance between two lists or strings.
The function computes an edit distance allowing deletion, insertion and substitution.
The result is an integer. Users may want to normalize by the length of the reference.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should imo also include an explanation of what happens when the user passes a list of strings and how that is different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, and updated. Doc may need to be updated further following discussion from comment.

@vincentqb vincentqb force-pushed the levenshtein branch 3 times, most recently from b6ea6b7 to ec889c1 Compare September 30, 2020 16:31
@@ -0,0 +1,40 @@
from collections.abc import Sequence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really deserve to be in a new namespace, for user-side API perspective?
I could also see that this can live inside of torchaudio.functionals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting the implementation to the dedicated module makes sense. functionals.py grew too large in my opinion, but we can also import functions defined in this module in functionals.py so that users can access it via torchaudio.functionals

Copy link
Contributor Author

@vincentqb vincentqb Oct 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good question. We'll need to follow-up on where this should go. functional.py does seem like a good candidate.

@vincentqb
Copy link
Contributor Author

Closed by #1601

@vincentqb vincentqb closed this Jun 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants