Skip to content

Conversation

@vincentqb
Copy link
Contributor

@vincentqb vincentqb commented May 12, 2020

We implement a reference pipeline using wav2letter model to train on librispeech. The structure will be inspired by torchvision's reference implementation.

As discussed here, this code was initially implemented in this python script which was converted from this notebook to be ran using SLURM using bash script and sbatch.

There are at least a few more things to do:

  • Clean code from notebook conversion.
  • Replace model implementation by call to torchaudio.
  • Remove SLURM-specific code.
  • Add an option to activate torch.autograd.set_detect_anomaly(True)
  • Remove Bring back viterbi decoder.
  • Make distributed work with 1 GPU per python process?
  • Add 10 ms shift data augmentation
  • Publish pre-trained weights

Note:

  • 4795a72 removed the modified wav2letter model with dropout and custom hidden units in order to use the exact implementation currently available within torchaudio. This commit also removes the SLURM termination signal capture.
  • d8ee1e9 removed the 2-gram viterbi decoder as it was not improving CER, and unused dataclass.

See also post by assemblyai, and internal.
cc @zhangguanheng66 for pytorch/text#767

@cpuhrsch cpuhrsch changed the title Example pipeline [WIP] Example pipeline May 12, 2020
@vincentqb vincentqb force-pushed the wav2letter branch 2 times, most recently from 2da10fd to 728f6c8 Compare May 12, 2020 19:43
return len(self._iterable)


class MapMemoryCache(torch.utils.data.Dataset):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Seems like a pretty generic object that should also be useful for core. It's effectively a readthrough cache backed by RAM similar to how diskcache_iterator is a readthrough cache backed by disk.

# return create(["train-clean-100", "train-clean-360", "train-other-500"]), create(["dev-clean", "dev-other"]), None


def which_set(filename, validation_percentage, testing_percentage):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary as opposed to a seeded shuffle + split?

Copy link
Contributor Author

@vincentqb vincentqb May 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the script recommended for splitting train/dev/test in SpeechCommands' readme adapted to this use case. I suggest we include it with SpeechCommands in torchaudio.

The advantage of this approach is that words and speakers are better distributed between the different splits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are they better distributed in comparison to a random shuffle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the real advantage of this approach is listed in the docstring:

We want to keep files in the same training, validation, or testing sets even if new ones are added over time. This makes it less likely that testing samples will accidentally be reused in training when long runs are restarted for example.

if c is None:
c = count
else:
c = c + count
Copy link
Contributor

@cpuhrsch cpuhrsch May 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could also use update and initialize c as an empty Counter.

Creating a Counter and adding to one over each iteration might be quite slow in comparison.

return output[:, 0, :]


def levenshtein_distance_list(r, h):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one version is needed indeed. The list version is faster than the pytorch version though.

data = LIBRISPEECH(
root, tag, folder_in_archive=folder_in_archive, download=False)
else:
data = torch.utils.data.ConcatDataset([LIBRISPEECH(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be done using sum since ConcatDataset can be created via add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@cpuhrsch
Copy link
Contributor

Looks pretty good :)

@codecov
Copy link

codecov bot commented May 14, 2020

Codecov Report

Merging #632 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #632   +/-   ##
=======================================
  Coverage   89.99%   89.99%           
=======================================
  Files          35       35           
  Lines        2719     2719           
=======================================
  Hits         2447     2447           
  Misses        272      272           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a5e29d...db488f4. Read the comment docs.

@cpuhrsch
Copy link
Contributor

I'd add "split into multiple files" as another todo as well.

import torch


def levenshtein_distance(r: str, h: str, device: Optional[str] = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems worthwhile a separate PR already. Do you agree? In particular with the C++ extension we can create a JIT-able, fast version of this already.

return args


def signal_handler(a, b):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need for functions like this worry me, because I'd imagine most users to not be aware of their necessity or purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not "needed" :) I'll remove them to avoid confusion.

@jimchen90 jimchen90 mentioned this pull request Jun 24, 2020
2 tasks
@vincentqb vincentqb force-pushed the wav2letter branch 7 times, most recently from c9b4d6c to d347fa4 Compare July 1, 2020 22:12
weight_decay=args.weight_decay,
)
else:
raise ValueError("Selected optimizer not supported")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeat the given option, i.e. "Selected optimizer %s not supported".format(args.optimizer) if you're going for this type of input sanitization to make it easier for the user to debug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this code is unreachable from CLI, so NotImplementedError makes more sense, because the only time you reach here is when you intend to add a new choice and changed CLI parser but forgot to add actual implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I would extract this into a helper function so that main logic becomes readable. _get_optimizer(...)

device=tensors[0].device,
)

tensors = torch.nn.utils.rnn.pad_sequence(tensors, batch_first=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A wrapped / generalized version of this could form a useful torchaudio function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pad_sequence requires transposes as it is, since in torchaudio it is the last dimension that we want to pad. I re-implemented pad_sequence for this use case.

def encode(self, iterable):
if isinstance(iterable, list):
return [self.encode(i) for i in iterable]
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I pass an iterable that yields lists? What's the basecase type here? Maybe that's an easier case to branch on. Also as a very minor nit, I actually like using returns to avoid "else". So you could write

if isinstance(iterable, list):
    return [self.encode(i) for i in iterable]
return [self.mapping[i] + self.mapping[self.char_blank] for i in iterable]

from torch import topk


class GreedyDecoder:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could generalize this file to be called "decoders.py" and also fold in things such as compute_error_rates

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is stateless. Can it be a function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be functional corresponding to a transform, but really it's a step towards our beamsearch work

metric["dataset length"] += metric["batch size"]
metric["iteration"] += 1
metric["loss"] = loss.item()
metric["cumulative loss"] += metric["loss"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd abstract this accumulation for both training evaluation and merge it into a single function. That way you'll always be sure that both training and evaluation are using the exact same calculations, since that's the last place you'd want to be buggy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second that, all those logging logic should belong logger side as a method. That will make the train loop more readable, and achieve better decoupling.


logging.info("Start time: %s", datetime.now())

# Explicitly set seed to make sure models created in separate processes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is for distributed training I'd worry that this isn't already happening. Did you have a case where this became necessary in order to avoid a bug?

collate_fn=collate_fn_train,
**loader_training_params,
)
loader_validation = DataLoader(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For validation "drop_last" is usually undesired because you can end up not running on the entire dataset.

"Checkpoint: loaded '%s' at epoch %s", args.checkpoint, checkpoint["epoch"]
)
else:
logging.info("Checkpoint: not found")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a case I'd error on. If the user intents to resume from this checkpoint and it wasn't found that's probably a mistake.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the logic here is strange. If user does not give the checkpoint option (training from scratch), there is no need to say not found.

not args.reduce_lr_valid,
)

if not (epoch + 1) % args.print_freq or epoch == args.epochs - 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I sometimes like to save on the indent and write something like:

if epoch < args.epoch - 1 and (epoch + 1) % args.print_freq:
    continue


class UnsqueezeFirst(torch.nn.Module):
def forward(self, tensor):
return tensor.unsqueeze(0)
Copy link
Contributor

@cpuhrsch cpuhrsch Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is, in my opinion, the sort of issue that makes me dislike using nn.Sequential over a function. You end up wrapping simple, small commands into modules.

However, if you write one (or two) collate_functions you'll probably end up writing function factories that essentially do the same.

def save_checkpoint(state, is_best, filename, disable):
"""
Save the model to a temporary file first,
then copy it to filename, in case the signal interrupts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this happened? I think the scheduler is supposed to signal you and then you get a bunch of time to catch the signal and shutdown gracefully.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in my opinion this logic does not corresponds to the name of the function. save_checkpoint should do saving and only saving. The logic for Handling temporary file for the sake of interruption solved different concern and should live in a different function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of this happening. I'll remove this logic.

Copy link
Contributor

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns on accuracy metrics. I strongly believe that WER computation is incorrect and we should not re-invent the wheel and use SCTK or something.

self.dataset = dataset
self._cache = [None] * len(dataset)

def __getitem__(self, n):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified.

if self._cache[n] is None:
    self._cache[n] = self.dataset[n]
return self._cache[n]

def __len__(self):
return len(self.dataset)

def process_datapoint(self, item):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This operation is not generic and requires specific item type, and since it uses index slicing it is very difficult to understand what it does. Please add a docstring.

if isinstance(transforms, list):
transform_list = transforms
else:
transform_list = [transforms]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is an example code and all the helper functions are for making the example code main code simpler, so making helper functions more specific helps better with maintainability. Instead of allowing multiple types, it's simpler to allow only one type and do the equivalent type conversion in client code.


def collate_fn(batch):

tensors = [transforms(b[0]) for b in batch if b]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very difficult to understand what are being transformed, here.

  1. for b in batch if b

Why is there a case that one item in a batch (denoted as b) can be invalid sample?

  1. what does b[0] represent?

Copy link
Contributor Author

@vincentqb vincentqb Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. if b is no longer needed, removed :)
  2. b[0] is the waveform from the processed data point tuple. added a comment

self.char_space = char_space
self.char_blank = char_blank

labels = [l for l in labels]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot be labels = list(labels)? What is the expected type of the input labels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yes, it's just a string.

from typing import List, Union


def levenshtein_distance(r: Union[str, List[str]], h: Union[str, List[str]]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If moving this into the library, docstring needs to be improved with the equation.


class MetricLogger(defaultdict):
def __init__(self, name, print_freq=1, disable=False):
super().__init__(lambda: 0.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think super().__init__(float) is better.

"""

if disable:
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this should be the logic of save_checkpoinit function. This goes against single-responsibility principle. It's caller's responsibility to check when to save.

return

if filename == "":
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, giving empty string as file location should be error.

def save_checkpoint(state, is_best, filename, disable):
"""
Save the model to a temporary file first,
then copy it to filename, in case the signal interrupts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in my opinion this logic does not corresponds to the name of the function. save_checkpoint should do saving and only saving. The logic for Handling temporary file for the sake of interruption solved different concern and should live in a different function.

Copy link
Contributor

@cpuhrsch cpuhrsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is getting very, very large and has been up for a long time. Let's merge it, since it's already working, and revisit some of these suggested improvements on a PR-by-PR basis. It'll also help us share code across examples etc.

@vincentqb
Copy link
Contributor Author

Following comment, reverted to Aug 19, merging, and moved follow-up to vincentqb#3.

@vincentqb vincentqb merged commit 9c27422 into pytorch:master Sep 24, 2020
@vincentqb
Copy link
Contributor Author

As mentioned in the README, we can get less than 13.8% "cer over target length" after 30 epochs. See sample output grepped for validation:

{"name": "validation", "epoch": 0, "cumulative loss": 70.70183157920837, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3667538847242082, "validation time": 25.846306562423706}
{"name": "validation", "epoch": 1, "cumulative loss": 69.96176052093506, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3315124057588124, "validation time": 6.5293145179748535}
{"name": "validation", "epoch": 2, "cumulative loss": 70.07647657394409, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3369750749497187, "validation time": 6.5231428146362305}
{"name": "validation", "epoch": 3, "cumulative loss": 69.94338345527649, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3306373073941185, "validation time": 5.516169548034668}
{"name": "validation", "epoch": 4, "cumulative loss": 69.97763347625732, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3322682607741583, "validation time": 6.777941942214966}
{"name": "validation", "epoch": 5, "cumulative loss": 69.72349190711975, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3201662812914168, "validation time": 6.666280746459961}
{"name": "validation", "epoch": 6, "cumulative loss": 70.0134003162384, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3339714436304, "validation time": 6.872417688369751}
{"name": "validation", "epoch": 7, "cumulative loss": 69.52196455001831, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3105697404770624, "validation time": 6.776198148727417}
{"name": "validation", "epoch": 8, "cumulative loss": 69.42171597480774, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3057959988003685, "validation time": 6.710602760314941}
{"name": "validation", "epoch": 9, "cumulative loss": 69.25003266334534, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.2976206030164446, "validation time": 6.838287353515625}
{"name": "validation", "epoch": 10, "cumulative loss": 64.44925856590271, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.069012312662034, "validation time": 6.446615695953369}
{"name": "validation", "epoch": 11, "cumulative loss": 63.2327446937561, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.9225932936722553, "cumulative cer": 258065.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0010559662090812, "cumulative wer": 54122.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.0110830806550526, "validation time": 7.578996181488037}
{"name": "validation", "epoch": 12, "cumulative loss": 57.679614543914795, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.6934829637641968, "cumulative cer": 195536.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.3079901443153819, "cumulative wer": 70233.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 2.7466483116149902, "validation time": 12.67360258102417}
{"name": "validation", "epoch": 13, "cumulative loss": 54.061622858047485, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.6738777717685235, "cumulative cer": 189715.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.3005983808518127, "cumulative wer": 69715.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 2.5743629932403564, "validation time": 14.44613003730774}
{"name": "validation", "epoch": 14, "cumulative loss": 42.6647093296051, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.6101270957274202, "cumulative cer": 169292.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0091517071453713, "cumulative wer": 54202.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 2.0316528252192905, "validation time": 15.787317752838135}
{"name": "validation", "epoch": 15, "cumulative loss": 30.52291715145111, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.42664954029204977, "cumulative cer": 120448.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.9686730024639212, "cumulative wer": 51122.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 1.4534722453071958, "validation time": 18.890476942062378}
{"name": "validation", "epoch": 16, "cumulative loss": 22.910719513893127, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.3174012979989183, "cumulative cer": 91667.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.7529039070749736, "cumulative wer": 41637.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 1.0909866435187203, "validation time": 19.944958686828613}
{"name": "validation", "epoch": 17, "cumulative loss": 17.98588478565216, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.2589913466738778, "cumulative cer": 75363.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.6613868356212601, "cumulative wer": 36475.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.8564707040786743, "validation time": 19.98793649673462}
{"name": "validation", "epoch": 18, "cumulative loss": 15.67355227470398, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.22397241752298538, "cumulative cer": 64542.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.6117564237944386, "cumulative wer": 33368.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.7463596321287609, "validation time": 20.789494514465332}
{"name": "validation", "epoch": 19, "cumulative loss": 13.239276766777039, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.1870605732828556, "cumulative cer": 55027.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.5381907778951074, "cumulative wer": 29982.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.6304417507989066, "validation time": 20.70693564414978}
{"name": "validation", "epoch": 20, "cumulative loss": 12.41661947965622, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.180164954029205, "cumulative cer": 51669.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.5255191833861317, "cumulative wer": 28399.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.5912675942693438, "validation time": 20.92684292793274}
{"name": "validation", "epoch": 21, "cumulative loss": 11.557319521903992, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.17056517036235802, "cumulative cer": 49252.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.5072157690953889, "cumulative wer": 27600.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.5503485486620948, "validation time": 21.347500801086426}
{"name": "validation", "epoch": 22, "cumulative loss": 10.85356280207634, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.15548945375878853, "cumulative cer": 46219.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.46673706441393875, "cumulative wer": 26201.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.5168363239083972, "validation time": 20.922102212905884}
{"name": "validation", "epoch": 23, "cumulative loss": 10.527889400720596, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.1535289345592212, "cumulative cer": 44505.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.47624076029567053, "cumulative wer": 25758.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.5013280667009807, "validation time": 20.710388660430908}
{"name": "validation", "epoch": 24, "cumulative loss": 10.139740884304047, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.14710654407787993, "cumulative cer": 43546.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.44948961633227735, "cumulative wer": 25158.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.48284480401447843, "validation time": 21.18306064605713}
{"name": "validation", "epoch": 25, "cumulative loss": 10.286657720804214, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.1484586262844781, "cumulative cer": 42859.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.4463217177050334, "cumulative wer": 24725.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.48984084384781973, "validation time": 20.720789670944214}
{"name": "validation", "epoch": 26, "cumulative loss": 9.967010378837585, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.14264467279610601, "cumulative cer": 41970.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.43681802182330165, "cumulative wer": 24719.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.47461954184940885, "validation time": 20.847771406173706}
{"name": "validation", "epoch": 27, "cumulative loss": 9.573839098215103, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13737155219037317, "cumulative cer": 40117.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.42696233720520943, "cumulative wer": 23860.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.45589709991500493, "validation time": 20.785948276519775}
{"name": "validation", "epoch": 28, "cumulative loss": 9.409523874521255, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13608707409410492, "cumulative cer": 39419.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.42625835973248855, "cumulative wer": 23408.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.44807256545339313, "validation time": 21.98930525779724}
{"name": "validation", "epoch": 29, "cumulative loss": 9.617189735174179, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.1341265548945376, "cumulative cer": 39536.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.42203449489616335, "cumulative wer": 23571.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.4579614159606752, "validation time": 21.18846106529236}
{"name": "validation", "epoch": 30, "cumulative loss": 9.645921111106873, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13608707409410492, "cumulative cer": 39568.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.4325941569869764, "cumulative wer": 23657.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.4593295767193749, "validation time": 20.77275061607361}
{"name": "validation", "epoch": 31, "cumulative loss": 9.530102461576462, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13115197404002163, "cumulative cer": 38687.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.41358676522351284, "cumulative wer": 23131.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.4538144029322125, "validation time": 20.9629008769989}
{"name": "validation", "epoch": 32, "cumulative loss": 9.8547984957695, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13662790697674418, "cumulative cer": 39482.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.42872228088701164, "cumulative wer": 23718.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.4692761188461667, "validation time": 21.710347414016724}
{"name": "validation", "epoch": 33, "cumulative loss": 9.720289260149002, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13270686857760952, "cumulative cer": 38280.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.4237944385779655, "cumulative wer": 23243.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.4628709171499525, "validation time": 21.28827738761902}
{"name": "validation", "epoch": 34, "cumulative loss": 10.022859960794449, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13067874526771228, "cumulative cer": 38105.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.41534670890531505, "cumulative wer": 23047.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.4772790457521166, "validation time": 20.7469642162323}
{"name": "validation", "epoch": 35, "cumulative loss": 9.8920236825943, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.1333829096809086, "cumulative cer": 37309.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.41745864132347765, "cumulative wer": 22699.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.47104874679020475, "validation time": 21.03089475631714}
{"name": "validation", "epoch": 36, "cumulative loss": 10.471008330583572, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.1385884261763115, "cumulative cer": 40063.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.43400211193241817, "cumulative wer": 23698.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.49861944431350347, "validation time": 21.33878517150879}
{"name": "validation", "epoch": 37, "cumulative loss": 10.184874802827835, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13223363980530017, "cumulative cer": 38067.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.4104188665962689, "cumulative wer": 22916.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.4849940382298969, "validation time": 21.298787117004395}
{"name": "validation", "epoch": 38, "cumulative loss": 10.797764033079147, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13770957274202272, "cumulative cer": 39543.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.4297782470960929, "cumulative wer": 23707.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.5141792396704356, "validation time": 21.03481435775757}
{"name": "validation", "epoch": 39, "cumulative loss": 11.389482617378235, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13547863710113575, "cumulative cer": 39638.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.42027455121436114, "cumulative wer": 23703.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.5423563151132493, "validation time": 20.84154224395752}
{"name": "validation", "epoch": 40, "cumulative loss": 12.254583895206451, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 0.13040832882639264, "cumulative cer": 37159.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 0.4171066525871172, "cumulative wer": 22646.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 0.58355161405745, "validation time": 20.9519464969635}

mthrok pushed a commit to mthrok/audio that referenced this pull request Feb 26, 2021
…torchscripttutorial

Delete Intro_to_TorchScript.py adding redirect to get user to new file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants