Add support for LiteWhisper #1219

xenova · 2025-03-05T17:22:03Z

LiteASR is a compression scheme for automatic speech recognition (ASR) models that leverages the low-rank properties of activation values. Our method can compress OpenAI's Whisper encoder by up to ~50%.

Supported models: https://huggingface.co/models?library=transformers.js&other=lite-whisper&sort=trending

Example usage:

import { pipeline } from "@huggingface/transformers";

const transcriber = await pipeline(
  "automatic-speech-recognition",
  "onnx-community/lite-whisper-large-v3-turbo-acc-ONNX",
  { dtype: { encoder_model: "fp32", decoder_model_merged: "q4" } },
);

const audio = await read_audio(
  "https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav",
  transcriber.processor.feature_extractor.config.sampling_rate,
);

const output = await transcriber(audio);
console.log(output);
// { text: ' And so, my fellow Americans, ask not what your country can do for you, ask what you can do for your country.' }

HuggingFaceDocBuilderDev · 2025-03-05T17:24:10Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

decoder-sh-david · 2025-03-29T00:02:40Z

@xenova Which browser did you test this in? I'm unable to get your code sample working in Chrome. While trying to load the model, the pipeline throws an error that's just a number 4273819248.

xenova · 2025-03-29T00:05:39Z

@decoder-sh-david Be sure to specify device: "webgpu" to run in-browser with the current configuration.

Alternatively, you can set dtype: "q8" if you'd like to run on CPU.

The above sample code was run with Node.js (CPU)

decoder-sh-david · 2025-03-29T00:07:57Z

@decoder-sh-david Be sure to specify device: "webgpu" to run in-browser with the current configuration.

Alternatively, you can set dtype: "q8" if you'd like to run on CPU.

The above sample code was run with Node.js (CPU)

That was the missing piece, thank you! Do you have any guidance on which quantizations work? I recall that for some onnx versions of whisper, some just don't work

decoder-sh-david · 2025-03-29T00:10:26Z

Additionally, I don't suppose that this model supports word-level timestamps, does it?

decoder-sh-david · 2025-05-19T20:39:41Z

@xenova does this PR also support converting this model type using the script?

I'm trying to convert it to to include timestamps in the onnx export, but I keep running into the error ValueError: Unrecognized configuration class <class 'transformers_modules.efficient-speech.lite-whisper-large-v3-turbo.6697ac2a887e3256da5defc9e8472f76a2b0f16e.configuration_lite_whisper.LiteWhisperConfig'> to build an AutoTokenizer. which indicates that transformers python may not support the model?

Add support for LiteWhisper

0b82498

xenova merged commit 31dfd43 into main Mar 6, 2025
4 checks passed

xenova deleted the add-lite-whisper branch March 6, 2025 11:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for LiteWhisper #1219

Add support for LiteWhisper #1219

Uh oh!

xenova commented Mar 5, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 5, 2025

Uh oh!

Uh oh!

decoder-sh-david commented Mar 29, 2025

Uh oh!

xenova commented Mar 29, 2025

Uh oh!

decoder-sh-david commented Mar 29, 2025

Uh oh!

decoder-sh-david commented Mar 29, 2025

Uh oh!

decoder-sh-david commented May 19, 2025

Uh oh!

Uh oh!

Add support for LiteWhisper #1219

Add support for LiteWhisper #1219

Uh oh!

Conversation

xenova commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 5, 2025

Uh oh!

Uh oh!

decoder-sh-david commented Mar 29, 2025

Uh oh!

xenova commented Mar 29, 2025

Uh oh!

decoder-sh-david commented Mar 29, 2025

Uh oh!

decoder-sh-david commented Mar 29, 2025

Uh oh!

decoder-sh-david commented May 19, 2025

Uh oh!

Uh oh!

xenova commented Mar 5, 2025 •

edited

Loading