I have a working video transcription pipeline working using a local OpenAI Whisper model. I would like to use the equivalent distilled model ("distil-small.en"), which is smaller and faster.
transcribe(self): file = "/path/to/video" model = whisper.load_model("small.en") # WORKS model = whisper.load_model("distil-small.en") # DOES NOT WORK transcript = model.transcribe(word_timestamps=True, audio=file) print(transcript["text"])
However, I get an error that the model was not found:
RuntimeError: Model distil-small.en not found; available models = ['tiny.en', 'tiny', 'base.en', 'base', 'small.en', 'small', 'medium.en', 'medium', 'large-v1', 'large-v2', 'large-v3', 'large']
I installed my dependencies in Poetry (which used pip under the hood) as follows:
[tool.poetry.dependencies]python = "^3.11"openai-whisper = "*"transformers = "*" # distilled whisper modelsaccelerate = "*" # distilled whisper modelsdatasets = { version = "*", extras = ["audio"] } # distilled whisper models
The GitHub Distilled Whisper documentation appears to use a different approach to installing and using these models.
Is it possible to use a Distilled model as a drop-in replacement for a regular Whisper model?