DataBubble·

Model Detail

whisper-large-v3-turbo

—

Provider: OpenAICategory: audioPipeline: automatic-speech-recognition

DB Score

1.0

Downloads

7.8M

Likes

Day

+0.0%

Week

+0.0%

Month

+41.1%

Overview

whisper-large-v3-turbo is an audio model with 404M parameters released by OpenAI. The model is registered under the automatic-speech-recognition pipeline tag on Hugging Face, distributed under the permissive mit license.

Technical

whisper-large-v3-turbo ships with 404M parameters. The mit license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.

Trending Signal

Downloads of whisper-large-v3-turbo have moved +41.1% over the trailing thirty days. That is a slight downtrend, consistent with normal cooling as newer models compete for the same workloads. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.

Read about databubble_score →

Use Cases

whisper-large-v3-turbo is best fit for speech recognition, transcription, or speech synthesis depending on the task head. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History

Research Paper

arXiv: 2212.04356→

Model Info

Licensemit

Citations7,426 (1015 influential)

Recent newsView all news →

Robust Assamese Speech Recognition through Controlled Fine-Tuning of Whisper Models

arXiv:2607.17164v1 Announce Type: new Abstract: Developing Automatic Speech Recognition (ASR) for morphologically rich, low-resource languages such as Assamese is challenging due to insufficient annotated speech data. The pretrained Whisper model performs poorly on Assamese speech recognition tasks.

arxivneutral9h ago

On the Interpretability of Whisper Encodings Using Sparse Autoencoders

arXiv:2605.12225v2 Announce Type: replace Abstract: While deep transformer-based models have advanced rapidly, their internal mechanisms remain largely a mystery. Recent work has prioritized understanding text-based transformer models, leaving ASR systems largely unexplored. In order to address this

arxivneutral5d ago

When Audio Separation Hurts Zero-Shot ASR: Evaluating SAM-Audio with Whisper on Bengali and English Speech

arXiv:2603.04710v2 Announce Type: replace-cross Abstract: Recent advances in automatic speech recognition (ASR) and speech enhancement have strengthened the common belief that cleaner audio should lead to more accurate transcription. In this work, we examine whether this assumption holds for modern

arxiv22d ago

From Dispersion to Attraction: Spectral Dynamics of Hallucination Across Whisper Model Scales

arXiv:2604.08591v2 Announce Type: replace-cross Abstract: Hallucinations in large ASR models present a critical safety risk. In this work, we propose the \textit{Spectral Sensitivity Theorem}, which predicts a phase transition in deep networks from a dispersive regime (signal decay) to an attractor

arxiv27d ago

Layer-wise Probing of wav2vec 2.0 and Whisper for Consonant Cluster Reduction in African American English

arXiv:2606.23948v1 Announce Type: new Abstract: Self-supervised and supervised speech models are increasingly used to investigate which linguistic information their internal representations encode, and at what level of abstraction they encode it. One underexplored phenomenon is consonant cluster red

arxiv35d ago

Semi-Supervised Speech Confidence Detection using Pseudo-Labelling and Whisper Embeddings

arXiv:2606.16505v1 Announce Type: cross Abstract: Understanding speaker confidence is crucial in educational settings, as it can enhance personalised feedback and improve learning outcomes. This study introduces a novel framework for detecting speaker confidence by integrating human-engineered featu

Related Models

clip-vit-large-patch14

OpenAI · 33.1M downloads

clip-vit-base-patch32

OpenAI · 21.4M downloads

Kokoro-82M

hexgrad · 9.7M downloads

XTTS-v2

coqui · 9.2M downloads