DataBubble·

Model Detail

speaker-diarization-3.1

▼ 1.9%

Provider: pyannoteCategory: audioPipeline: automatic-speech-recognition

Downloads

10.6M

Likes

Day

-1.9%

Week

-7.8%

Month

+0.0%

Download History

Research Paper

arXiv: 2111.14448→

A Novel Automatic Framework for Speaker Drift Detection in Synthesized Speech

arXiv:2604.06327v1 Announce Type: cross Abstract: Recent diffusion-based text-to-speech (TTS) models achieve high naturalness and expressiveness, yet often suffer from speaker drift, a subtle, gradual shift in perceived speaker identity within a single utterance. This underexplored phenomenon underm

arxivneutral7d ago

Expressive Prompting: Improving Emotion Intensity and Speaker Consistency in Zero-Shot TTS

arXiv:2409.18512v2 Announce Type: replace-cross Abstract: Recent advancements in speech synthesis have enabled large language model (LLM)-based systems to perform zero-shot generation with controllable content, timbre, speaker identity, and emotion through input prompts. As a result, these models he

arxivneutral7d ago

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

arXiv:2604.03074v1 Announce Type: cross Abstract: Transcribing and understanding multi-speaker conversations requires speech recognition, speaker attribution, and timestamp localization. While speech LLMs excel at single-speaker tasks, multi-speaker scenarios remain challenging due to overlapping sp

arxiv13d ago

Pashto Common Voice: Building the First Open Speech Corpus for a 60-Million-Speaker Low-Resource Language

arXiv:2603.27021v1 Announce Type: new Abstract: We present the Pashto Common Voice corpus -- the first large-scale, openly licensed speech resource for Pashto, a language with over 60 million native speakers largely absent from open speech technology. Through a community effort spanning 2022-2025, t

huggingface1085d ago

Introducing HuggingFace blog for Chinese speakers: Fostering Collaboration with the Chinese AI community

Related Models

segmentation-3.0

pyannote · 10.8M downloads

speaker-diarization-community-1

pyannote · 2.4M downloads

speaker-diarization-3.1

pyannote · 10.6M downloads

Kokoro-82M

hexgrad · 9.8M downloads