DataBubble·

Model Detail

speaker-diarization-community-1

—

Provider: pyannoteCategory: audioPipeline: automatic-speech-recognition

DB Score

2.9

Downloads

4.6M

Likes

802

Day

+0.0%

Week

+0.0%

Month

+0.0%

Overview

speaker-diarization-community-1 is an audio model released by pyannote. The model is registered under the automatic-speech-recognition pipeline tag on Hugging Face, distributed under the permissive cc-by-4.0 license.

Technical

The cc-by-4.0 license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.

Use Cases

speaker-diarization-community-1 is best fit for speech recognition, transcription, or speech synthesis depending on the task head. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History

Research Paper

arXiv: 2104.03603→

Model Info

Licensecc-by-4.0

Citations152 (20 influential)

Recent newsView all news →

Do Speech Tokens Leak Voiceprints? Speaker Inversion Attacks Against End-to-End Speech Language Models

arXiv:2607.16870v1 Announce Type: cross Abstract: End-to-end speech language models increasingly represent user speech with speech tokens rather than relying exclusively on cascaded ASR--LLM--TTS pipelines. Although these tokens support expressive and low-latency spoken interaction, they may also pr

arxivneutral3d ago

Autoregressive Guidance of Deep Spatially Selective Filters using Bayesian Tracking for Efficient Extraction of Moving Speakers

arXiv:2603.23723v2 Announce Type: replace-cross Abstract: Deep spatially selective filters achieve high-quality enhancement with real-time capable architectures for stationary speakers of known directions. To retain this level of performance in dynamic scenarios where only the speakers' initial dire

arxivneutral3d ago

Large Audio Language Models for Spoofing-Aware Speaker Verification

arXiv:2607.14753v1 Announce Type: cross Abstract: Recent advances in text-to-speech and voice cloning make high-quality spoofing inexpensive and scalable, threatening voice authentication systems, especially automatic speaker verification (ASV). Existing defenses mainly address this threat through b

arxivneutral4d ago

Diarization-Guided Qwen-ASR Adaptation for Multilingual Two-Speaker Conversational Speech

arXiv:2607.08208v2 Announce Type: replace Abstract: This paper describes our self-designed system for Task 1 of the MLC-SLM 2026 Challenge for multilingual two-speaker conversational speech. The system combines a modular speaker diarization front end with a challenge-adapted Qwen3-ASR-1.7B recognize

techcrunchneutral6d ago

OpenAI’s first hardware device is reportedly a screenless speaker that can move

The device is weirdly described as involving "mechanical elements that can move on their own" and the Bloomberg report includes the detail that the device is designed to "feel like a companion and become a physical manifestation of OpenAI’s ChatGPT."

thevergeneutral6d ago

OpenAI may announce a ChatGPT smart speaker this year

OpenAI's first device is set to be a smart speaker that lets you talk with ChatGPT, according to a report from Bloomberg. The device apparently won't have a screen, but will use a camera and additional sensors to "understand" your environment. The report comes just days after Apple filed a lawsuit a

Related Models

speaker-diarization-3.1

pyannote · 8.2M downloads

segmentation-3.0

pyannote · 6.1M downloads

Kokoro-82M

hexgrad · 9.8M downloads

XTTS-v2

coqui · 9.2M downloads