arxivMarch 31, 2026 at 4:00 AM1 min read
Pashto Common Voice: Building the First Open Speech Corpus for a 60-Million-Speaker Low-Resource Language
arXiv:2603.27021v1 Announce Type: new Abstract: We present the Pashto Common Voice corpus -- the first large-scale, openly licensed speech resource for Pashto, a language with over 60 million native speakers largely absent from open speech technology. Through a community effort spanning 2022-2025, t
No replies yet. Be first.