BaltiVoice: A Speech Corpus and Fine-tuned Whisper ASR System for the Balti Language

Source

arxiv.orgfull article ↗

Publisher summary· verbatim

arXiv:2606.03504v2 Announce Type: replace-cross Abstract: We present BaltiVoice, a 16.8-hour read-speech corpus for Balti (ISO 639-3: bft), a Tibetic language spoken in Gilgit-Baltistan, Pakistan, with no prior publicly available ASR resources. The corpus contains 10,060 validated utterances in nati

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

Discussion

No replies yet. Be first.

BaltiVoice: A Speech Corpus and Fine-tuned Whisper ASR System for the Balti Language

Related coverage

BaltiVoice: A Speech Corpus and Fine-tuned Whisper ASR System for the Balti Language

Related coverage