Real and synthetic Punjabi speech datasets for automatic speech recognition

Singh S; Hou F; Wang R

Real and synthetic Punjabi speech datasets for automatic speech recognition

dc.citation.issue	February 2024
dc.citation.volume	52
dc.contributor.author	Singh S
dc.contributor.author	Hou F
dc.contributor.author	Wang R
dc.coverage.spatial	Netherlands
dc.date.accessioned	2024-09-16T01:34:49Z
dc.date.available	2024-09-16T01:34:49Z
dc.date.issued	2024-02
dc.description.abstract	Automatic speech recognition (ASR) has been an active area of research. Training with large annotated datasets is the key to the development of robust ASR systems. However, most available datasets are focused on high-resource languages like English, leaving a significant gap for low-resource languages. Among these languages is Punjabi, despite its large number of speakers, Punjabi lacks high-quality annotated datasets for accurate speech recognition. To address this gap, we introduce three labeled Punjabi speech datasets: Punjabi Speech (real speech dataset) and Google-synth/CMU-synth (synthesized speech datasets). The Punjabi Speech dataset consists of read speech recordings captured in various environments, including both studio and open settings. In addition, the Google-synth dataset is synthesized using Google's Punjabi text-to-speech cloud services. Furthermore, the CMU-synth dataset is created using the Clustergen model available in the Festival speech synthesis system developed by CMU. These datasets aim to facilitate the development of accurate Punjabi speech recognition systems, bridging the resource gap for this important language.
dc.description.confidential	false
dc.format.pagination	109865-
dc.identifier.author-url	https://www.ncbi.nlm.nih.gov/pubmed/38146308
dc.identifier.citation	Singh S, Hou F, Wang R. (2024). Real and synthetic Punjabi speech datasets for automatic speech recognition.. Data Brief. 52. February 2024. (pp. 109865-).
dc.identifier.doi	10.1016/j.dib.2023.109865
dc.identifier.eissn	2352-3409
dc.identifier.elements-type	journal-article
dc.identifier.issn	2352-3409
dc.identifier.number	109865
dc.identifier.pii	S2352-3409(23)00926-5
dc.identifier.uri	https://mro.massey.ac.nz/handle/10179/71460
dc.language	eng
dc.publisher	Elsevier Inc
dc.publisher.uri	https://www.sciencedirect.com/science/article/pii/S2352340923009265
dc.relation.isPartOf	Data Brief
dc.rights	(c) 2023 The Author/s
dc.rights	CC BY 4.0
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Automatic speech recognition
dc.subject	Punjabi language
dc.subject	Speech dataset
dc.subject	low-resource languages
dc.title	Real and synthetic Punjabi speech datasets for automatic speech recognition
dc.type	Journal article
pubs.elements-id	485150
pubs.organisational-group	Other

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 485150 PDF.pdf
Size:: 1.03 MB
Format:: Adobe Portable Document Format
Description:: Published version.pdf

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 9.22 KB
Format:: Plain Text
Description:

Download

Collections

Journal Articles