Real and synthetic Punjabi speech datasets for automatic speech recognition

dc.citation.issueFebruary 2024
dc.citation.volume52
dc.contributor.authorSingh S
dc.contributor.authorHou F
dc.contributor.authorWang R
dc.coverage.spatialNetherlands
dc.date.accessioned2024-09-16T01:34:49Z
dc.date.available2024-09-16T01:34:49Z
dc.date.issued2024-02
dc.description.abstractAutomatic speech recognition (ASR) has been an active area of research. Training with large annotated datasets is the key to the development of robust ASR systems. However, most available datasets are focused on high-resource languages like English, leaving a significant gap for low-resource languages. Among these languages is Punjabi, despite its large number of speakers, Punjabi lacks high-quality annotated datasets for accurate speech recognition. To address this gap, we introduce three labeled Punjabi speech datasets: Punjabi Speech (real speech dataset) and Google-synth/CMU-synth (synthesized speech datasets). The Punjabi Speech dataset consists of read speech recordings captured in various environments, including both studio and open settings. In addition, the Google-synth dataset is synthesized using Google's Punjabi text-to-speech cloud services. Furthermore, the CMU-synth dataset is created using the Clustergen model available in the Festival speech synthesis system developed by CMU. These datasets aim to facilitate the development of accurate Punjabi speech recognition systems, bridging the resource gap for this important language.
dc.description.confidentialfalse
dc.format.pagination109865-
dc.identifier.author-urlhttps://www.ncbi.nlm.nih.gov/pubmed/38146308
dc.identifier.citationSingh S, Hou F, Wang R. (2024). Real and synthetic Punjabi speech datasets for automatic speech recognition.. Data Brief. 52. February 2024. (pp. 109865-).
dc.identifier.doi10.1016/j.dib.2023.109865
dc.identifier.eissn2352-3409
dc.identifier.elements-typejournal-article
dc.identifier.issn2352-3409
dc.identifier.number109865
dc.identifier.piiS2352-3409(23)00926-5
dc.identifier.urihttps://mro.massey.ac.nz/handle/10179/71460
dc.languageeng
dc.publisherElsevier Inc
dc.publisher.urihttps://www.sciencedirect.com/science/article/pii/S2352340923009265
dc.relation.isPartOfData Brief
dc.rights(c) 2023 The Author/s
dc.rightsCC BY 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectAutomatic speech recognition
dc.subjectPunjabi language
dc.subjectSpeech dataset
dc.subjectlow-resource languages
dc.titleReal and synthetic Punjabi speech datasets for automatic speech recognition
dc.typeJournal article
pubs.elements-id485150
pubs.organisational-groupOther
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Published version.pdf
Size:
1.03 MB
Format:
Adobe Portable Document Format
Description:
485150 PDF.pdf
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
9.22 KB
Format:
Plain Text
Description:
Collections