CHILDREN’S SPEECH RECOGNITION BY FINE-TUNING SELFSUPERVISED ADULT SPEECH REPRESENTATIONS

Authors

  • MEERA VALI CHERUKURI Assoc.professor, Dept. of ECE, RISE KrishnaSai Gandhi group of institutions Author
  • K ANJI REDDY Assoc.professor, Dept. of ECE, RISE KrishnaSai Gandhi group of institutions Author
  • Dr. S V SWAMI NATHAN Professor, Dept. of ECE, RISE KrishnaSai Gandhi group of institutions Author

Abstract

Children’s speech recognition is a vital, yet largely
overlooked domain when building inclusive speech
technologies. The major challenge impeding progress in
this domain is the lack of adequate child speech
corpora; however, re- cent advances in self-supervised
learning have created a new opportunity for
overcoming this problem of data scarcity.In this
paper, we leverage self-supervised adult speech
representations and use three well-known child
speech corpora to build models for children’s speech
recognition. We as- sess the performance of finetuning
on both native and non-native children’s
speech, examine the effect of cross-domain child
corpora, and investigate the minimum amount of child
speech required to fine-tune a model which outperforms
a state-of-the-art adult model. We also analyze speech
recogni- tion performance across children’s ages. Our
results demonstrate that fine- tuning with cross-domain
child corpora leads to relative improvements of up to
46.08% and 45.53% for native and non-native child
speech respectively, and absolute improvements of
14.70% and 31.10%. We also show that with as little
as 5 hours of transcribed children’s speech, it is
possible to fine-tune a children’s speech recognition
system that outperforms a state-of-the-art adult model
fine-tuned on 960 hours of adult speech. Keywords:
Children’s speech recognition, Self-supervised
learning, Speech representations, Transformer-based
learning

Downloads

Download data is not yet available.

Published

2021-01-29

Issue

Section

Articles

How to Cite

CHILDREN’S SPEECH RECOGNITION BY FINE-TUNING SELFSUPERVISED ADULT SPEECH REPRESENTATIONS. (2021). International Journal of Multidisciplinary Engineering In Current Research, 6(1), 56-73. https://ijmec.com/index.php/multidisciplinary/article/view/57