Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
VOICE DATA ENHANCING METHOD AND DEVICE IN VOICE RECOGNITION BASED ON RECURRENT NEURAL NETWORK
Document Type and Number:
WIPO Patent Application WO/2019/024008
Kind Code:
A1
Abstract:
A voice data enhancing method based on a recurrent neural network in the field of voice recognition processing aims at solving the problem of excessive modeling word dependence caused by irregular grammar phenomena of voice recognition simulation in voice recognition in a recurrent neural network. The method comprises: extracting acoustic features of various frequency energy values identifying voice from input voice data to generate acoustic feature vectors (201); obtaining a statement label sequence of the voice data according to a preset labeling file and the acoustic feature vectors (202); obtaining an alignment file after a decision cluster operation by means of the labeling file preset by a decision cluster, and the statement label sequence (203); generating a first random number γ between [0, 1], and comparing the first random number with a preset adjusting proportion α (204); and if the first random number γ is greater than the adjusting proportion α, performing enhancement processing on the voice data in a position indicated by a boundary file (205). The method enables irregular spoken language phenomena in training data to be increased quickly and conveniently.

Inventors:
ZHAO YUANYUAN (CN)
XU SHUANG (CN)
XU BO (CN)
Application Number:
PCT/CN2017/095668
Publication Date:
February 07, 2019
Filing Date:
August 02, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INST AUTOMATION CAS (CN)
International Classes:
G10L15/16; G10L15/02; G10L15/06; G10L15/20
Foreign References:
CN107437417A2017-12-05
CN101582264A2009-11-18
CN103021420A2013-04-03
US9208794B12015-12-08
Other References:
KUMAR, A. ET AL.: "Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks", INTERSPEECH 2016, 12 September 2016 (2016-09-12), pages 3738 - 3742, XP055567333
XU, YONG ET AL.: "A Regression Approach to Speech Enhancement Based on Deep Neural Networks", ACM TRANSACTIONS ON AUDIO, SPEECH, vol. 23, no. 1, 31 January 2015 (2015-01-31), pages 7 - 19, XP055567335
ZHAO, YUANYUAN ET AL.: "Multidimensional Residual Learning Based on Recurrent Neural Networks for Acoustic Modeling", INTERSPEECH 2016, 12 September 2016 (2016-09-12), pages 3419 - 3423
Attorney, Agent or Firm:
HANRAY LAW FIRM (CN)
Download PDF: