Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SINGLE-SOUND CHANNEL ROBUSTNESS SPEECH KEYWORD REAL-TIME DETECTION METHOD
Document Type and Number:
WIPO Patent Application WO/2021/062705
Kind Code:
A1
Abstract:
A single-sound channel robustness speech keyword real-time detection method, comprising the following steps: receiving noisy speech of an electronic format; converting a time domain speech signal into a frequency domain signal by means of short-time Fourier transform in a frame-by-frame mode; using a Mel filter to process the frequency domain signal so as to obtain a Mel feature as an acoustic feature; making the Mel feature pass a neural network in a frame-by-frame mode, and then using a normalized exponential function to process the Mel feature to obtain the confidence degree information of each keyword; when the confidence degree information of a certain keyword is greater than a predefined threshold, splicing the current frame and previous several frames so as to be used as an output of the neural network; and sequentially passing through an attention mechanism and a feed-forward type deep neural network, and performing processing by means of the normalized exponential function so as to obtain the confidence degree information of each sentence-level keyword, when a confidence degree value is greater than the predefined threshold, considering that the keyword is detected, and otherwise, considering the keyword is not detected. The method still can keep a high wakeup rate in a noisy environment, has wide applicability, and can greatly reduce the false alarm rate of the neural network and improve the detection performance of the keyword.

Inventors:
HU PENG (CN)
YAN YONGJIE (CN)
Application Number:
PCT/CN2019/109603
Publication Date:
April 08, 2021
Filing Date:
September 30, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ELEVOC TECH CO LTD (CN)
International Classes:
G10L15/02; G10L15/14
Foreign References:
CN110097870A2019-08-06
CN108615526A2018-10-02
CN103559881A2014-02-05
US20060190259A12006-08-24
Other References:
KUMAR RAJATH, YERUVA VAISHNAVI, GANAPATHY SRIRAM: "On Convolutional LSTM Modeling for Joint Wake-Word Detection and Text Dependent Speaker Verification", INTERSPEECH 2018, ISCA, ISCA, 1 January 2018 (2018-01-01), ISCA, pages 1121 - 1125, XP055797254, DOI: 10.21437/Interspeech.2018-1759
Attorney, Agent or Firm:
SHENZHEN KUAIMA PATENT & TRADEMARK OFFICE et al. (CN)
Download PDF: