Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SPEECH ENHANCEMENT METHOD, MODEL TRAINING METHOD, AND RELATED DEVICE
Document Type and Number:
WIPO Patent Application WO/2022/161277
Kind Code:
A1
Abstract:
A speech enhancement method, a model training method, and a related device. A speech enhancement model comprises a speech prediction neural network module, a noise estimation neural network module, and a linear filtering module. The model training method comprises: obtaining a noisy speech amplitude spectrum and a pure speech amplitude spectrum of each speech pair in a training set (S110); obtaining a first feature set and a second feature set according to the noisy speech amplitude spectrum (S120); inputting the first feature set into the speech prediction neural network module, so as to output a first quasi-estimated pure speech amplitude spectrum and a prediction error (S130); inputting the second feature set into the noise estimation neural network module, so as to output an estimated noise energy (S140); inputting the first quasi-estimated pure speech amplitude spectrum, the prediction error, and the estimated noise energy into the linear filtering module, the linear filtering module being used for outputting an estimated pure speech amplitude spectrum (S150); and calculating model loss according to the pure speech amplitude spectrum and the estimated pure speech amplitude spectrum, so as to train the speech enhancement model (S160). The present model can implement speech enhancement optimization.

Inventors:
XUE WEI (CN)
CAI YUYU (CN)
WU JUNYI (CN)
QUAN GANG (CN)
ZHANG CHAO (CN)
YANG FAN (CN)
DING GUOHONG (CN)
HE XIAODONG (CN)
Application Number:
PCT/CN2022/073197
Publication Date:
August 04, 2022
Filing Date:
January 21, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BEIJING WODONG TIANJUN INFORMATION TECHNOLOGY CO LTD (CN)
BEIJING JINGDONG CENTURY TRADING CO LTD (CN)
International Classes:
G10L21/0208
Foreign References:
CN113808602A2021-12-17
CN110211598A2019-09-06
CN109767781A2019-05-17
Other References:
XUE WEI; QUAN GANG; ZHANG CHAO; DING GUOHONG; HE XIAODONG; ZHOU BOWEN: "Neural Kalman Filtering for Speech Enhancement", ICASSP 2021 - 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), IEEE, 6 June 2021 (2021-06-06), pages 7108 - 7112, XP033955006, DOI: 10.1109/ICASSP39728.2021.9413499
Attorney, Agent or Firm:
BEIJING INTELLEGAL INTELLECTUAL PROPERTY AGENT LTD. (CN)
Download PDF: