SPEECH ENHANCEMENT METHOD, MODEL TRAINING METHOD, AND RELATED DEVICE - BEIJING WODONG TIANJUN INFORMATION TECHNOLOGY CO LTD

Title:

SPEECH ENHANCEMENT METHOD, MODEL TRAINING METHOD, AND RELATED DEVICE

Document Type and Number:

WIPO Patent Application WO/2022/161277

Kind Code:

A1

Abstract:

A speech enhancement method, a model training method, and a related device. A speech enhancement model comprises a speech prediction neural network module, a noise estimation neural network module, and a linear filtering module. The model training method comprises: obtaining a noisy speech amplitude spectrum and a pure speech amplitude spectrum of each speech pair in a training set (S110); obtaining a first feature set and a second feature set according to the noisy speech amplitude spectrum (S120); inputting the first feature set into the speech prediction neural network module, so as to output a first quasi-estimated pure speech amplitude spectrum and a prediction error (S130); inputting the second feature set into the noise estimation neural network module, so as to output an estimated noise energy (S140); inputting the first quasi-estimated pure speech amplitude spectrum, the prediction error, and the estimated noise energy into the linear filtering module, the linear filtering module being used for outputting an estimated pure speech amplitude spectrum (S150); and calculating model loss according to the pure speech amplitude spectrum and the estimated pure speech amplitude spectrum, so as to train the speech enhancement model (S160). The present model can implement speech enhancement optimization.

Inventors:

XUE WEI (CN)
CAI YUYU (CN)
WU JUNYI (CN)
QUAN GANG (CN)
ZHANG CHAO (CN)
YANG FAN (CN)
DING GUOHONG (CN)
HE XIAODONG (CN)

Application Number:

PCT/CN2022/073197

Publication Date:

August 04, 2022

Filing Date:

January 21, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

BEIJING WODONG TIANJUN INFORMATION TECHNOLOGY CO LTD (CN)
BEIJING JINGDONG CENTURY TRADING CO LTD (CN)

International Classes:

G10L21/0208

Foreign References:

CN113808602A	2021-12-17
CN110211598A	2019-09-06
CN109767781A	2019-05-17

Other References:

XUE WEI; QUAN GANG; ZHANG CHAO; DING GUOHONG; HE XIAODONG; ZHOU BOWEN: "Neural Kalman Filtering for Speech Enhancement", ICASSP 2021 - 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), IEEE, 6 June 2021 (2021-06-06), pages 7108 - 7112, XP033955006, DOI: 10.1109/ICASSP39728.2021.9413499

Attorney, Agent or Firm:

BEIJING INTELLEGAL INTELLECTUAL PROPERTY AGENT LTD. (CN)

Download PDF:

View/Download PDF PDF Help

Previous Patent: METHOD AND APPARATUS FOR SESSION SERVICE MANAGEMENT

Next Patent: ANTI-TIGIT ANTIBODY AND APPLICATION THEREOF