Title:
SPEECH SAMPLE SCREENING METHOD AND APPARATUS BASED ON GEOMETRY, AND COMPUTER DEVICE AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2022/116442
Kind Code:
A1
Abstract:
A speech sample screening method and apparatus (100) based on geometry, and a computer device (500) and a storage medium, which relate to artificial intelligence technology. The method comprises: acquiring an initial speech sample set, and extracting a speech feature corresponding to each piece of initial speech sample data in the initial speech sample set, so as to constitute a speech feature set (S110); acquiring a Euclidean distance between speech features in the speech feature set by means of a dynamic time warping algorithm, so as to perform K-means clustering to obtain a clustering result (S120); calling a preset sample subset screening condition, and acquiring, from the clustering result, a cluster that meets the sample subset screening condition, so as to constitute a target cluster set (S130); and acquiring, from the target cluster set, an annotated value corresponding to each speech feature, so as to obtain a current speech sample set corresponding to the target cluster set (S140). Samples with a relatively small redundancy are automatically selected for the training of a speech recognition model, thereby reducing the annotation cost of a speech recognition task in a deep learning background, and improving the training speed of a speech recognition model.
Inventors:
LUO JIAN (CN)
WANG JIANZONG (CN)
CHENG NING (CN)
WANG JIANZONG (CN)
CHENG NING (CN)
Application Number:
PCT/CN2021/083934
Publication Date:
June 09, 2022
Filing Date:
March 30, 2021
Export Citation:
Assignee:
PING AN TECH SHENZHEN CO LTD (CN)
International Classes:
G10L15/02; G10L15/06; G10L15/08; G10L15/26
Foreign References:
CN112530409A | 2021-03-19 | |||
CN110931043A | 2020-03-27 | |||
CN111966798A | 2020-11-20 | |||
CN111813905A | 2020-10-23 | |||
US10699719B1 | 2020-06-30 |
Attorney, Agent or Firm:
SHENZHEN TALENT PATENT SERVICE (CN)
Download PDF: