Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS, DEVICES AND MEDIA FOR IMPROVING KNOWLEDGE DISTILLATION USING INTERMEDIATE REPRESENTATIONS
Document Type and Number:
WIPO Patent Application WO/2022/217853
Kind Code:
A1
Abstract:
Methods, devices and processor-readable media for knowledge distillation using intermediate representations are described. A student model is trained using a Dropout-KD approach in which intermediate layer selection is performed efficiently such that the skip, search, and overfitting problems in intermediate layer KD may be solved. Teacher intermediate layers are selected randomly at each training epoch, with the layer order preserved to avoid breaking information flow. Over the course of multiple training epochs, all of the teacher intermediate layers are used for knowledge distillation. A min-max data augmentation method is also described based on the intermediate layer selection of the Dropout-KD training method.

Inventors:
HAIDAR MD AKMAL (CA)
REZAGHOLIZADEH MEHDI (CA)
Application Number:
PCT/CN2021/120956
Publication Date:
October 20, 2022
Filing Date:
September 27, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUAWEI TECH CO LTD (CN)
International Classes:
G06N3/04
Foreign References:
CN112508169A2021-03-16
CN113326941A2021-08-31
CN113435208A2021-09-24
CN113281048A2021-08-20
US20160078339A12016-03-17
Download PDF: