Title:
METHODS, DEVICES AND MEDIA FOR IMPROVING KNOWLEDGE DISTILLATION USING INTERMEDIATE REPRESENTATIONS
Document Type and Number:
WIPO Patent Application WO/2022/217853
Kind Code:
A1
Abstract:
Methods, devices and processor-readable media for knowledge distillation using intermediate representations are described. A student model is trained using a Dropout-KD approach in which intermediate layer selection is performed efficiently such that the skip, search, and overfitting problems in intermediate layer KD may be solved. Teacher intermediate layers are selected randomly at each training epoch, with the layer order preserved to avoid breaking information flow. Over the course of multiple training epochs, all of the teacher intermediate layers are used for knowledge distillation. A min-max data augmentation method is also described based on the intermediate layer selection of the Dropout-KD training method.
Inventors:
HAIDAR MD AKMAL (CA)
REZAGHOLIZADEH MEHDI (CA)
REZAGHOLIZADEH MEHDI (CA)
Application Number:
PCT/CN2021/120956
Publication Date:
October 20, 2022
Filing Date:
September 27, 2021
Export Citation:
Assignee:
HUAWEI TECH CO LTD (CN)
International Classes:
G06N3/04
Foreign References:
CN112508169A | 2021-03-16 | |||
CN113326941A | 2021-08-31 | |||
CN113435208A | 2021-09-24 | |||
CN113281048A | 2021-08-20 | |||
US20160078339A1 | 2016-03-17 |
Download PDF:
Previous Patent: SUCTION IRRIGATOR
Next Patent: QUANTUM GATE OPTIMIZATION METHOD AND APPARATUS, AND DEVICE AND STORAGE MEDIUM
Next Patent: QUANTUM GATE OPTIMIZATION METHOD AND APPARATUS, AND DEVICE AND STORAGE MEDIUM