Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MIXTURE-OF-EXPERTS MODEL IMPLEMENTATION METHOD AND SYSTEM, ELECTRONIC DEVICE, AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2023/201981
Kind Code:
A1
Abstract:
The present disclosure relates to fields of artificial intelligence such as deep learning and distributed storage, and provided thereby are a mixture-of-experts model implementation method and system, an electronic device, and a storage medium. The method may comprise: constructing a communication group, the communication group comprising a tensor parallel communication group, the tensor parallel communication group comprising at least two computing devices, and sparse parameters of computing devices in the same tensor parallel communication group using a tensor parallel segmentation means; and on the basis of the communication group, training a mixture-of-experts model. By applying the solution of the present disclosure, normal proceeding of model training can be guaranteed.

Inventors:
SHEN LIANG (CN)
WANG HAIFENG (CN)
WU HUACHAO (CN)
GONG WEIBAO (CN)
WU ZHIHUA (CN)
YU DIANHAI (CN)
Application Number:
PCT/CN2022/119752
Publication Date:
October 26, 2023
Filing Date:
September 20, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BEIJING BAIDU NETCOM SCI & TECH CO LTD (CN)
International Classes:
G06N3/04
Foreign References:
CN114841315A2022-08-02
CN114282681A2022-04-05
CN114186633A2022-03-15
CN114169427A2022-03-11
US20190251423A12019-08-15
US20200151580A12020-05-14
CN202210430519A2022-04-22
Attorney, Agent or Firm:
BEIJING WISPRO INTELLECTUAL PROPERTY LLP. (CN)
Download PDF: