Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
INFERENCE METHOD AND APPARATUS FOR NEURAL NETWORK MODEL, AND RELATED DEVICE
Document Type and Number:
WIPO Patent Application WO/2024/066676
Kind Code:
A1
Abstract:
The present application discloses an inference method for a neural network model. The method is applied to a computing cluster; the computing cluster comprises a plurality of inference servers and a memory pool; and each inference server comprises at least one inference card and a local memory. The method comprises: a first inference card of a first inference server in the computing cluster receiving an inference task; if no parameter for executing the inference task is hit in the first inference card, the first inference card acquiring parameters from a local memory of the first server; and if no parameter is hit in the local memory of the first server, acquiring parameters from the memory pool. The first inference card can execute the inference task on the basis of all acquired parameters. On the basis of the high-speed read/write capability of the local memory of the first inference server, the speed of acquiring the parameters by the first inference card can be improved, thereby reducing the parameter acquisition time delay of the first inference card, and satisfying the low-time-delay requirements for the execution of the inference task. In addition, the present application further provides a corresponding apparatus, a computing cluster, and a storage medium.

Inventors:
WANG GUOWEI (CN)
XU HUA (CN)
Application Number:
PCT/CN2023/107683
Publication Date:
April 04, 2024
Filing Date:
July 17, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HUAWEI TECH CO LTD (CN)
International Classes:
G06F16/9535; G06N5/04
Foreign References:
CN111881358A2020-11-03
CN106998370A2017-08-01
CN113111123A2021-07-13
CN114035748A2022-02-11
CN114911596A2022-08-16
US20170091246A12017-03-30
US20190073590A12019-03-07
Attorney, Agent or Firm:
SHENPAT INTELLECTUAL PROPERTY AGENCY (CN)
Download PDF: