Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND SYSTEM FOR PREDICTING LATENCY OF DEEP LEARNING MODEL BY DEVICE
Document Type and Number:
WIPO Patent Application WO/2023/017884
Kind Code:
A1
Abstract:
A method and a system for predicting latency of a deep learning model by a device are disclosed. A method for predicting latency according to an embodiment may comprise the steps of: generating a latency lookup table including information of a single neural network layer and latency information on an edge device of the single neural network layer; training a latency predictor so that the latency predictor predicts an input latency of the neural network layer, by using the latency lookup table; and predicting an input on-device latency of a deep learning model by using the trained latency predictor.

Inventors:
KIM JEONG HO (KR)
KIM MINSU (KR)
KIM TAE-HO (KR)
Application Number:
PCT/KR2021/011006
Publication Date:
February 16, 2023
Filing Date:
August 19, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NOTA INC (KR)
International Classes:
G06N3/08; G06N3/04
Other References:
ZHANG LI LYNA LZHANI@MICROSOFT.COM; HAN SHIHAO HANS3@ROSE-HULMAN.EDU; WEI JIANYU NOOB@MAIL.USTC.EDU.CN; ZHENG NINGXIN NINGXIN.ZHEN: "nn-Meter towards accurate latency prediction of deep-learning model inference on diverse edge devices", PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, ACMPUB27, NEW YORK, NY, USA, 24 June 2021 (2021-06-24) - 3 December 2021 (2021-12-03), New York, NY, USA, pages 81 - 93, XP058761876, ISBN: 978-1-4503-8457-5, DOI: 10.1145/3458864.3467882
JUSSI HANHIROVA; TEEMU K\"AM\"AR\"AINEN; SIPI SEPP\"AL\"A; MATTI SIEKKINEN; VESA HIRVISALO; ANTTI YL\&QUO: "Latency and Throughput Characterization of Convolutional Neural Networks for Mobile Computer Vision", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 26 March 2018 (2018-03-26), 201 Olin Library Cornell University Ithaca, NY 14853 , XP080859116
VÉSTIAS MÁRIO P.: "A Survey of Convolutional Neural Networks on Edge with Reconfigurable Computing", ALGORITHMS, vol. 12, no. 8, 31 July 2019 (2019-07-31), pages 154, XP093034724, DOI: 10.3390/a12080154
MOHAMMED SHADY A.; SHIRMOHAMMADI SHERVIN; ALTAMIMI SADI: "A Multimodal Deep Learning-Based Distributed Network Latency Measurement System", IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, IEEE, USA, vol. 69, no. 5, 20 January 2020 (2020-01-20), USA, pages 2487 - 2494, XP011781913, ISSN: 0018-9456, DOI: 10.1109/TIM.2020.2967877
ZADEH ALI HADI; EDO ISAK; AWAD OMAR MOHAMED; MOSHOVOS ANDREAS: "GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference", 2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), IEEE, 17 October 2020 (2020-10-17), pages 811 - 824, XP033856366, DOI: 10.1109/MICRO50266.2020.00071
Attorney, Agent or Firm:
BAEK, Jong Ung et al. (KR)
Download PDF: