Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
BIG DATA PROCESSING METHOD BASED ON DIRECT CALCULATION OF COMPRESSED DATA
Document Type and Number:
WIPO Patent Application WO/2022/199305
Kind Code:
A1
Abstract:
Provided is a big data processing method based on direct calculation of compressed data, characterized by comprising the following steps: 1) compressing original input data on the basis of an improved Sequitur compression method according to a granularity given by a user, and converting same into a DAG graph formed by numbers; and 2) determining an optimal traversal mode, and performing top-down or bottom-up traversal on the DAG graph in step 1) on the basis of the determined optimal traversal mode, so as to implement direct processing of compressed data. By setting an improved Sequitur algorithm and a top-down and bottom-up traversal strategy, direct processing of compressed data is implemented; moreover, for higher-level document analysis, some representations can also be derived on this basis; the present invention can be widely applied in the field of big data processing.

Inventors:
ZHANG FENG (CN)
DU XIAOYONG (CN)
Application Number:
PCT/CN2022/077227
Publication Date:
September 29, 2022
Filing Date:
February 22, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV RENMIN CHINA (CN)
International Classes:
G06F16/174
Foreign References:
CN113064870A2021-07-02
CN103326730A2013-09-25
Other References:
ZHANG FENG; ZHAI JIDONG; SHEN XIPENG; WANG DALIN; CHEN ZHENG; MUTLU ONUR; CHEN WENGUANG; DU XIAOYONG: "TADOC: Text analytics directly on compression", VLDB JOURNAL, SPRINGER VERLAG, BERLIN, DE, vol. 30, no. 2, 1 March 2021 (2021-03-01), DE , pages 163 - 188, XP037407734, ISSN: 1066-8888, DOI: 10.1007/s00778-020-00636-3
FENG ZHANG ; JIDONG ZHAI ; XIPENG SHEN ; ONUR MUTLU ; WENGUANG CHEN: "Efficient document analytics on compressed data", PROCEEDINGS OF THE VLDB ENDOWMENT, ASSOC. OF COMPUTING MACHINERY, NEW YORK, NY, vol. 11, no. 11, 1 July 2018 (2018-07-01), New York, NY , pages 1522 - 1535, XP058416505, ISSN: 2150-8097, DOI: 10.14778/3236187.3236203
FENG ZHANG; ZAIFENG PAN; YANLIANG ZHOU; JIDONG ZHAI; XIPENG SHEN; ONUR MUTLU; XIAOYONG DU: "G-TADOC: Enabling Efficient GPU-Based Text Analytics without Decompression", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 13 June 2021 (2021-06-13), 201 Olin Library Cornell University Ithaca, NY 14853, XP081988704, DOI: 10.1109/ICDE51399.2021.00148
PAN ZAIFENG; ZHANG FENG; ZHOU YANLIANG; ZHAI JIDONG; SHEN XIPENG; MUTLU ONUR; DU XIAOYONG: "Exploring Data Analytics Without Decompression on Embedded GPU Systems", IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, IEEE, USA, vol. 33, no. 7, 12 October 2021 (2021-10-12), USA, pages 1553 - 1568, XP011886857, ISSN: 1045-9219, DOI: 10.1109/TPDS.2021.3119402
Attorney, Agent or Firm:
JEEKAI & PARTNERS (CN)
Download PDF: