Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MULTI-THREAD FAST STORAGE LOSSLESS COMPRESSION METHOD AND SYSTEM FOR FASTQ DATA
Document Type and Number:
WIPO Patent Application WO/2017/214765
Kind Code:
A1
Abstract:
Provided is a multi-thread fast storage lossless compression method for FASTQ data, which is applied to compression of a DNA sequence. The method comprises: a data classification step of inputting original FASTQ data, and dividing a short reading of the original FASTQ data into three data flows, namely metadata, a mass fraction, and a base sequence (S11); a data compression step of: with respect to the metadata, using incremental encoding to detect and eliminate redundant information of the metadata; with respect to the mass fraction, using a bit level PPM prediction model and arithmetic coding for compression; and with respect to the base sequence, using improved arithmetic coding of a fixed order for compression (S12); and a data output step of archiving and merging compression results of different data flows, and outputting final data after the compression (S13). The solution can improve the compression efficiency and compression speed.

Inventors:
ZHU ZEXUAN (CN)
HUANG ZHIAN (CN)
SUN YIWEN (CN)
WEN ZHENKUN (CN)
Application Number:
PCT/CN2016/085426
Publication Date:
December 21, 2017
Filing Date:
June 12, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV SHENZHEN (CN)
International Classes:
H03M7/30; G16B50/50
Domestic Patent References:
WO2015120170A12015-08-13
Foreign References:
CN103559020A2014-02-05
US20150227686A12015-08-13
Other References:
ZHANG, YONGPENG: "Lossless Comprssion of High-through DNA Sequence Data", CHINA MASTER'S THESES FULL-TEXT DATABASE BASIC SCIENCE, 15 December 2015 (2015-12-15), pages 3 - 5 , 12-15 and 20-25, ISSN: 1674-0246
AKGUN, METE ET AL.: "A new PPM model for quality score compression", SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU, 26 April 2013 (2013-04-26), XP032423184, ISSN: 2165-0608
BEHZADI, BEHSHAD ET AL.: "DNA Compression Challenge Revisited: A Dynamic Programming Approach", COMBINATORIAL PATTERN MATCHING, 22 June 2005 (2005-06-22), XP055297614, ISSN: 0302-9743, DOI: doi:10.1007/11496656_17
Attorney, Agent or Firm:
HENSEN INTELLECTUAL PROPERTY FIRM (CN)
Download PDF: