Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CHAN FRAMEWORK, CHAN CODING AND CHAN CODE
Document Type and Number:
WIPO Patent Application WO/2018/020328
Kind Code:
A1
Abstract:
A framework and the associated method, schema and design for processing digital data, whether random or not, through encoding and decoding losslessly and correctly for the purpose of encryption/decryption or compression/decompression or both. The is no assumption of the digital information to be processed before processing.

Inventors:
CHAN KAM FU (CN)
Application Number:
PCT/IB2017/050985
Publication Date:
February 01, 2018
Filing Date:
February 22, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CHAN KAM FU (CN)
International Classes:
H04N19/176
Foreign References:
CN103125119A2013-05-29
CN105594209A2016-05-18
CN101252694A2008-08-27
CN102948145A2013-02-27
Download PDF:
Claims:
Claims

[1] CHAN FRAMEWORK, method of creating order out of digital data information,

whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Code Unit as the basic unit of bit container containing binary bits of a digital data set for use; according to the design and schema chosen for processing for the purpose of encoding and decoding, Code Unit being classified primarily by the maximum possible number of data values a Code Unit is defined to hold or to represent, i.e. the value size of a Code Unit, where each of the possible unique values of a Code Unit could have the same bit size or different bit sizes; and Code Unit then being classified by the number of bits all the possible unique data values altogether of a Code Unit occupy, i.e. the sum of the bit size of each of the possible unique data values of a Code Unit takes up; and Code Unit being further classified by the Head Design, i.e. whether it is of 0 Head Design or 1 Head Design; whereby Code Unit of a certain value size under CHAN FRAMEWORK could have different definitions and versions;

[2] CHAN FRAMEWORK, method of creating order out of digital data information,

whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Processing Unit(s) which is made up by a certain number of Code Units as sub-units according to the design and schema chosen for processing for the purpose of encoding and decoding;

[3] CHAN FRAMEWORK, method of creating order out of digital data information,

whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Super Processing Unit(s) which is made up by a certain number of

Processing Unit(s) as sub-units according to the design and schema chosen for processing for the purpose of encoding and decoding;

[4] CHAN FRAMEWORK, method of creating order out of digital data information,

whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Un-encoded Section which is made up by a certain number of binary bits, which do not make up to the size of one Processing Unit, thus left as un-encoded or left as it is according to the design and schema chosen for processing for the purpose of encoding and decoding;

[5] CHAN FRAMEWORK, method of creating order out of digital data information,

whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Un-encoded Section which is made up by a certain number of binary bits, which do not make up to the size of one Processing Unit, thus left as un-encoded or left as it is according to the design and schema chosen for processing for the purpose of encoding and decoding;

[6] CHAN FRAMEWORK, method of creating order out of digital data information,

whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of traits or characteristics or relations that are derived from Code Unit(s), Processing Unit(s), Super Processing Unit(s) and Un-encoded Section as well as their combination in use according to the design and schema chosen for processing for the purpose of encoding and decoding;

[7] CHAN FRAMEWORK, method of creating order out of digital data information,

whether random or not, being characterized by a descriptive language that is used to describe the traits or characteristics or relations of any digital data set using the terminology for describing the traits or characteristics or relations of Code Unit,

Processing Unit, Super Processing Unit and Un-encoded Section;

[8] CHAN CODING, method of encoding and decoding, being characterized by techniques for processing data for the purpose of encoding and decoding under CHAN

FRAMEWORK;

[9] CHAN CODING, method of encoding and decoding, being characterized by the resultant CHAN CODE created out of any digital data set using techniques of CHAN CODING;

[10] CHAN CODING, method of encoding and decoding, being characterized by the

technique using Absolute Address Branching Technique with range;

[11] CHAN CODING, method of encoding and decoding, being characterized by the

technique using mathematical formula(e) for representing the relations between Code Units of a Processing Unit of the data order created under CHAN FRAMEWORK;

[12] CHAN CODING, method of encoding and decoding, being characterized by the

technique of placement, placing the values or encoded codes as represented by mathematical formula(e) as well as those values or encoded codes of Code Unit,

Processing Unit, Super Processing Unit and Un-encoded Section in different position order;

[13] CHAN CODING, method of encoding and decoding, being characterized by a technique of classification, i.e. the assignment of 0 Head Design or 1 Head Design or both, represent by the associated bit pattern, to trait(s) or characteristic(s) of the digital data under processing that is/are used to classify or group data values for processing for the purpose of encoding and decoding;

[14] CHAN CODING, method of encoding and decoding, being characterized by a technique of classification, i.e. the use of trait(s) or characteristic(s) in terms of Rank and Position of the data values of the digital data under processing for classifying or grouping data values for processing for the purpose of encoding and decoding;

[15] CHAN CODING, method of encoding and decoding, being characterized by a technique of classification, i.e. the use of code re-distribution, including re-distribution of unique data values as well as unique address codes from one class to another class of the classification scheme by use of any one of the following techniques including code swapping, code re-assignment and code re-filling for processing digital data set for the purpose of encoding and decoding;

[16] CHAN CODING, method of encoding and decoding, being characterized by techniques of code adjustment, including any one of the following techniques including code promotion, code demotion, code omission as well as code restoration for processing for the purpose of encoding and decoding;

[17] CHAN CODING, method of encoding and decoding, being characterized by technique of using Terminating Condition or Terminating Value for defining the size of a Processing

Unit or a Super Processing Unit for processing for the purpose of encoding and decoding;

[18] CHAN CODING, method of encoding and decoding, being characterized by technique of using Code Unit Definition as Reader of digital data values or encoded code values;

[19] CHAN CODING, method of encoding and decoding, being characterized by technique of using Code Unit Definition as Writer of digital data values or encoded code values;

[20] CHAN CODING, method of encoding and decoding, being characterized by technique of using Super Processing Unit for sub-dividing a digital data set into sub-sections of data of which at least one sub-section is not in random for processing for the purpose of encoding and decoding;

[21] A method of Claim [20] being characterized by further classifying the Super Processing Units of the digital data set into classes, two or more, using a classifying condition, such as the number of value entries appearing in the Super Processing Unit for a particular class; and by designing mapping tables which are appropriate to the data distribution of each of these classes for encoding and decoding; and by encoding and decoding the data values of each of these Super Processing Units with the use of their respective mapping table appropriate to the data distribution of the data values of each of these Super Processing Units; and using indicators to make distinction between these classes of Super Processing Units for the use in decoding, such indicators being kept at the head of each of these Super Processing Units or elsewhere as in separate CHAN CODE FILES;

[22] A method of Claim [20] being characterized by further classifying the Super Processing Units of the digital data set into classes, two or more, using a classifying condition, such as the number of value entries appearing in the Super Processing Unit for a particular class; and by designing mapping tables which are appropriate to the data distribution of each of these classes for encoding and decoding; and by encoding and decoding the data values of each of these Super Processing Units with the use of their respective mapping table appropriate to the data distribution of the data values of each of these Super Processing Units; and by setting criteria appropriate to the data distribution of the classes of Super Processing Units and the corresponding mapping tables used for encoding and decoding for use in assessing the encoded code for making Artificial Intelligence distinction between the classes of Super Processing Units so that the use of indicators could be dispensed with;

[23] A method of Claim [20] being characterized by further classifying the Super Processing Units of the digital data set into two classes, using a classifying condition, such as the number of value entries appearing in the Super Processing Unit for a particular class; and by designing mapping tables which are appropriate to the data distribution of each of these classes for encoding and decoding, whereby at least one of these mapping tables could serve and thus be chosen to serve as an unevener and such an unevener could also be adjusted through the use of code re-distribution that it could take advantage of the data distribution of the data values of at least one class of Super Processing Units so that the unevener mapping table after code adjustment through code re-distribution could serve and thus be chosen as the mapping table of a compressor for at least one class of Super Processing Units; and by encoding all the Super Processing Units using the unevener in the first cycle; and then by encoding at least one class of the Super Processing Units using the compressor where compression of data of the respective Super Processing Unit under processing is feasible in the second cycle, i.e. encoded with the use of the unevener in the first cycle and the compressor in the second cycle, leaving those Super Processing Unit with data incompressible as it is, i.e. encoded with the use of the unevener only; and decoding the data values of each of these Super Processing Units with the use of their respective mapping table(s) appropriate to the data distribution of the data values of each of these Super Processing Units, whereby in the first cycle of decoding, the encoded code formed out of unevener encoding and compressor encoding is decoded so that the layer of compressor encoding is removed, and in the second cycle of decoding, the encoded code, consisting of only unevener encoded code, of all the Super Processing Units is decoded by the unevener decoder; and by setting criteria appropriate to the data distribution of the classes of Super Processing Units and the corresponding mapping tables used for encoding and decoding for use in assessing the encoded code for making Artificial Intelligence distinction between the classes of Super Processing Units so that the use of indicators could be dispensed with;

[24] CHAN CODING, method of encoding and decoding, being characterized by the

technique of creating Unevener Encoder and Unevener Decoder by building a mapping table and using the unique code addresses of the said mapping table for mapping the unique data values of the digital data input in one to one mapping whereby the number of bit(s) used by the unique data values and that used by the corresponding mapped unique table code addresses of the corresponding mapped pair is the same; by using the said mapping table for encoding and decoding;

[25] CHAN CODING, method of encoding and decoding, being characterized by the

technique of using Unevener Encoder and Unevener Decoder for processing for the purpose of encoding and decoding;

[26] CHAN CODING, method of encoding and decoding, being characterized by the

technique of using Unevener Encoder and Unevener Decoder together with an Evener Encoder and Decoder or a Compressor and Decompressor for processing for the purpose of encoding and decoding;

[27] CHAN CODING, method of encoding and decoding, being characterized by technique of dynamic adjustment of the size of Processing Unit or Super Processing Unit in the context of changing data distribution and in accordance with the Terminating Condition used under processing; [28] CHAN CODING, method of encoding and decoding, being characterized by technique of dynamic adjustment of Code Unit Definition in accordance with the data distribution pattern of the data values under processing;

[29] CHAN CODE being characterized by Classification Code and Content Code, which are created out of any digital data set using techniques of CHAN CODING for processing for the purpose of encoding and decoding;

[30] CHAN CODE being characterized by Classification Code, Content Code and Un- encoded Code Section, which are created out of any digital data set using techniques of CHAN CODING for processing for the purpose of encoding and decoding;

[31] CHAN CODE being characterized by Header, Classification Code and Content Code, which are created out of any digital data set using techniques of CHAN CODING for processing for the purpose of encoding and decoding, whereby the said Header contains indicator(s) resulting from the use of CHAN CODING technique(s) for processing for the purpose of encoding and decoding;

[32] CHAN CODE being characterized by Header, Classification Code, Content Code and

Un-encoded Code Section, which are created out of any digital data set using techniques of CHAN CODING for processing for the purpose of encoding and decoding, whereby the said Header contains indicator(s) resulting from the use of CHAN CODING technique(s) for processing for the purpose of encoding and decoding, such indicator(s) including any of the following: Checksum Indicator, Signature for CHAN CODE FILES, Mapping Table Indicator, Number of Cycle Indicator, Code Unit, Definition Indicator, Processing Unit Definition Indicator, Super Processing Unit Definition Indicator, Last Identifying Code Indicator, Scenario Design Indicator, Unevener/Evener Indicator, Recycle Indicator, Frequency Indicator;

[33] Encoder and Decoder being characterized by being embedded with techniques of CHAN CODING for processing;

[34] Encoder and Decoder being characterized by being embedded with techniques of CHAN CODING and Header Indicator(s) for processing, such indicator(s) including any of the following: Checksum Indicator, Signature for CHAN CODE FILES, Mapping Table Indicator, Number of Cycle Indicator, Code Unit, Definition Indicator, Processing Unit Definition Indicator, Super Processing Unit Definition Indicator, Last Identifying Code Indicator, Scenario Design Indicator, Unevener/Evener Indicator, Recycle Indicator, Frequency Indicator;

[35] CHAN CODE FILES, being digital information files containing CHAN CODE;

[36] CHAN CODE FILES, being digital information files containing additional information for the use by CHAN CODING techniques, including Header and the indicator(s) contained therein, such indicators including any of the following: Checksum Indicator, Signature for CHAN CODE FILES, Mapping Table Indicator, Number of Cycle

Indicator, Code Unit, Definition Indicator, Processing Unit Definition Indicator, Super Processing Unit Definition Indicator, Last Identifying Code Indicator, Scenario Design Indicator, Unevener/Evener Indicator, Recycle Indicator, Frequency Indicator; [37] CHAN MATHEMATICS being a mathematical method using techniques whereby data values are put into an order that is being able to be described in mathematical formula(e) corresponding to the respective CHAN SHAPE, including the associated mathematical calculation logic and techniques used in merging and separating digital information, such digital information including values of Code Units of a Processing Unit in processing digital information, whether at random or not, for the purpose of encoding and decoding;

[38] CHAN FORMULA(E) being formula(e) describing the characteristics and relations between basic components, the Code Units and derived components such RP Piece of CHAN CODE and other derived components, such as the Combined Values or sums or differences of values of basics components of a Processing Unit for processing digital information, whether at random or not, for the purpose of encoding and decoding;

[39] CHAN SHAPES including CHAN DOT, CHAN LINES, CHAN TRIANGLE , CHAN RECTANGLES, CHAN TRAPEZIA AND CHAN SQUARES AND CHAN BARS representing the characteristics and relations of the basic components of a Processing Unit;

[40] COMPLEMENTARY MATHEMATICS using a constant value or a variable containing a value as a COMPLEMENTARY CONSTANT or COMPLEMENTARY VARIABLE for mathematical processing, making the mirror value of a value or a range or ranges of values being obtainable for use;

[41] CHAN MATHEMATICS using COMPLEMENTARY MATHEMATICS and normal mathematics or either of them alone for processing;

[42] Use of CHAN FRAMEWORK for the purpose of encryption/decryption or

compression/decompression or both;

[43] Use of CHAN CODING for the purpose of encryption/decryption or

compression/decompression or both;

[44] Use of CHAN CODE for the purpose of encryption/decryption or

compression/decompression or both;

[45] Use of CHAN CODE FILE(S) for the purpose of encryption/decryption or

compression/decompression or both;

[46] Use of CHAN MATHEMATICS for the purpose of encryption/decryption or

compression/decompression or both;

[47] Use of COMPLEMENATRY MATHEMATICS for the purpose of encryption/decryption or compression/decompression or both;

[48] Use of CHAN SHAPE(S) for the purpose of encryption/decryption or

compression/decompression or both;

[49] A method of parsing digital data set, whether random or not, for collecting statistics about digital data set for the purpose of encoding and decoding, characterized by using design and schema of data order defined under CHAN FRAMEWORK;

[50] A method of describing digital data set, whether random or not, characterized by using

CHAN FRAMEWORK LANGUAGE; and [51] CHAN CODING, method of encoding and decoding, being characterized by techniqi of using Posterior Classification Code or Interior Classification Code or Modified Content Code as Classification Code.

Description:
Description

CHAN FRAMEWORK, CHAN CODING AND CHAN CODE

Technical Field

[0] Let him that hath understanding count the number

[1] This invention claims priority of two earlier PCT Applications, PCT/IB2016/054562 filed on 29 July 2016 and PCT/IB2016/054732 filed on 05 August 2016, submitted by the present inventor. This invention relates to the use of the concept and techniques revealed in the aforesaid two PCT Applications and improved on in the present

Application, presenting a framework for describing digital data whether random or not for encoding and decoding purposes, including compression and decompression as well as encryption and decryption. In addition, the present invention also reveals the relationship between different components of CHAN SHAPES (including CHAN RECTANGLES, CHAN TRAPESIA, CHAN SQUARES, CHAN TRIANGLE, CHAN LINE, CHAN DOT AND CHAN BARS or other shapes which describe the relations and characteristics of the basic components of the Processing Unit) and the respective techniques in making coding (including encoding and decoding) of digital information for the use and protection of intellectual property, expressed in the form of digital information, including digital data as well executable code for use in device(s), including computer system(s) or computer-controlled device(s) or operating-system-controlled device(s) or system(s) that is/are capable of running executable code or using digital data. Such device(s) is/are mentioned hereafter as Device(s).

[2] In particular, this invention relates to the framework and method as well as its

application in processing, storing, distribution and use in Device(s) of digital information, including digital data as well as executable code, such as boot code, programs, applications, device drivers, or a collection of such executables constituting an operating system in the form of executable code embedded or stored into hardware, such as embedded or stored in all types of storage medium, including read-only or rewritable or volatile or non-volatile storage medium (referred hereafter as the Storage Medium) such as physical memory or internal DRAM (Dynamic Random Access Memory) or hard disk or solid state disk (SSD) or ROM (Read Only Memory), or readonly or rewritable CD/DVD/HD-DVD/Blu-Ray DVD or hardware chip or chipset etc. The method of coding revealed, i.e. CHAN CODING, when implemented produces an encoded code, CHAN CODE that could be decoded and restored losslessly back into the original code; and if such coding is meant for compression, such compressed code could also be re-compressed time and again until it reaches its limit.

[3] In essence, this invention reveals a framework for creating an order for digital data so that digital data could be described and its characteristics could be investigated for the purpose of making compression / decompression or encryption / decryption of digital information. In this relation, it makes possible the processing, storing, distribution and use of digital information in Device(s) connected over local clouds or internet clouds for the purpose of using and protecting intellectual property. As with the use of other compression methods, without proper decompression using the corresponding methods, the compressed code could not be restored correctly, so it could also be considered an encrypted code as well. CHAN CODING AND CHAN CODE (CHAN CODING AND CHAN CODE including the concepts and techniques and the resultant code so produced as revealed in the aforesaid PCT Applications and in the present Application) could also be used in other scientific, industrial and commercial endeavors in various kinds of applications to be explored. The use of it in the Compression Field

demonstrates vividly its tremendous use.

[4] However, the framework, the associated schema, design and method as well as its

application revealed in this invention are not limited to delivery or exchange of digital information over clouds, i.e. local area network or internet, but could be used in other modes of delivery or exchange of information.

Background Art

[5] In the field of Compression Science, there are many methods and algorithms published for compressing digital information and introduction to commonly used data compression methods and algorithms could be found at http://en.wikipedia.org/Viki/Data_compression .

The present invention describes a novel method that could be used for making lossless data compression (besides also being suitable for use for the purpose of making encryption and losslessly decryption) and its restoration.

Relevant part of the aforesaid wiki on lossless compression is reproduced here for easy reference:

"Lossless data compression algorithms usually exploit statistical redundancy to represent data more concisely without losing information, so that the process is reversible. Lossless compression is possible because most real-world data has statistical redundancy. For example, an image may have areas of colour that do not change over several pixels; instead of coding "red pixel, red pixel, ..." the data may be encoded as "279 red pixels". This is a basic example of run-length encoding; there are many schemes to reduce file size by eliminating redundancy.

The Lempel-Ziv (LZ) compression methods are among the most popular algorithms for lossless storage. [6] DEFLATE is a variation on LZ optimized for decompression speed and compression ratio, but compression can be slow. DEFLATE is used in PKZIP, Gzip and PNG. LZW (Lempel-Ziv-Welch) is used in GIF images. Also noteworthy is the LZR (Lempel-Ziv-Renau) algorithm, which serves as the basis for the Zip method. LZ methods use a table-based compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often Huffman encoded (e.g. SHRI, LZX). A current LZ-based coding scheme that performs well is LZX, used in Microsoft's CAB format.

The best modern lossless compressors use probabilistic models, such as prediction by partial matching. The Burrows-Wheeler transform can also be viewed as an indirect form of statistical modelling. [7]

The class of grammar-based codes are gaining popularity because they can compress highly repetitive text, extremely effectively, for instance, biological data collection of same or related species, huge versioned document collection, internet archives, etc. The basic task of grammar-based codes is constructing a context-free grammar deriving a single string. Sequitur and Re-Pair are practical grammar compression algorithms for which public codes are available.

In a further refinement of these techniques, statistical predictions can be coupled to an algorithm called arithmetic coding. Arithmetic coding, invented by Jorma Rissanen, and turned into a practical method by Witten, Neal, and Cleary, achieves superior compression to the better-known Huffman algorithm and lends itself especially well to adaptive data compression tasks where the predictions are strongly context-dependent. Arithmetic coding is used in the bi-level image compression standard JBIG, and the document compression standard Dj Vu. The text entry system Dasher is an inverse arithmetic coder. [8]"

[6] In the aforesaid wiki, it says that "LZ methods use a table-based compression model where table entries are substituted for repeated strings of data". The use of table for translation, encryption, compression and expansion is common but how the use of table for such purposes are various and could be novel in one way or the other.

[7] The present invention presents a novel method, CHAN CODING that produces

amazing result that has never been revealed elsewhere. This represents a successful challenge and a revolutionary ending to the myth of Pigeonhole Principle in

Information Theory. CHAN CODING demonstrates how the technical problems described in the following section are being approached and solved.

Disclosure of Invention Technical Problem

[8] The technical problem presented in the challenge of lossless data compression is how longer entries of digital data code could be represented in shorter entries of code and yet could be recoverable. While shorter entries could be used for substituting longer data entries, it seems inevitable that some other information, in digital form, has to be added in order to make it possible or tell how it is to recover the original longer entries from the shortened entries. If too much such digital information has to be added, it makes the compression efforts futile and sometimes, the result is expansion rather than

compression. [9] The way of storing such additional information presents another challenge to the compression process. If the additional information for one or more entries of the digital information is stored interspersed with the compressed data entries, how to differentiate the additional information from the original entries of the digital information is a problem and the separation of the compressed entries of the digital information during recovery presents another challenge, especially where the original entries of the digital information are to be compressed into different lengths and the additional information may also vary in length accordingly.

[10] This is especially problematic if the additional information and the compressed digital entries are to be recoverable after re-compression again and again. More often than not, compressed data could not be re-compressed and even if re-compression is attempted, not much gain could be obtained and very often the result is an expansion rather than compression.

[11] The digital information to be compressed also varies in nature; some are text files, others are graphic, music, audio or video files, etc. Text files usually have to be compressed losslessly, otherwise its content becomes lost or scrambled and unrecognizable.

[12] And some text files are ASCII based while others UNICODE based. Text files of

different languages also have different characteristics as expressed in the frequency and combination of the digital codes used for representation. This means a framework and method which has little adaptive or all embracing power (i.e. catering for all possible cases) could not work best for all such scenarios. Providing a more adaptive and flexible or an all embracing framework and method for data compression is therefore a challenge.

Technical Solution

[13] It has long been held in the data compression field that pure random binary numbers could not be shown to be definitely subject to compression until the present invention. By providing a framework and method for lossless compression that suits to digital information, whether random or not, of different types and of different language characteristics the present invention enables one to compress random digital information and to recover it successfully. The framework as revealed in this invention, CHAN FRAMEWORK, makes possible the description and creation of order of digital information, whether random or not, in an organized manner so that the characteristics of any digital information could be found out, described, investigated and analyzed so that such characteristics and the content of the digital information could be used to develop techniques and methods for the purposes of lossless encryption / decryption and compression / decompression in cycles. This puts an end to the myth of Pigeonhole Principle in Information Theory. Of course, there is a limit. This is obvious that one could not further compress a digital information of only 1 bit. The limit of compressing digital information as revealed by the present invention varies with the schema and method chosen by the relevant implementation in making compression, as determined by the size of the header used, the size of Processing Unit (containing a certain number of Code Units) or the size of Super Processing Units (containing a certain number of Processing Units) used as well as the size of un-encoded binary bits, which do not make up to a Processing Unit or Super Processing Unit. So this limit of compressing any digital information could be kept to just one or two hundred binary bits or less depending on design.

Using CHAN FRAMEWORK AND CHAN CODING, the random digital information to be compressed and recovered need not be known beforehand. CHAN FRAMEWORK will be defined in the course of the following description where appropriate. The following diagram is used to explain the features of CHAN FRAMEWORK as revealed in the present invention for encoding and decoding (i.e. including the purposes of compression / decompression and encryption / decryption):

Diagram 1

CHAN FRAMEWORK AND LEGEND where a and b are two pieces of digital information, each representing a unit of code, Code Unit (being the basic unit of code of a certain number of binary bits of 0s and Is). The content or the value of Code Units, represented by a certain number of binary bits of 0s and Is, is read one after another, for instance a is read as the first Code Unit, and b the second; a piece of digital information constitutes a Code Unit, and two such Code Units in Diagram 1 constitute a Processing Unit (the number of Code Units a Processing Unit contains could vary, depending on the schema and techniques used in the coding design, which is decided by the code designer and which could therefore be different from the case used in the present illustration using Diagram 1); for convenience and ease of computation, each Code Unit is best of equal definition, such as in terms of bit size for one cycle of coding process, using the same number scale without having to do scale conversion; consistency and regularity in encoding and decoding is significant to successfully recovery of digital information losslessly after encoding. Consistency and regularity in encoding and decoding means that the handling of digital information follows certain rules so that logical deduction could be employed in encoding and decoding in such ways that digital information could be translated or transformed, making possible alteration of data distribution such as changing the ratio of binary bits 0 and binary bits 1 of the digital information, and dynamic code adjustment (including code promotion, code demotion, code omission, and code restoration). Such rules for encoding and decoding are determined by the traits and contents of the Code Units or Processing Units or Super Processing Units and the associated schema, design and method of encoding and decoding used. Such rules or logic of encoding and decoding could be recorded into the encoded code or the header using binary bit(s) as indicators or embedded in the encoder and decoder where consistency and regularity of the schema, design and method of encoding and decoding allows; the Code Unit could be expressed and represented on any appropriate number scale of choice, including binary scale, octary, hexidecimal, etc.; the size of Code Unit, Code Unit Size, could be of any appropriate choice of size, for instance on binary scale, such as 4 bits or 8 bits or 16 bits or 32 bits or 64 bits or any bit size convenient for computation could be used as Code Unit Size (the definition of Code Unit will be improved upon beginning from Paragraph [55]); the digital number or value of each Code Unit represents the digital content of the Code Unit, the digital number signifying the bit signs of all the bits of the Code Unit; and the relations between the Code Units used could be designed, found out and described; to show how CHAN CODING works using two Code Units as a demonstration of the concept and the techniques used, it is to be defined using mathematical formula(e) as follows: where a and b are the two Code Units making up one Processing Unit in CHAN

CODING applied in the present schema in Diagram 1, each being the digital number representing the content or values of the digital information conveyed in the respective Code Units; a being read before b; where a could be a bigger or lesser value than b, and one could use another two variable names to denote the ranking in value of these two Code Units:

A, being the bigger value of the two Code Units;

B, being the smaller value of the two Code Units; and where a and b are equal in value, then the one read first is to be A and the second one B; so A is bigger or equal in value than B; and so a could be A or B, depending its value in relation to b. where, in view of the above, a bit, the RP Bit (i.e. the Rank and Position Bit), has to be used to indicate whether the first Code Unit has bigger / equal or smaller value than the second one; this bit of code therefore signifying the relation between the position and ranking of the values of the two Code Units read; where, to encode a and b, one could simply add the values of a and b together into one single value, using a bit size of Code Unit Size plus one bit as follows:

Diagram 2

Before encoding, the data of Diagram 2 is as in Diagram 1, assuming Code Unit Size of 64 bits, having a Processing Unit with two Code Units:

Diagram 3

After encoding, the resultant Code, the CHAN CODE, consisting of RP Piece and CV Piece:

where the RP Bit (1 bit), the first piece, the RP Piece of CHAN CODE and the combined value of a and b, A+B, (65 bits, i.e. 64 bits plus one, being bit size of Code Unit Size plus one bit), i.e. the second piece, the Coded Value Piece or Content Value Piece (the CV Piece), of CHAN CODE makes up the resultant coded CHAN CODE, which also includes the associated header information, necessary for indicating the number of encoding cycles that has been carried out for the original digital information as well as necessary for remainder code processing. Such header information formation and processing has been mentioned in another PCT Patent Application, PCT/IB2015/056562, dated August 29, 2015 that has also been filed by the present inventor and therefore it is not repeated here.

People skilled in the Art could easily make use of header processing mentioned in the aforesaid PCT Patent Application or in other designs together with the resultant CHAN CODE, i.e. the RP Piece and the CV Piece of the CHAN CODE, for decoding purpose. As to be revealed later in the present invention, the CV piece could be further subdivided into sub-pieces when more Code Units are to be used according to schema and method of encoding and decoding so designed;

RP Piece is a piece of code that represent certain trait(s) or characteristic(s) of the corresponding Code Units, representing the characteristics of Rank and Position between the two corresponding Code Units of a Processing Unit here. And RP Piece is a subset of code to a broader category of code, which is named as Traits Code or Characteristics Code or Classification Code (so called because of the traits or the characteristics concerned being used to classify or group Code Units with similar traits or

characteristics). The CV Piece represents the encoded code of the content of one or more Code Units. Sometimes, depending on the schema and method of encoding and decoding, part of the content of the Code Units is extracted to become the Classification Code, so that what is left in the CV Piece is just the remaining part of content of the corresponding Code Units. The CV Piece constitutes the Content Code of CHAN CODE. So depending on schema and method of encoding and decoding, CHAN CODE therefore includes at least Content Code, and where appropriate plus Classification Code, and where appropriate or necessary plus other Indicator Code as contained in or mixed inside with the Content Code itself or contained in the Header, such as for instance the Coding method or Code mapping table being used in processing a Super Processing Unit. This will be apparent in the description of the present invention in due course; up to here, CHAN FRAMEWORK contains the following elements: Code Unit,

Processing Unit, Super Processing Unit, Un-encoded Code Unit (containing un-encoded Code), Header Unit (containing indicators used in the Header of the digital information file, applied to the whole digital data file), Content Code Unit, and where appropriate plus Classification Code Unit (used hereafter meaning the aforesaid Traits Code or Characteristics Code Unit), and Indicator Code mixed inside with Content Code (for instance, specific to the respective Super Processing Unit). This framework will be further refined and elaborated in due course.

[15] After finding out the relations of the components, the two basic Code Units of the

Processing Unit, i.e. the Rank and Position as well as the sum listed out in Paragraph [14] above, such relations are represented in the RP Piece and the CV Piece of CHAN CODE using the simplest mathematical formula, A+B in the CV Piece. The RP Piece simply contains 1 bit, either 0 or 1, indicating Bigger / Equal and Smaller in value of the first value a in relation to the second value b of the two Code Units read in one Processing Unit.

[16] Using the previous example, and on the 64 bit personal computers prevalent in the

market today, if each Code Unit of 64 bits on binary scale uses 64 bits to represent, there could be no compression or encryption possible. So more than 1 Code Unit has to be used as the Processing Unit for each encoding step made. A digital file of digital information has to be broken down into one or more Processing Units or Super

Processing Units for making each of the encoding steps, and the encoded code of each of the Processing Units or Super Processing Units thus made are elements of CHAN CODE, consisting of one RP Piece and one CV Piece for each Processing Unit, a unit of CHAN CODE in the present case of illustration. The digital file of the digital information after compression or encryption using CHAN CODING therefore consists of one or more units of CHAN CODE, being the CHAN CODE FILE. The CHAN CODE FILE, besides including CHAN CODE, may also include, but not necessarily, any remaining un- encoded bits of original digital information, the Un-encoded Code Unit, which does not make up to one Processing Unit or one Super Processing Unit, together with other added digital information representing the header or the footer which is usually used for identifying the digital information, including the check-sum and the signature or indicator as to when the decoding has to stop, or how many cycles of encoding or re- encoding made, or how many bits of the original un-encoded digital information present in the beginning or at the end or somewhere as indicated by the identifier or indicator in the header or footer. Such digital information left not encoded in the present encoding cycle could be further encoded during the next cycle if required. This invention does not cover how such additional digital information is to be designed, to be placed and used. The use of such additional digital information will be mentioned where appropriate for the purpose of clarifying how it is to be used. This invention therefore mainly covers the CHAN CODE produced by the techniques and methods used in encoding and decoding, i.e. CHAN CODING within CHAN FRAMEWORK. CHAN CODE could also be divided into two or more parts to be stored, for instance, sub-pieces of CHAN CODE may be separately stored into separate digital data files for the use in decoding or for delivery for convenience or for security sake. The CHAN CODE Header or Footer could also be stored in another separate digital data file and delivered for the same purposes. Files consisting such CHAN CODE and CHAN CODE Header and Footer files are all CHAN CODE FILES, which is another element added to CHAN FRAMEWORK defined in Paragraph [14].

[17] CHAN CODE is the encoded code using CHAN CODING. CHAN CODING produces encoded code or derived code out of the original code. Used in making compression, CHAN CODE therefore represents the compressed code (if compression is made possible under the schema and method used), which is less than the number of bits used in the original code, whether random or not in data distribution. Random data over a certain size tends to be even, i.e. the ratio between bits 0 and bits 1 being one to one. CHAN CODE represents the result of CHAN CODING, and in the present example produced by using the operation specified by the corresponding mathematical formula(e), i.e. the value of the RP Piece and the addition operation for making calculation and encoding, the mathematical formula(e) expressing the relations between the basic components, the Code Units, of the Processing Unit and producing a derived component, i.e. A+B in the CV Piece in Diagram 3. Using the rules and logic of encoding described above, the original code could be restored to. The RP Piece represents the indicator information, indicating the Rank and Position Characteristics of the two Code Units of a Processing Unit, produced by the encoder for the recovery of the original code to be done by the decoder. This indicator, specifying the rule of operation to be followed by the decoder upon decoding, is included in the resultant encoded code. The rule of operation represented by the mathematical formula, A+B, however could be embedded in the encoder and decoder because of its consistency and regularity of application in encoding and decoding. Derived components are components made up by one or more basic components or together with other derived component(s) after being operated on by certain rule(s) of operation such as represented by mathematical formula(e), for instance including addition, subtraction, multiplication or division operation.

CHAN CODE, as described above, obtained after the processing through using CHAN CODING, includes the digital bits of digital information, organized in one unit or broken down into sub-pieces, representing the content of the original digital information, whether random or not in data distribution, that could be recovered correctly and losslessly. The above example of course does not allow for correct lossless recovery of the original digital information. It requires, for instance, another mathematical formula, such as A minus B and the corresponding piece of Content Code to be present before the original digital information could be restored to. The above example is just used for the purpose of describing and defining CHAN FRAMEWORK and its elements so far. After the decision being made on the selection of the number scale used for computation, the bit size of the Code Unit and the components for the Processing Unit (i.e. the number of the Code Units for one Processing Unit; the simplest case, being using just two Code Units for one Processing Unit as described above) and their relations being defined in mathematical formula(e) and being implemented in executable code used in digital computer when employed, what CHAN CODING does for encoding when using mathematical formula(e) as rules of operation (there being other techniques to be used as will be revealed in due course) includes the following steps: (1) read in the original digital information, (2) analyze the digital information to obtain its characteristics, i.e. the components of the Compression Unit and their relations, (3) compute, through applying mathematical formula or formulae designed, which describe the characteristics of or the relations between the components of the original digital information so obtained after the analysis of CHAN CODING, that the characteristics of the original digital data are represented in the CHAN CODE; if compression is made possible, the number of digital bits of CHAN CODE is less than the number of digital bits used in the original code, whether in random data distribution or not; the CHAN CODE being a lossless encoded code that could be restored to the original code lossless on decoding [using mathematical formula(e) and the associated mathematical operations in encoding does not necessarily make compression possible, which for instance depends very much on the formula(e) designed together with the schema and method used, such as the Code Unit Definition and the technique of Code Placement]; and (4) produce the corresponding CHAN CODE related to the original digital information read in step (1). What the CHAN CODING does for decoding the corresponding CHAN CODE back into the digital original code includes the following steps: (5) read in the corresponding CHAN CODE, (6) obtain the characteristics of the corresponding CHAN CODE, (7) apply in a reverse manner mathematical formula(e) so designed, which describe the characteristics of or the relations between the components of the original digital information so obtained after the analysis of CHAN CODING, to the CHAN CODE, including the use of normal mathematics and COMPLEMENTARY MATHEMATICS; (8) produce after using step (7) the original code of the original digital information lossless, whether the original digital information is random or not in data distribution. So on decoding, the CHAN CODE in Diagram 3 is restored correctly and losslessly to the original digital data code in Diagram 2. This could be done because of using another inventive feature, broadening the definition of Code Unit to provide a more flexible and novel way of data description, later to be introduced to CHAN FRAMEWORK beginning at Paragraph [55]. Of course, up to now before revealing this inventive feature, it is not the case as another CV sub-piece representing A minus B, for instance, is missing in the above Diagrams; even if this CV sub-piece is present, using the existing Code Unit definition (Code Unit being defined in terms of unified bit sizes), the resultant CHAN CODE is not guaranteed to be less in size than the original code, depending on the schema and method used, such as the Code Unit Definition and the technique of Code Placement, as well as the original data distribution. But with the presence of the CV sub- piece representing the result of the operation of the mathematical formula, A minus B, the resultant CHAN CODE could be regarded an encrypted code that could be used to recover the original digital code correctly and losslessly; the encrypted CHAN CODE so produced however may be an expanded code, not necessarily a compressed code. The method and the associated techniques for producing compressed code out of digital data whether random or not in distribution, putting an end to the myth of Pigeonhole Principle in Information Theory, will be revealed in due course later when discussing the inventive feature of novel Code Unit definition.

[19] To path the way of understanding the concept of range and its use in applying the

technique of Absolute Address Branching, the use of which could help compressing random data together with the use of the aforesaid inventive feature of novel Code Unit definition, an explanation of how COMPLEMENTARY MATHEMATICS does is given below in the following Diagram:

Diagram 4

COMPLEMENTARY MATHEMATICS

CC - A = A C and A C + A = CC or

B C + B = CC and

(A C + B) = (CC - A) + B where CC is Complementary Constant or Variable, being a Constant Value or Variable Value chosen for the operation of COMPLEMENTARY MATHEMATICS, which is defined as using the Complementary Constant or Variable (it could be a variable when different Code Unit Size is used in different cycles of encoding and decoding) to make mathematical calculation or operation having addition and subtraction logic as explained in the present invention. Depending on situation more than one Complementary Constant or Variable could be designed and use for different operations or purposes where necessary or appropriate;

A is the value being operated on, the example used here is the Rank Value A, A being bigger or equivalent in value to B in the present case of using two Code Unit Values only; so in the first formula:

CC - A = A C

where CC minus A is equal to A Complement, i.e. denoted by A c , which is the

Complementary Value of A, or a mirror value, using the respective Complementary Constant or Variable; for instance, let CC be a constant of the maximum value of the Code Unit Size, such as 8 bits having 256 values; then CC is 256 in value; and let A be 100 in value, then A c is equivalent to 256 minus 100 = 156; and the reverse operation is therefore A c + A = CC, representing the operation of 100 + 156 = 256; and in the fourth formula, (A c + B) = (CC - A) + B; and let B be 50, then A c + B = (256 - 100) + 50 = 156 + 50 = 206.

[20] Diagram 4 gives the logic of the basic operations of the COMPLEMENTARY

MATHEMATICS invented by the present inventor that is sufficient for making the decoding process to be introduced later. However, for more complete illustration of the addition and subtraction operations of COMPLEMENTARY MATHEMATICS, such logic is defined and explained in Diagram 5 below: Diagram 5

More Logic of COMPLEMENTARY MATHEMATICS defined:

CC - (A + B) = (A+B) c or = CC - A- B; and

CC-(A-B) = A C + B and

CC - A + B may be confusing; this should better be represented clearly as: either

(CC-A) + B=A C + B or

(CC B) + A = B C + A or

CC-(A + B) = CC-A- B; so to further illustrate the above logic of the subtraction operations of

COMPLEMENTARY MATHEMATICS, let CC be 256, Abe 100 and B be 50, then

CC-(A + B) = (A+B) C or = A c -B or = B c -A

i.e.256 - (100 + 50) = (100 + 50) c = 256 - 150 = 106 = A c - B = 156 - 50 = 106 or

B c - A 206 100= 106 and

CC - (A A c + B

i.e.256- (100 - 50) = 256 - (50) = 206 = 156 + 50 = 206 and (CC-A) + B=A C + B

i.e. (256 - 100) + 50 = 156 + 50 = or

(CC-B) + A = B C + A

i.e. (256 - 50) + 100 = 206 + 100 [21] Using the above logic of the addition and subtraction operations of COMPLEMENTARY MATHEMATICS, one could therefore proceed with showing more details about how COMPLEMENTARY MATHEMATICS work in following Diagram 6:

Diagram 6

Operation on Data Values or Data Ranges using COMPLEMENTARY MATHEMATICS Let CC be 256, Abe 100 and B be 50

(1) normal mathematical processing:

divide 150 by 2, i.e. get the average of A and B:

= (A+B)/2 = 1/2 A + 1/2B = 75; and since A is the bigger value in A+B;

therefore

= A- 1/2(A-B) = 100- 1/2(100-50)= 100- 1/2(50)= 100-25 = 75; = B + 1/2(A- B) = 50 + 1/2(100 - 50) = 50 + 1/2(50) = 50 + 25 = 75;

(2) COMPLEMENTARY MATHEMATICS processing:

make an operation of (CC - A) + B, i.e. operating CC on A, not B:

= (CC -A) + B=A C + B = (256 - 100) + 50= 156 + 50 = 206; noting that to do the operation in this step, A and B must be separated first; the step is meant to illustrate the operation of COMPLEMENTARY

MATHEMATICS here

(3) CHAN CODING using CHAN MATHEMATICS (normal mathematical

processing and COMPLEMENTARY MATHEMATICS processing): add the result of Step (1) to the result of Step (2), using A - 1/2(A-B):

= A C + B+A- 1/2(A-B) = A C + A + B- 1/2(A-B)

= CC + B- 1/2(A-B) = 256 + 50- 1/2(100-50)

= 256 + 50-25

= 256 + 25;

(4) CHAN CODING using CHAN MATHEMATICS :

subtract CC from the result of Step (3):

= [CC + B- 1/2(A-B)]-CC = B- 1/2(A-B) = [256 + 50 - 25] - 256

= [50 - 25];

CHAN CODING using CHAN MATHEMATICS:

add the result of Step (1) to Step (4), using B + 1/2(A-B):

= [B - 1/2(A - B)] + [B + 1/2(A - B)]

= 2B

= [50 - 25] + [50 + 25]

= 25 + 75

= 100 normal mathematical processing:

divide 2B by 2 to get the value of B:

= 2B/2 = B

= 100/2 = 50 normal mathematical processing:

get the value of A by subtracting B from A+ B:

= A+B - B

= 150 - 50

= 100

The above serves to show the differences amongst normal mathematical processing, COMPLEMENTARY MATHEMATICS processing, and CHAN CODING using CHAN MATHEMATICS.

[22] COMPLEMENTARY MATHEMATICS performed in Step (2) above could only be made only after A and B are separated and known beforehand, therefore another piece of data information, i.e. (A - B) has to be added (i.e. before the novel Code Unit Definition, which is to be revealed later, being invented), so that A and B could be separated using the formulae (A+B) + (A - B) = 2* A and 2* A / 2 = A as well as (A+B) +(A - B) = 2*B and 2*B / 2 = B. And Step (2) just shows how COMPLEMENTARY MATHEMATICS works when operating on such basic components. Using the RP Bit, A and B after separation could be restored correctly to the position of first value and the second value read as a and b.

[23] COMPLEMENTARY MATHEMATICS does not directly help to meet the challenge of the Pigeonhole Principle in Information Theory. However it does highlight the concept of using range for addition and subtraction of data values and the concept of a mirror value given a Complementary Constant or Value. It is with this insight of range that the challenge of Pigeonhole Principle in Information Theory is met with successful result as range is essential in the operation of using Absolute Address Branching or latent in the way how data value or number is to be represented and defined.

Before confirming the end to the myth of the Pigeonhole Principle in Information Theory, the present invention reveals in greater detail about how using mathematical formula(e) under CHAN FRAMEWORK could produce countless algorithms for encoding and decoding. The illustration begins with Diagram 7, in which four Code Units, four basic components, makes up one Processing Units:

In most cases, the four basic components of a Processing Unit could be arranged into 3 Arms, i.e. the Long Arm, the Middle Arm and the Short Arms, with 2 pairs of basic components, representing the Upper Corner (being the pair of the two basic components with a bigger sum) and the Lower Corner (being the pair of the two basic components with a smaller sum) of the respective arms. However, in rare cases the values of these pairs happen to have same values in one way or anther, so that there may be less than 3 arms, such as only 2 arms or 1 arm or even becoming a dot shape. Therefore the distribution of the values of the four basic components of a Processing Unit could be represented in different CHAN SHAPES as follows:

Diagram 7

CHAN SHAPES

CHAN DOT ·

This is where all four basic components have the same value;

CHAN LINES

There are 2 CHAN LINES as follows:

CHAN LINE 1 : The three arms all overlap together with the Short Arm having values [l]+[4] being the Upper Corner and [2]+[3] being the Lower Corner.

CHAN LINE 2: The three arms all overlap together with the Short Arm having values [2]+[3] being the Upper Corner and [l]+[4] being the Lower Corner.

CHAN TRIANGLE

There are 2 arms, the Long Arm and the Middle Arm, and the Short Arm becomes a dot as its pairs of values [l]+[4] and [2]+[3] are equal.

CHAN RECTANGLES AND TRAPEZIA AND SQUARES CHAN RECTANGLE 1 showing the incoming stream of data values of 4 Code Units one after the other in sequence

The above CHAN RECTANGLE shows the first Code Unit Value a, of the Processing Unit is B, the second in ranking amongst the four Code Units; the second Code Unit Value b is C, the third in ranking; the third Code Unit Value c is A, the first in ranking; the fourth Code Unit Value d is D, the last in ranking.

CHAN TRAPEZIA showing the relationship between the four basic components of the CHAN RECTANGLES CHAN TRAPEZIUM 1

Upper Corners of the 3 arms are [l]+[2], [l]+[3] and [l]+[4] and

Lower Corners of the 3 arms are [3]+[4], [2]+[4] and [2]+[3].

CHAN TRAPEZIUM 1 shows the relationship amongst the four basic components, the four values of the four Code Units shown in CHAN RECTANGLE 2 where A is re- denoted by [1], B [2], C [3] and D [4]; and (A+B) = [l]+[2], (A - B) = [1] -[2], and the like in the same way. It could be seen that the values of the four basic components of the Processing Unit [1], [2], [3] and [4] could be arranged into three arms, being ([l]+[2]) - ([3]+[4]) i.e. the Long Arm, ([l]+[3]) - ([2]+[4]) the Middle Arm and ([l]+[4]) - ([2]+[3]) the Short Arm. The sum of the values of [l]+[2]+[3]+[4] is always the same for all the three arms. The differences amongst the three arms is reflected in their lengths, i.e. the differences in values between the upper corners and lower corners, of the three arms.

The Long Arm and Middle Arm always stay the same way in ranked value arrangement. The Upper Corner and Lower Corner of the Short Arm however would swap depending on the value distribution of the four basic components. So there are two scenarios, either [l]+[4] is bigger in value the [2]+[3] as in CHAN TRAPEZIUM 1 or the other way round, which is represented in CHAN TRAPEZIUM 2 as follows:

CHAN TRAPEZIUM 2

Upper Corners of the 3 arms are [l]+[2], [l]+[3] and [l]+[4] and in CHAN

TRAPEZIUM 1, Lower Corners of the 3 arms are [3]+[4], [2]+[4] and [2]+[3].

In CHAN TRAPEZIUM 1, the values of the Long Arm, the Middle Arm and the Short Arm could be redistributed as follows: Long Arm = ([l]+[2]) - ([3]+[4]) = ([l]-[4]) + ([2]-[3]) = ([l]-[3]) + ([2]-[4]);

Middle Arm = ([l]+[3]) - ([2]+[4]) = ([l]-[4]) - ([2]-[3]) = ([l]-[2]) + ([3]-[4]); and Short Arm = ([l]+[4]) - ([2]+[3]) = ([l]-[3]) - ([2]-[4]) = ([l]-[2]) - ([3]-[4]).

In CHAN TRAPEZIUM 2, the values of the Long Arm, the Middle Arm and the Short Arm could also be redistributed as follows:

Long Arm = ([l]+[2]) - ([3]+[4]) = ([l]-[4]) + ([2]-[3]) = ([2]-[4]) + ([l]-[3]);

Middle Arm = ([l]+[3]) - ([2]+[4]) = ([l]-[4]) - ([2]-[3]) = ([3]-[4]) + ([l]-[2]); and Short Arm = ([2]+[3]) - ([l]+[4]) = ([2]-[4]) - ([l]-[3]) = ([3]-[4]) - ([l]-[2]).

So in CHAN TRAPEZIUM 1 and 2, the Long Arm is always equal to or bigger than the Middle Arm by 2*([2]-[3]).

But because of the two possible scenarios of swapping in values of the Upper Corner and Lower Corner of the Short Arm, in CHAN TRAPEZIUM 1, the Long Arm is always equal to or bigger than the Short Arm by 2*([2]-[4]) and the Middle Arm is always equal to or bigger than the Short Arm by 2*([3]-[4]).

And in CHAN TRAPEZIUM 2, the Long Arm is always equal to or bigger than the Short Arm by 2*([l]-[3]) and the Middle Arm is always equal to or bigger than the Short Arm by 2*([l]-[2]).

CHAN TRAPEZIUM 3 or CHAN SQUARE 1

This is where the Middle Arm overlaps with the Long Arm with Upper Corner and Lower Corner of the Short Arm being [l]+[4] and [2]+[3] respectively. If the two arms therein are not equal in length, it is a trapezium, otherwise it is a square:

CHAN TRAPEZIUM 4 or CHAN SQUARE 2

This is where the Middle Arm overlaps with the Long Arm with Upper Corner and Lower Corner of the Short Arm being [2]+[3] and [l]+[4] respectively. If the two arms therein are not equal in length, it is a trapezium, otherwise it is a square:

CHAN TRAPEZIUM 5 or CHAN SQUARE 3

This is where the Short Arm overlaps with the Middle Arm with Upper Corner and Lower Corner of the Short Arm being [l]+[4] and [2]+[3] respectively. If the two arms therein are not equal in length, it is a trapezium, otherwise it is a square:

CHAN TRAPEZIUM 6 or CHAN SQUARE 4

This is where the Short Arm overlaps with the Middle Arm with Upper Corner and Lower Corner of the Short Arm being [2]+[3] and [l]+[4] respectively. If the two arms therein are not equal in length, it is a trapezium, otherwise it is a square:

To make possible data encoding and decoding in the present illustration, the four values of the four basic components have to be represented by 1 CV Pieces consisting of 4 sub- pieces of values (produced by the use of four formulae designed for such purpose; one could attempt to make use of three or less formulae, so far the efforts do not appear to show promising results; one however should not rule out such possibility as there are plenty opportunities to introduce new techniques to CHAN FRAMEWORK as the present invention will show in this Application) in addition to the RP Piece, which is used to indicate the relationship between the Position and Rank of the values of the 4 basic components as shown in the following Diagram 8:

Diagram 8

CHAN RECTANGLES showing details of the positions and ranking of the four incoming basic components and the resultant CHAN CODE

CHAN RECTANGLE 3 showing the Ranking and Position of incoming stream of data values of 4 Code Units and the 64 bit size used CHAN RECTANGLE 4 CHAN CODE, the compressed code created by using CHAN CODING showing details of the RP Piece and CV Piece

One very distinguishing characteristic of the present invention is the varying bit sizes of values of the 4 sub-pieces making up the CV Piece; and RP Piece itself varies between 4 bit and 5 bit; and despite their varying bit sizes, CHAN CODING techniques to be revealed later could be used to decode the relevant CHAN CODE and restore it losslessly and correctly back into the original incoming digital data codes. For the purpose of making compression, the varying bit sizes used is intended for further raising the compression ratio through using CHAN CODING techniques over the compression ratio that could be achieved using mathematical formulae.

The RP Piece is to be explained here first. RP Piece is used for indicating the relative positions of the 4 Ranked Values of the four basic components, the four Code Units, of a Processing Unit because the Ranking of the four basic components may vary with their positions, there is no fixed rule for determining the relationship between position and ranking of the values of the four basic components. There are altogether 24 combinations between Position and Ranking as shown in the following Diagram 9:

Diagram 9

Rank Position Code Table

Possible Combinations of Positions and Ranks of the 4 Basic Components

As there are altogether 24 variations between Rank and Position of the values of the four basic components in combination, one normally would have to use 5 bits to house and indicate these 24 variations of Rank and Position Combination so that on decomposition, the correct Rank and Position of the values of the four basic components could be restored correctly, i.e. the four rank values of the basic components could be placed back into their correct positions corresponding to the positions of these values in the incoming digital data input. However, a technique called Absolute Address Branching could be used to avoid wasting in space for there are 32 seats for housing only 24 variations and 8 seats are left empty and wasted if Absolute Address Branching is not to be used.

[28] To use the simplest case, one could have only 3 values, then normally 2 bits have to be use to provide 4 seats for the 3 variations of values. However, with Absolute Address Branching is used, for the case where value = 1, only 1 bit is used and for the case where the value = 2 or = 3, 2 bits however have to be used. For instance, the retrieving process works as follows: (1) read 1 bit first; (2) if the value is 0, representing the value being 1, then there is no need to read the second bit; and if the value is 1, then the second bit has to be read, if the second bit is 0, it represents that the value is 2 and if the second bit is 1, then the value is 3. So this saves some space for housing the 3 values in question. 1/3 of the cases or variations uses 1 bit and the other 2/3 of the cases or variations has to use 2 bits for indication.

[29] So using Absolute Address Branching, 8 variations out of the 24 variations require only 4 bits to house and the remaining 16 variations require 5 bits. That means, 4 bits provide only 16 seats and 5 bits provide 32 seats. And if there are 24 variations, there are 8 variations over the seats provided by 4 bits, so 8 seats of the 16 seats provided by 4 bits have to reserved for representing 2 variations. So one could read 4 bits first, if it is found that the value is between 1 to 8, then one could stop and does not have to read in anther bit. However, if after reading 4 bits and the value is between 9 to 16, for these 8 variations, one has to read in another bit to determine if the which value it represents, for instance after 9 is determined, it could represent 9 or another value such as 17, then one has to read in another bit, say if it is 0, that means it stays as 9 and if it is 1, then it is of the value of 17, representing a Rank Position Code having a value of 17, indicating the RP pattern that the values of [1], [2], [3] and [4] have to be put into the positions of 3,4,1 and 2 correspondingly by referring to and looking up the Rank Position Code Table in Diagram 9 above. Absolute Address Branching is therefore a design in which an address, instead of indicating one value as it normally does, now could branch to identify 2 or more values using extra one bit or more bits, depending on design. It is used when the range limit is known, i.e. the maximum possible combinations or options that a variable value is to choose from for its determination. For instance, in the above RP Table, it is known that there are only 24 combinations of Rank and Position, so the maximum possible combinations are only 24, because it is known it could be used as the range limit for indicating which particular value of the RP combination that a Processing Unit has, indicating how the values of [1], [2], [3] and [4] are to be put into the first, the second, the third and the fourth positions of the incoming digital data stream. Because this range limit is known and Absolute Address Branching is used, therefore on average, only four and a half bit is required instead of the normally five bits required for these 24 combinations.

[30] It now comes to the determination of the ranked values of the four basic components, A=[l], B=[2], C=[3] and D=[4]. To determine the values of [1], [2], [3] and [4], one could use formulae with respect to the CHAN RECTANGLES AND CHAN TRAPEZIA to represent the essential relations and characteristics of the four basic components where the RP Piece as explained in Paragraph [29] above and the CV Piece altogether takes up a bit size less than the total bit size taken up by the 4 incoming basic components, a, b, c and d, i.e. 4 times the size of the Compression Unit for a Processing Unit under the schema presented in the present invention using CHAN RECTANGLES AND CHAN TRAPEZIA as presented above.

[31] After meticulous study of the characteristics and relations between the four basic

components making up a Processing Unit represented in CHAN SHAPES, the following combinations of formulae represented in 3 sub-pieces of the CV Piece is the first attempt for illustrating the principle at work behind. There could be other similar formulae to be found and used. So there is no limit to, but including using the formulae presented below with reference to CHAN SHAPES. So this first attempt is:

(1) ([4] - 1 2 ([3] - [4]))

(2) ([l] - [4])

(3) (([2] - [3]) + 1 2 ([3] - [4]))

The above 3 values represented in the formulae of Step (1) to Step (3) are different from those presented in the PCT Application, PCT/IB2016/054732 filed on 05 August 2016 mentioned earlier. In that PCT Application, the use of COMPLEMENTARY

MATHEMATICS combining with the use of Rank and Position Processing is asserted to be able to put an end to the myth of Pigeonhole Principle in Information Theory. Upon more careful examination, it is found that the three formulae thus used are not good enough to achieve that end. So the right formulae and formulae design is very vital for the application of the techniques of CHAN CODING. In the aforesaid PCT Application, CHAN CODING is done using formulae designed using the characteristics and relations between the basic components, i.e. the Code Units of the Processing Unit, as expressed in

CHAN SHAPES.

Formulae Design is more an art than a science. Because one could not exhaust all combinations of the characteristics and relations between different basic as well as derived components of the Processing Units, a novel mind will help to make a successful hunch of the right formulae to use. The 3 formulae used in the aforesaid PCT Application is designed in accordance to a positive thinking in order that using the 3 formulae, one is able to reproduce the associated CHAN SHAPES, including CHAN TRAPESRJM or CHAN SQUARE or CHAN TRIANGLE or CHAN DOT or CHAN LINE as the case may be. But it is found out that using this mind set, the basic components could not be separated out (or easily separated out because the combinations for calculation could scarcely be exhausted) from their derived components as expressed in the 3 formulae so designed using the techniques introduced in that PCT Application.

[33] In order to meet the challenge of the Pigeonhole Principle in Information Theory, a novel mind set is lacking in the aforesaid PCT Application. And it is attempted here. When things do not work in the positive way, it might work in the reverse manner. This is also the mindset or paradigm associated with COMPLEMENTARY MATHEMATICS. So if the formulae designed to reproduce the correct CHAN SHAPE do not give a good result, one could try to introduce discrepancy into these 3 formulae. So the technique of Discrepancy Introduction is revealed in this present invention in order to show the usefulness of COMPLEMENTARY CODING as well as the usefulness of the technique of Discrepancy Introduction itself during formula design phase by ending the myth of Pigeonhole Principle in Information Theory with the use of all the techniques of CHAN CODING, of which Discrepancy Introduction and COMPLEMENTARY CODING may be useful ones.

[34] So during the design phase, the first step is that one would design the formulae for

encoding as usual so that CHAN SHAPES could be reproduced using the formulae so designed. For instance, using the example given in the aforesaid PCT Application, the 3 formulae, from which the values and the encoded codes, of the 3 sub-pieces of CV Piece of CHAN CODE are derived and obtained, of Step (1) to Step (3) are:

(1) = ([1] - [4]);

(2) = ([2] - [3]); and

(3) = ([3] + [4]).

Using normal mathematics, Step (4) to Step (9) in the aforesaid PCT Application, cited below, reproduce the associated CHAN SHAPE as follows:

(4) = (1) + (2); i.e. Step (1) + Step (2)

= ([1] ~ [4]) + ([2] - [3]); upon re-arrangement or re-distribution of these 4 ranked

values, leading to;

= ([!] + [2]) - ([3] + [4]); the Long Arm obtained;

= ([1] - [3]) + ([2] - [4]); for comparing the difference in length with other arms;

(5) = (i) - (2);

= ([l] - [4]) - ([2] - [3]);

= ([1] + [3]) - ([2] + [4]); the Middle Arm obtained;

= ([l] - [2]) + ([3] - [4]);

(6) = (1) + (3);

= ([l] - [4]) + ([3] + [4]); = ([!] + [3]); the Upper Corner of the Middle Arm;

(7) = (2) + (3);

= ([2] - [3]) + ([3] + [4]);

= ([2] + [4]); the Lower Corner of the Middle Arm

(8) = (6) + (7);

= ([1] + [3]) + ([2] + [4]); being the sum of [1]+ [2] + [3] + [4], very useful for

finding the Upper Corner of the Long Arm;

= ([!] + [2] + [3] + [4]);

(9) = (8) - (3);

= ([1] + [2] + [3] + [4]) - ([3] + [4]); where [3] + [4] = Step (3) given as the Lower

Corner of the Long Arm;

= ([1] + [2]); the Upper Corner of the Long Arm;

[35] It could be seen from the above steps that the two corners of the Long Arm and the

Middle arms as well as the arms itself are properly reproduced using normal mathematics from Step (4) to Step (9). However, using the 3 proper formulae, the basic components are merged and bonded together so nicely in the derived components that the basic components could not be easily separated out from each other. So one could try to introduce discrepancy into the 3 properly designed formulae in order to carry on the processing further to see if a new world could be discovered. One should not introduce discrepancy in a random manner for the principle of garbage in garbage out. One should focus on what is required for providing useful information to the 3 properly formulae already designed.

[36] In the above example, one could easily notice that two derived components are missing but important in providing additional information for solving the problem at hand, i.e. separating the 4 basic components out from the derived components. These two derived components are identified to be [1] - [2] and [3] - [4]. Having either of these two derived components, one could easily separate the basic components out through addition and subtraction between ([1] - [2]) with ([1] + [2]) obtained at Step (9) as well as between ([3] - [4]) with ([3] + [4]) obtained at Step (3). So one could try introducing either or both of [1] - [2] and [3] - [4] into the 3 properly formulae as mentioned above. And where necessary, further adjustment of formulae could be used.

[37] After countless trials and errors, under the CHAN FRAMEWORK so far outlined, no successful formula design has come up for correct decoding using only 3 formulae in the schema of using 4 Code Units as a Processing Unit even when the feature of Discrepancy Introduction is attempted in the formula design as found in Paragraph [31]. So the fourth formula such as [1] - [2] or [3] - [4], i.e. Step (4) = [1] - [2] or Step (4) = [3] - [4], has to be introduced for correct decoding. Or a more wise soul may be able to come up with a solution of using only 3 formulae. So there is still hope in this respect. What is novel about CHAN FRAMEWORK is that it provides the opportunity of making possible different and numerous ways of ordering or organizing digital data, creating orders or structures out of digital data that could be described so that different techniques could be devised for encoding and decoding for the purposes of compression and encryption for protection of digital information.

[38] Even if 4 CV sub-pieces, resulting from using 4 formulae, together with the RP Piece have to be used for successfully separating out the values of the 4 basic components or Code Units of a Processing Unit upon decoding for correct recovery of the original digital information, it still provides opportunities for compression depending on the formula design and data distribution of the digital information. With the introduction of another technique of using Super Processing Unit, using the technique of formula design may yield fruitful result even in compressing random and/or even data. Nevertheless, formula design used in CHAN FRAMEWORK serves to provide limitless ways or algorithms of making encryption and decryption of digital data for the purpose of data protection. And this is an easy way of doing encryption and decryption that could easily be practised by even layman. To the less wise souls, the values of [1], [2], [3] and [4] are separated out from each other using the formulae as expressed in Steps (1) to (4) and other derivatives steps as outlined in Paragraphs [34] and [36]. Further formula adjustment and steps could be designed for space optimization, modeling on examples as outlined in Paragraphs [43] and [44] below where applicable.

[39] The values of the data calculated from using the formulae stated Step (I) to Step(IV) in Paragraph [37] are now put into the four sub-pieces of the CV Piece of CHAN CODE during the encoding process. These four values are stored into the CHAN CODE FILE as the four sub-pieces of the CV Piece together with the corresponding RP Piece upon encoding. The value range limit for each of the CV sub-piece should be large enough for accommodating all the possible values that could come up using the respective formula. During decoding, the RP Piece and the CV Piece are read out for decoding by using Absolute Address Branching technique and by looking up the retrieved value, the Rank Position Code of the corresponding Processing Unit against the relevant Rank Position Code Table used as in Diagram 9 to determine where the ranked values of [1], [2], [3] and [4] of the Processing Units are to be placed during decoding. The ranked values of [1], [2], [3] and [4] are determined as shown in the above steps outlined in Paragraph [38] using the values of the 4 sub-pieces of the corresponding CV Piece stored in Step (I) to Step (IV) in Paragraph [37]. The 4 sub-pieces of the CV Piece are to be placed using the techniques revealed in Paragraph [43] and [44], which elaborate on the value of COMPLEMENTARY MATHEMATICS AND COMPLEMENTARY CODING in determining the range limit for the placement of the CV sub-pieces for adjustment of design where appropriate. Absolute Address Branching technique is technique for optimizing the space saving here. Simple substitution of a, b, c, d, replacing [1], [2], [3] and [4] in the four formulae as described in Paragraph [37] also indicates that the RP Piece could also be dispensed with. This means that, through such substitution, the formulae outlined in Paragraph [37] and [38] also work without RP processing. But without RP processing, the advantage of the possible space saving resulting from the use of range limits is then lost. That could result in more space wasted than using RP processing.

[40] COMPLEMENTARY MATHEMATICS AND COMPLEMENTARY CODING helps very much during making the design for the placement of the CV sub-pieces for space saving which may result in adjustment of the original formula design where appropriate. Diagram 10 below illustrates the contribution made by using COMPLEMENTARY MATHEMATICS AND COMPLEMENTARY CODING during the formula design phase in the present endeavor using the formula design in Paragraph [31] together with the following Diagram 10:

Diagram 10

CHAN BARS

Visual Representation of Range Limits under the paradigm of COMPLEMENTARY MATHEMATICS using the Code Unit Size as the Complementary Constant CC

[41] From the above Diagram, ranges represented by the 3 formulae in Paragraph [31] expressing the values of the 3 CV sub-pieces are shown together with their

Complementary range(s), the unknown data. X is not known by itself and merged as part of the formula: ([2] - [3]) + 1/2 ([3] - [4]). The anomaly or discrepancy or adjustment, 1 /2 ([3] - [4]), being introduced into these formulae mainly are introduced to the properly designed formulae of [3] + [4] and [2] - [3] describing neatly the associated CHAN SHAPE. Because the average of [3] + [4] is either ([4] + 1 2 ([3] - [4]) or ([3] - 1 2 ([3] - [4]), one could use either of which to introduce the anomaly or discrepancy or adjustment into it. ([4] + 1/2 ([3] - [4]) is taken to be the modified formula after the introduction of formula discrepancy or adjustment into it, the one now used in the formula used in Step (1), and to make a balance of this discrepancy or adjustment, the third formula used in Step (3) is adjusted to be (([2] - [3]) + 1/2 ([3] - [4])) correspondingly. Because this is a brand new world that no one charters before, one has to learn from trial and error. As the formulae designed in Paragraph [31] are also not successful in providing the solution to the aforesaid challenge. More adjustment is required. People with wiser soul may also design other formulae suitable for use using CHAN CODING for separating merged data or basic components of Processing Unit out from derived (or combined with basic) components represented by the formulae so designed. The technique of introducing Formula Discrepancy or Formula Anomaly or Formula Adjustment includes the following steps: (i) designing formulae which could be used to reproduce the values or relations and characteristics of derived or basic components; (ii) finding what derived or basic components which are missing but essential for supplying additional values for the purpose of separating basic components out from the components represented by the formulae designed; (iii) designing formula anomaly or formula adjustment or formula discrepancy using formulae that could supply these missing components; such formula anomaly or formula adjustment or formula discrepancy is to be introduced into the formulae used to obtain values of the CV sub-piece(s) of CHAN CODE; (iv)

incorporating the formula anomaly or formula adjustment or formula discrepancy into the previously designed formulae made in Step (i) above and making a new set of formulae suitable for use for the purpose of separating basic components out from the components represented by the formulae newly designed or adjusted.

[42] Assuming using 4 formulae as described in Paragraph [38], after determining the ranked values of [1], [2], [3] and [4] and using the RP Piece, the original digital data input of the corresponding Processing Unit could now be decoded losslessly and restored correctly into the right positions. The placement and the bit size used for storing the code represented by the formulae for Step (I) to Step (IV) as the 4 sub-pieces of the CV Piece of the CHAN CODE could now be considered and further optimized for bit storage during the encoding process. It uses the concept of range limit.

[43] To consider which sub-piece of the 4 sub-pieces of the CV Piece of the CHAN CODE to be put first, one could consider if placing one sub-piece could give information for the placement of the other ensuing sub-pieces so that storage space could be reduced. The following discussion uses the 3 formulae as described in Paragraph [31] for elucidation purpose. In order to appreciate the value of COMPLEMENTARY MATHEMATICS, by comparing the free formulae at Steps (1) to (3) in Paragraph [31]: ([4] + 1/2 ([3] - [4]), ([1] - [4]), and (([2] - [3]) + 1/2 ([3] - [4])), it could be seen that the ranges represented by the first 2 formulae, [4] - 1/2 ([3] - [4]) and ([1] - [4]) are out of range; that neither of them embraces the other. So apparently either of them could be placed first. However, using CC on [4] - 1/2 ([3] - [4]), it becomes obvious that the mirror value of [4] - 1/2 ([3] - [4]), that is [4] c + 1/2 ([3] - [4]), should be able to embrace the range of [1] - [4], so [4] c + 1 /2 ([3] - [4]), the mirror value of [4] - 1/2 ([3] - [4]), could be used as a range limit for placing the value of [1] - [4]. And thus the CC value of [4] - l/2([3] - [4]) is to be placed as the first CV sub-piece so that the second CV sub-piece represented by the formula of [1] - [4] could use the mirror value of (by making the CC operation on) [4] - 1/2 ([3] - [4]) as the range limit for storing the value of [1] - [4].

[44] However as the value of [4] - 1/2 ([3] - [4]) in some rare cases could be a negative value, then the value of [4] c + 1/2 ([3] - [4]) would be over the Code Unit Size, in those cases, one could revert to using the CC value, i.e. the Code Unit Size, as the range limit for the value of [1] - [4]. That is one is able to choose the shortest range out of the two ranges provided by [4] c + 1/2 ([3] - [4]) and the CC value for use as the range limit of the value of [1] ~ [4]· In most other cases, the range limit as represented by [4] c + 1/2 ([3] - [4]) would be less than the CC value, so bit storage saving could be achieved. [1] - [4] could be used as the range limit for ([2] - [3]) + 1/2 ([3] - [4]), this is because the mirror value of [2] - [3] using [1] - [4] as the Complementary Variable is [1] - [2] plus [3] - [4] and [3] - [4] should be more than 1/2 ([3] - [4]), so it is certain that ([2] - [3]) + 1/2 ([3] - [4]) lies within the range of [1] - [4]. Therefore [1] - [4] could be put as the second CV sub-piece serving the range limit for the third CV sub-piece, ([2] - [3]) + 1/2 ([3] - [4]), for more bit storage saving. The range limit for placing the first CV sub-piece is the CC value as no other range limit is available before it as a reference. Also as for some rare cases where the value of [4] - 1/2 ([3] - [4]) could become negative, an extra sign bit has to be used for it. Also because the value could be with a fraction of 0.5 due to the halving operation, a fraction bit has also to be included for such indication. So altogether it is the CC bit size + 2 bits. So if [1] - [4] is of the value of 1,000 instead of the maximum value of 2^64, then 1000 could be used as the range limit for storing ([2] - [3]) + 1/2 ([3] - [4]). Absolute Address Branching could also be used so that the limit of 1,024 could be reduced exactly to 1,000 though in this case the saving is very small. The bit size used is either 10 bits or 9 bits instead of the 64 bits normally required for a 64 bit Code Unit. However as with the case for the first CV sub-piece, [4] - 1/2 ([3] - [4]), the third CV sub-piece, ([2] - [3]) + 1/2 ([3] - [4]), may also have a value fraction of 0.5 because of the halving operation, so one more fraction bit has to assign above onto the range limit set by the value of [1] - [4]. The placement of these 3 sub-pieces of CV Piece of the CHAN CODE could then be done for the Steps (1) to (3) in Paragraph [31] revealed above. So it is apparent that the 3 sub-pieces of the CV Piece and thus the whole CV Piece could vary in size from one Processing Unit to another if the concept and techniques of Range Limit and Absolute Address Branching are also used for optimization of storage space. It should be born in mind that one should make sure that the range limit used should be able to embrace all the possible values that could appear for which the relevant range limit is designed to be used. In certain cases, the range limit may require adjustment by adding the numerical value 1 to it; this is because the rank values are ranked according to the value itself and when being equal in value, they are ranked according to their positions in the incoming digital data stream. One therefore has to be careful and consider the range limit for a particular value case by case and make certain the range limit designed for a particular value is able to cover all the possible values that could come up.

[45] For instance, the space or bit size required for placing the following three CV sub-pieces of CHAN CODE (not the ones in Paragraph [31] and assuming Formula (IV) as using the standard bit size of a Code Unit), represented by formulae as designed as follows for instance for encryption purpose:

Formula (I) = 3*([1] - [2] + [3] - [4]) + ([2] - [3]) + ([1] + [3]);

Formula (II) = 3*([1] - [2] + [3] - [4]) + ([2] - [3]) - ([2] + [4]); and

Formula (III) = [3] - [4]; is estimated to be 5 Code Units, 3 Code Units and 1 Code Unit respectively. The RP Piece providing for 24 combinations uses up another 5 bits; if Absolute Address

Branching is used, some combinations may just use up 4 bits. So if the Code Unit Size is taken to be 64 bits, then 68 bits, 66 bits and 64 bits are used for Formula (I), (II) and (III) respectively, without counting out the space optimization that could be achieved using Range Limiting. Using Range Limiting, it is obvious that the value of Formula (I) is bigger than that of Formula (II) and Formula (II) bigger than Formula (III). So Formula (I) should be placed first and then Formula (II) and then Formula (III). Using such placement technique, bit storage could be minimized.

[46] Upon further reflection, it appears that COMPLEMENTARY MATHEMATICS provides very useful concept and more technical tools for saving storage space. However, its importance lies rather in the paradigm it provides, including range processing, mirror value, as well as base shifting, for instance, the base for indicating the mirror value of a value is using the CC value, i.e. the Code Unit Size, as the base for counting reversely instead of the base of Value 0 when doing normal mathematical processing.

[47] Using the characteristics and relations as revealed in CHAN SHAPES, one may design formulae or other shapes with the use of different numbers of Code Units for a

Processing Unit as the case may be that could perhaps meet the challenge of Pigeonhole Principle in Information Theory using CHAN CODING even with just normal mathematical processing. No one is for sure that it is never possible given the endless combinations of Code Units and formula designs (as well as the possibility of adding other techniques for use with it) that could be constructed under CHAN FRAMEWORK. And it will be revealed later that these other techniques are able to end the myth of Pigeonhole Principle in Information Theory even without using the feature of formula design.

[48] COMPLEMENTARY MATHEMATICS AND COMPLEMENTARY CODING helps in the formula design phase and in encoding and decoding. So CHAN MATHEMATICS AND CHAN CODING is a super set of mathematical processing including normal mathematics and COMPLEMENTARY MATHEMATICS used in conjunction or alone separately with reference to CHAN SHAPES and the characteristics and relations so revealed in making encoding and decoding of digital information, whether in random or not.

[49] Under further examination for optimization, it appears even the RP Piece and the related RP processing could be dispensed with by just substituting a, b, c, and d values of a Processing Unit for the values of [1], [2], [3] and [4] in the formulae outlined in

Paragraph [37] and [38]. However, the placement of the CV sub-pieces would then require more space than that required by using RP Piece and RP processing. It should also be realized that the value of RP Processing lies in giving a clearer picture of the relationship between the four basic components (a, b, c and d) of a Processing Unit in the form of [1], [2], [3] and [4] so that these values could be put into perspective when using and being represented by CHAN SHAPES. This further paves the way for designing correct formulae for applying other encoding and decoding techniques outlined in the present invention. So whether using the RP Piece and RP processing for encoding and decoding or not is a matter of choice that could be decided case by case.

[50] Using 4 CV sub-pieces represented by 4 formulae so designed could give one a very free hand for designing multivariate algorithms for encryption. And then such encrypted digital data could then be compressed using other techniques to be introduced. In this way, both highly easy and secure encryption / decryption and compression /

decompression algorithms and process could be designed and implemented. The number of cycles of encryption / decryption and compression / decompression also has effects on the outcome code. And if such information, including the formulae designed and the number of cycles of the encryption / decryption and compression / decompression implemented, is sent separately from the data intended to be sent, data security could be enhanced and greatly protected on an unprecedented level. To enhance the data protection further, different types of CHAN CODE could be separately stored and sent out in the correct order for recovery of the original digital information; such as the sign bits for the CV sub-pieces being in one sign bit file, each CV sub-piece being in a separate CV sub-piece file, the RP Pieces being in a RP Piece file, and the header or footer containing additional information relating to the processing of the corresponding CHAN CODE being in a separate header or footer file.

[51] Using 4 formulae producing 4 CV sub-pieces does not necessarily mean that

compression could not be achieved. One could select the shortest range identified to make a formula for representing the shortest range for use for the 4 th value to be added for compression. For example, if [1] is very near the CC value, i.e. the biggest value of the Code Unit, then if CC minus ([1] + [2]) is the shortest range identified through the processing of the first three formulae in Step (1) to Step (3), then one could choose to make the fourth formula as [1] and place the value of [1] using the range limit of CC minus ([1] + [2]). Further research could be done in this respect about formula design and placement of CHAN CODE pieces.

[52] Even if using 4 CV sub-pieces could not compress every piece of Processing Unit, one could design different sets of formulae that are suitable to different types of data distribution (including the frequency of the individual data values and the corresponding data value distribution, i.e. the frequency of each of the values present and the number and the identity of the corresponding values present) and break the whole digital input file of random data into Super Processing Units that are not random. The technique of using Super Processing Units that are not random will be illustrated later when other techniques much simpler than using formula design are added to it for implementation. In the course of such discussion, the relevant concept of Super Processing Unit would be revealed. Through using different sets of formula design for Super Processing Units of different data distribution, re-compression could be attempted and achieved likewise as in the case where these other techniques are used with Super Processing Units.

[53] It could therefore be seen that using very simple logic of CHAN MATHEMATICS,

including normal mathematical processing and COMPLEMENTARY MATHEMATICS, tremendous progress has been made in the field of Compression and Decompression Science and Art. The end to the myth of the Pigeonhole Principle in Information Theory as announced in PCT/IB2016/054562 and confirmed in PCT/IB2016/054732, other techniques however are required and revealed as follows:

[54] Before revealing such techniques, CHAN FRAMEWORK has to be further fine tuned.

Up to now, CHAN FRAMEWORK is characterized by the following structural features:

(a) Code Unit;

(b) Processing Unit;

(c) Super Processing Unit;

(d) Un-encoded Code Section; and

(e) Header or Footer.

The schema and design of CHAN FRAMEWORK here refers to the definition chosen for any of the above structural elements, if used, for processing for the purpose of encoding and decoding a particular digital data set corresponding to the chosen techniques of CHAN CODING in use.

[55] Code Unit is the basic unit of CHAN FRAMEWORK. Up to now, its size, i.e. Code Unit Size, is measured in number of binary bits (bit size) and the maximum number of values (Code Content) that a Code Unit can represent is therefore limited by the bit size of the Code Unit. For example, if the Code Unit has only one bit in size, that it can only be used to represent two values, bit value 0 or bit value 1 at one instance. If the Code Unit has the bit size of 3, then it can represent at most 8 bit values, namely 000, 001, 010, 011, 100, 101, 110, and 111. This is the conventional way of using binary bits to represent data values. Code Unit is the basic unit of data that is read in one by one from the data input stream by the encoder for encoding purpose. It is this conventional way of reading and representing data which gives rise to the myth of Pigeonhole Principle in Information Theory, details of which could be found at: https:// ' en. wikipedia.org/wiki/Pigeonhole_principle

Its essence is expressed as:

"In mathematics, the pigeonhole principle states that if n items are put into m containers, with n > m, then at least one container must contain more than one item."

In other words, if one container can only take up one item, then the number of items that could be taken up by the containers is limited by the number of containers; i.e. the number of items that could be taken up cannot exceed the number of containers that are used to take up the items. This is a one one correspondence relationship between item and container.

Applying to the use of Code Unit for encoding here, if the Code Unit of bit size of 3 bits, it could only provide 8 addresses and so it could only be used to represent at most 8 values, one value at an instance of time. In the conventional way, the number of addresses that a Code Unit can have is measured in binary bit size, the bigger the number of binary bits used for a Code Unit, the more addresses that the Code Unit can have for representing Code Content values in one one correspondence. So the number of addresses that a Code Unit has is equal to 2 to the power of the bit size of the Code Unit, i.e. the number of binary bits measuring the size of the Code Unit.

[56] So encoding for making compression so far is possible only because of encoding method taking advantage of the uneven nature of data distribution. For a given bit size of Code Unit, such as 3 bit Code Unit for instance, if the data input stream contains only 3 different unique Code Values, such as 000, 001, and 100, then the data input stream could be compressed. Or if the data input stream contains all the 8 different unique Code Values, namely 000, 001, 010, 011, 100, 101, 110 and 111, it still could be compressed if the frequency distribution of these 8 Code Values are not even, i.e. the frequencies of these 8 unique Code Values are not the same to each other. Usually, the more the unevenness in data distribution the more compression saving could be achieved. Random data tends to be even in the ratio between the number bit 0 and bit 1 where bit 0 and bit 1 are appearing in a random way, i.e. without any regularity or predictable pattern. So it is long held that random data could not be compressed, giving rise to the myth of

Pigeonhole Principle in Information Theory.

[57] The Pigeonhole Principle in Information Theory is very true but only by itself! The myth of which however lies in the myth related to this Principle that random data could not be compressed or data could not be re-compressed time and again. So to end the myth, the essence is to create unevenness out of random data. One fundamental technique is to create unevenness through fine-tuning the way of defining and measuring Code Unit and the respective Code Values that the Code Unit is used to represent. So this is a significant novel feature of the present invention: Redefining the notion of Code Unit used in CHAN FRAMEWORK, a structural change or improvement to the very basic element of CHAN FRAMEWORK, the Code Unit and its definition. Unevenness could therefore be easily created in random data. Capitalizing on this structural change of CHAN FRAMEWORK, one could easily design schema that provides more addresses than the values that a Code Unit is supposed to accommodate. So the number of code addresses available for use to represent code values is not the limiting factor for compressing and re-compressing random data set in cycle. What is left out and being neglected by the Pigeonhole

Principle in Information Theory is the frequency distribution characteristic of a digital data set. To be able to compress and re-compress a random data set in cycle, one has also to pay attention to the nature of data distribution in terms of the frequency distribution of code values present in the data set as well their corresponding bit lengths. These two issues defying previous efforts in making encoding and decoding for compressing and re- compressing random data set in cycle: the number of code addresses available to unique code values and the frequency distribution of the unique code values present in the digital data set will be addressed to one by one in the following discussion.

[58] Firstly, about the issue of the number of code addresses available to unique code values in a data set, after the novel feature of re-defining Code Unit is introduced to CHAN FRAMEWORK, Code Unit under CODE FRAMEWORK is firstly measured by the number of Code Values that a Code Unit is used to represent or hold, secondly by the number of binary bits, and thirdly by the Head Design of the Code Unit Definition, where appropriate, as will be seen in Diagram 14 of Paragraph [62] below. So the nomenclature for referring to Code Unit changes from 3 -bit Code Unit to 8-value Code Unit, using the number of values that a Code Unit is used to hold as the first or primary factor to name or represent (of course one at a time) rather than the number of bits, which could then be used as the second or secondary factor, for distinguishing the size of a Code Unit. This novel feature does not prevent one from designing a schema using a standard bit size for all the values of a Code Unit, it only provides the opportunity for designing schema using different bit sizes for different unique Code Values of a Code Unit. That means, the different unique Code Values of a Code Unit could have different bit sizes; and in addition, it also allows for giving each of the unique Code Values of a Code Unit the same number of bits in size, depending on the Code Value Definition and the related Code Unit Definition used at a certain time of encoding and decoding processes under a specific schema and design of encoding and decoding. For instance, for a 8-value Code Unit, all the 8 unique Code Values could be defined having the same bit size of 3 bits under a specific schema and design such as in Diagram 11 :

Diagram 11

Definition of Code Values of a 8-value Code Unit having the same bit size

Or under another specific schema and design, these 8 values of the Code Unit could be redefined as having different bit sizes as in Diagram 12:

Diagram 12

Definition of Code Values of a 8-value Code Unit having the different bit sizes

[59] So Code Unit under CHAN FRAMEWORK is now measured firstly by the number of Code Values that the Code Unit is used to represent; the number of binary bits of a Code Unit becomes the secondary factor for size measurement. So the Code Values of a Code Unit could have the same bit size or have different bit sizes, such option depending on how it is defined under a specific schema and design used for encoding and decoding. Such Code Value Definition or Code Unit Definition could also change where necessary and appropriate in the course of encoding and decoding, using the code adjustment technique of CHAN CODING.

[60] Using this novel feature of Code Unit Definition and Code Value Definition under

CHAN FRAMEWORK, techniques for creating unevenness into data distribution, including random data, could be easily designed. It also makes it possible to investigate into the nature of random data and to allow ways of describing data distribution in terms of Code Values and their related frequencies of occurring in a specific digital data input stream so that appropriate techniques could be used for encoding and decoding such as for the purpose of making compression/decompression. Before demonstrating the technique for creating unevenness into any particular data set, the schema and design for 3 -value Code Unit is introduced here to end the myth of Pigeonhole Principle in

Information Theory, namely the number of Code Addresses being assumed to be no more than the number of unique Code Values. The number of Code Addresses being no more than the number of unique Code Values is true only when Code Unit Size is just measured in terms of bit size, such as 1 bit Code Unit, 2 bit Code Unit, so on and so forth. This is not true when Code Unit could be measured in terms of the number of Code Values that a Code Unit is designed to hold. This conventional way of measuring Code Unit in terms of the number of binary bits puts up a restriction that the number of Code Values that a Code Unit could hold is determined by the number of binary bits used for a Code Unit; for instance, a 3 -bit Code Unit could hold at maximum 8 unique Code Values, each using 3 bits to represent, no more and no less. A 3 -bit Code Unit could not hold more than 8 unique Code Values. What is more, when reading data, using this

conventional definition, the encoder or decoder could not avoid reading all the 8 unique Code Values if they are present there; that means, the encoder or decoder could not say just read 3 unique Code Values and disregard or discard the other 5 unique Code Values if they are present in the data set. So because of this traditional design of reading and interpreting data, the number of Code Addresses available is by design exactly the same as the number of Code Values that the Code Unit is designed to hold. So if all the unique Code Values appear in the data set, all the Code Addresses are exhausted so that compression of a random data set could not be made possible by techniques only capitalizing on unevenness in the frequency distribution of the Code Values present in the data set (except for the use of CHAN CODING); as for a random data set, such frequency distribution for all the Code Values of the data set tends to be even, i.e. the ratio between bit 0 and 1 of the whole data set is 1 to 1, and the frequency of all the Code Values of the Code Unit is about the same. So no unevenness in frequency distribution of the Code Values of a Code Unit of a random data set could be utilized for making compression by techniques of prior art so far designed as well as there being no more Code Addresses available than the number of Code Values present in the data set.

Diagram 13 shows the design for a 3-value Code Unit using the novel feature (i.e. Code Unit being measured by number of Code Values; the number of binary bits is used as a secondary measurement for the Code Unit as a whole) just introduced into CHAN FRAMEWORK: Diagram 13

Definition of Code Values of a 3 -value Code Unit using 5 bits

with 2 versions, one having bit 0 to bit 1 ratio as 2:3 and the other 3 :2

These two versions of design for 3 -value Code Unit definition are meant for illustrating that more number of Code Addresses could be created for the number of unique Code Values that a Code Unit is designed for, thus providing more addresses than the number of unique values appearing in the data set. Suppose a schema of using a Processing Unit made up of three 3 -value Code Units is designed for use for encoding and decoding a digital data input stream for making data compression. In this design, the Code Units of the digital data input stream is read one by one and 3 adjacent Code Units are encoded (or decoded for restoration afterward) as one Unit, the Processing Unit, using the definition of 0 Head Design; what the encoding or decoding does is reading three 3-value Code Units (reading three one by one) by using the Code Unit definition of 0 Head Design as Reader and then treat the code of the three Code Units as one piece of code (code of one Processing Unit) and change it with another piece of code, for instance, using the Code Unit definition of 1 Head Design as Writer to encode it or write it upon encoding; or restore it to the original code upon decoding by reading the encoded code with the 1 Head Design Code Unit definition and writing it back with the 0 Head Design Code Unit definition; or using mapping tables of other design for encoding or decoding. Because there are 3 unique Code Values of a Code Unit used here, a Processing Unit is designed to hold at maximum 27 unique values (3 values times 3 Code Units equal to 27 unique values) for representation. The number of addresses that is available could be calculated using the following mathematical formula:

2 to the power of (The average bit size of the Code Values of a Code Unit * The number of Code Units of a Processing Unit)

So for a Processing Unit consisting of three 3-value Code Units using the 0 Head Design, the number of addresses available for use is:

2 to the power of (5 bits / 3 values * 3 units) = 2 to the power of 5 = 32

There are 32 unique addresses available for 27 unique values. This is the first sign that spells the end to the myth of Pigeonhole Principle in Information Theory. Using this design, there could be more addresses than the number of unique values that have to be represented. So one could in one way for instance use the Absolute Address Branching technique to reduce the number of bits that have to be used for representing the 27 unique values from absolutely 5 bit to 4 or 5 bits [for instance, using Absolute Address Single Branching, the value range of 4 bits is 16 (the lower value range) and the value range of 5 bits 32 (the upper value range), and the actual value range of the Processing Unit here is 27 (the actual value range); therefore 27 - 16 = 11, there are 11 values that have to be single-branched in the 4 bit range; therefore there should be 5 value addresses using 4 bits and 22 value addresses using 5 bits]. So some bit saving is achieved. What is more, in another design, one could reserve 1 or 2 or more addresses (up to 5) on top of the 27 unique addresses for use as special addresses for indicating special processing to be done. For example, the 28 th address, when present in the encoded code, could be used to indicate that the next two Processing Unit contains the same data values and thus same encoded code as the last Processing Unit. In this way, it provides more flexibility for encoding and decoding data for compression as well as for encryption. If 28 addresses are to be used, i.e. 27 unique value addresses and 1 special processing address, then there are 4 addresses [16 - 12 (reserved for single branching) = 4] using 4 bits and 24 addresses [(28 - 16 = 12) * 2 = 24] using 5 bits. The use of this schema and design of a Processing Unit of three 3 -value Code Units and the respective encoding and decoding processes will be elaborated in greater detail later in providing a proof and an example of how a random data set could be compressed. For the time being, techniques of creating unevenness into a particular data set are to be discussed first as follows:

To understand the technique of changing the ratio between bit 0 and bit 1 of a data set for creating unevenness into the respective data distribution for the purpose of making compression possible, more examples of Code Value Definition and Code Unit

Definition are illustrated below in Diagram 14 of 6-value Code Unit with different Code Value Definitions:

Diagram 14a

Definition of Code Values of a 6 value Code Unit using 20 bits: 1 Multiple Branching with 2 versions, one having bit 0 to bit 1 ratio as 5: 15 and the other 15:5

Diagram 14b

Definition of Code Values of a 6 value Code Unit using 16 bits: 3-pair Single Branching with 2 versions, one having bit 0 to bit 1 ratio as 7:9 and the other 9:7 (Skewed

Distribution)

Diagram 14c

Definition of Code Values of a 6 value Code Unit using 16 bits: 2-pair No Skew Single Branching

with 2 No Skew versions of equal bit 0 to bit 1 ratio,

both having bit 0 to bit 1 ratio as 8:8 (No Skew Distribution)

Diagram 14d

Definition of Code Values of a 6 value Code Unit using 17 bits: 1-pair Single and 1 Multiple Branching

with 2 versions, one having bit 0 to bit 1 ratio as 6: 11 and the other 11 :6

Diagram 14e

Definition of Code Values of a 6 value Code Unit using 18 bits: 1-pair Single and 1 Multiple Branching

with 2 versions, one having bit 0 to bit 1 ratio as 6: 12 and the other 12:6

Diagram 14f

Definition of Code Values of a 6 value Code Unit using 19 bits: 2-pair Single Branching with 2 versions, one having bit 0 to bit 1 ratio as 6: 13 and the other 13 :6

One can see from Diagram 14 that there could be more than one definition for a 6-value Code Unit, using from 16 bits to 20 bits with different bit 0 to bit 1 ratios. So Code Unit could be classified primarily by the number of unique data values it holds and then by the number of bit size and then by which version of the Head Design, whether 0 Head or 1 Head. This schema of defining Code Unit allows great flexibility in using Code Unit as a basic unit for manipulating digital data in addition to using it as the basic unit of a language (CHAN FRAMEWORK LANGUAGE, using terminology as revealed in the present invention for describing the traits or characteristics of the structural elements of CHAN FRAMEWORK and the techniques of CHAN CODING) for describing the traits or characteristics of a digital data set. This differs from the conventional way of defining Code Unit just in terms of bit size, in which a Code Unit of a certain bit size could have only 1 version of code definition, which could not vary in the number of unique values that are to be represented (for instance, for a 3 -bit Code Unit defined in the conventional way, there are 8 unique values to represent and one could not just represent only 6 unique values out of the 8 possible combinations and ignore the other two unique values, leaving it not handled nor processed; i.e. one simply could not just handle only 6 unique values without handling the other two when they do appear in the data set with a 3 -bit Code Unit defined in the conventional way) nor vary in Head Design; whereas the Code Definition schema under CHAN FRAMEWORK allows a Code Unit Definition having many different versions of definition, varying in Code Unit total bit size, varying in Code Unit values bit size, and varying in the number of unique values that the Code Unit is designed to hold, as well as varying in the 0 Head or 1 Head Design.

One could utilize the aforesaid differences between different schemas and definitions of Code Values for each of the different design types of 6-value Code Units to create unevenness into an existing digital data set, for instance changing the ratio of bit 0 to bit 1. For instance, Diagram 14c provides 2 No Skew versions (i.e. 0 Head Design and 1 Head Design) of 16-bit 6-value Code Unit. The 0 Head Design version is used for the discussion hereafter except where mentioned otherwise specifically. Comparing it with the corresponding 3 -pair Skewed Single Branching version in Diagram 14b, the No Skew version and the 3 -pair Skewed Single Branching version both use 16 bits for representing the 6 unique values of the 6-value Code Unit; they differ only in the pattern of bit codes used and the ratio between bit 0 and bit 1, with the No Skew version having the ratio as 8:8 and the 3-pair Skewed Single Branching version 7:9. So one could do a cross mapping between these 2 sets of 6 Code Values in order to increase, say, the number of bit 1 of the No Skew version, so that the ratio of bit 0 to bit 1 of the new data set after mapping translation from 8:8 to a ratio towards the 7:9 side. However, after doing some trial, the change of this ratio for a random data set is found to be possible but relatively small using one pass of encoding translation. This is because of the nature of the frequency distribution of the 6 unique Code Values found in a random data set. Diagram

15 gives one instance of result generated by running the autoit program given in Diagram

16 using 80,000 random binary bits as follows:

Diagram 15

Frequency Distribution of the 6 unique Code Values of a 6-value Code Unit for the Three 0 Head Design of No Skew Version

Diagram 16a

autoit Programme: main.au3 for generating the Frequency Distribution in Diagram 15

#include <Array.au3>

#include "helper.au3" global $ShowInput = false

global $ShowDecode = false

global $LastDecode func ReadSix(byref $data)

local $bl, $b2, $b3

$bl = ReadOneBit($data)

$b2 = ReadOneBit($data)

if $bl = 0 then

if $b2 = 0 then

$b3 = ReadOneBit($data)

if $b3 = 0 then

$LastDecode = '000'

return 0

else

$LastDecode = '00r

return 1

endif

else

$LastDecode = '01'

return 2

endif

else

if $b2 = 0 then

$LastDecode = '10'

return 3

else

$b3 = ReadOneBit($data)

if $b3 = 0 then

$LastDecode = '110'

return 4

else

$LastDecode = 'l l l'

return 5 endif

endif

endif

endfunc global $data[ 10000]

GenerateRandomData($data)

local $DecodeTable[8] local $FreqTablel [6]

ZeroArray($FreqTable 1 )

InitPointer($data)

while not $DataEnded

$index = ReadSix($data)

$DecodeTable[$index] = $LastDecode

if $ShowDecode then ConsoleWrite($LastDecode & '(' & $index & ') ') $FreqTablel [$index] = $FreqTablel [$index] + 1

wend

if $ShowDecode then ConsoleWrite(@CRLF)

ConsoleWrite('max6' & @CRLF)

for $i = 0 to 5

ConsoleWrite($DecodeTable[$i] & ' : ' & $FreqTablel [$i] & @CRLF) next

Diagram 16b

autoit Programme: helper. au3

global $CurrentDataIndex

global $CurrentBitPos

global $DataEnded

global $DataSize global $Mask[8] = [128, 64, 32, 16, 8, 4, 2, 1]

global $BitLen = 8

global $MaxNum = 256 func ReadOneBit(byref $data)

if $DataEnded then return 0

if BitA D($data[$CurrentDataIndex], $Mask[$CurrentBitPos]) <> 0 then $r = 1 else

$r = 0

endif

$CurrentBitPos = $CurrentBitPos + 1

if $CurrentBitPos >= $BitLen then

$CurrentBitPos = 0

$CurrentDataIndex = $CurrentDataIndex + 1

if $CurrentDataIndex >= $DataSize then $DataEnded = true endif

return $r

endfunc func GenerateRandomData(byref $arr)

for $i = 0 to UBound($arr) - 1

$arr[$i] = Random(0, $MaxNum - 1, 1)

next

endfunc func DumpArray(byref $arr)

for $i = 0 to UBound($arr) - 1

ConsoleWrite($arr[$i] & ' ')

next

ConsoleWrite(@CRLF)

endfunc func ZeroArray(byref $arr)

for $i = 0 to UBound($arr) - 1

$arr[$i] = 0

next

endfunc func InitPointer(byref $data)

$CurrentDataIndex = 0

$CurrentBitPos = 0

$DataEnded = false

$DataSize = UBound($data)

endfunc func DumpBinaryData($data)

InitPointer($data)

$pos = 0

while not $DataEnded ConsoleWrite(ReadOneBit($data))

$pos = $pos + 1

if $pos = 8 then

ConsoleWriteC ')

$pos = 0

endif

wend

ConsoleWrite(@CRLF)

endfunc

Func AABBitslBase($range, Svalue) if Svalue < 1 then

ConsoleWrite('Invalid input 0')

exit

endif

if Svalue > Srange then

ConsoleWrite('Invalid range')

exit

endif

Local Smax = 1

local Sbits = 0

while Smax <= Srange

Smax = Smax * 2

Sbits = Sbits + 1

WEnd

Smax = Smax / 2

Sbits = Sbits - 1

local Ssub = Srange - Smax

If Svalue <= Smax - Ssub Then

Return Sbits

Else

Return Sbits + 1

Endif

EndFunc

Func AABBitsOBase($range, Svalue) return AABBitslBase($range + 1, Svalue + 1) EndFunc [64] Diagram 16a is the main program for generating the Frequency Distribution of a 6-value Code Unit of Diagram 15, using Diagram 16b as an #include program for use. Diagram 15 is just one instance of such generations only. Running Diagram 16a once will generate one such instance, each instance will differ from other instances a little. But in general, such instances of the frequency distribution of the 6-value Code Unit maintains roughly the same proportions each time for the 6 unique Code Values under concern. Cross mapping between the Code Values of the No Skew version and the 3 -pair Single

Branching version of the 6-value Code Unit and the related calculation is shown in Diagram 17:

Diagram 17

Cross Mapping between the 6 Code Values of the No Skew and 3 -pair Single Branching version of 6-value Code Unit

So it could be seen that by such cross mapping, the number of bit 1 has been increased by 41 bits out of 80000 total binary bits. This is a relatively small figure. However, if such a trend of increasing bit 1 could be continued, then the data distribution would be skewed towards bit 1 with multiplier effects gradually. The more the skew is the greater the compression saving could be achieved. So more experiments should be tried to understand more about the patterns of cross mapping between Code Values of different design of a Code Unit. In this case, both the No Skew and the 3 -pair Single Branching versions use 16 bits, and the mapping is done in such a way that 2 bit values are mapped to 2 bit values and 3 bit values to 3 bit values, so there is no change in bit usage but only slight change in the bit 0 to bit 1 ratio. What is more, for all the above versions of 6-value Code Unit Design, using from 16 bits to 20 bits, each bit size does have 2 corresponding versions. So cross mapping between the code values of those two versions (or even amongst one version itself as in Diagram 18) could be utilized from one cycle of encoding to another cycle of encoding for the purpose of changing the ratio between bit 0 and bit 1 in the data set without having to change the total bit usage. Say the first cycle of encoding could use cross mapping between the two versions using 16 bits as shown in Diagram 17, the next cycle could use the 20 bit versions, and the third the 18 bit versions, so on and so forth. Of course, in the course of doing such cross mapping, frequency distribution for Code Values read should be found out first and the best cross mapping table be designed so that the trend of increasing a particular bit, either bit 0 or bit 1, in terms of bit ratio between these 2 bit values is to be maintained from one cycle of encoding to another cycle of encoding. What is more, not only cross mapping between 2 versions of Code Unit Design using the same bit size could be used for this purpose. Cross Mapping using only just 1 version of Code Unit Design itself could also be used as illustrated in Diagram 18:

Diagram 18

Cross Mapping amongst the 6 Code Values of the No Skew version of 6-value Code Unit itself

It could be seen using the cross mapping among the Code Values of any Code Unit Design itself could also change the ratio between bit 0 and bit 1. And the result in this case as shown in Diagram 18 is even better than the result of change using 2 versions of Code Unit Design in Diagram 17.

[65] Furthermore, such cross mapping for the purpose of tilting the bit 0 to bit 1 ratio towards one side could be done not just using 6-value Code Unit Design, but could be done using Code Unit Design for any X-value Code Units, in different cycles of encoding. For instance, the first cycle could use the 6-value Code Unit Design and the next uses the 12- value Code Unit Design, etc. etc. This is so as long as the trend of tilting is maintained. So there is endless opportunity for changing the ratio between bit 0 and bit 1 for any specific data set starting with whatever point of bit 0 to bit 1 ratio in the data distribution spectrum. And as long as the pattern of such cross mapping is found and recorded, such logic of cross mapping and the path it follows for a data set starting with whatever point of bit 0 to bit 1 ratio in the data distribution spectrum could be embedded in the encoder and decoder. Or else such logic and path of cross mapping could be put as indicators into the header of the encoded code for each cycle of encoding so that the original code could be recovered correctly and losslessly upon decoding later. Of course, embedding such logic and path of cross mapping of code values in the encoder and decoder help to further minimize the bit usage for the purpose of making compression during the phase of compression encoding. So changing the data distribution by changing the ratio between bit 0 and bit in the data set without changing the bit usage using cycles of encoding through cross mapping of Code Values of the same Code Unit Design alone or of different Code Unit Design could be used as an intermediate step for the purpose of creating unevenness in the data set so that compression of a specific digital data set could be made possible or enhanced later during the phase of encoding for compression using other techniques. What is more, whether changing the bit size of a data set during a particular phase or at any cycle of encoding is not that important, what is important is the end result. So changing the data distribution as well as the bit usage of any data set is always associated with the encoding step. The novel feature of this revelation here is that encoding should be designed in such a way that the change of data distribution of any data set should be tilted towards one direction in general in terms of changing the bit 0 and bit 1 ratio for the purpose of making data compression. Besides using Code Unit Definition as Reader and Writer for changing the bit 0 : bit 1 ratio of a digital data set, the Processing Unit Definition could also serve the same purpose as demonstrated in

Paragraph [115] and Diagram 58 as well as in Paragraph [116] and Diagram 59. The result is much better there. It is therefore apparent that using bigger size of Code Unit or bigger size of Processing Unit, greater differences are to be generated, captured and produced as unevenness in data distribution of a particular data set.

[66] So after the intermediate phase of changing the data distribution of any data set in terms of tilting the bit 0 : bit 1 ratio towards one direction in general (up to a certain point where appropriate), the digital data set could be compressed using technique(s) which is suitable for compressing data set of such distribution at that point of bit 0 : bit 1 ratio. So if at first a random data set is given for making compression, the ratio of bit 0 : bit 1 could be altered tilting towards one direction using the techniques outlined in Paragraphs [62] to [64] above or [115] and [116] below, then depending on the data distribution, one could use the cross mapping technique of code values for making compression, and this time using cross mapping of Code Values of Code Units of the same value size but of different bit sizes, for instance in the example now being used: 6-value Code Units, having different bit sizes, such as reading the data set using 6-value Code Unit of 20 bit size (or any other bit sizes where appropriate) and cross mapping such Code Values with those Code Values of 6-value Code Unit of 19 bit size (or any other bit sizes where appropriate) for encoding purpose, depending on the frequency distribution of the Code Values of the data set under processing. So in brief, in doing cross mapping of Code Values, for changing the data distribution in terms of bit 0 : bit 1 ratio, same X-value Code Units with code addresses mapped to data values of the same bit size in one one correspondence are used; whereas for making data compression for reducing bit usage, same X-value Code Units with code addresses mapped to data values of the different bit sizes in one one correspondence are used instead. However, this does not preclude one from using Code Units of different Code Value Size for both changing the bit usage as well as for changing the ratio of bit 0 to bit 1 in one go. The same applies to using Processing Unit for such purpose.

[67] So it could be seen from above that under the revised CHAN FRAMEWORK, Code Unit Size could now be measured by the number of Code Values as well as by the number of binary bits as illustrated by the examples given in Diagram 14 for 6-value Code Values having different bit sizes, other characteristics of the data set could also be investigated and found out, such as the ratio of bit 0 to bit 1 and the frequency distribution of the unique Code Values of Code Unit for any Code Unit Definition with a certain number of values and a certain bit size. This facilitates the description of any digital data set as well as the use of appropriate techniques for encoding and decoding for any purposes, including the purposes of making data encryption and data compression together with correct and lossless recovery of the original digital data.

[68] With the change of the definition of Code Unit Size that a Code Unit could now be

measured both in terms of the number of Code Codes that the Code Unit is designed to hold and in terms of the number of binary bits used for a Code Unit as a whole,

Processing Unit (the unit for encoding or for writing of a piece of encoded code) could be made up by only 1 Code Unit as Diagram 14 shows that a Code Unit having the same number of Code Values could be designed to be having different bit sizes for each of the code values and for the Code Unit as a whole. So a Code Unit, i.e. the basic Read Unit, could by itself be used alone without having to combine with other Read Unit(s) to form a Processing Unit, the basic Write/Read Unit, for writing encoded code in the encoding process and for reading back during the decoding process.

[69] As random data is long held to be incompressible. It is time to make it possible using

CHAN FRAMEWORK as described above in terms of changing the ratio of bit 0 to bit 1 in the data set as well as other techniques to be revealed as follows:

[70] These other techniques could be classified as coding techniques used in CHAN CODING under CHAN FRAMEWORK. Identifying and organizing traits or characteristics of Code Units, either in single or in combination, for producing Classification Code of CHAN CODE is one such technique, such as producing RP Code using the rank and position of Code Units or designing mathematical formulae that could be used to describe Code Values of Code Units either in single or in combination, and that could also be used in encoding and decoding purpose, for instance for creating Content Value Code, the CV sub-pieces, of CHAN CODE. So the basic part of CHAN CODE (the part that describes the traits or characteristics of Code Units) under CHAN FRAMEWORK could be divided into Classification Code and Content Value Code (or Content Code in short). Other parts that could be regarded belonging to CHAN CODE include other

identification code or indicators, for instance included in Header for CHAN CODE FILES, for identifying the cross mapping table (Mapping Table Indicator) used for encoding and decoding the basic part of CHAN CODE, the number of cycles (Number of Cycle Indicator) of encoding and decoding for a particular digital data input, the checksum (Checksum Indicator) calculated for CHAN CODE FILES, the Un-encoded Code Section of the digital data for any particular cycle of encoding and decoding if any, as well as other indicators which are designed by designer for use for the purposes of identifying as well encoding and decoding CHAN CODE where appropriate and necessary. One such indicator for instance could be the number of Processing Units making up a Super Processing Unit for use in encoding and decoding; others could be indicator for adjustment of Classification Code of CHAN CODE by frequency and indicator for adjustment of Content Code of CHAN CODE by frequency where appropriate and desirable (i.e. Frequency Indicators for adjustment of using 0 Head or 1 Head Design for Classification Code and for Content Code as appropriate to the patten of Bit 0 : Bit 1 ratio of the digital data set under processing, Frequency Indicators in short). The concept and usage of Super Processing Unit and of the adjustment of CHAN CODE by frequency will also be revealed later.

[71] One essential technique, Absolute Address Branching, used in CHAN CODE has already been discussed in several places in the aforesaid discussion. It is worth to elaborate on how this technique is to be used in compressing random data set, i.e. reducing the bit storage usage of a random data set as a whole. This usage has been briefly touched upon in Paragraphs [60] and [61] in discussing the issue on the relation between number of code addresses and number of code values. To refresh memory, in that discussion, a Processing Unit made up of three 3-value Code Units is used to reveal that the number of code addresses could be made more than the number of code values that a Processing Unit for encoding and decoding is designed to hold. This is made possible by using a new definition for a Code Unit so that the size of a Code Unit could be designed to be measured by the number of Code Values the Code Unit holds and by the number of bits used for different unique Code Values of the Code Unit as well as of the Code Unit as a whole. This feature is shown in Diagram 14 using different design and definition for 6- value Code Unit. And this feature is made possible also because of the use of the technique of Absolute Address Branching.

[72] In that discussion, it also briefly touches upon how the 27 unique code values of a

Processing Unit could be represented by code addresses, as well as using some spare addresses as special processing addresses for encoding and decoding purposes. For instance, in one design the 27 unique code values could be represented by five 4-bit basic addresses and twenty two 5-bit single-branched addresses. So how the code values could be cross mapped with the code addresses is illustrated by the following examples: [73] Diagram 19

Classified Absolute Address Branching Code Table (CAABCT) For 27 Values

Example I

Sorted PU Bit Representation in CHAN CODE, including Class Bit, Normal Bits and Branch Bit

Using the above Example I of Classified Absolute Address Branching Code Table (CAABCT), the 27 Code Values of three 3 -value Code Units of a Processing Unit could be cross mapped one by one into CHAN CODE, including the Class Bit (Classification Code), and the Normal Bits & Branch Bit (Content Code). However, using this version, Example I, of CAABCT, a random data set could not be compressed because of the frequency distribution of a random data. It could however compress a digital data set of which the frequency of all the 27 unique code values is roughly the same or roughly even amongst all the 27 unique code values (for instance, if the frequency for all the 27 unique code values in a particular data set is 1, then all the 27 unique code values together use up a total of 135 bits as found in Diagram 21 below and if it is cross mapped with the code addresses in Diagram 19 above, which use a total of 130 bits only, there should be a 9 bits of compression saving). So not only the number of code addresses that matters but the frequency distribution of the unique code values presents another hurdle in the attempt of compressing a random data set, which is now to be addressed.

Because of the nature of the definition of a 3 -value Code Unit shown in Diagram 20 that the ratio of bit 0 : bit 1 of a Code Unit is 2 : 3, the frequency distribution of the 27 code values of a Processing Unit of three 3-value Code Units is not an even distribution as shown in Diagram 21 :

Diagram 20

3-value Code Unit Definition: 0 Head Design

with Bit 0 : Bit 1 Ratio = 2 : 3

Diagram 21

An instance of Unsorted Frequency Distribution of the 27 unique Code Values of a Processing Unit made up of three 3-value Code Units read from a random data set of 80,000 bits using the Code Unit Definition in Diagram 20

The instance of frequency distribution as shown in Diagram 21 is generated by running the autoit programmes in Diagram 22 as follows:

Diagram 22a

autoit programme: main27values.au3 using helper.au3 of Diagram 22b for generating one instance of frequency distribution as shown in Diagram 21 #include <Array.au3>

#include "helper.au3" global $ShowInput = false

global $ShowDecode = false global $MapTablel [27] = [4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,

5, 5]

global $MapTable2a[27] = [3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6,

6, 6, 6]

global $MapTable2b[27] = [3, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 4, 4, 4, 4, 4, 4, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6]

global $MapTable3[27] = [3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6] global $LastDecode

global $Original func ReadThree(byref $data)

$r = ReadOneBit($data)

if $r = 0 then

$LastDecode = '0'

return 0

endif

$r = ReadOneBit($data)

if $r = 0 then

$LastDecode = '10'

return 1

endif

$LastDecode = 'l T

return 2

endfunc func RunTable(ByRef Stable, $name)

local $i, $total

for $i = 0 to 26

$total = $total + $FreqTable[$i] * $table[$i]

next

ConsoleWrite($name & ' size: ' & $total & @CRLF)

endfunc func RunTableOriginal(ByRef Stable, $name)

local $i, $total

for $i = 0 to 26

$total = $total + $FreqTable[$i] * $table[$i]

next

Return $total

endfunc global $data[ 10000] GenerateRandomData($data) if $ShowInput then

DumpArray($data)

DumpBinaryData()

endif local $a, $b, $c, $index global $FreqTable[27]

ZeroArray($FreqTable) global $DecodeTable[27]

InitPointer($data)

while not $DataEnded

$a = ReadThree($data)

$d = $LastDecode

$b = ReadThree($data)

$d = $d & '-' & $LastDecode

$c = ReadThree($data)

$d = $d & '-' & $LastDecode

$index = $a * 9 + $b * 3 + $c

$DecodeTable[$index] = $d

if $ShowDecode then ConsoleWrite($LastDecode & '(' & $index & ') ') $FreqTable[$index] = $FreqTable[$index] + 1

wend if $ShowDecode then ConsoleWrite(@CRLF)

for $i = 0 to 26

ConsoleWrite($DecodeTable[$i] & ' : ' & $FreqTable[$i] & @CRLF) next

_ArraySort($FreqTable, 1)

ConsoleWrite('Freq: ')

DumpArray($FreqTable)

$Original = RunTableOriginal($MapTable2a, able2')

ConsoleWrite('Original: ' & $Original & @CRLF)

RunTable($MapTablel, 'Table Γ)

RunTable($MapTable2a, able2a')

RunTable($MapTable2b, able2b')

RunTable($MapTable3, able3')

Diagram 22b

autoit programme: helper.au3 for main27values.au3 of Diagram 22a

global $CurrentDataIndex

global $CurrentBitPos

global $DataEnded

global $DataSize global $Mask[8] = [128, 64, 32, 16, 8, 4, 2, 1]

global $BitLen = 8

global $MaxNum = 256 func ReadOneBit(byref $data)

if $DataEnded then return 0

if BitA D($data[$CurrentDataIndex], $Mask[$CurrentBitPos]) <> 0 then

$r = 1

else

$r = 0

endif

$CurrentBitPos = $CurrentBitPos + 1

if SCurrentBitPos >= $BitLen then

$CurrentBitPos = 0

$CurrentDataIndex = $CurrentDataIndex + 1

if SCurrentDatalndex >= $DataSize then $DataEnded = true

endif

return $r

endfunc func GenerateRandomData(byref $arr) for $i = 0 to UBound($arr) - 1

$arr[$i] = Random(0, $MaxNum - 1, 1) next

endfunc func DumpArray(byref $arr)

for $i = 0 to UBound($arr) - 1

ConsoleWrite($arr[$i] & ' ') next

ConsoleWrite(@CRLF)

endfunc func ZeroArray(byref $arr)

for $i = 0 to UBound($arr) - 1

$arr[$i] = 0

next

endfunc func InitPointer(byref $data)

$CurrentDataIndex = 0

$CurrentBitPos = 0

$DataEnded = false

$DataSize = UBound($data)

endfunc func DumpBinaryData($data)

InitPointer($data)

$pos = 0

while not $DataEnded

ConsoleWrite(ReadOneBit($data)) $pos = $pos + 1

if $pos = 8 then

ConsoleWriteC ')

$pos = 0

endif

wend

ConsoleWrite(@CRLF)

endfunc

Func AABBitslBase($range, $value) if $value < 1 then

ConsoleWrite('Invalid input 0') exit

endif

if $value > $range then

ConsoleWrite('Invalid range')

exit

endif

Local $max = 1

local $bits = 0

while $max <= $range

$max = $max * 2

$bits = $bits + 1

WEnd

$max = $max / 2

$bits = $bits - 1

local $sub = $range - $max

If $value <= $max - $sub Then

Return $bits

Else

Return $bits + 1

Endif

EndFunc

Func AABBitsOBase($range, $value)

return AABBitslBase($range + 1, $value + 1)

EndFunc

[75] By running another pass of the autoit programmes in Diagram 22, the result is generated and listed in Diagram 23 :

Diagram 23

Result generated by autoit programmes in Diagram 22

0-0-0 : 2273

0-0-10 : 1175

0-0-11 : 1149

0-10-0 : 1123

0-10-10 : 531

0-10-11 : 593

0-11-0 : 1060

0-11-10 : 548

0-11-11 : 542

10-0-0 : 1045 10- -0-10 : 542

10- -0-11 : 576

10- -10-0 : 551

10- -10-10 : 276

10- -10-11 : 288

10- -11-0 : 559

10- -11-10 : 266

10- -11-11 : 294

11- â– 0-0 : 1072

11- â– 0-10 : 508

11- â– 0-11 : 561

11- â– 10-0 : 540

11- â– 10-10 : 277

11- â– 10-11 : 279

11- â– 11-0 : 591

11- â– 11-10 : 262

11- â– 11-11 : 304

Freq: 2273 1175 1149 1123 1072 1060 1045 593 591 576 561 559 551 548 542 542 540 531 508 304 294 288 279 277 276 266 262

Original: 80001

Tablel size: 82133

Table2a size: 80001

Table2b size: 84373

Table3 size: 81444

It could be seen that the frequency distribution of the 27 unique code values of the Processing Unit of three 3-value Code Units being used in the present example is similar in proportion to the one listed in Diagram 21. Tablel size is the bit size resulted from cross mapping between the 27 unique code values of Diagram 23 sorted in bit usage (such a sorting of the 27 unique code values is listed in Diagram 25 in Paragraph [78] below) and the 27 unique codes found in the Classified Absolute Address Branching Code Table (CAABCT) For 27 Values used in Example I as listed out in Diagram 19 of Paragraph [73]. Table2a size in Diagram 73 is a result of a cross mapping using another CAABCT as found in Example II below in Diagram 24: Diagram 24

Classified Absolute Address Branching Code Table (CAABCT) For 27 Values

Example II

Sorted PU Bit Representation in CHAN CODE, including Class Bit, Normal Bits and Branch Bit

It could be seen that as Example II above uses the same bit usage for its 27 unique table codes as that used in the 27 unique code values (sorted in bit usage) produced out of a data set read using the 3 -value Code Unit 0 Head Design (being value 1 = 0, value 2 = 10 and value 3 = 11) for three Code Units making up a Processing Unit, the bit usage result is the same as the instance of the original random data set generated and listed in Diagram 23. The random data set generated and listed in Diagram 23 and read up using the aforesaid 3 counts of 3 -value Code Units of 0 Head Design takes up 80001 bits and the bit usage by encoding that random data set so read using cross mapping with table codes of the CAABCT in Example II (Table2a in Diagram 23) is therefore the same: 80001 bits. The CAABCT in Example II uses up 135 bits for all its 27 unique table codes in 1 count each; the CAABCT in Example I uses up only 130 bits. But because of the uneven frequency distribution (made uneven under the 0 Head Design of the 3-value Code Unit used in reading rather than the conventional way of reading data in standard and same bit size designed for each value of a Code Unit of standard and uniform bit sizes) of the 27 unique code values in a random data set, the CAABCT in Example I (using 130 bits for an even frequency distribution of 1 count each for its 27 unique table codes) however uses more than the CAABCT in Example II (using 135 bits for an even frequency distribution of 1 count each for its 27 unique table codes) in terms of bit usage when encoding the original random data set of 8,0001 bits, producing an encoded code using 8,1989 bits instead as seen in Diagram 23, an expansion rather than compression. The result of Table2a to Table3 (Table2a being the cross mapping result using CAABCT 2, Table2b CAABCT 1, and Table3 CAABCT 0 as discussed in Paragraphs [85] to [93] and Diagram 31) is listed out in Diagram 23 and extracted as below:

Table2a size: 80001

Table2b size: 84373

Table3 size: 81444

The above gives the bit usage result of the encoding after cross mapping the PU values (sorted in bit usage) read from the random data set with the table codes of other bit usage patterns as listed in the autoit program listing as in Diagram 22a and reproduced below: global $MapTablel [27] = [4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,

5, 5]

global $MapTable2a[27] = [3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6,

6, 6, 6]

global $MapTable2b[27] = [3, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 4, 4, 4, 4, 4, 4, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6]

global $MapTable3[27] = [3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]

It could therefore be seen that there seems difficult to design a code table with 27 unique table codes that, when used in encoding the 27 unique code values read from a random data set, could achieve any saving or compression in bit usage. So in order to compress a random data set, additional techniques have to be designed, developed and used for such a purpose. One such design is through using Super Processing Units together with using specially designed Code Tables for mapping or encoding as revealed below.

[76] The concept of using Super Processing Units arises from the understanding that random data set in fact could be considered being made up of a number of uneven data subsections and if a random data set could not be compressed on account of its nature of frequency distribution of data code values, one could perhaps divide the whole digital data input file of random data into sub-sections of uneven data, called Super Processing Units here, so that techniques which capitalize on uneven data distribution for making compression possible could be applied to such individual Super Processing Units one by one to make compression possible for the whole digital data input of random distribution. This is the Divide and Conquer strategy.

[77] So the whole digital data input file could be regarded a Huge Processing Unit consisting of all the Processing Units of the digital data at a certain point of the data distribution spectrum, from random or even to wholly uneven at either extremes. In the field of Compression Science, compression techniques are possible just through taking advantage of the uneven nature of a data set. And a data set of random distribution so far is considered incompressible. So a single Huge Processing Unit consisting of Processing Units of random data as a whole could be divided into sub-sections consisting of a certain number of Processing Units called Super Processing Units, so that techniques could be designed for compressing such data sub-sections. So Huge Processing Unit is defined as the whole unit consisting of all the data codes that are to be put into encoding and decoding, therefore excluding the Un-encoded Code Section, which is made up by data codes that are not subject to the process of encoding and decoding, for instance, because of not making up to the size of one Processing Unit or one Super Processing Unit where appropriate. A Huge Processing Unit could be divided into a number of Super Processing Units for encoding and decoding for the sake of a certain purpose, such as compressing random data through such division or other purposes. The encoding and decoding of data for a Super Processing Unit may require some adjustment made to the encoding and decoding techniques or process that are used by encoding and decoding made for a Processing Unit. Therefore, a Super Processing Unit is a unit of data consisting of one or more Processing Units subject to some coding adjustment to the encoding and decoding made for a Processing Unit.

[78] In order to understand how Super Processing Units are used for the purpose of

compressing random data, the 27 unique code values of a Processing Unit made up of three 3 -value Code Units of 0 Head Design in Diagram 21 are sorted and listed in Diagram 25 below first:

Diagram 25

An instance of Sorted/Unsorted Frequency Distribution of the 27 unique Code Values of a Processing Unit made up of three 3 -value Code Units read from a random data set of 80,000 bits using the Code Unit Definition in Diagram 20

Processing Unit Code Values Bit used Frequency Code (sorted/unsorted)

Sorted/Unsorted

Sequence

It should be noted that the sorted ranked values of the 27 unique code values in Diagram 25 above could be divided into 4 groups in terms of bit usage: Group 1 of 1 code value of

3 bits, Group 2 of 6 code values of 4 bits, Group 3 of 12 code values of 5 bits and Group

4 of 8 code values of 6 bits. The ranking of each of the code values within a particular group may slightly vary from one random data set to another random data set because of the slight variation in frequency distribution of random data generated from time to time. But code values will not move from one group to another in terms of their frequencies between one instance of random data to another instance. If they do change so wildly, the data distribution is not random at all.

Since random data has such similar frequency distributions of data code values, different versions of CAABCT could be designed for cross mapping with them and the result of bit usage after encoding using such CAABCTs has been mentioned and shown in Diagram 23. For cross mapping of table codes of CAABCT with data code values of a particular schema and design of Processing Unit and Code Unit under CHAN

FRAMEWORK for processing a particular data set of a certain data distribution, the frequency ranking of the data code values of 27 unique code values under concern may be different from that of a random data set. So the order of such 27 unique code value frequencies must be known so that cross mapping of table codes of CAABCT and the 27 unique code values could be designed so as the best result for compression is to be attempted. So such an order of the unique code value frequencies should be obtained by parsing the data set under concern first and such information has to be made available to the encoder and decoder for their use in processing. Such information could vary from one data set to another data set so that it could be included in the Header of the encoded code for use later by decoder for correct recovery of the original data set. This is so for a data set in random distribution as well for the assignment of cross mapping of data code values and table code values for encoding and decoding. However, if slight variation of frequency ranking of the code values within group for a random data set is considered acceptable, such information could be spared from the Header. However, indicator to which CAABCT is to be used (or the content of the CAABCT as a whole) for processing has still be retained in the Header or made available to or built in into the encoder and decoder where appropriate. CAABCT is used here because AAB technique is used in designing the mapping code table for the 27 unique code values of the Processing Unit using three 3 -value Code Unit of 0 Head Design. So other mapping code tables without using AAB technique could also be designed for use where appropriate. So the mentioning of CAABCT for use in the present case of discussion applies to the use of mapping code table in encoding and decoding in general. [80] It is time to see how CAABCT is used in encoding Super Processing Units for making compression for a random data set. Since the use of Super Processing Units is for the purpose of breaking a random data set into sub-sections of data, Super Processing Units therefore are designed to have a data distribution, which is different from a random data set so that techniques for compressing uneven data could be used for making

compression. For example, Super Processing Units that have equal or less number of processing units than a full set of processing units (in the present example 27 unique entries of data values) are guaranteed to have an uneven data distribution. However, it does not mean that all uneven sub-sections of data are compressible. This is so since any compression technique or mapping code table that is useful in compressing data of a certain data distribution may not suit to data of another different data distribution. This means that more than one compression technique or one mapping code table has to be used in compressing Super Processing Units of different data distribution.

[81] In adopting this approach and technique of dividing a random data set into sub-sections of data in the form of Super Processing Units, the first attempt is to classify the Super Processing Units into two groups, using one CAABCT for encoding and decoding the Processing Units of Super Processing Units of one group and another CAABCT for another group. In such as a way, one bit indicator about which CAABCT is used for either of the two groups has to be used for one Super Processing Unit. So additional bit usage has to be incurred for making compression using this approach. And the encoding implemented using the relevant CAABCTs for making compression should result in bit usage saving that is more than the bit usage that has to be incurred by using the

CAABCT bit indicator for each Super Processing Unit in addition to other additional information such as those contained in the Header. This is a very challenging task ahead.

[82] The technqiues so far suggested to be used for this purpose are:

(a) use of Super Processing Units;

(b) dividing Super Processing Units into groups, two in the present case here first; and

(c) use of CAABCT (two for the present case) for cross mapping between unique data code values and unique table code values (of the two groups of Super Processing Units) and use of CAABCT indicator.

Questions arise as to the size of the Super Processing Units to be sub-divided into for use and how the Super Processing Units are to be grouped or classified and what CAABCTs are to be used.

[83] Answers to these questions have to be found in the example being used in the above case of discussion in Diagram 25, in which a Processing Unit is designed as comprising three 3 -value Code Units of the 0 Head Design, having therefore 27 unique code values sorted in frequency ranking for a random data set of around 80000 bits. Diagram 23 shows that the two CAABCTs used for discussion in Paragraphs [73] to [75] could not make compression possible for a random data set. This is where subdividing the random data set into Super Processing Units for encoding is suggested as a solution in this relation. Therefore the Super Processing Units have to be divided in such a way that each of them do not have a random data distribution. So Super Processing Units having a fixed size which has a certain number of Processing Units which guarantee that the data distribution within each of Super Processing Unit is not random could be used. So the discussion in Paragraphs [73] to [75] suggests that it is certain that Super Processing Unit made up of 27 Processing Units or less should meet this criterion as 27 unique code values if all present or not do not constitute a random data set of Code Unit designed in conventional sense using fixed bit size. A random data set of Code Unit designed in conventional sense using fixed bit size when read using the schema and design of Processing Unit made up of three 3-value Code Units of 0 Head Design exhibits a characteristic data frequency distribution that is shown in Diagram 25 rather than the one count for each of the 27 unique values shown in Diagram 19. Using a fixed size Super Processing Unit could be one way for subdivision. So for the time being a Super Processing Unit is considered having the size of 27 Processing Units first.

A Super Processing Unit having the size of 27 Processing Units here do not guarantee each of the unique 27 code values will be present in each of the Super Processing Units so divided. In these Super Processing Units, each may have a different data distribution, some having all the 27 unique code values present, some having some unique code values absent while other unique code values occurring more than once, all in different ways. So for simplicity, the example here divides such different data patterns into two groups for encoding and decoding purpose. The statistics of bit usage and frequency distribution of a random data set of 80000 bits in Diagram 25 is refined in Diagram 26 as follows:

Diagram 26

An instance of Statistics of the 27 unique Code Values of a Processing Unit made three 3-value Code Units read from a random data set of 80,003 bits using the Code Unit

Definition in Diagram 20

Processing Unit

From Diagram 26 above, it can be seen that the 27 unique values could be divided into 4 categories in terms of bit usage or frequency characteristics. If Category 1 and 2 form into one group (Group 0), it takes up about 49.7% in terms of frequency counts and Category 3 and 4 form into another group (Group 1), taking up about 50.2%. The frequency counts of these two groups are roughly the same. So by dividing a random data set into Super Processing Units (using 27 Processing Units) with uneven data

distribution, it is expected that some Super Processing Units will have more unique code values coming from Group 0 and the other from Group 1. So if a mapping code table (CAABCT 0) could be so designed to have less bit usage given to unique table code values for cross mapping to unique data code values in Group 0 than those in Group 1 and another mapping code table (CAABCT 1) to have less bit usage given to unique table code values for cross mapping to unique data code values in Group 1 than those in Group 0. Then those Super Processing Units with more unique data code values from Group 0 will benefit from using CAABCT 0 for encoding for compression purpose and those Super Processing Units with more from Group 1 will benefit from using CCABCT 1 for the same purpose. However there is an issue about the additional expenditure on the use of one indicator bit for indicating which mapping code table is used for each of the Super Processing Unit. On the other hand, in Group 0 Super Processing Units (i.e. those Super Processing Units having more data values from Group 0) will have sometimes more than 1 entry of data values from Group 0 than from Group 1. This is also true for Group 1 Super Processing Units. So this additional expenditure of the mapping table indicator bit for every Super Processing Units may still have a chance to be offset by the aforesaid mentioned pattern of code values occurrence. What is more, some other techniques could be used to help produce more bit usage saving for using this encoding and decoding technique of using Super Processing Units and Mapping Code Tables.

[86] One of these other techniques is the use of Artificial Intelligence (AI) technique in

dispensing with the use of the Mapping Code Table Bit Indicator for every Super Processing Unit in encoding and decoding. Artificial Intelligence technique used here is by setting up AI criteria by which to distinguish from the content of each of the encoded Super Processing Units which one of the two mapping code tables is used for encoding the corresponding Super Processing Unit. In this way, if the AI criteria are set up appropriately, the Mapping Code Table Bit Indicator could be dispensed with. Below are some suggestion about the AI criteria that could be used:

(a) a code value (Identifying Code Value) present in the Super Processing Unit that could be used for identifying which CAABCT is used for its encoding; so taking into account of this criterion, such a code value should be encoded in different table code values by the two different CAABCTs; and this requirement has to be catered for during the stage of designing the two CAABCTs under concern; also because of this requirement, the Terminating Condition or Criterion for stopping the use of one mapping code table for determining the size of the Super Processing Unit for encoding is to be changed; for instance at first it is taken that the size of the Super Processing Unit is to be 27 processing unit code values, and as the Identifying Code Value may not always be found amongst the 27 code values present in a Super Processing Unit, so the Terminating Condition has to be modified to: either

(i) using the Identifying Code Value as the Terminating Value by which the value codes before it and itself are to be encoded using one CAABCT, after this Identifying Code Value, a new assessment of which CAABCT is to be used for encoding is to be made for those code values ahead including the next Identifying Code Value; using this technique, the last section of the original code values without the Identifying Code Value could also be assessed and to be encoded using either of the two CAABCTs; however in such a way, an Identifying Code Value has to be added to the end of this section after encoding; and an indicator about whether this last Identifying Code Value is one that has been added or is one part of the original code values has to be added in the Header so that upon decoding, either the Identifying Code Value has to be decoded or removed from the recovered codes; and then following this section, there may be the Un-encoded Code Section, containing code bit(s) that do not make up to one processing unit code value if there is any; or

(ii) using the Super Processing Unit containing the Identifying Code Value as the

Terminating Condition; this means that if from the head of the digital data input, only the third Super Processing Unit (of 27 code values in this case) contains the Identifying Code Value, then all the code values of the first 3 Super Processing Units are to be encoded using one CAABCT upon assessment, and a new assessment is to be made about using which CAABCT for encoding the code values ahead up to all the values of the Super Processing Unit containing the Identifying Code Value; and at the end, the last section of the original code values not making up to one Super Processing Unit with or without the Identifying Code Value could be processed in way like that in (i) above or just left as included in the Un-encoded Code Section;

(b) unsuccessful decoding; code values encoded using one CAABCT sometimes may not be successfully decoded using another CAABCT; so the encoded code values have to be decoded using both CAABCTs, as one CAABCT must be the one selected for encoding; the decoding process should be successful for using it for decoding; this may not be the case for decoding using another CAABCT which was not used for encoding;

(c) shorter encoded code; because the encoding is used for the purpose of making compression, the CAABCT that produces the shortest encoded code will certainly be selected; so upon decoding, using CAABCT that is not used for encoding will certainly produce encoded code values that as a whole is longer in bit usage;

(d) unsuccessful re-encoding; so upon decoding using two different CAABCTs, two different sets of decoded codes are produced; these two sets of decoded codes are to be encoded again using the two CAABCTs again interchangeably, sometimes re-encoding using another CAABCT other than the one chosen may not be successful; it is so especially code values in trios using different Head Design is employed in the two different CAABCTs; for instance, one CAABCT using 0 Head Design such as:

0

10

11

as suffix to the code values of trios and another CAABCT using 1 Head Design such as: 1

01

00

as suffix to the code values of the trios (this point will be explained later); another evidence of unsuccessful encoding is that the last bits of the decoded code upon re- encoding does not form into one code value and that for using Super Processing Unit with fixed size, the re-encoded code values do not make up to the designed fixed size, either more encoded code values or less than the fixed size of the Super Processing Unit are produced upon re-encoding;

(e) additional bit to be added after the encoded code where necessary; it is very rare chance that after assessing with the above AI criteria, it is still not possible to identify the CAABCT chosen for encoding, an additional bit has to be added to the end of the section or unit of encoded code for making such a distinction; however it seems to be very rare chance for this; this additional bit, if necessary, is only provided for use as an exception handling technique and as a safe escape, which may never be required to be implemented and thus may not actually use up bit storage, from incorrect distinction when all the above AI criteria could not provide a clear-cut answer in ambivalent cases; in view of this, such AI assessment should be done during the encoding process as well after each section or unit of encoding is finished; and

(f) other criteria where found to be appropriate and valid for use.

To put the above revelation in picture, two such CAABCTs are designed for further elaboration in Diagram 27 and 28:

Diagram 27

Classified Absolute Address Branching Code Table (CAABCT 0 For 27 Values

Example III

Bit Representation in CHAN CODE, including Group No.,

Normal Bits and Branch Bit/Suffix

Diagram 28

Classified Absolute Address Branching Code Table (CAABCT 1) For 27 Values

Example IV

PU Bit Representation in CHAN CODE, including Group No.,

Normal Bits and Branch Bit/Suffix

It could be seen from the above two CAABCTs, the grouping of table code values is adjusted a little bit as to grouping the first 11 table codes together into Group 0 and the remaining into Group 1. This adjustment of grouping is a result of the need for easier code arrangement. Those table code values in bracket are code values in trio. In CAABCT 0, there is only 1 trio whereas in CAABCT 1, there are 6. In CAABCT 0, the trio is in 0 Head Design, that is having suffix in the form of: 0

10

11 whereas in CAABCT 1, the trios is in 1 Head Design, having suffix in the form of: 1

01

00

The use of suffix of different design is meant for AI distinction as discussed in Paragraph [86](d). The suffix design is another usage that results from using the AAB technique.

[89] Diagram 29 below gives a consolidated view of the cross mapping of the 27 unique data code values of the Processing Unit of three 3 -value Code Units sorted according to bit size with the table code values of CAABCT 0 and CAABCT 1 :

Diagram 29

Cross Mapping between Data Code Values (Diagram 26) and Table Code Values of CAABCT 0 and CAABCT 1

[90] When comparing the statistics above, it is not absolutely clear if random data set could be compressed using the above two CAABCTs as the mapping tables do not apply to the whole random data set but to appropriate Super Processing Units where either one of the CAABCTs is better in terms of bit usage. The pattern of distribution of data code values of a random data set into Super Processing Units of uneven data distribution is yet to be ascertained. Selecting which CAABCT for encoding any particular Super Processing Unit is based upon the actual encoding result, not by counting the number of data code values of Group 0 and Group 1 found as the bit usage results produced by actually implementing the respective encoding is a more accurate indicator about which

CAABCT is best for use for a particular Super Processing Unit.

[91] There could be enhancement to the above technique, for instance by using one CAABCT which has exactly the same bit usage distribution for all the 27 unique table code values as that for the 27 unique data code values that is characteristic of a random data set. One such a CAABCT is CAABCT 1. Now CAABCT 1 is distributed in an order which is optimized for creating differences in cross mapping in the hope of capitalizing on such differences for making compression saving in bit usage. CAABCT 1 could be

redistributed for cross mapping purpose as follows in Diagram 30:

Diagram 30

Cross Mapping between Data Code Values (Diagram 26) and Table Code Values of CAABCT 2 and CAABCT 1

[92] CAABCT 2 is exactly the same as CAABCT 1 except that:

(a) CAABCT 2 uses the 0 Head Design for its 6 trios (i.e. 0, 10, 11) whereas CAABCT 1 uses the 1 Head Design (i.e. 1, 01, 00) for their respective suffix to the trios;

(b) when used in cross mapping, unique table code values of CAABCT2 are mapped to to unique data code values of the random data set with exactly the same bit size, i.e. 3 bit size table code value is mapped to 3 bit size data code value, and 4 bit to 4 bit, 5 bit to 5 bit and 6 bit to 6 bit; mapping in such a way results in the same bit usage after encoding, no compression nor expansion in data size of the random data set.

[93] For encoding and decoding for compressing a random data set, the technique introduced in using Super Processing Units could be slightly adjusted as follows. Firstly, when a random data set is given for encoding, CAABCT 2 is used to cross map the data code values of the random data set for encoding; i.e. the random data set is read using the definition of the 3 -value Code Unit of 0 Head Design one by one, three of such consecutive Read Units form one Processing Unit for using with CAABCT 2 as a mapping table for encoding. The Processing Unit is then encoded one by one as well as the Super Processing Units as described above. This is a cross the board translation for all the random data set except the few bits that are left in the Un-encoded Code Section which do not make up to the size of 1 Processing Unit for encoding, resulting in translated code in accordance to CAABCT 2. For making compression of this translated data set (which is not random now after the cross-mapping translation using CAABCT2), CAABCT 1 is then used with Super Processing Units sub-divided from the translated data set using the chosen Terminating Condition, such as the original code value of 000, which is now translated into 010, the corresponding table code value of CAABCT 2. So wherever the Super Processing Unit under processing is susceptible to encoding using CAABCTl for producing encoded code which is less in bit usage than the translated code of the same Super Processing Unit, it is encoded by using CAABCT 1. As the Terminating Condition includes the encoded CAABCT table code value of 010, if CAABCT 1 is used to encode it, 010 is encoded into 011. So the resultant encoded code after encoding using CAABCT 2 and CAABCT 1, the original data code value 000 is translated into 010 and then 011. If using CAABCT 1 could not reduce the size of the CAABCT 2 encoded code of the Super Processing Units, the CAABCT 2 encoded code values of those Super Processing Units are then left un-touched. So the original data code value 000 remains as 010. So this could also be used as one of the AI criteria used for distinguishing between CAABCT 2 code and CAABCT 1 code. As the suffix of the 6 trios of CAABCT 2 are different from those of CAABCT 1, many suffix indicators could be used for AI distinction purpose as well. All the AI operations mentioned in Paragraph [86] could be used as well for distinguishing CAABCT 1 code from CAABCT 2 code. As there is no need to translate it back to the original data code values for AI distinction, the decoding process should be successful without question. So for decoding the encoded code after the cross-the-board mapping using CAABCT 2 and then selective cross mapping using CAABCT 1, AI techniques for making AI distinction of Super Processing Units containing CAABCT 1 code mentioned in Paragraph [86] could be used. And after such Super Processing Units containing CAABCT 1 code are identified, the

corresponding CAABCT 1 code is then decoded back into CAABCT 2 code for those Super Processing Units just identified. After all CAABCT 1 code is translated back into CAABCT 2 code, cross-the-board decoding of CAABCT 2 code to the original data code values could be achieved using the code table of CAABCT 2. In this way, it could be asserted that whenever there are Super Processing Units having code values that are subject to compression by using CAABCT 1, then the random data set containing such Super Processing Units could be compressed. Or one could use CAABCT 0 for cross mapping with CAABCT 2 instead of using CAABCT 1, or using CAABCT 0 and CAABCT 1 interchangeably for cross mapping with CAABCT 2 where appropriate; in these cases, the AI criteria may have to be due adjusted or added to for determining the mapping code table which is used for any particular Super Processing Unit. Diagram 31 below shows the cross mapping that could be done using all 3 CAABCTs: Diagram 31

Cross Mapping between Data Code Values (Diagram 26) and Table Code Values of CAABCT 2, CAABCT 0 and CAABCT 1

Processing Unit

Sorted Code

And since the Terminating Condition for dividing Super Processing Units could be adjusted or fine tuned, such as changing the fixed size of the Super Processing Unit from 27 data code values to something less or something more, or changing the Terminating Value used from 000 to another code value, or using just a Terminating Value for determining the size of Super Processing Unit (in this way, the Super Processing Unit could be of varying sizes) instead of using a Terminating Value with fixed size Super Processing Unit; or one could attempt other sizes of Code Units, (for instance using 6- value Code Units and the sets of CAABCTs designed for it) or other sizes of Processing Units as well, such as using four 6-value Code Units instead of three 6-value Code Units; there could be endless such variations under CHAN FRAMEWORK, therefore it could not be certain that random data set could never be compressed. The opposite is more certain instead. By the way, technique of changing the bit 0 : bit 1 ratio mentioned in Paragraphs [62] to [66] could be used first to change the frequency distribution of the random data set to an uneven data set, which is then amenable to compression by techniques capitalizing on uneven data distribution.

[95] The above inventive revelation discloses many novel techniques for encoding and

decoding digital data set, whether random or not, for both the purposes of

encryption/decryption and compression/decompression. Such techniques could be combined to achieve such purposes as intended by the designer, implementer and user. Other techniques could also be designed and implemented for use utilizing the structural traits and coding techniques introduced here under CHAN FRAMEWORK. Before closing the revelation, there is another technique, much simpler, and useful that could be used as well, either alone or in combination with the techniques introduced above. This is encoding and decoding using dynamic Processing Unit of different sizes together with dynamic adjustment of Code Value Definition using Artificial Intelligence Technique.

[96] In the above discussion of Super Processing Units, the use of the occurrence of just a Terminating Value for determining the size of a Super Processing Units results in of Super Processing Units varying in size, i.e. different number of Processing Units making up a Super Processing Unit at different positions of the digital data set. One could also encode and decode a Code Value dynamically using different number of Code Units as a Processing Unit by designing an appropriate Terminating Condition for such division for the purpose of encoding and decoding. In the course of encoding and decoding, there could also be dynamic adjustment to the size and therefore the definition of the code values under processing.

[97] Let one turn to revealing the technique mentioned in Paragraph [96], i.e. the technique of using different sizes of Processing Units (in the present case, Processing Units of 3 Code Units and of 4 Code Units are used for illustration) dynamically in the context of changing data distribution, by using the design of 3-value Code Units of 0 Head Design as listed in Diagram 32 below: Diagram 32

Processing Unit of One 3 -value Code Unit

So the Processing Unit is either made up of three or four 3-value Code Units of 0 Head Design, depending on the changing context of the data distribution at any point under processing. This novel technique simplifies the processing of encoding and decoding data of any type of data distribution whether in random or not.

Designing this technique is an outcome of using the concept of Terminating Condition. The Terminating Condition being conceived is that the Termination Point of a Processing Unit (using 3-value Code Unit of 0 Head Design here) is to be based on whether all the 3 unique Code Values of the 3-value Code Unit have come up. By logical deduction, it is apparent that the Processing Unit should not be less than having the size of 3 Code Units for a 3-value Code Unit.

So if all 3 unique code values have come up in three consecutive occurrence, the

Processing Unit size is 3 Code Units, if not the size of Processing Unit should be more than 3 Code Units. It could be 4 or 5 or 6 and so on so forth. So it is simpler to use 4 Code Units as another Termination Point if the Termination Point of the Processing Unit under the context of the data distribution is not 3 Code Units. That means, when 3 consecutive code values are read using the definition of 3-value Code Unit of 0 Head Design, not all the 3 unique data code values (i.e. vl, v2 and v3 as listed in Diagram 32) are present, then the Termination Point stops at the fourth code value read, so that this is a Processing Unit of 4 Code Units; whereas if all 3 unique data code values are present upon reading 3 consecutive code values, the Termination Point is at the third code value read and the Processing Unit is made up of 3 Code Units. So the size of the Processing Unit measured in terms of the number of Code Units that it is made up of varies dynamically with the context of the data distribution of the digital data set under processing. According to the context of data distribution, if the Processing Unit should be of the size of 4 Code Units, then there are two scenarios for this:

(i) all the 3 unique code values are present; and

(ii) not all 3 unique code values are present.

So altogether there are 3 scenarios: (a) Processing Unit of 3 Code Units where all 3 unique code values are present;

(b) Processing Unit of 4 Code Units where all 3 unique code values are present; and

(c) Processing Unit of 4 Code Units where not all 3 unique code values are present.

So one could assign Classification Code to these 3 scenarios as listed out in Diagram 33

Diagram 33a

Scenario Classification Code (Part of CHAN CODE) for the 3 scenarios of Paragraph [98]

Depending on the frequency distribution of these 3 scenarios, the scenario has the highest frequency could be adjusted to using the least number of binary bit. So assuming

Scenario (c) has the highest frequency, the assignment of scenarios to Scenario

Classification Code could be adjusted to Diagram 33b a listed below:

Diagram 33b

Scenario Classification Code (Part of CHAN CODE) for the 3 scenarios of Paragraph [98]

So for encoding and decoding the whole digital data input file, one could first parse the whole file and find out which scenario has the highest frequency and assign it to using the shortest Classification Code and push other scenarios downwards. So a Scenario Design Indicator (indicating which Scenario Classification Schema or Design is to be used) has to be included in the Header so that decoding could be done correctly.

After discussing how Classification Code of CHAN CODE could be used in the present example of illustration, it comes to see how Content Code of CHAN CODE could be designed and manipulated using another technique of CHAN CODING, i.e. dynamic code adjustment. For Scenario (a), one could use the following coding for use:

Scenario Classification Code + Rank and Position Code

Because Scenario (a) is a Processing Unit of 3 Code Units where all the 3 unique code values are present, using 2 or 3 bits (using AAB technique of CHAN CODING, for instance here the actual value range is 6, the lower value range is 2 bits equivalent to 4 values, and the upper value range is 3 bits equivalent to 8 values) for Rank and Position Code could be enough for covering all the 6 possible combinations of the 3 unique code values distinguished by their ranks and positions as follows in Diagram 34:

Diagram 34

RP Code assignment for 6 possible combinations of 3 unique code values in terms of rank and position

To do CHAN CODING for the Content Code of CHAN CODE for Scenario (b) and (c), one could use the value of the fourth data code value read, determined by the Terminating Condition. This fourth data code value, the Terminating Value, is to be placed exactly as it is read using the definition of 3-value Code Unit of 0 Head Design without having to change any code of which. So the encoding for the Content Code part for Scenario (b) and (c) each includes the following steps:

(a) reading four consecutive data code values coming in using the definition of the Code Unit under concern;

(b) writing fourth data code value exactly as it is read;

(c) writing the first data code value using technique of code adjustment where appropriate;

(d) writing the second data code value using technique of code adjustment where appropriate;

(e) writing the third data code value using technique of code adjustment where appropriate; and

(f) looping back to Step (a) after finishing encoding the 4 consecutive code values read in Step (a) until it is up to the point where the Un-encoded Code Section begins. and the technique of code adjustment mentioned in the above Steps (c) to (e) includes content code rank and position coding, content code promotion and content code omission, content code demotion, and content code restoration where appropriate. For Scenario (b), the fourth data code value could be one of the 3 data code values: vl, v2 or v3. So Under Scenario (b), the sub-scenarios are:

Diagram 35

3 Sub-Scenarios of Scenario (b) and RP Coding

For each of the 3 sub-scenarios of Scenario (b), there are also 6 possible combinations. One could use the technique of rank and position coding for each of these sub-scenarios as shown in Diagram 35 using 2 or 3 bits for each of their respective 6 possible combinations.

Or one could use the technique of code promotion as well as code omission where appropriate as follows in Diagram 36: Diagram 36a

3 Sub-Scenarios of Scenario (b) and Code Promotion & Code Omission

Or the values that are to be placed after placing the 4 th code value could be re-arranged as in Diagram 36b as follows:

Diagram 36b

3 Sub-Scenarios of Scenario (b) and Code Promotion & Code Omission

If the technique of code promotion and code omission is to be used, the placement of code values in Diagram 36b may be preferred to Diagram 36a for the sake of consistency as such a placement arrangement may be a better choice for Scenario (c) that is to be explained in Paragraph [103] below.

Taking the sub-scenario (i) Scenario (b) above, Code promotion is a result of logical deduction for use in order to reduce bit usage. For instance, since Scenario (b) is a scenario whereby the 3 unique Code Values must all appear in the 4 consecutive code values read, after placing the Scenario Classification Code as used in Diagram 33b, and the fourth code value, for instance vl, the encoded code becomes as listed out in Diagram 37:

Diagram 37

Encoding for Scenario Classification and the 4 th Code Value Seen.

(b) 4 th code value 3 rd code value 2 nd code value 1 st code value

vl

10 0 and the encoded code for the remaining 3 code values are to be filled out. Since it is Scenario (b), that means the first 3 code values, i.e. the first to the third, must be different from the 4 th code value as it is the 4 th code value that makes it meeting the Terminating Condition designed for Scenario (b). So the remaining 3 code values are either v2 or v3. And since there are only 2 choices, it only requires 1 bit, either bit 0 or bit 1, to represent these 2 different value occurrences. Originally v2 and v3 are represented by 10 and 11 respectively. So these code values are then promoted to using 0 and 1 respectively for saving bit usage. This is the technique of code promotion, a technique of CHAN

CODING. And if the third and the second code values are all v2, then the first one must be v3, as it is so defined for Scenario (b), otherwise it could not meet Scenario (b)'s Terminating Condition. So v3 could be omitted by logical deduction because of the above reasoning. The whole piece of encoded code using the techniques of Code Promotion and Code Omission of CHAN CODING for the 4 Code Unit Processing Unit just mentioned is therefore represented in Diagram 38 as follows:

Diagram 38

Encoding using Code Promotion & Code Omission of CHAN CODING

It could be observed that using code promotion and code omission technique gives the same bit usage result (2 * 2 bits + 4 * 3 bits as listed out in Diagram 36a and 36b) as that using rank and position coding technique in Diagram 35, these two techniques differ only in the resulting bit pattern arrangement.

Likewise for Scenario (c), the fourth data code value could be one of the 3 data code values: vl, v2 or v3. So under Scenario (c), the sub-scenarios are listed in Diagram 39:

Diagram 39

3 Sub-Scenarios of Scenario (c) and RP Coding

For each of the 3 sub-scenarios of Scenario (c), there are also 15 possible combinations. One could use the technique of rank and position coding for each of these sub-scenarios as shown in Diagram 35 using 3 or 4 bits for each of their respective 15 possible combinations as in Diagram 39 above.

Or one could use the technique of code promotion, code omission as well as other forms of code adjustment where appropriate. Diagram 40 shows one way of code adjustment by encoding the remaining 3 code values in the order of 3 rd code value, 2 nd code value and 1 st code value after placing the 4 th code value first. Because in Scenario (c), where only 2 unique code values are present, the 4 th code value already counts one; so the remaining 3 positions have to be filled up by the one same as the 4 th code value and another one out of the remaining 2 unique code values. However, because there are then 2 choices out of 3 options, to eliminate uncertainty for reducing bit usage, the one that is other than the 4 th code value better be determined first. So for directly encoding the remaining 3 code values, encoding the code value of the 3 rd position of the Processing Unit under processing in the incoming digital data input may be the preferred choice. This is based on the conventional assumption that the chance of having 2 or more same code values going one after another is lower than that of having 2 different code values. Of course, if there is information available about the pattern of frequency distribution amongst the 3 unique code values in the digital data input under processing, such placement choice could be adjusted where such information available warrants the change. However, Diagram 40 adopts the placement arrangement in the order of 4 th , 1 st , 2 nd and 3 rd positions first for convenience first.

Let it be assumed that the 4 th code value is v3, so the values of the other 3 positions could be any of vl, v2 or v3. The earlier that the other one present is known, the more bit saving it could be by using the technique of code promotion. But because one of the two code values present could be any of the 3 values, vl, v2 and v3 and one of which could be v3. So it is logical to promote v3. i.e. 11 to 0 first, and vl is then demoted to v2 and v2 to v3. And if the another one code values turns up, the choices could be limited to the 2 unique code values that already turn up. Since the fourth code value already takes the rank of vl using the code value of bit 0, then the second unique code value that turns up could take the code value of bit 10. Using this logic, Diagram 40 is produced as follows: Diagram 40

3 Sub-Scenarios of Scenario (c) and Code Adjustment with Scenario Classification Code using lbit (bit 0)

It could be observed for Scenario (c) here that using code promotion and code omission technique gives roughly the same bit usage result (using code promotion technique here apparently slightly better) as that using rank and position coding technique.

From the above result, another observation is that those code value entries which after encoding results in expansion are those entries having more vl code value. So if the data distribution of the data set is having more bit 0 than bit 1, it would be better to use the 1 Head Design as the definition of the Code Unit for reading the digital data set for encoding using the techniques introduced above; the three unique code values then become:

1

01

00

In this way, bit 0 will be sampled upon reading as v2 and v3 only instead of going into vl . So it is apparent that using the technique of making dynamic adjustment to the size of Processing Unit corresponding to the changing data distribution pattern of the digital data input as outlined above allows more flexibility of dynamic code adjustment during encoding. What is more, during the data parsing stage, information could be collected for arranging and assigning the three Scenarios (a), (b) and (c) with Scenario Classification Code, giving the most frequent scenario the least number of bit. And the use of 0 Head Design or 1 Head Design of Code Unit could also be selected for use in accordance with the frequency distribution of bit 0 and bit 1. And the technique of changing the ratio of bit 0 and bit 1 in the data set has been introduced in Paragraph [62] and onwards and could be applied to the random set when it is to be compressed using together with other techniques introduced as revealed above.

Through examining the bit usage results of Scenarios (a) and (b) in Diagram 34 and 36b, it is noted that where the Processing Unit has all 3 unique code values present, it is easier to make the encoding because of less patterns of varying data distribution and requiring less bit usage for use for representing those patterns, be it using RP Coding or the technique of code adjustment through code promotion and code omission. So the aforesaid mentioned design of using Processing Units of varying sizes in terms of number of Code Units used could be further improved on by changing the Terminating Condition that: any Processing Unit used should contain all the 3 unique codes values of vl, v2 and v3 (i.e. 0, 10 and 11).

So Scenario (c) discussed above has to be eliminated by replacing it with a Processing Unit of size of 5 Code Units, or 6 Code Units, so on and so forth until 3 unique code values of vl, v2 and v3 have come up and the Termination Point stops at the Code Unit which contains the last appearing unique code value of the trio: vl, v2 and v3. And the Scenario Classification Code therefore is changed to:

Diagram 41a

Scenario Classification Code (a) for Processing Units of varying sizes based on the appearance of the last unique code values

Diagram 41b

Scenario Classification Code (b) for Processing Units of varying sizes based on the appearance of the last unique code values

The above Scenario Classification Codes all end on bit 0 and there will not be Scenario Classification Code ends on bit 1 if so designed. Or else the Scenario Classification Code ends on bit 1 will be just similar to Scenario (c) which is fixed in the number of

Processing Units containing only one or two unique values up to that point, i.e. instead of Scenario (c): 4 Code Unit containing less than 3 unique values, it is 5 Code Unit containing less than 3 unique values, or 6 or 7, so on and so forth, depending on the number of binary bits the Scenario Code has.

So the bit usage diagrams for Scenarios (a) and (b) could be revised as follows: Diagram 42

RP Code assignment for 6 possible combinations of 3 unique code values in terms of rank and position

Diagram 43

3 Sub-Scenarios of Scenario (b) and Code Promotion & Code Omission

It is noted from Diagram 42, because of the use of one bit 0 for the Scenario (a), now renamed as Scenario 3 Code Units, the encoding result is even better for this scenario. The bit usage result for Scenario 4 Code Units in Diagram 43 is just the same as before. However considering that there might be many different (or even infinite for the worse cases) varying sizes of Processing Units, there must be a simpler logic so that programming for catering for such infinite number of scenarios that may occur. So the logic for encoding all such scenarios could change to:

(a) reading 3 consecutive data code values coming in using the definition of the Code Unit under concern and determining if the Terminating Condition is met; the Terminating Condition being the consecutive data code values so far read containing all the unique data code values of the Code Unit according to design [i.e. in this case when the code units read so far do not contain all 3 unique code values, going to Step (b) afterwards; otherwise going to Step (c)]; (b) reading 1 more data code value [i.e. when the code units read in front so far do not contain all 3 unique code values] and evaluating each time if the Terminating Condition is met until the code units read contain all 3 unique code values [i.e. the Terminating Condition in this case]; and going to Step (c) if the Terminating Condition is met;

(c) when the data code values so read contains all unique data code values [3 in this case of 3-value Code Unit], counting the number of data code values so read and determining the corresponding Scenario Classification Code Value and writing it and then writing the last data code value read exactly as it is read;

(d) using and writing 1 bit code for identifying which one of the other two unique code values that are present [for this case of 3-value Code Unit; bit 0 for the unique data code value with the higher ranking of the remaining two unique data code values discounting the unique data code value of the last one read and written in Step (c), and bit 1 for the lower ranking one; or vice versa depending on design where appropriate] starting from writing [i.e. replacing or encoding it with either bit 0 or bit 1 mentioned in this Step (d)] the one read up in the first position to the one in the last but one position, using technique of content code adjustment [including content code rank and position coding, content code promotion, content code omission, content code demotion or content code restoration where appropriate] where appropriate;

(e) looping back to Step (a) after finishing encoding the last but one data code value read in Step (a) for the Processing Unit under processing until it is up to the point where the Un-encoded Code Section begins.

Following the above revised encoding steps, one could revise the bit usage diagrams starting from Scenario 3 Code Units as follows:

Diagram 44

Encoding and Bit Usage for Scenario 3 Code Units

where [] stands for code omission by logical deduction; and bit 0 and bit 1 encoding for the first position data code value is a code adjustment using code promotion where appropriate.

Diagram 45

Encoding and Bit Usage for Scenario 4 Code Units

It could be seen from the above figures that if Scenario 3 Code Units, using 10 and Scenario 4 Code Units using 0 as the scenario classification codes, all entries have either breakeven or bit usage saving results.

The encoding and bit usage for Scenario 5 Code Units is a bit longer and complex to list it out. However, the encoding follows the same logic and the bit usage result could be briefly discussed as follows.

Scenario 4 Code Units using Scenario Code 10 is used as a basis for discussion. For 18 encoded entries, bit usage is reduced by 6 bits (i.e. 10 bits saved minus 4 bits lost on average). If Scenario 5 Code Units using Scenario Code 110, that means it will spend one more bit on the Scenario Code used for every encoded entry, but it will have the chance of encoding another additional data code value. The frequency distribution in Diagram 46 of the 3 unique data code values of the 3 -value Code Unit of 0 Head Design is produced by running the autoit program in Diagram 47 below (using also with the helper.au3 listed out in Diagram 16b):

Diagram 46

Frequency Distribution of 3 -value Code Unit using 80000 random bits

0 : 26536

10 : 13156

11 : 13576

Diagram 47

autoit program producing the frequency distribution in Diagram 46 of 3 -value Code Unit using 80000 random bits

#include <Array.au3>

#include "helper.au3" global $ShowInput = false

global $ShowDecode = false

global $LastDecode func Readl(byref $data)

$r = ReadOneBit($data)

if $r = 0 then

$LastDecode = '0'

return 0

endif

$r = ReadOneBit($data)

if $r = 0 then

$LastDecode = '10'

return 1

endif

$LastDecode = 'l T

return 2

endfunc func Read2(byref $data)

$r = ReadOneBit($data)

if $r = 0 then

$LastDecode = '0' return 0

endif

$r = ReadOneBit($data) if $r = 0 then

$LastDecode = '10' return 1

endif

$r = ReadOneBit($data) if $r = 0 then

$LastDecode = '110' return 2

endif

$LastDecode = 'lir return 3

endfunc func Read3(byref $data) $r = ReadOneBit($data) if $r = 0 then

$LastDecode = '0' return 0

endif

$r = ReadOneBit($data) if $r = 0 then

$LastDecode = '10' return 1

endif

$r = ReadOneBit($data) if $r = 0 then

$LastDecode = '110' return 2

endif

$r = ReadOneBit($data) if $r = 0 then

$LastDecode = '1110' return 3

endif

$LastDecode = 'llir return 4

endfunc global $data[ 10000] GenerateRandomData($data) local $DecodeTable[8]

local $FreqTablel [3]

ZeroArray($FreqTable 1 )

InitPointer($data)

while not $DataEnded

$index = Readl($data)

$DecodeTable[$index] = $LastDecode

if $ShowDecode then ConsoleWrite($LastDecode & '(' & $index & ') ')

$FreqTablel [$index] = $FreqTablel [$index] + 1

wend

if $ShowDecode then ConsoleWrite(@CRLF)

for $i = 0 to 2

ConsoleWrite($DecodeTable[$i] & ' : ' & $FreqTablel [$i] & @CRLF)

next

[106] It could be seen that the frequency of v2 and v3 counts for slightly more than 50%, i.e. around half, and vl slightly less than 50% just around half as well. So about half of the chance that the data code value comes up is vl and another half of which 25% is v2 and 25%) is v3. So when vl is the 5 th data code value, then the additional data code value that comes up must be either v2 or v3, so half of the chance that 1 bit is saved; whereas if v2 or v3 is the 5 th data code value, then either vl or v3 comes up for the case of v2 being the 5 th data code value and vl or v2 for the case of v3 being the 5 th data code value, so that is half of the half chance that 1 bit will be saved. So on average, about 3 quarters of the chance that 1 bit is saved, i.e. 3/4 bit saved. Also by logical deduction, if the first, the second and the third values are the same unique data code value, then the fourth value could be deduced; so the bit usage used for those cases will be either 2 bits or 1 bit. So overall speaking, there is not much that is lost when the number of Scenario counts up from 4 Code Units to 5 Code Units and so on. Given that if Scenario 4 Code Units using 10 (2 bits) produces a bit usage saving of 6 bits on average of the 18 encoded entries. The chance of having overall bit usage saving for the whole random data set is very likely given that fact that the frequency distribution of these Scenarios for 80000 random bits is decreasing when the Scenario number increases from 3 Code Units onwards. What is more, one could attempt to reshuffle the assignment of the first 3 most frequent Scenarios in a manner that produces the best result of bit usage saving.

[107] Re-examination of Diagram 44 [i.e. Scenario 3 Code Units or (a)], 45 [i.e. Scenario 4 Code Units or (b)] and 40 [i.e. Scenario (c)] with a Scenario Code assignment arrangement of 2 bits (10), 2 bits (11) and 1 bit (bit 0) according to the ordering of (a), (b) and (c), it could be seen that for Scenario (a), it saves 2 bits out of 6 encoded entries (2 bits saved versus 0 bit lost, for Scenario (b), it saves 6 bits out of 18 entries (10 bits saved versus 4 bits lost), and it saves 3 bits out of 45 entries (i.e. 20 bits saved versus 17 bits lost). It appears that all 3 Scenarios (a), (b) and (c) have bit usage saving. However, the result is still subject to the frequency distribution of each encoded entries under each of these three Scenarios. The frequency distribution in Diagram 49 of these 3 scenarios using 80000 random bits however could be produced by running the following autoit programmes listed out in Diagram 48 as follows:

Diagram 48a

autoit program producing the frequency distribution of Scenarios listed out in Diagram 49 using helper2.au3 listed in Diagram 48b

#include "helper2.au3" local $data[l]

if not FileExists('ff) then GenerateRandomFile('ff , 10000)

ReadDataFile('ff , $data) local $map = [Ό', ΊΟ', Ί Γ]

local $obj [l]

MapInit($obj, $map)

func GetPow2($v)

if $v = 0 then

return 1

elseif $v = 1 then

return 2

else

return 4

endif

endfunc func CountNum($last)

local $input[l]

local $index, $i, $j

Inputlnit($input, $data) local $count[ 10000]

local $total = 0

local $rest = 0

local $lastlndex = 0

ZeroArray($count)

while not InputEnded($input)

$i = GetPow2(MapRead($input, $obj)) $i = BitOr($i, GetPow2(MapRead($input, $obj)))

for $j = 0 to $last - 3

$i = BitOr($i, GetPow2(MapRead($input, $obj)))

if $j > $lastlndex then $lastlndex = $j

if $i = 7 then

$count[$j] = $count[$j] + 1

exitloop

endif

if $last = 10000 and InputEnded($input) then

exitloop

endif

next

if $i <> 7 then $rest = $rest + 1

$total = $total + 1

wend

ConsoleWrite('case ' & $last & @CRLF)

ConsoleWrite('all : ' & $total & @CRLF)

for $j = 0 to $lastlndex

ConsoleWrite('cu' & ($j + 3) & ' : ' & $count[$j] & @CRLF) next

ConsoleWrite('rest: ' & $rest & @CRLF)

endfunc

CountNum( 10000)

Diagram 48b

autoit program: help2.au3 for use by program listed in Diagram 48a

#include <Array.au3>

#include <FileConstants.au3> global const $Mask[8] = [128, 64, 32, 16, 8, 4, 2, 1]

global const $BitLen = 8

global const $MaxNum = 256

func PrintError($msg)

ConsoleWrite($msg & @CRLF)

exit

endfunc

; Input function func Inputlnit(byref $obj, byref $data) redim $obj[5]

$obj[0] = $data

$obj[l] = UBound($data) ; size

$obj[2] = 0 ;bytepos

$obj[3] = 0 ; bit pos

$obj[4] = false ; is ended

endfunc func InputEnded(byref $obj)

return $obj [4]

endfunc func InputReadBit(byref $obj)

if $obj[4]then return 0

local $r

if BitA D(($obj[0])[$obj[2]], $Mask[$obj[3]]) <> 0 then

$r= 1

else

$r = 0

endif

$obj[3] = $obj[3] + l

if $obj[3] >= $BitLen then

$obj[3] = 0

$obj[2] = $obj[2] + l

if $obj [2] >= $obj [1] then $obj [4] = true

endif

return $r

endfunc

; Internal array function

func ArrayCreate($size)

local $arr[$size]

$arr[0] = 0

return $arr

endfunc func Array SetData(byref $arr, $index, $v)

$arr[$index] = $v

endfunc func ArrayRedim(byref $arr, $

redim $arr[$size]

endfunc

; Output function

func OutputInit(byref $obj)

redim $obj [4]

$obj[0] = ArrayCreate(lOOO)

$obj[l] =UBound($obj[0]) ; size

$obj[2] = 0 ;bytepos

$obj[3] = 0 ; bit pos

endfunc func OutputWriteBit(byref $obj, $r)

if $r <> 0 then ArraySetData($obj [0], $obj [2], BitOr(($obj [0])[$obj [2]], $Mask[$obj[3]]))

$obj[3] = $obj[3]+ 1

if $obj[3] >= $BitLen then

$obj[3] = 0

$obj[2] = $obj[2] + l

if$obj[2]>=$obj[l]then

$obj[l] = $obj[l] + 1000

ArrayRedim($obj [0], $obj [ 1 ])

endif

ArraySetData($obj[0], $obj[2], 0)

endif

endfunc func OutputGetData(byref $obj)

$obj[l] = $obj[2]

if $obj[3] <> 0 then $obj[l] = $obj[l] + 1

if $obj[l] = 0 then PrintError('No output data')

ArrayRedim($obj[0], $obj[l])

return $obj [0]

endfunc

; Random data function

func GenerateRandomFile($name, $size)

local $fh = FileOpen($name, BitOr($FO_OVERWRITE, $FO_BINARY)) if $fh = -1 then PrintError('File open fails')

local $i

for $i = 0 to $size - 1 FileWrite($fh, BinaryMid(Binary(Random(0, $MaxNum - 1, 1)), 1, 1)) next

FileClose($fh)

endfunc func GenerateRandomData(byref $arr, $size)

redim $arr[$size]

local $i

for $i = 0 to UBound($arr) - 1

$arr[$i] = Random(0, $MaxNum - 1, 1)

next

endfunc

; File reader/writer

func ReadDataFile($name, byref $array)

local $fh = FileOpen($name, BitOr($FO_READ, $F O B IN ARY)) if $fh = -1 then PrintError('File open fails')

local $data = FileRead($fh)

local $len = BinaryLen(Sdata)

redim $array[$len]

local $i

for $i = 1 to $len

$array[$i - 1] = Int(BinaryMid($data, $i, 1))

next

FileClose($fh)

endfunc

; File Writer

func WriteDataFile($name, byref $array)

local $fh = FileOpen($name, BitOr($FO_OVERWRITE, $FO_BINARY)) if $fh = -1 then PrintError('File open fails')

for $i = 0 to UBound($array) - 1

FileWrite($fh, BinaryMid(Binary($array[$i]), 1, 1))

next

FileClose($fh)

endfunc

; Array helpers

func DumpArray(byref $arr)

local $i

for $i = 0 to UBound($arr) - 1

ConsoleWrite($arr[$i] & ' ') next

ConsoleWrite(@CRLF)

endfunc func ZeroArray(byref $arr)

local $i

for $i = 0 to UBound($arr) - 1

$arr[$i] = 0

next

endfunc func DumpBinaryArray(byref $data) local $input[l]

Inputlnit($input, $data)

local $pos = 0

while not InputEnded($input)

ConsoleWrite(InputReadBit($input)) $pos = $pos + 1

if $pos = 8 then

ConsoleWriteC ')

$pos = 0

endif

wend

ConsoleWrite(@CRLF)

endfunc

; Map function

func Maplnit(byref $obj, byref $map) redim $obj [5]

$obj [0] = ArrayCreate(lO) ; 0 branch $obj [l] = ArrayCreate(lO) ; 1 branch $obj [2] = 0 ; index

$obj [3] = $map

$obj [4] = ArrayCreate(UBound($map)) ; freq ZeroArray($obj [4])

local $i, $j, $c, $pos, $branch, $v, $len for $i = 0 to UBound($map) - 1

$pos = 0

$len = StringLen($map[$i])

for $j = 1 to $len

$c = StringMid($map[$i], $j, 1) if $c = '0' then $branch = 0

elseif $c = then

$branch = 1

else

PrintError('invalid char in map')

endif

$v = ($obj[$branch])[$pos]

if $v < 0 then

PrintError('invalid map')

elseif $v > 0 then

$pos = $v

elseif $j < $len then

$obj[2] = $obj[2] + l

if $obj [2] >= UBound($obj [0]) then

ArrayRedim($obj[0], $obj[2] + 10)

ArrayRedim($obj[l], $obj[2] + 10)

endif

ArraySetData($obj[0], $obj[2], 0)

ArraySetData($obj[l], $obj[2], 0)

ArraySetData($obj[$branch], $pos, $obj[2])

$pos = $obj[2]

endif

next

if $v <> 0 then PrintErrorCinvalid map')

ArraySetData($obj[$branch], $pos, -1 - $i)

next

for $i = 0to $obj[2]

if ($obj[0])[$i] = 0 or ($obj[l])[$i] = 0 then PrintError('invalid map') next

endfunc func MapRead(byref $input, byref $obj)

local $bit, $pos

$pos = 0

while $pos >= 0

$bit = InputReadBit($input)

$pos = ($obj[$bit])[$pos]

wend

return -$pos - 1

endfunc func MapClearFreq(byref $obj) ZeroArray($obj [4])

endfunc func MapPrintFreq(byref $obj, byref $data, $detail = false)

local $input[l]

local $index, $i, $j

Inputlnit($input, $data)

MapClearFreq($obj)

while not InputEnded($input)

$index = MapRead($input, $obj)

if $detail then ConsoleWrite(($obj [3])[$index] & ' ')

ArraySetData($obj [4], $index, ($obj [4])[$index] + 1)

wend

if $detail then ConsoleWrite(@CRLF)

local $c0 = 0

local $cl = 0

for $i = 0 to UBound($obj [3]) - 1

ConsoleWrite(($obj [3])[$i] & ' : ' & ($obj [4])[$i] & @CRLF)

for $j = 1 to StringLen(($obj [3])[$i])

if StringMid(($obj [3])[$i], $j, 1) = Ό * then

$c0 = $c0 + ($obj [4])[$i]

else

$cl = $cl + ($obj [4])[$i]

endif

next

next

ConsoleWrite('0: l = ' & $c0 & ':' & $cl & @CRLF)

endfunc func MapOutput(byref $obj, byref $data, byref $outmap, $outfile = ") local $input[l]

local $index, $i, $j

Inputlnit($input, $data)

MapClearFreq($obj)

if UBound($outmap) <> UBound($obj [3]) then PrintError('map size not match') local $output[l]

OutputInit($output)

local $o0 = 0

local $ol = 0

while not InputEnded($input)

$index = MapRead($input, $obj)

$str = $outmap[$index] for $i = 1 to StringLen($str)

if StringMid($str, $i, 1) = '0' then

OutputWriteBit($output, 0)

$o0 = $o0 + 1

else

OutputWriteBit($output, 1)

$ol = $ol + 1

endif

next

ArraySetData($obj [4], $index, ($obj [4])[$index] + 1)

wend

local $c0 = 0

local $cl = 0

for $i = 0 to UBound($obj [3]) - 1

ConsoleWrite(($obj [3])[$i] & '->' & $outmap[$i] & ' : ' & ($obj [4])[$i] & @CRLF) for $j = 1 to StringLen(($obj [3])[$i])

if StringMid(($obj [3])[$i], $j, 1) = 'Ο' then

$c0 = $c0 + ($obj [4])[$i]

else

$cl = $cl + ($obj [4])[$i]

endif

next

next

ConsoleWrite('old 0: 1 = ' & $c0 & ':' & $cl & @CRLF)

ConsoleWrite('new 0: 1 = ' & $o0 & ':' & $ol & @CRLF)

if $outfile <> " then

WriteDataFile($outfile, OutputGetData($output))

endif

endfunc Diagram 49 so produced is listed as follows: Diagram 49

Frequency Distribution of Scenarios using 80000 random bits case 10000

all : 8449

cu3 : 1606

cu4 : 1578

cu5 : 1292

cu6 : 953 cu7 : 774

cu8 : 575

cu9 : 422

culO : 311

cul l : 238

cul2 : 191

cul3 : 143

cul4 : 94

cul5 : 56

cul6 : 49

cul7 : 42

cul8 : 33

cul9 : 18

cu20 : 16

cu21 : 6

cu22 : 13

cu23 : 7

cu24 : 7

cu25 : 10

cu26 : 4

cu27 : 1

cu28 : 3

cu29 : 2

cu30 : 1

cu31 : 2

cu32 : 0

cu33 : 0

cu34 : 0

cu35 : 0

cu36 : 0

cu37 : 0

cu38 : 1

rest: 1

It could be seen from the above Diagram 48 that the 80000 bits generated at random when read using the 3 -value Code Unit of 0 Head Design with the Terminating Condition that 3 unique data code values should be present in the Processing Unit, once the last appearing unique data code value (the Terminating Value) comes up, these 80000 random bits produce Processing Units of varying sizes starting from 3 Code Units to 38 Code Units with the rest being the Un-encoded Code Section. These Processing Units of varying Code Unit sizes, from 3 to 38, are listed in Diagram 49 with their frequency of occurrences in the 80000 random bits so generated at one instance. It could be seen that the frequency of the Processing Units decreases with increasing Processing Unit sizes in general, and quite steadily from 3 Code Units to 20 Code Units. The frequency for Scenario 3 Code Units or (a) and 4 Code Units or (b) is 1606 and 1578 respectively out of 8449 processing units. So the frequency for Scenario (c) is 8449 - (1606 + 1578 = 3184 or 37.68%) = 5265 or 62.32%.

Given the above piece of information generated out of a data set of 80000 random bits, one could do another improvement on the encoding design. For instance, if one wishes to increase bit 1 ratio in the data set as against bit 0 ratio. One could use the following codes as Scenario Classification Codes (or Scenario Codes in short) for Scenario (a), (b) and (c) in Diagram 50:

Diagram 50

Scenario Code Assignment for Scenario (a), (b) and (c)

As Scenario (c) accounts for most of the processing units, it should be given the shortest Scenario Code bit 1, and because it is intended to increase the bit 1 ratio, so the 1 Head Design of Scenario Code is adopted. For the same reasons, Scenario (a) and (b) are assigned with Scenario Code bit 01 and 00 respectively.

Another improvement is to use a reverse placement of encoded data codes in the order of 4 th , 3 rd , 2 nd , and 1 st position on the assumption that there is less chance of two same data code values adjacent to each other, it is good for Scenario (c), increasing the chance of the appearance of the next unique data code value in addition to the 4 th data code value. What is more, upon further analysis, the reverse placement of encoded data codes in accordance with their relative position could create another trait or characteristic (the trait being whether the last data code value is different from the last but one data code value) in the ordering of data codes that could be capitalized upon in making compression for bit storage saving. This characteristic is related to the design of the Terminating

Condition. For simplicity, another similar Termination Condition could be used first for illustration. Now the Termination Condition stops at 3 Code Units, the data code values are divided in just two groups or two classes, one being 3 Code Units having all the unique data code values of the 3 -value Code Unit, the other being 3 Code Units NOT having all the unique data code values. So that means, the first class has 3 unique code values present and the second 2 unique code values present with 1 unique code value missing. So these two classes have the following frequency (as listed in Diagram 50) according to the result of Diagram 49:

Diagram 50

Frequency Distribution of Processing Units of two classes in a random data set of 80000 bits: one having all 3 unique data code values (Class A), the other having less than 3

(Class B)

Class Frequency %

A 1606 19

B 6843 81

Overall 8449 100

Class B has overwhelming majority of data code values. One could however divide these Processing Units according to the trait that is related to the reverse placement of data code values with respective to their relative positions. For reverse placement of these data code values, one schema and design is placing the 3 rd data code value first, then the 2 nd data code value, and then the 1 st one. So for the Processing Units having all three unique data code values, the 3 rd and the 2 nd data code values must be different, for those Processing Units NOT having all three unique data code values, the 3 rd and the 2 nd data code values could either be the same or different in value. It appears that using this trait as another classifying criterion could produce better results in saving bit storage. So Scenario Code could be assigned likewise accordingly. So sub-Scenario Code bit 1 could be assigned to those Processing Units where the 3 rd and the 2 nd data code values are the same and bit 0 to those where they are different. For the Scenario Class 0 here, additional sub-scenario code bit may be assigned or such sub-scenario code bit could be combined with Content Code Bit using another novel feature of CHAN CODING, namely the use of Posterior Classification or the placement of Posterior Classification Code within Content Code of CHAN CODE. This encoding technique of CHAN CODING is better explained using actual encoding of a Processing Unit of three 3 -value Code Unit using the Terminating Condition as used in Diagram 50, as listed out in Diagram 51 as follows:

Diagram 51

Encoding and Bit Usage and Bit 1/Bit 0 change with Scenario Code assigned to Scenario

Class 0 and Scenario Class 1

Data Code Content Code

Code Adjustment

3rd2ndlst Bit Seen. 3rd 2nd 1st Bit +/- Bit 1/Bit 0

Code Value Code

Class B

vlvlvl 3 1 0 [] 0 1+2 0 +1/-1

It could be seen from the above that the result is very close. The logic for encoding Processing Units assigned with Scenario Code 1 in Diagram 51 is as follows:

(a) after reading the data code values from the digital data input, and after determining the nature of data distribution of the Processing Unit under processing, if the Processing Unit belongs to the Class where the 3 rd and the 2 nd data code values are the same, writing the Scenario Code bit 1;

(b) writing the 3 rd data code value as it is;

(c) omitting the 2 nd data code value by logic; since the 2 nd data code value is to be the same as the 3 rd one, it could be omitted by logical deduction; and

(d) writing the 1 st data code value using the original data code value as read using the design of the Code Unit; as the Processing Unit is one which does not have all 3 unique data code values, it could have one or two data code values only. Since one data code value has appeared as the 3 rd one already, but there could also have 3 choices to select from, so the 1 st position value could only be written as it is read directly (or the code value present in the 3 rd position is promoted to bit 0, and the other two remaining values adjusted to bit 10 or bit 11 depending on their relative rank and the 1 st position value then uses such adjusted code). In encoding the Processing Units assigned with Scenario Code 0 in Diagram 51 is as follows:

(i) after reading the data code values from the digital data input, and after determining the nature of data distribution of the Processing Unit under processing, if the Processing Unit belongs to the Class where the 3 rd and the 2 nd data code values are NOT the same and where all unique data code values are present, writing Scenario Code Bit 00 for it; if the Processing Unit belongs to the Class where the 3 rd and the 2 nd data code values are NOT the same but where all unique data code values are NOT present, writing Scenario Code Bit 01 for it;

(ii) writing the 3 rd data code value as it is;

(iii) writing the 2 nd data code value for Processing Unit with Scenario Code 00 using the encoding logic that: as it has all unique data code values and as one data code value has appeared as the 3 rd one, there remain two choices to be selected from, so using one bit for indicating which one appears as the 2 nd data code value (bit 0 for the smaller value, bit 1 for the bigger value, where in the 0 Head Design, one could design as that vl is the smallest value and v3 is the biggest value as where appropriate); or writing the 2 nd data code value for Processing Unit with Scenario Code 01 using the encoding logic that: as it does not have all unique data code values, and as one data code value has appeared as the 3 rd one, there still could only 2 choices (two unique values not yet present) to be selected from because of Scenario Class 0 here being defined as the class where the 3 rd and the 2 nd data code values are NOT the same, so using one bit for indicating which one appears as the 2 nd data code value (bit 0 for the smaller value vl and bit 1 for the bigger value v3); and

(iv) for Processing Unit assigned with Scenario Code 00, the 1 st data code value could be omitted; for Processing Unit assigned with Scenario Code 01, the 1 st data code value could be encoded and written using the encoding logic that: as two different data code values have appeared in the 3 rd and the 2 nd position, and Scenario Class 01 is where Not all 3 unique data code values are present, that means the data code value in the 1 st position must be one out of the two values in the 3 rd and the 2 nd position, so encoding and writing the 1 st position data code value using another bit (bit 0 for the smaller value vl and bit 1 for the bigger value v3).

The use of Posterior Classification in one form may help to further reduce the bit storage a little bit. In the present example, the placement of Posterior Classification Code could be done in two ways:

(a) for Processing Units assigned with Scenario Code 00 and 01, the second bit is for distinguishing if it belongs to Class A (the class with all unique data code values) or Class B (the class not having all unique data code values). The second bit of the Scenario Code could be dispensed with through combining the following content codes:

combining the bit for encoding the second data code value in Class A and with the bit for the second data code value and the bit for the first data code value of Class B; as there are 6 combinations of these encoded codes to be represented; 2 or 3 bits are to be used after writing the encoded code values of these 6 combinations in the following assignment in Diagram 52:

Diagram 52

Scenario Code combined with Content Code

00 for Class A Processing Units under Scenario Code 0 where bit 0 is assigned to the 2 nd data code value;

01 for Class A Processing Units under Scenario Code 0 where bit 1 is assigned to the 2 nd data code value;

100 for Class B Processing Units under Scenario Code 0 where bit 0 is assigned to the 2 nd data code value and bit 0 assigned to the 1 st data code value;

101 for Class B Processing Units under Scenario Code 0 where bit 0 is assigned to the 2 nd data code value and bit 1 assigned to the 1 st data code value;

110 for Class B Processing Units under Scenario Code 0 where bit 1 is assigned to the 2 nd data code value and bit 0 assigned to the 1 st data code value; and

111 for Class B Processing Units under Scenario Code 0 where bit 1 is assigned to the 2 nd data code value and bit 1 assigned to the 1 st data code value.

Using the above assignment, the second bit of Scenario Codes 00 and 01 could be taken away, and the first encoded code written for these Processing Units is Scenario Code 0, and then it is followed by the 3 rd data code value written as it is read and then followed by the above combined Scenario and Content Code using 2 to 3 bits. The result of bit usage is exactly the same as the result produced in Diagram 51 in terms of bit usage; and

(b) however, there is another novel way of combining Scenario Code with Content Code, using the following logic:

(i) upon encoding and writing the Scenario Code 0, the 3 rd data code value and the 2 nd data code value for those Processing Units under Scenario Code 0 (the second bit of which is designed to be done away with), an exiting code combined out of the second bit of the Scenario Code 0 and the Content Code of the 1 st data code value of Class B under Scenario Code 0 could be created as follows in Diagram 53 :

Diagram 53

Posterior Hybrid Classification and Content Code as Exiting Code existing code for and representing Class B Processing Unit under Scenario Code 0 where the 1 st data code value is bit 0;

existing code for and representing Class B Processing Unit under Scenario Code 0 where the 1 st data code value is bit 1;

existing code for and representing Class A Processing Unit under Scenario Code 0 where no processing is to be done upon exiting after the 3 rd and the 2 nd data code value have been encoded and written: and such exiting codes are to be used upon decoding for correct restoration of the original digital data information.

Depending on the frequency distribution of the Processing Units involved in the data distribution. The above technique could be used for better bit storage saving. For a random data set used in the aforesaid example of 80000 bits, the frequency of Class B Processing Units is 6843 out of a total number of Processing Units: 8449 as listed out in Diagram 50. About half of this 6843 goes to Scenario 01, and half of these having the 1 st data code value using bit 0 upon encoding, which now is represented by exiting code bit 1 and the second bit of the Scenario Code 01 is stripped off, so saving 1 bit for this half of the 6843 Processing Units, equivalent to saving about 3422 / 2, i.e. about 1711 bits. And the exiting code for the other half of these 6843 Processing Units uses the exiting code of bit 01, so using 2 bits of the existing code to replace the original 2 nd bit of the Scenario Code 01 and the bit 0/1 used for the encoded code representing the original 1 st data code value. So there is no loss in bit usage for this half of the Class B Processing Units. For Class A Processing Units under Scenario Code 00, it uses the exiting code of bit 00, the original second bit of the Scenario Code 00 now being stripped off could only account for 1 bit of the 2 bits of the exiting code; as its 1 st data code value is omitted by logic, the other bit of the exiting code for it could not be accounted for but represents a bit usage loss. And the frequency for these Class A Processing Units is 1606. Against the 1711 bits saved above for the Class B Processing Units, the balance is 1711 minus 1606 = 105 bits of bit usage saving. Out of the 80000 random bits, it novel feature alone could help to save around 105 bits. The technique so far presented also could be applied to using other Scenarios such as Scenarios (a), (b) and (c) or Scenarios 3 Code Units, 4 Code Units and the rest.

The previous example of Processing Units of Class A and B being divided into two Scenarios 0 and 1 shows how data code values could be classified in hybrid

Classification and Content Code in combination and placed in a posterior manner as contrary to the conventional placement in anterior position. Another novel feature of data classification could only be used with embedded or interior Classification Code. Diagram 54 shows the result using this novel feature: Diagram 54a

Anterior Classification Code and Class A Processing Units

Diagram 54b

Interior or Embedded Classification Code for Class B Processing Units sub-divided using the criterion of whether the 3 rd and the 2 nd data code values are the same value or not

As for Class A, there are 2 bits saving out of 6 entry combinations; and for Class B with different values for the 3 rd and the 2 nd data code values, there is no saving or loss apparently out of 12 combinations; and for Class B with the same value for the 3 rd and the 2 nd data code values, there are 3 bits saving out of 9 entry combinations apparently. As said before, one could use other techniques, such as the technique of changing ratio of bit 0 : bit 1 or using Super Processing Units having uneven data for the whole random data set first before carrying out using the techniques in this example. And this is only an example out of many many possible scenarios to be designed for using the techniques mentioned in this example.

The above example of classification is based on four classes using Classification Code: 00 for Class A Processing Units, however what is novel is about the use of 01, 10 and 11 which are actually Content Codes by itself with a slight modification for vl, from bit 0 to bit 01 so that these Content Codes are qualified to be used as Classification Codes;

whereas when being used in the encoding processing as part of the content codes, the encoded code value of vl, i.e. 01, is reverted to the shorter form as 0 only as this will not be mistaken inside the content code part, not being used as Classification Code at the head of the encoded Processing Unit. In the above example, it therefore also

demonstrates vividly the usefulness of this technique of using Content Code as

Classification Code, albeit with modification. This is another technique of CHAN CODING used in producing CHAN CODE.

Diagram 55 shows that Classification Code could use Content Code (with slight modification) as a substitute as follows:

Diagram 55

Modified Content Code used as Classification Code for Class B Processing Units subdivided using the criterion of whether the 3 rd and the 2 nd data code values are the same value or not

Data Code Content Code

Code Adjustment

3rd2ndlst Bit 3rd 2nd 1st Bit +/- Bit 1 /Bit 0

Code Value

Class B

(where the 3 rd and the 2 nd are different)

vlv2vl 4 01 10 0 5 +1 0/+1

0100

Class B

(where the 3 rd and the 2 nd are the same)

After revealing the aforesaid various classification techniques, one could evaluate which techniques are most useful for the digital data set under processing. And according to the selection of classification techniques for use, a re-classification and re-distribution of data code values for encoding and decoding may be found to be necessary and appropriate for the intended purpose. Diagram 56 is a result upon such deliberation: the most useful technique is to make correct identification of the trait that could be used to make the right classification of data code values. Under the present design and schema being discussed, many techniques have been developed, designed and implemented for use in the diagrams presented in the preceding paragraphs. It could be observed that while some code entry combinations make bit saving, the saving is far more than offset by others which produce bit loss. So this trait or characteristic is the one that has to be investigated; that is to find out the offending entry combinations that used to make bit loss on encoding using the various techniques of CHAN CODING discussed above. It is apparent that if the offending entry combinations making losses are grouped together and those friendly entry combinations making savings grouped likewise, then this could enhance the chance of success. Diagram 56 uses this as the primary criterion for data classification here and other techniques of CHAN CODING, such as code adjustment through code promotion, code omission, code replacement and after all the most essential technique of Absolute Address Branching with the use of range, for the subsequent encoding:

Diagram 56

Data Classification based on Compressible and Incompressible data value entries before code re-distribution, with Frequency using Diagram 23 in Paragraph [75]

Class Compressible; where the 3 rd and the 2 nd position values are different and encoding using code adjustment techniques

Class Compressible: where the following 2 entry combinations are exception for redistribution to two preceding entry combinations as appropriate

Class Compressible: where all 3 code values are unique, encoded code to be re-assigned

It could be seen from the above Diagram 56 that there are 2 types of code adjustment that have to be made:

(a) code swapping; v2vlv2 and v3vlv3, the encoded code each of which uses 6 bits instead of the 5 bit used by the original code, resulting in bit loss; so their encoded codes have to be swapped with v2v2v3 and v3v3v2, each of which uses 5 bits instead of the 6 bits used by the original code, resulting in bit gain; so swapping the encoded codes between these pairs makes a balance of bit usage, resulting in no bit loss nor bit gain;

(b) code re-assignment or re-distribution or re-filling; there are 2 vacant code seats or addresses in the Class Incompressible with same 3 rd and 2 nd values: the encoded codes for these two vacant code addresses of vlvlv2 and vlvlv3 are 1100 and 1101, each of these 2 encoded codes uses 4 bits; and there are 4 vacant code seats or addresses in the Class Incompressible with different 3 rd and 2 nd values, the encoded codes for these four vacant code addresses are as follows: vlv2vl with encoded code as 10000 using 5 bits, vlv3vl with encoded code as 10010 using 5 bits, v2vlvl with encoded code as 101000 using 6 bits and v3vlvl with encoded code as 101100 using 6 bits. So there are now 6 vacant code addresses to be re-filled. The two vacant code seats could be filled up first by using up the two exception entry combinations: v2v2v2 and v3v3v3. So there remains only 4 vacant code seats: 2 using 4 bits and 2 using 5 bits for accommodating 6 entry combinations of Class Compressible with Processing Units having 3 unique data code values using 5 bits. So the first two using 5 bits un-accommodated Processing Units with all 3 unique data code values could be used to re-fill the two vacant code seats first; remaining 4 un-accommodated 5-bit Processing Units are left behind to be placed into the remaining 2 4-bit code vacant addresses. And Diagram 57 shows the situation upon such code re-distribution as follows:

Diagram 57

Data Classification based on Compressible and Incompressible data value entries upon code re-distribution, with Frequency using Diagram 23 in Paragraph [75]

Class Incompressible: using AAB technique for encoding with range

Class Compressible; where the 3 rd and the 2 nd position values are different and encoding using code adjustment techniques

So up to here, it seems two of the remaining 4 5-bit Processing Units have to be used to re-fill the remaining two 4-bit vacant code addresses, resulting in 1 bit saving for each seat; and the yet remaining two un-accommodated 5-bit Processing Units have to use AAB technique to double them with 2 6-bit code addresses (choosing the entry combinations with the lowest frequency) already taken up by other two entry

combinations, resulting in 4 7-bit entry combinations and a loss of 2*(7-5 + 7-6) = 6 bits. However, it is an appropriate use of the AAB technique, which could be used in a novel way for the purpose of creating new code addresses. So using the AAB technique, by adding one more bit to the 2 4-bit vacant code addresses, one could get 2*2 5-bit vacant code addresses, making up 4 5-bit vacant code addresses available, enough for holding up the 4 5-bit un-accommodated Processing Units with all 3 unique data code values, without having to incur any bit loss. The final result, all breakeven in terms of bit usage: without bit gain or bit loss for this example is then listed out in Diagram 58 as follows:

Diagram 58

Breakeven upon re-distribution of code addresses and code entry values listed out in Diagram 56

Class Incompressible: using AAB technique for encoding with range

Class Compressible; where the 3 rd and the 2 nd position values are different and encoding using code adjustment techniques

The frequency of the Class Incompressible and the Class Compressible above differs only slightly, with 8897 Processing Units for the first class and 8888 for the second. However, as the second class being assigned with Scenario 10 and 11, the skew is intended towards increasing the bit 1 ratio as against bit 0 ratio. The result however is quite surprising, it turns out that after encoding 1 cycle, bit 1 decreases by 297 bits whereas bit 0 increases by 297 bits as shown in Diagram 58 above. So even there is no bit gain or bit loss after the encoding, the way of bit 0 : bit 1 ratio changes makes it quite possible to make the data compressible, now encoded and turned much uneven in terms of the ratio distribution between bit 0 and bit 1. If adopting the assignment of Scenario Code in the opposite way, the bit 0 : bit 1 gain or loss is shown in the next Diagram 59:

Diagram 59

Using the opposite assignment of Scenario Code, different from that used in Diagram 58 Class Incompressible: using AAB technique for encoding with range

Class Compressible; where the 3 rd and the 2 nd position values are different and encoding using code adjustment techniques

out of 12 entry combinations

Class Compressible: where the 3 rd and the 2 nd position values are the same and encoding using code adjustment techniques

It turns out that the result this time has been improved and skewed towards another direction, increasing bit 1 by 1955 and decreasing bit 0 by the same amount for a random data set of 80000 bits.

It could be concluded from the above that an indicator in the Header could be reserved for indicating how the Scenario Code is to be assigned and also whether 0 Head Design of the Code Unit or 1 Head Design of the Code Unit is being used to read the digital data as well as which Head Design is being used to encode the data values after being read as they may not have to be the same. Such variations could affect the ratio of bit 0 : bit 1 in the resulting encoded CHAN CODE. Such indicators are used for selecting the best scenario that serves the purpose of the encoding required.

For the above example of CHAN CODING used as an unevener encoder for increasing bit 1 and decreasing bit 0 by 1955 each, bit 1 is therefore 3910 bits more than bit 0 in the resultant code thus encoded, for compressing such a data set, further change of the uneven distribution of bit 1 and bit 0 could be performed until such a point that such uneven distribution could be compressed by using other techniques of CHAN CODING. For instance, one could use the Super Processing Unit with AI Distinction technique for making further compression. Or the different Code Unit Definitions for 6-value Code Units introduced in Paragraph [62] could be used, one Definition for reading the data and the other for encoding and writing the data; for instance, if bit 1 being more than bit 0, then 17-bit 6-value Code Unit Definition of 1 Head Design could be used to read the data, whereas for encoding and writing it could be done by 18-bit 6-value Code Unit Definition of the same Head Design. Since 17-bit Definition uses 2 bits of bit code 11 for the value of vl, and 18-bit Definition uses 1 bit of bit code 1 for representing vl, then upon writing using 18-bit Definition, 2 bits of bit code 11 read are encoded into 1 bit of bit code 1 for the vl read by the 17-bit Definition, thus reducing 2 bits into 1 bit. If the frequency of bit 1 is higher than bit 0, this may help in making compression.

[117] For instance, the 17-bit to 19-bit 6-value Code Unit of 1 Head Design is shown in

Diagram 60:

Diagram 60

Definition of Code Values of a 6 value Code Unit of 1 Head Design using 17, 18 and 19 bits

As discussed before, if a digital data set is having more bit 1 than bit 0, using 1 Head Design Code Unit Reader, vl would have much higher frequency than other values individually. Comparing the above 3 6-value Code Unit Definition, it could be seen that if vl has higher frequency than others, using 17-bit as Reader, writing it with 18-bit Writer, on every count of vl read, it saves 1 bit and on every count of v2 and v3 read, it loses 1 bit each; for v4, v5 and v6 the bit usage is breakeven; so as long as the frequency of vl is higher than the frequency of v2 and v3 added up together, bit saving could be achieved. 19-bit Writer could also be used depending on the frequency distribution between the pair of v2 and v3 and the of v3 and v4 as compared to the frequency of vl .

[118] So in order to take advantage of the slight differences of the bit 0 : bit 1 patterns in the data sets for making compression in an easy way, one could construct the pair of 0 Head Design and 1 Head Design Code Unit Definitions of varying bit sizes for Code Units of other value sizes. For instance, the Reader and the Writer or the Encoder used in

Paragraph [117] are based on 6-value Code Unit, one however could construct for use such Readers and Writers basing on other value sizes, such as 7-value, 8-value, 9-value, 10-value, so on and so forth. The bigger the value size of the Code Unit Definition, the finer bit 0 : bit 1 difference the Reader and the Writer could pick up and turn into bit storage saving. One could also do several times of unevening process (i.e. using

Unevener to do several cycles of changing the bit 0 : bit 1 ratio in the digital data set skewed towards one direction first) before making encoding for compression. If one unevener could not always skew the data distribution pattern towards one direction, one could change the assignment pattern of the Classification Code from 0 Head to 1 Head or vice versa, or change or interchange the Head Design of the Reader and Writer from one Head Design to another Head Design, or use different design of Code Unit Definition of varying bit size as well as varying value size as long as the path of making such changes is well recorded in the Header of CHAN CODE or built into the encoder and decoder for use. What is more, the Unevener could be built with different Content Code. For instance, the Unevener discussed in Paragraph [114] and Diagram 57 uses Content Code classified into 2 Classes, one Class Incompressible and the other Class Compressible before slight modification due to code redistribution optimization. This Unevener maps every unique code addresses to unique data values one by one with the same bit size so that every encoded code for any data value under processing is exactly the same bit size as that of the original corresponding data value, resulting in no bit gain nor bit loss. Diagram 58 and 59 shows two different arrangement of Classification Code Assignment to the same set of Content Code, but using 0 Head and 1 Head Design, resulting in different ratio changes in terms of bit 0 : bit 1 distribution pattern. One could however design other Unevener with different Content Code in similar way, using techniques introduced in Paragraph [114], including selecting a Terminating Condition for defining the nature of the Processing Unit to be processed, classifying the unique value entries of the Processing Unit according to their characteristics or traits identified, making code adjustment, code swapping and code redistribution so that another Unevener Encoder similar to the one made in Diagram 57 to 59 could be created. Another example of this is that found in the discussion of Super Processing Unit in Paragraph [91] and Diagram 30. Using the Unevener in Diagram 30, the way that the ratio of bit 0 : bit of a data set is changed differently from that using the Unevener in Diagram 58 or 59. So as discussed before, when one Unevener could not change the ratio of bit 1 to bit 0 towards one constant direction anymore, such unidirectional change could be further taken up by using another Unevener appropriately designed, so on and so forth. So the role of an Unevener Encoder is to un-even the distribution pattern of bit 0 and bit 1 of a digital data set. Unevener encoder, resulting in more Bit 1 than Bit 0 (as compared with the even distribution of bit 0 : bit 1 in 1 : 1 ratio) could be called Bit 1 Unevener, as against the unevener encoder making the data distribution (as compared with the even distribution of bit 0 : bit 1 in 1 : 1 ratio) skewed towards making Bit 0 more, which could then be called Bit 0 Unevener. On the other hand, those encoders which tend to make the bit 0 : bit 1 ratio more even than before encoding (i.e. making the bit 0 : bit 1 ratio more towards 1 : 1 direction) could be called Evener Encoder, or just Evener.

[119] In such a way, the technique of using Code Unit Definition as Reader and Writer for

making compression as outlined in Paragraph [115] to [117], could be used together with Unevener(s) mentioned in Paragraph [118] as a definite proof to the end of the myth of Pigeonhole Principle in Information Theory and that any data set whether random or not are subject to compression in cycles up to a limit as explained previously in this invention.

[120] Compressor in the course of compressing a digital data set is apparently performing the role of an evener, otherwise any data set could be made compressible in cycle (of course up to a certain limit as this invention reveals). The fact that before this invention, methods in the Compression Field do not seem to achieve the long desired goal of making data set of any pattern of data distribution, whether random or not, compressible in cycle speaks for itself. With the revelation of this invention in ways discussed in Paragraph [118], i.e. using Unevener and Evener in alternation or other ways as discussed in Paragraph [124], the aforesaid goal is in sight definitely.

[121] For encoding and decoding using Unevener and Evener in alternation, the Header of the resultant encoded CHAN CODE FILE basically includes the following indicators for use:

(a) Check-sum Indicator; present if appropriate, the decoder using it to identify if the file to be decoded is a valid CHAN CODE FILE, so including the Signature Indicator designed by the designer for the corresponding CHAN CODE FILE so produced by the encoder and if the file is a valid file for use;

(b) Recycle Bit or Indicator; being a bit written by the encoder for use by the decoder indicating if the decoder has to stop after decoding the current cycle of processing; and

(c) The Mapping Table or Code Unit Definition Indicator used (by the encoder in encoding or) for decoding the layer of digital data of the current cycle of encoding; one could use another Indicator Bit (the Unevener/Evener Indicator) for distinguish if the current layer of encoded CHAN CODE has been done by using Unevener Mapping Table or Evener Mapping Table.

[122] Besides, the Header, the resultant CHAN CODE FILE also includes two other sections:

(A) The CHAN CODE Section; containing the encoded CHAN CODE, using the Reader of the chosen Code Unit Definition for reading the input digital data and using the Writer writing or encoding the digital data read, the Writer being the encoder using either the Code Unit Definition or the Mapping Table as indicated in the Header for writing or containing the programming logic for implementing the Code Unit Definition or the Mapping Table for use in encoding; the encoded CHAN CODE here containing

Classification and Content Code where appropriate; and

(B) The Un-encoded Code Section; this is the section of binary bits representing the part of input digital data which is left un-encoded as it is read, usually placed at the end of the resultant CHAN CODE FILE; this is designed as the section of code the number of bits of which is not enough to make up to one Processing Unit or one Super Processing Unit so that it could not be encoded by the encoding techniques being used.

[123] As mentioned previously, such separate pieces of digital information identified in

Paragraph [121] and [122] could be separately placed into different CHAN CODE FILES as separate entities for storage. The corresponding design should enable the decoder to gain access to and correctly identify them for use in decoding. Upon decoding, the decoder uses either the Code Unit Definition or the Mapping Table as indicated in the Header of the respective resultant CHAN CODE FILE for reading the input encoded CHAN CODE FILE(S) or contains the programming logic for implementing the Code Unit Definition or the Mapping Table for use in decoding (i.e. using the corresponding Code Unit Definition or Mapping Table or the corresponding built in programming logic for translating the encoded code back to the input digital data code and writing it out); the resultant decoded code being the input digital data code before that cycle of encoding.

[124] If not used with evener in alternation, unevener could also be used for compressing

digital data when carried to extreme that all the digital incoming binary bits are reduced to either bit 0 or bit 1 through cycles of unevening processing by unevener. So the resultant CHAN CODE FILE (of unevening path) being produced is just the information relating to the path of unevener encoder takes, including the bit size of the original digital data input, the number of unevening made where necessary or appropriate, and the Code Unit Definitions or Mapping Tables being used in each cycle of unevening. The unevener decoder could therefore rely on such information on unevening path the unevener encoder takes to restore correctly and losslessly the original digital data. In this way, the encoding process could include one or more than 1 cycle of unevener encoding but without evener encoding or compressor encoding.

Another variant way of using unevener is to use it a number of cycles before the last cycle of processing in which the encoding is done by compressor (or evener used for compression). In this way, the encoding process includes one or more than one cycle of unevener encoding before the last one cycle of compressor encoding. The structure of the resultant CHAN CODE and CHAN CODE FILE(S) of this variant is similar to that described in Paragraphs [121] to [123]. As evener is a compressor which tends to make the ratio of bit 0 : bit 1 of the data set become more even, when used as the encoder for the last cycle or last layer of encoding, the direction towards which the data distribution is skewed as a result of the use of such evener or compressor is not important anymore, the skew could be either way as long as it could compress the data as it is intended. The term Evener or Evener Encoder is taken to be the same as Compressor or Compressor Encoder in the present invention.

[125] CHAN FRAMEWORK and CHAN CODING so far revealed above at least demonstrates that the number of code addresses as against the number of unique data values is not the factor (as misguided by the myth of Pigeonhole Principle in Information Theory in the past) that imposes the limit for making data compression. What matters is the frequency distribution of the data values of the input data set. CHAN CODING method

implemented using the techniques introduced in the above paragraphs does the feat that in encoding the input data set, whether random or not, it puts into the encoded code the information of the input data plus the information of Classification Code representing the traits of the input data as well as altering the data distribution so that upon decoding the encoded code using the corresponding techniques of CHAN CODING, the input data could be correctly restored losslessly; and up to a certain limit, the corresponding digital data could be encrypted and compressed in cycles; the limit, subject to the design and schema being used or implemented, being the code bits representing the Header and the relevant Indicators used, the CHAN CODE being fitted into the size of one Processing Unit or one Super Processing Unit, and the Un-encoded Code Section, if any (as there may not be any such un-encoded binary bits left behind), which is the un-encoded binary bits the number of which being less than the size of one Processing Unit or one Super Processing Unit but more than 0. In this paragraph, the CHAN CODE thus mentioned is the CHAN CODE CORE, being the encoded code for content or data value part of the original input data, other additional information as contained in Header and the indicators it contains (as well as that part of information and programming logic built into the Encoder and Decoder) and Un-encoded Code Section belongs also to CHAN CODE, being CHAN CODE PERIPHERAL, used together with CHAN CODE CORE in the encoding and decoding process so that the original input data could be perfectly encoded and encoded code be perfectly decoded and restored to the original input data correctly and losslessly.

[126] All in all, the conclusion is again:

Let him that hath understanding count the number

Advantageous Effects

[127] As there is no assumption about the incoming digital information, any numbers,

including random numbers or numbers in even distribution or not, could be encrypted or compressed in cycle subject to the limit described above. In the present days of the era of information explosion, method that enables encryption and compression of digital data, random or not in distribution, in cycle makes a great contribution to the whole mankind making use of and relying on exchange and storage of digital data in every aspect of life. It surely could also contribute to the effort of man-space exploration or resettlement.

Best Mode

[128] The best mode introduced so far in the present invention is the use of Unevener and

Evener in alternation for encoding and decoding as this provides a definite proof that any digital data set could be encoded and decoded in cycle up to a limit described, a proof that puts an end to the myth of Pigeonhole Principle in Information Theory. That does not mean that other techniques of CHAN CODING in other modes could not produce the same result or the same proof. It is predicted that same result and same proof could also be provided using other modes.

Mode for Invention

[129] Other modes include the use of Super Processing Units for breaking down random data set into sub-sections or sub-units of uneven data that is susceptible to compression, especially through the technique of setting criteria for using AI distinction of such subsections, and the use of Processing Units of varying sizes with appropriate use of Terminating Condition and criteria of classification according traits or characteristics of the content of the digital data values for encoding and decoding, as well as the use of mathematical formula(e) and the placement of their corresponding values for encoding and decoding, especially for easy design of encrypting schema.

[130] What is of the most importance is that CHAN FRAMEWORK as seen from the above discussion provides a framework that could be used to create order from data whether random or not, allowing statistics be generated from it in terms of the schema and design one chooses for describing the particular data set under processing (the schema and design including the design of Code Unit, Processing Unit, Super Processing Unit, Un- encoded Code Section, Header containing essential indicators designed for the use or such information and programming logic built into the Encoder and Decoder, resulting in CHAN CODE to be represented in digital binary bits in the form of CHAN CODE FILES), and allowing the use of techniques of CHAN CODING for encoding and decoding for the purposes of compression and encryption where appropriate. Such aforesaid statistics include the sizes of the Code Unit, Processing unit, Super Processing Unit, their frequency distribution, the rank and position of the data code values, and other characteristic information such as the relations between different data code values as expressed in mathematical formula, the terminating value and terminating condition, ratio between bit 0 and bit 1, the data ranges, etc etc as discussed above. Because such characteristics or traits of the data set could be described so that relations or derived traits could be created for encoding and decoding purposes. For instance, one particular useful trait is the Absolute Address Branching Code, which could also be used, for example, as a Code Unit Definition by itself, or as the Content Code, or as the Scenario Classification Code as well as as suffix to trios of Content Code for use as criterion in making AI Distinction. So CHAN FRAMEWORK is a rich framework allowing great flexibility during the design stage when used in creating order out of any data set of whatever data distribution, which is made describable under the Framework so that techniques be developed for seizing differences between data values, which could then be

conscientiously manipulated, such as through cycles of altering the ratio between bit 0 and bit 1 of a data set so that the unevenness of the data distribution could be multiplied for the purpose of making re-cycling data compression possible, or through the design of mathematical formula(e) expressing the relationship between different components of a Processing Unit for the purpose of encrypting the corresponding digital data set either in itself or before making further compression for it again.

] Which mode to use is a matter of choice, depending on the primary purpose of encoding and encoding, be it for encryption or for compression or both. However, as recompression in cycle could easily be made, it is insignificant to make the distinction.] In essence, the present invention is characterized by:

CHAN FRAMEWORK, method of creating order out of digital data information, whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Code Unit as the basic unit of bit container containing binary bits of a digital data set for use; according to the design and schema chosen for processing for the purpose of encoding and decoding, Code Unit being classified primarily by the maximum possible number of data values a Code Unit is defined to hold or to represent, i.e. the value size of a Code Unit, where each of the possible unique values of a Code Unit could have the same bit size or different bit sizes; and Code Unit then being classified by the number of bits all the possible unique data values altogether of a Code Unit occupy, i.e. the sum of the bit size of each of the possible unique data values of a Code Unit takes up; and Code Unit being further classified by the Head Design, i.e. whether it is of 0 Head Design or 1 Head Design; whereby Code Unit of a certain value size under CHAN FRAMEWORK could have different definitions and versions;

CHAN FRAMEWORK, method of creating order out of digital data information, whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Processing Unit(s) which is made up by a certain number of Code Units as sub-units according to the design and schema chosen for processing for the purpose of encoding and decoding;

CHAN FRAMEWORK, method of creating order out of digital data information, whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Super Processing Unit(s) which is made up by a certain number of

Processing Unit(s) as sub-units according to the design and schema chosen for processing for the purpose of encoding and decoding;

CHAN FRAMEWORK, method of creating order out of digital data information, whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Un-encoded Section which is made up by a certain number of binary bits, which do not make up to the size of one Processing Unit, thus left as un-encoded or left as it is according to the design and schema chosen for processing for the purpose of encoding and decoding;

CHAN FRAMEWORK, method of creating order out of digital data information, whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of Un-encoded Section which is made up by a certain number of binary bits, which do not make up to the size of one Processing Unit, thus left as un-encoded or left as it is according to the design and schema chosen for processing for the purpose of encoding and decoding;

CHAN FRAMEWORK, method of creating order out of digital data information, whether random or not, being characterized by an order of data or a data order or a data structure or a data organization created from any digital data set, whether random or not, consisting of traits or characteristics or relations that are derived from Code Unit(s), Processing Unit(s), Super Processing Unit(s) and Un-encoded Section as well as their combination in use according to the design and schema chosen for processing for the purpose of encoding and decoding;

CHAN FRAMEWORK, method of creating order out of digital data information, whether random or not, being characterized by a descriptive language that is used to describe the traits or characteristics or relations of any digital data set using the terminology for describing the traits or characteristics or relations of Code Unit,

Processing Unit, Super Processing Unit and Un-encoded Section;

CHAN CODING, method of encoding and decoding, being characterized by techniques for processing data for the purpose of encoding and decoding under CHAN

FRAMEWORK;

CHAN CODING, method of encoding and decoding, being characterized by the resultant CHAN CODE created out of any digital data set using techniques of CHAN CODING; CHAN CODING, method of encoding and decoding, being characterized by the technique using Absolute Address Branching Technique with range;

CHAN CODING, method of encoding and decoding, being characterized by the technique using mathematical formula(e) for representing the relations between Code Units of a Processing Unit of the data order created under CHAN FRAMEWORK;

CHAN CODING, method of encoding and decoding, being characterized by the technique of placement, placing the values or encoded codes as represented by mathematical formula(e) as well as those values or encoded codes of Code Unit,

Processing Unit, Super Processing Unit and Un-encoded Section in different position order;

CHAN CODING, method of encoding and decoding, being characterized by a technique of classification, i.e. the assignment of 0 Head Design or 1 Head Design or both, represent by the associated bit pattern, to trait(s) or characteristic(s) of the digital data under processing that is/are used to classify or group data values for processing for the purpose of encoding and decoding; CHAN CODING, method of encoding and decoding, being characterized by a technique of classification, i.e. the use of trait(s) or characteristic(s) in terms of Rank and Position of the data values of the digital data under processing for classifying or grouping data values for processing for the purpose of encoding and decoding;

CHAN CODING, method of encoding and decoding, being characterized by a technique of classification, i.e. the use of code re-distribution, including re-distribution of unique data values as well as unique address codes from one class to another class of the classification scheme by use of any one of the following techniques including code swapping, code re-assignment and code re-filling for processing digital data set for the purpose of encoding and decoding;

CHAN CODING, method of encoding and decoding, being characterized by techniques of code adjustment, including any one of the following techniques including code promotion, code demotion, code omission as well as code restoration for processing for the purpose of encoding and decoding;

CHAN CODING, method of encoding and decoding, being characterized by technique of using Terminating Condition or Terminating Value for defining the size of a Processing Unit or a Super Processing Unit for processing for the purpose of encoding and decoding; CHAN CODING, method of encoding and decoding, being characterized by technique of using Code Unit Definition as Reader of digital data values or encoded code values; CHAN CODING, method of encoding and decoding, being characterized by technique of using Code Unit Definition as Writer of digital data values or encoded code values;

CHAN CODING, method of encoding and decoding, being characterized by technique of using Super Processing Unit for sub-dividing a digital data set into sub-sections of data of which at least one sub-section is not in random for processing for the purpose of encoding and decoding;

A method of Claim [20] being characterized by further classifying the Super Processing Units of the digital data set into classes, two or more, using a classifying condition, such as the number of value entries appearing in the Super Processing Unit for a particular class; and by designing mapping tables which are appropriate to the data distribution of each of these classes for encoding and decoding; and by encoding and decoding the data values of each of these Super Processing Units with the use of their respective mapping table appropriate to the data distribution of the data values of each of these Super Processing Units; and using indicators to make distinction between these classes of Super Processing Units for the use in decoding, such indicators being kept at the head of each of these Super Processing Units or elsewhere as in separate CHAN CODE FILES;

A method of Claim [20] being characterized by further classifying the Super Processing Units of the digital data set into classes, two or more, using a classifying condition, such as the number of value entries appearing in the Super Processing Unit for a particular class; and by designing mapping tables which are appropriate to the data distribution of each of these classes for encoding and decoding; and by encoding and decoding the data values of each of these Super Processing Units with the use of their respective mapping table appropriate to the data distribution of the data values of each of these Super Processing Units; and by setting criteria appropriate to the data distribution of the classes of Super Processing Units and the corresponding mapping tables used for encoding and decoding for use in assessing the encoded code for making Artificial Intelligence distinction between the classes of Super Processing Units so that the use of indicators could be dispensed with;

A method of Claim [20] being characterized by further classifying the Super Processing Units of the digital data set into two classes, using a classifying condition, such as the number of value entries appearing in the Super Processing Unit for a particular class; and by designing mapping tables which are appropriate to the data distribution of each of these classes for encoding and decoding, whereby at least one of these mapping tables could serve and thus be chosen to serve as an unevener and such an unevener could also be adjusted through the use of code re-distribution that it could take advantage of the data distribution of the data values of at least one class of Super Processing Units so that the unevener mapping table after code adjustment through code re-distribution could serve and thus be chosen as the mapping table of a compressor for at least one class of Super Processing Units; and by encoding all the Super Processing Units using the unevener in the first cycle; and then by encoding at least one class of the Super Processing Units using the compressor where compression of data of the respective Super Processing Unit under processing is feasible in the second cycle, i.e. encoded with the use of the unevener in the first cycle and the compressor in the second cycle, leaving those Super Processing Unit with data incompressible as it is, i.e. encoded with the use of the unevener only; and decoding the data values of each of these Super Processing Units with the use of their respective mapping table(s) appropriate to the data distribution of the data values of each of these Super Processing Units, whereby in the first cycle of decoding, the encoded code formed out of unevener encoding and compressor encoding is decoded so that the layer of compressor encoding is removed, and in the second cycle of decoding, the encoded code, consisting of only unevener encoded code, of all the Super Processing Units is decoded by the unevener decoder; and by setting criteria appropriate to the data distribution of the classes of Super Processing Units and the corresponding mapping tables used for encoding and decoding for use in assessing the encoded code for making Artificial Intelligence distinction between the classes of Super Processing Units so that the use of indicators could be dispensed with;

CHAN CODING, method of encoding and decoding, being characterized by the technique of creating Unevener Encoder and Unevener Decoder by building a mapping table and using the unique code addresses of the said mapping table for mapping the unique data values of the digital data input in one to one mapping whereby the number of bit(s) used by the unique data values and that used by the corresponding mapped unique table code addresses of the corresponding mapped pair is the same; by using the said mapping table for encoding and decoding;

CHAN CODING, method of encoding and decoding, being characterized by the technique of using Unevener Encoder and Unevener Decoder for processing for the purpose of encoding and decoding; CHAN CODING, method of encoding and decoding, being characterized by the technique of using Unevener Encoder and Unevener Decoder together with an Evener Encoder and Decoder or a Compressor and Decompressor for processing for the purpose of encoding and decoding;

CHAN CODING, method of encoding and decoding, being characterized by technique of dynamic adjustment of the size of Processing Unit or Super Processing Unit in the context of changing data distribution and in accordance with the Terminating Condition used under processing;

CHAN CODING, method of encoding and decoding, being characterized by technique of dynamic adjustment of Code Unit Definition in accordance with the data distribution pattern of the data values under processing;

CHAN CODE being characterized by Classification Code and Content Code, which are created out of any digital data set using techniques of CHAN CODING for processing for the purpose of encoding and decoding;

CHAN CODE being characterized by Classification Code, Content Code and Un- encoded Code Section, which are created out of any digital data set using techniques of CHAN CODING for processing for the purpose of encoding and decoding;

CHAN CODE being characterized by Header, Classification Code and Content Code, which are created out of any digital data set using techniques of CHAN CODING for processing for the purpose of encoding and decoding, whereby the said Header contains indicator(s) resulting from the use of CHAN CODING technique(s) for processing for the purpose of encoding and decoding;

CHAN CODE being characterized by Header, Classification Code, Content Code and Un-encoded Code Section, which are created out of any digital data set using techniques of CHAN CODING for processing for the purpose of encoding and decoding, whereby the said Header contains indicator(s) resulting from the use of CHAN CODING technique(s) for processing for the purpose of encoding and decoding, such indicator(s) including any of the following: Checksum Indicator, Signature for CHAN CODE FILES, Mapping Table Indicator, Number of Cycle Indicator, Code Unit, Definition Indicator, Processing Unit Definition Indicator, Super Processing Unit Definition Indicator, Last Identifying Code Indicator, Scenario Design Indicator, Unevener/Evener Indicator, Recycle Indicator, Frequency Indicator;

Encoder and Decoder being characterized by being embedded with techniques of CHAN CODING for processing;

Encoder and Decoder being characterized by being embedded with techniques of CHAN CODING and Header Indicator(s) for processing, such indicator(s) including any of the following: Checksum Indicator, Signature for CHAN CODE FILES, Mapping Table Indicator, Number of Cycle Indicator, Code Unit, Definition Indicator, Processing Unit Definition Indicator, Super Processing Unit Definition Indicator, Last Identifying Code Indicator, Scenario Design Indicator, Unevener/Evener Indicator, Recycle Indicator, Frequency Indicator;

CHAN CODE FILES, being digital information files containing CHAN CODE; CHAN CODE FILES, being digital information files containing additional information for the use by CHAN CODING techniques, including Header and the indicator(s) contained therein, such indicator(s) including any of the following: Checksum Indicator, Signature for CHAN CODE FILES, Mapping Table Indicator, Number of Cycle

Indicator, Code Unit, Definition Indicator, Processing Unit Definition Indicator, Super Processing Unit Definition Indicator, Last Identifying Code Indicator, Scenario Design Indicator, Unevener/Evener Indicator, Recycle Indicator, Frequency Indicator;

CHAN MATHEMATICS being a mathematical method using techniques whereby data values are put into an order that is being able to be described in mathematical formula(e) corresponding to the respective CHAN SHAPE, including the associated mathematical calculation logic and techniques used in merging and separating digital information, such digital information including values of Code Units of a Processing Unit in processing digital information, whether at random or not, for the purpose of encoding and decoding; CHAN FORMULA(E) being formula(e) describing the characteristics and relations between basic components, the Code Units and derived components such RP Piece of CHAN CODE and other derived components, such as the Combined Values or sums or differences of values of basics components of a Processing Unit for processing digital information, whether at random or not, for the purpose of encoding and decoding;

CHAN SHAPES including CHAN DOT, CHAN LINES, CHAN TRIANGLE , CHAN RECTANGLES, CHAN TRAPEZIA AND CHAN SQUARES AND CHAN BARS representing the characteristics and relations of the basic components of a Processing Unit;

COMPLEMENTARY MATHEMATICS using a constant value or a variable containing a value as a COMPLEMENTARY CONSTANT or COMPLEMENTARY VARIABLE for mathematical processing, making the mirror value of a value or a range or ranges of values being obtainable for use;

CHAN MATHEMATICS using COMPLEMENTARY MATHEMATICS and normal mathematics or either of them alone for processing;

Use of CHAN FRAMEWORK for the purpose of encryption/decryption or

compression/decompression or both;

Use of CHAN CODING for the purpose of encryption/decryption or

compression/decompression or both;

Use of CHAN CODE for the purpose of encryption/decryption or

compression/decompression or both;

Use of CHAN CODE FILE(S) for the purpose of encryption/decryption or

compression/decompression or both;

Use of CHAN MATHEMATICS for the purpose of encryption/decryption or

compression/decompression or both;

Use of COMPLEMENATRY MATHEMATICS for the purpose of encryption/decryption or compression/decompression or both;

Use of CHAN SHAPE(S) for the purpose of encryption/decryption or

compression/decompression or both; A method of parsing digital data set, whether random or not, for collecting statistics about digital data set for the purpose of encoding and decoding, characterized by using design and schema of data order defined under CHAN FRAMEWORK;

A method of describing digital data set, whether random or not, characterized by using CHAN FRAMEWORK LANGUAGE; and

CHAN CODING, method of encoding and decoding, being characterized by technique of using Posterior Classification Code or Interior Classification Code or Modified Content Code as Classification Code.

Industrial Applicability

[133] There are numerous industrial applications that could use CHAN FRAMEWORK and CHAN CODING and its related design and schema at an advantage, including all computer applications that process digital information, including all types of digital data, whether in random distribution or not.

[134] The prior art for the implementation of this invention includes computer languages and compilers for making executable code and operating systems as well as the related knowledge for making application or programs; the hardware of any device(s), whether networked or standalone, including computer system(s) or computer- controlled device(s) or operating- system-controlled device(s) or system(s), capable of running executable code; and computer-executable or operating-system-executable instructions or programs that help perform the steps for the method of this invention. In combination with the use of the technical features contained in the prior art stated above, this invention makes possible the implementation of CHAN FRAMEWORK using CHAN CODING for the processing of digital information, whether at random or not, through encoding and decoding losslessly and correctly the relevant digital data, including digital data and digital executable codes, for the purpose of encryption/decryption or

compression/decompression or both; and in this relation, is characterized by the following claims:

Sequence List Text

[]