Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR RANDOMISING DATA COMPRESSION DICTIONARIES
Document Type and Number:
WIPO Patent Application WO/2016/162792
Kind Code:
A1
Abstract:
A method of randomising a connection structure forming part of a dictionary in a computer memory device is provided. The structure includes a plurality of leaf connections and a plurality of non-leaf connections, with each leaf connection representing a symbol, and each non-leaf connection representing a relationship between two symbols, a symbol and a non-leaf connection, a non-leaf connection and a symbol, or two non-leaf connections. The method including copying connections, storing the copies at randomly determined available address in a new dictionary, and maintaining the relationships between the copied connections that existed in the original dictionary.

Inventors:
SMITH RODNEY (NZ)
HARTE SAMUEL (NZ)
Application Number:
PCT/IB2016/051929
Publication Date:
October 13, 2016
Filing Date:
April 05, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SMITH RODNEY (NZ)
International Classes:
H03M7/30; G06F21/62; H04L9/06; H04L9/12; H04L9/28
Domestic Patent References:
WO2015065203A12015-05-07
Foreign References:
US20090309769A12009-12-17
US8665124B22014-03-04
US6674908B12004-01-06
US20110246503A12011-10-06
Download PDF:
Claims:
CLAIMS

What is claimed is: 1. A method of randomising a connection structure forming part of a dictionary stored in a computer memory device,

said structure comprising a plurality of leaf connections and a plurality non-leaf connections, each leaf connection representing a symbol, and

each non-leaf connection representing a relationship between two symbols, a symbol and a non-leaf connection, a non-leaf connection and a symbol, or two non-leaf connections,

each connection existing at a unique address in the dictionary;

said method comprising:

identifying a randomly determined connection address at which a connection may be stored, replacing each reference to the address of an existing connection with a reference to the randomly determined connection address,

moving or copying the existing connection to the randomly determined connection address. 2. A method of randomising a dictionary comprising a plurality of connections stored in computer memory wherein each initially exiting connection and connection address reference in the structure is randomised according to claim 1.

3. A method according to claim 1 comprising:

obtaining a random number,

assigning the random number to a given connection,

replacing each reference in the dictionary to that connection with the random number, moving or copying that connection to the address referenced by the random number.

4. A method according to claim 3 comprising:

creating a list of unique random numbers whose values lie within the range of the lowest connection address number in the dictionary and the highest connection address number in the dictionary,

processing each address field in a connection and replacing the value stored in the field with a random number from the random number list.

5. A method according to claim 4 of modifying a connection comprising:

creating a position pointer P as an index to the list of random numbers,

assigning the address number stored in a given address field in the connection to P, looking up the random number stored at index P in the random number list, writing this found random number to the address field in the connection, and performing this for each address field in the connection.

6. A method according to claim 4 comprising:

assigning to P the address of the connection,

looking up the random number list at index P,

adopting the random number located at index P as the address of the now modified connection,

storing the now modified connection at this address.

Description:
METHOD FOR RANDOMISING DATA COMPRESSION DICTIONARIES

TECHNICAL FIELD

This invention relates to the fields of data compression and data privacy and particularly to the field of the security of data transmitted in electronic form.

BACKGROUND

Nowadays, communication often takes the form of information embodied in electronic transmissions including transmissions along metal wires, glass fibres, by electromagnetic radiation through air, and in other ways. Electronic digital computers often facilitate the transmission and storage of such information, and respective computing devices might include desktop computers, laptops, net books, tablets, smart phones and other devices.

It is well known that data transmitted by electronic or other means may be intercepted, copied and stored by unknown parties for one or more unknown purposes. For business, personal and other reasons, this is not ideal.

One form of encrypted transmission comprises addresses of information stored in a data compression dictionary, which dictionary is not transmitted with the respective address stream. A new instance of the original information is created by an algorithm going to the locations in the same or a copy of the dictionary whose addresses are those of the transmission. The algorithm emits as output the leaf node symbols contained in the structures at the found addresses, and thus creates a new instance of the original information. Such leaf node symbols could be any of the 256 values of a byte.

One shortcoming of such a dictionary-based method is that transmission of a dictionary over the Internet or other transmission means may be intercepted, and the dictionary come into the hands of an attacking party, which party may then intercept a respective address stream, then recover the original using the intercepted dictionary.

It would be better if there were a method of transmitting then using a dictionary where, even if that transmission were intercepted and the dictionary copied, the intended recipient of the dictionary could still use the dictionary in such a way that transmitted address streams were not vulnerable to successful attack by an interceptor who had intercepted and now possesses a copy of the dictionary.

SUMMARY OF THE INVENTION

The present invention enables an original dictionary comprising structural elements to be transformed into a new dictionary in which the original structural elements or copies of them are mostly or completely located at randomly determined addresses inside the new dictionary.

A crypto system is fully defined by the algorithm (process) and key (shared secret). One use of such a dictionary of structural elements is to be a key for communicating data in an encrypted form. The addresses of the structures inside the dictionary are used as the codewords emitted as cipher-text during the encryption process, and thus the encrypted output of the encryption process comprises a sequence of addresses of places inside the dictionary where structural elements are stored.

A such dictionary, regarded as a key, can be provided to a user. The user will desire that this key be unique to themselves. This would not be the case if a given dictionary were sent, for instance vial email, to a number of users, or if a transmitted dictionary were intercepted and copied.

This problem of certainty of uniqueness of key can be addressed by giving a user a dictionary and also a means to randomise the addresses of the places inside the dictionary where the structural elements are located, where each randomisation session uses entropy from the environment and seeds the randomisation process with adequate entropy.

Such a randomisation process produces a "randomised" dictionary that is unique to the intended recipient, even though copies of the sent dictionary may be in the hands of multiple agents.

The present invention provides a method of randomising dictionaries that contain structural elements located at accessible addresses, including binary tree structures of nodes and connections. In many areas of business, community and personal life, security and privacy of information is important. The present invention has a general applicability in improving security and privacy for business, community and personal use of electronic communication generally. The present invention may be used in a variety of different ways whose primary utility may not be limited to or may not relate to privacy and security of information. The purpose and use of the present invention is therefore expressly not limited to the purpose and use exemplified in the embodiments described herein. BRIEF DESCRIPTION OF THE DRAWINGS

The above described advantages and operation of the present invention will be more fully understood upon reading the following description of the preferred embodiment in conjunction with the drawings, of which:

FIG. 1 illustrates a leaf connection structure and a non-leaf connection structure.

FIG. 2 is a flow chart illustrating the process of creating a randomised dictionary.

FIG. 3 is a flow chart illustrating the process of creating a randomised connection which is part of the process of creating a randomised dictionary.

DESCRIPTION OF THE INVENTION

As required, a detailed embodiment of the present invention is disclosed herein; it is to be understood that the disclosed embodiment is merely exemplary of the invention, which may be embodied in various forms. Therefore specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present invention in virtually any appropriately detailed structure. The present invention includes by way of reference the invention described in the patent specification of the present inventory published as W098/39723.

Dictionary structure An existing dictionary of a type the same as or relevantly similar to that described in W098/39723 comprising a structure of connections realised as a set of connection records of the types indicated in FIG 1 is loaded into computer memory. This connection structure comprises four fields Fl - F4, each containing a binary integer.

Creating a randomised dictionary

Referring now to FIG. 1, the connection record structure 110 is a non-leaf connection and exists in the dictionary at address 25470 and contains four fields Fl - F4 each of which stores a value which is the address of a connection record. The first field Fl contains the integer 65498 which identifies the connection record at address 65498.

The dictionary structure may be abstractly conceived of as a forest of binary trees realised as leaf connection records and non-leaf connection records. The leaf connection record 120 is characterised by the first field Fl containing NULL, or binary zero, and the second field F2 containing an identifier of the leaf symbol, typically the byte value of the leaf symbol. For example, the leaf symbol "A" might be stored in F2 as the binary value 01000001, or hex 41, decimal 65. Referring now to FIG. 2, the dictionary to be randomised ("original dictionary") ("OD") is loaded into processing memory. A random number array (RNA) is initialised 201 and populated with unique random numbers. These numbers range in value from 1 to the maximum address number MAN in OD. For example, if the maximum address number in OD is 785,902, then RNA is created with length 785,902 and each index position of the array is populated with a random number, without duplication.

The connection counter for OD, P, is initialised 210 and set to 1, so P points to the first connection in OD. The connection pointed to by P is identified 215. This connection or a copy of it is randomised 220 then FIG. 3, then the randomised connection record is inserted into the new dictionary at the address stored in RNA at index position P, 225.

A test is made on the value of P. If this value equals the maximum address number MAN, the randomisation process terminates 235 Y. If this test on P returns False, P is incremented by one 240 and the sub-process of randomising the next connection in OD executes. Creating a randomised connection

Referring now to FIG. 3, the process of randomising a connection is illustrated, which comprises replacing each address stored in the connection record with an address that has been randomly generated.

A connection record contains either two addresses or four addresses. The first type of connection record 110 is a non-leaf connection and contains four addresses, one in each field of the record F1 - F4.

The second type of connection record, the leaf connection, contains addresses in only the third and fourth fields 120, F3 and F4. Fl and F2 of this second type of connection define the leaf and do not point to the location of a connection record within the dictionary. In W098/39723 this second type of connection is referred to as an interface connection connection.

In respect of the current connection record OC from the original dictionary OD, where the value of Fl = NULL 301 Y, Fl and F2 define and identify the leaf, and Fl and F2 of OC are copied to randomised connection Fl and F2 without change, then the field number FN is set to 3, 310. Where the value of Fl of OC is not null 301 N then the field number FN is set to 1, 315.

Randomising each address stored in a connection

For fields Fl, F2, F3 and F4 of non-leaf connections 110, and for fields F3 and F4 of leaf connections 120 the field value variable FV is loaded with the value of the current field (which will be an integer address) 320.

Field value variable FV now contains the address value stored in the current field of OC. The random number array RNA is looked up using the index = FV 325. For example, if the address in F3 of OC is 573,245, then 573,245 is the index used into RNA, that is, the contents of the array element at position 573,245 in the array is accessed. This content is a randomly generated integer with a value greater than zero and less than or equal to the maximum address number MAN in OD. The random number found at index FV in RNA is copied into the current field, i.e., field number FN, of the randomised connection being newly created 330. In the example, the random number stored at index 573,245 in RNA is copied into the current field of the randomised connection.

If integer field number FN is not less than 4 then the randomised connection has been fully created, and processing returns to FIG. 2, 335 N, the current connection pointer P is incremented by 1 240 and the next connection in OD now becomes the current connection 215.

If FN is less than 4 then the randomised connection has not yet been fully created 335 Y, and FN is incremented by one 340, and processing moves to the next field in the current connection OC 320