FRACTAL DATA COMPRESSION - UNIV STRATHCLYDE

Title:

FRACTAL DATA COMPRESSION

Document Type and Number:

WIPO Patent Application WO/1995/020296

Kind Code:

Abstract:

An image compression system for compressing an image frame (1) made up of an array of pixels. The system subdivides the image frame (1) into a number of contiguous small tiles (2) and a large number of overlapping large patches (3). For each of the tiles (2) and patches (3) attribute characteristics are generated which are representative of the pixel data in that tile or patch. By comparing attribute characteristics, a best matching large patch is identified for each tile. For each matching pair an affine transform is determined which maps the tile to the patch. The compressed image is stored as an array of identifiers mapping a tile location to the matching patch and a corresponding array of affine transforms.

Inventors:

MCGREGOR DOUGLAS ROBERT (GB)
COCKSHOTT WILLIAM PAUL (GB)
FRYER RICHARD JOHN (GB)
LAMBERT ROBERT BARTHOLEMEW (GB)

Application Number:

PCT/GB1995/000093

Publication Date:

July 27, 1995

Filing Date:

January 19, 1995

Export Citation:

Click for automatic bibliography generation Help

Assignee:

UNIV STRATHCLYDE (GB)
MCGREGOR DOUGLAS ROBERT (GB)
COCKSHOTT WILLIAM PAUL (GB)
FRYER RICHARD JOHN (GB)
LAMBERT ROBERT BARTHOLEMEW (GB)

International Classes:

H04N7/30; G06T9/00; H03M7/30; H04N1/41; H04N7/26; (IPC1-7): H04N7/28; G06T9/00; H04N7/26

Foreign References:

EP0633701A2

1995-01-11

Other References:

A. JACQUIN: "Fractal image coding : a review", PROCEEDINGS OF THE IEEE, vol. 81, no. 10, pages 1451 - 1465
L. WALL AND W. KINSNER: "A fractal block coding technique employing frequency sensitive competitive learning", PROCEEDINGS OF IEEE WESCANEX 93, SASKATOON,CANADA, pages 320 - 329, XP000419836
K. BARTHEL AND T. VOYE: "Adaptive fractal image coding in the frequency domain", JOURNAL ON COMMUNICATIONS , IMAGE PROCESING I, vol. 45, pages 33 - 38

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

A method of compressing an array of data into a form from which the original data array can be substantially reconstructed, the system comprising generating a first plurality of data blocks by a first division of the data array, generating a second plurality of data blocks by a second division of the same or a further data array data array, each of the second plurality of data blocks being the same size as or larger than those of the first plurality, evaluating for each of the first and second plurality of data blocks at least one attribute characteristic which is a function of the data contained in that block, determining which of the second plurality of data blocks best matches each of the first plurality of data blocks from a comparison of respective attribute characteristics, and obtaining for each of the first plurality of data blocks an affine transform which maps the data contained in the matching one of the second plurality of data blocks onto that one of the first plurality, wherein said at least one attribute characteristic is selected to be substantially invariant under the affine transform.

A method of compressing an array of data into a form from which the original data array can be substantially reconstructed, the method comprising: generating a first plurality of data blocks by a first division of the data array; evaluating for each of the blocks of the first plurality at least one attribute characteristic which is a function of the data contained in that block; storing a plurality of reference blocks different from the first plurality of data blocks, or storing means for generating a plurality of such reference blocks, together with associated attribute characteristics; and determining which one of the plurality of reference blocks best matches each of the first plurality of data blocks from a comparison of respective attribute characteristics, wherein the compressed data array comprises for each data block of the first plurality at least an identifier identifying the matching one of the second plurality of data blocks.

A method of decompressing an array of data compressed using the method of claim 2, the method comprising receiving the compressed data array, using said identifiers to select those ones of the reference blocks which match the data blocks of the first plurality, and reconstructing the original data array using the selected ones of the reference blocks.

Apparatus for compressing an array of data into a form from which the original data array can be substantially reconstructed, the apparatus comprising: input means for receiving the entries of the data array as a data stream; calculation means for calculating for each of a first plurality of blocks of the data array, from the entries within the blocks, at least one attribute characteristics; memory means for storing a second plurality of reference blocks different from the first plurality, or for storing means for generating a plurality of such reference blocks, together with associated attribute characteristics; comparator means for comparing the attribute characteristic(s) of the first plurality of blocks against those of the second plurality of blocks and for identifying therefrom which ones of the second plurality of blocks best match each of the first plurality of blocks; and output means for outputting a data stream comprising a plurality of identifiers where each identifier maps a corresponding one of the first plurality of data blocks onto the best matching one of the second plurality of data blocks.

Apparatus for decompressing an array of data compressed using the apparatus of claim 4, the apparatus comprising an input for receiving the compressed data array, comparison means for selecting those ones of the reference blocks which match the data blocks bf the first plurality, and array generation means for assembling the original data array from the selected ones of the reference block.

6.	A method of apparatus according to any one of the preceding claims, wherein the attribute characteristics are fourier transform of each data block.

7.	A method or apparatus according to any one of claims 1 to 5, wherein the attribute characteristics are obtained from a weighted sum or sums of the data values in each data block.

8.	A method or apparatus according to any one of claims 1 to 5, wherein the attribute characteristics are obtained from a weighted sum or sums of the data values in each data block.

9.	A method or apparatus according to any one of the preceding claims, wherein the data array to be compressed represents an image.

10.

A method of compressing an array of data into a form from which the original data array can be substantially reconstructed, the system comprising generating a first plurality of data blocks by a first division of the data array, generating a second plurality of data blocks by a second division of the same or a further data array, storing the second plurality of data blocks using a selforganising memory, for example a neural network, to create a code book the entries of which match the second plurality of blocks to corresponding outputs to the selforganising memory, determining which of the second plurality of data blocks best matches each of the first plurality of data blocks using the selforganising memory, and storing as the compressed image a plurality of said code book entries.

Description:

FRACTAL DATA COMPRESSION

The present invention relates to data compression and in particular, though not necessarily, to a system for compressing digitised two dimensional image frames. Where it is necessary to store, transmit or process large quantities of data it is often helpful, or even necessary, to compress the data in order to reduce both the storage space required and/or the processing time. It is generally important that it be possible to reconstruct the original uncompressed data from the compressed data as accurately as possible.

A number of techniques for compressing data are known. JPEG is an ANSI standard for the compression and storage of single images, produced by the Joint Photographic Expert Group (see "Digital Compression and Coding of Continuous-tone Still Images, Part 1: Requirements and guidelines, Number: ISO/IEC CD 10918-1, alternate number: sc2 N2215) . It adopts a single fixed patch size best suited to moderate degrees of compression at reasonably high quality. As a result, reconstructed JPEG images begin to show an effect (an artifact of the sample patch boundaries generally described as "cobblestones") when the average compression ratio exceeds approximately 25:1. This is a result of the fixed-size partitions adopted, each of which is transformed into 64 Discrete Cosine Transform

(DCT) coefficients.

At higher compression ratios increasing numbers of these partitions are reduced to a single term which, despite interpolation techniques, leaves the edges of the square delineated by a slope discontinuity. It is a symmetric technique requiring equal amounts of computation for compression and decompression and is readily implemented by both software and hardware methods. MPEG is a video compression standard, similar to

JPEG, but obtaining further compression by combining frames and deriving approximations for succeeding frames from their predecessors (see G.K. Wallace, "The JPEG Still Picture Compression Standard," Comm. of the ACM, Volume 34, Number 4, 1991). Blocks are matched individually to their best-matched position in successor frames and the differences are coded in a compressed form. As it is primarily intended for video data, considerations of hardware implementation have been influential in formulating its design.

A fractal compression system, for compressing digitised image frames, is known from U.S. Patent No. 5,065,447, which system relies upon the modelling of each small region of an image frame, comprising a two dimensional data array of numbered pixels, as a selected larger region of the same image frame which has undergone some transform. Application of those transforms, known as affine transforms, to corresponding

large areas of some arbitrary image frame, whether or not corresponding in size to the original image frame, followed by successive iterative applications to the resulting image frame, ultimately results in a reconstruction which closely approximates the original data.

Thus, the data contained in the entire image frame can be stored by storing only the affine transforms and the transform addresses. The system of U.S. Patent No. 5,065,447 sub-divides the image frame to be compressed into an array of small patches (tiles) . For each of these tiles a larger patch of the frame is chosen which, when reduced in size to correspond to the size of the tile, closely matches that tile. The matching large patch and any transformation which it must undergo to improve the match are chosen by comparing, on a pixel by pixel basis, each of the small squares successively against each of the possible large squares which can be created within the image frame. For a typical image frame of size 512 x 512 pixels, tiles of size 8 x 8 pixels, large patches of size 16 x 16 pixels, and given that each of the large patches must be tested in 8 rotational orientations to determine an appropriate transformation, over 1 x 10 ¹¹ individual operations must be carried out in the system of U.S.

Patent No. 5,065,447 to compress a single image frame. Given this complexity, the compression of an individual image frame may take over 30 minutes on conventional

equipment .

According to a first aspect of the present invention there is provided a system for compressing an array of data into a form from which the original data array can be substantially reconstructed, the system comprising generating a first plurality of data blocks by a first sub-division of the data array, generating a second plurality of data blocks by a second sub-division of the same or a further data array, each of the second plurality of data blocks being the same size as or larger than those of the first plurality, evaluating for each of the first and second plurality of data blocks at least one attribute characteristic which is a function of the data contained in that block, determining which of the second plurality of data blocks best matches each of the first plurality of data blocks from a comparison of respective attribute characteristics, and obtaining for each of the first plurality of data blocks an affine transform which maps the data contained in the matching one of the second plurality of data blocks onto that one of the first plurality, wherein said at least one attribute characteristic is selected to be substantially invariant under the affine transform.

Preferably every data entry in the array to be compressed is contained in one of the first plurality of data blocks.

The above first aspect of the present invention is applicable to one, two, three and higher dimensional

arrays of data. In particular, when applied in two dimensions, in comparison with U.S. 5,065,447, the number of individual comparisons required to compress an individual image frame may be greatly reduced by, for example, a factor of at least 3 or 4, this reduction arising from the fact that each large patch needs to be considered only once to obtain the corresponding attribute characteristics.

In one embodiment of the first aspect of the present invention a plurality of attribute characteristics each of which is substantially invariant under the affine transform are evaluated for each of the first and second plurality of data blocks. Suitable attribute characteristics may be, for example, Fourier coefficients obtainable by taking a digital Fourier transform of the data blocks. Only the most significant Fourier coefficients are generally used.

In a second embodiment of the first aspect of the invention, the best matching one of the second plurality of data blocks is obtained by first arranging the evaluated attribute characteristics of those blocks in the form of an associative table, preferably a binary tree and more preferably a K-D tree, in which case the compression time may be reduced by a further factor on the order of 50. Thus, a reduction in the computational complexity involved in the search may be obtained. In order to reduce the risk of error in selecting a best match from a K-D tree associative

table, a plurality of best matches may initially be selected from the tree. The final best match may then be selected from the plurality by comparing the tile under consideration against each of the plurality on a pixel-by-pixel basis, for example by using some measure of dissimilarity, for example the Hausdorf distance or the mean squared difference.

In another embodiment of the first aspect of the invention, the affine transform corresponding to each of the first plurality of data blocks comprises an offset or address parameter and at least one of a level parameter, a contrast parameter, and a rotation parameter. Preferably the transform comprises all four parameters. The derivation of the rotation parameter may involve a comparison of the position of the centroid of matching data blocks. If a K-D tree is used to reduce the search complexity, the centroid comparison may be carried out after selecting a best match from the tree or it may be integrated into the K-D tree search. This may reduce the compression time further by around a factor of four.

In an embodiment of the above first aspect of the present invention the attribute characteristic(s) for each said block is (are) calculated from a weighted sum of pixel intensities in the block and, more preferably, from the separate sums of the pixel intensities in each of the four quadrants of a block or in some other subdivision of the block, e.g. 8 or 16.

Preferably, in order to evaluate the attribute characteristic(s) for a block, the degree of rotation and/or reflection required to orient the block in a standard orientation, for example with the quadrant providing the largest pixel intensity sum being arranged at the top left of the block, is determined. Preferably also, the parameters required to normalise the intensity of pixels within a block, or in order to normalise the quadrant sums, with respect to level and/or contrast are determined. When a block of the first plurality of data blocks has been matched to a block of the second plurality the corresponding affine transform may be determined by combining the transformations applied to the two matching blocks in order to arrange them in said standard orientation and by combining the normalising transforms required to normalise the level and/or contrast of each block.

It is noted that the approach described above is based upon two principles:- 1) the location within the same image of range blocks which approximate the content of domain blocks and the identification of these within the transform file to allow the reconstruction (as an approximation) of the original image; and 2) the use of iteration as a means of reconstruction of the (approximated) image content.

According to a second aspect of the present invention there is provided a method of compressing an

array of data into a form from which the original data array can be substantially reconstructed, the method comprising: generating a first plurality of data blocks by a first division of the data array; evaluating for each of the blocks of the first plurality at least one attribute characteristic which is a function of the data contained in that block; storing a plurality of reference blocks different from the first plurality of data blocks, or storing means for generating a plurality of such reference blocks, together with associated attribute characteristics; and determining which one of the plurality of reference blocks best matches each of the first plurality of data blocks from a comparison of respective attribute characteristics, wherein the compressed data array comprises for each data block of the first plurality at least an identifier identifying the matching one of the second plurality of data blocks.

Preferably, every data entry in the array to be compressed is contained in one of the first plurality of data blocks. In a preferred embodiment of the above second aspect of the invention, the system is used for compressing a 2D image, either in isolation or as part of a film sequence. The data entries of the data blocks

of the first and second plurality correspond to individual image pixels and at least said first plurality of data blocks are contiguous with one another. The reference blocks are generated from an appropriate reference image which may, for example, be a photograph or which may be generated using a fractal synthesis technique.

Using the preferred embodiment it is possible to prevent unauthorised receivers from decompressing a compressed image by withholding access to the reference image or to the means for generating the reference image. Alternatively, unauthorised decompression may be prevented by encrypting the compressed image using an encryption key, the key being provided only to authorized users.

According to a third aspect of the present invention there is provided a method of decompressing an array of data compressed using the above second aspect, the method comprising receiving the compressed data array, using said identifiers to select those ones of the reference blocks which match the data blocks of the first plurality, and reconstructing the original data array using the selected ones of the reference blocks. In a preferred embodiment of the above third aspect, where the data array is a 2D image, the image is displayed using two levels of display indirection, where the first level may store a combination of brightness and contrast shift information in each word. The

display may use a modified conventional textual or graphical information display.

Embodiments of the second and third aspects of the present invention give rise to at least two significant advances in the field of data compression: 1) the image patches which are used to recreate an approximation of the original uncompressed image are not drawn from the uncompressed image; and 2) recursion need not be used in the image reconstruction scheme. In a preferred embodiment image patches approximating sub-areas of the image to be compressed are selected in an appropriate fashion from a supplied reference image having a range of structure and texture suiting it for use in approximating a range of image types. A preferred reference image can be produced by a fractal synthesis process. For example the reference image could be an appropriately selected visualisation of part of the Mandelbrot set, or alternatively the topography map of a fractal landscape (Mandelbrot, B. , The Fractal Geometry of Nature' W.H.Freeman and Co., San Francisco 1982) , (Miller G. , The Definition and Rendering of Terrain Maps', Computer Graphics. 20(4), August 1986) . The system also works if the reference image is an ordinary digitised photograph although there may be additional advantages to using a synthesised fractal image.

The compressed data set therefore consists of the means of relating specified sub-areas of the compressed

image to specified sub-areas of the reference image together with brightness and/or contrast adjustments to improve the approximation achieved. In essence reconstruction therefore consists of the copying from the reference image held at the site of decompression into a suitable medium for holding the decompressed image, making in the process such adjustments as to brightness and/or contrast as are specified in the compressed data set. It will be recognised that if the reference image is held only at the compression and decompression sites (which need not be different) then the compression scheme represents a form of data encryption also, since decompression without access to the reference image is impossible. If the reference image chosen is fractally synthesised, the compressor and decompressor can construct it using a small number of numeric parameters which could be transmitted along with the compressed image. For instance in the case of the Mandelbrot set the necessary parameters would be the real and imaginary co-ordinates of the origin of the image in the complex plain and the absolute length of its diagonal in the complex plain. For a fractal landscape, the image could be specified by sending the seed to a pseudo random number generator used to drive the image construction process.

Means for rapidly and efficiently determining optimum block matches without exhaustive searching are

described hereinbelow, and may be utilised in implementing the present invention. These means utilise data structures such as the KD Tree during search and operate by the rapid computation of patch attributes, i.e. of summary descriptions of patches that require only simple operations for their generation, and the use of these attributes to index into potentially equivalent sub-area/patch pairs without the need to consider in detail non-equivalent or poor quality matches. The image compression scheme of the above second and third aspects of the present invention may be extended to the case of an image sequence. By noting that in an image sequence there is in general a degree of equivalence (often considerable) between one image and the next image in the sequence, a further increase in compression may be obtained. In such cases it is necessary, as is well known, only to record those areas where the image has changed significantly and to encode the data necessary to reconstruct such changed areas. Since the present invention intrinsically analyses and compresses the images on a local basis the inclusion of such a checking and conditional compression/decompression procedure may readily be included. The standard fractal decompression process disclosed in US 5,065,447 decompresses by copying patches of image data from a source to a destination buffer (adjusting brightness and contrast in the

process) , replacing the source with the destination buffer and then iterating until the picture quality is good enough. Embodiments of the above second and third aspects of the present invention may differ from this in that:

1) there need only be one buffer, a source buffer, and instead of data being copied from there into a destination buffer the contents of this single buffer can after suitable transformation be directly transmitted to the video display hardware. There is no requirement to maintain a representation of the uncompressed image in a destination buffer as no iterative process is carried out;

2) as the process is not iterative, the image data stream for a single frame is decompressed in a single pass producing a significant saving in decompression time; and

3) the source buffer contains a reference image which need not be in any way derived from the image being transmitted.

According to a fourth aspect of the present invention there is provided apparatus for compressing an array of data into a form from which the original data array can be substantially reconstructed, the apparatus comprising: input means for receiving the entries of the data array as a data stream; calculation means for calculating for each of a

first plurality of blocks of the data array, from the entries within the blocks, at least one attribute characteristics; memory means for storing a second plurality of reference blocks different from the first plurality, or for storing means for generating a plurality of such reference blocks, together with associated attribute characteristics; comparator means for comparing the attribute characteristic(s) of the first plurality of blocks against those of the second plurality of blocks and for identifying therefrom which ones of the second plurality of blocks best match each of the first plurality of blocks; and output means for outputting a data stream comprising a plurality of identifiers where each identifier maps a corresponding one of the first plurality of data blocks onto the best matching one of the second plurality of data blocks. In an embodiment of the above fourth aspect of the present invention the data stream is a raster scan of a 2-dimensional image frame where each entry of the array corresponds to the intensity of a single pixel of the image frame. Preferably, said at least one attribute characteristic for each of the blocks is obtained by summing/subtracting the intensities of the pixels within the block under the application of a mask. In order to achieve this, the calculation means may comprise an

adder/subtracter unit coupled for input and output to a "circular" buffer, where the adder/subtractor unit includes an accumulator which in use holds a running sum of the pixel intensities of the block across which the scan is currently travelling. When the data stream crosses a block boundary in the scan direction the accumulator is arranged to write its presently held sum to the circular buffer and to recover the running sum for the next block. When the data stream crosses a patch boundary in the direction perpendicular to the scan direction, the accumulator is arranged to transfer its contents to an attribute characteristics store and to commerce calculation for the next set of blocks. If a plurality of attribute characteristics are to be calculated for each block, the calculation means may comprise a corresponding plurality of adder/subtractor units, together with respective circular buffers, arranged to receive scan data in parallel.

According to a fifth aspect of the present invention there is provided apparatus for decompressing an array of data compressed using the apparatus of the above fourth aspect, the apparatus comprising an input for receiving the compressed data array, comparison means for selecting those ones of the reference blocks which match the data blocks of the first plurality, and array generation means for assembling the original data array from the selected ones of the reference blocks. According to a sixth aspect of the present invention

there is provided a method of compressing an array of data into a form from which the original data array can be substantially reconstructed, the system comprising generating a first plurality of data blocks by a first division of the data array, generating a second plurality of data blocks by a second division of the same or a further data array, storing the second plurality of data blocks using a self-organising memory, for example a neural network, to create a code book the entries of which match the second plurality of blocks to corresponding outputs to the self-organising memory, determining which of the second plurality of data blocks best matches each of the first plurality of data blocks using the self-organising memory, and storing as the compressed image a plurality of said code book entries. For a better understanding of the invention and to show how the same may be carried into effect, reference will now be made, by way of example, to the accompanying drawings in which:- Fig. 1 shows the creation of a plurality of tiles and large patches from a two dimensional image frame to be compressed;

Fig. 2 shows, by way of illustration, a K-D tree type associative table; Fig. 3 shows a table for determining a rotation parameter of an affine transform; and

Fig. 4 shows schematically an embodiment of the invention having a store for containing attribute

characteristics for each of the large patches;

Fig. 5 shows a first weighting function for application to a patch;

Fig. 6 shows a second weighting function for application to a patch.

Figure 7 illustrates a number of masks which may be applied to a data block in order to calculate attribute characteristics for that block;

Figure 8 illustrates how quadrant sums for a data block can be evaluated from the quadrant sums of the preceding data block in the same row by evaluating only half column sums;

Figure 9 shows a system for compressing frames of a video signal; Figure 10 shows a unit of the system of Figure 9 for calculating geometric and global attributes;

Figure 11 shows in detail an attribute extraction unit of the system of Figure 9; and

Figure 12 shows a system for decompressing frames of a video signal compressed using the system of Figure 9.

Whilst the present invention is applicable to systems for compressing digitised data in general, for the purpose of illustration, the following description is limited to a system for compressing digitised two dimensional image frames.

As has had been briefly mentioned above, it is possible to reconstruct each small region or data block

of a two dimensional image frame by iteratively applying a predetermined affine transform to a larger region or data block of an arbitrary digitised image frame. (For a general discussion of affine transforms see the article by Louisa F. Anson, BYTE, October 1993, pages 195 to 202) . In order to determine the appropriate transforms, the image frame to be compressed is sub¬ divided by a first sub-division into a number of small regions (termed tiles) for each of which a separate transform must be obtained.

Fig. 1 illustrates a digitised two-dimensional image frame 1 which, for the purpose of the following example, comprises a 512 x 512 array of pixels, where each of the pixels is assigned a grayscale value of, say, between 0 and 255, 0 representing a black pixel and 255 a white pixel. The image frame has been sub¬ divided into an 8 x 8 array of non-overlapping, contiguous, tiles 2 each of which contains 64 x 64 pixels. Whilst in the present example the tiles (and the large patches described below) are square, it is noted that any tileable polygonal shape can be adopted.

A set of equally sized large data patches 3 comprising a 128 x 128 array of contiguous pixels, which patches are greater in size than the tiles, are then selected by a second sub-division of the image frame 1. This set may comprise a subset of all possible arrays 3 but preferably comprises all of the arrays 3 in which case, with a 512 x 512 pixel image, (512 - 128 + 1) x

(512 - 128 +1) large patches are extracted from the image frame, i.e. approximately 150,000.

The selection process may be visualised by considering a 128 x 128 pixel window positioned over the 128 x 128 pixel array located at the top left hand corner of the image l. This first pixel array provides the first large data patch. The window is then moved to the right by one pixel and the underlying 128 x 128 pixel array provides the second large patch. The window is moved across the top row of the image frame, one pixel at a time, to extract further large data patches until the right hand boundary of the window meets the right hand boundary of the image. The window is then moved back to the top left hand corner and is displaced downwards by one pixel to the next row. The underlying 128 x 128 pixel array provides the next large data patch. Displacement of the window along each row and from row-to-row in a similar manner is repeated until the window is positioned over the 128 x 128 pixel array located at the bottom right hand corner of the image frame. If it is desired to select only a subset of all the possible large patches, the selection may be made by selecting only every mth patch of the array, by using some predetermined algorithm, or may be carried out at random.

The next step in the process is to consider each of the tiles 2 in turn and to determine which of the approximately 150,000 large patches 3 provides an

initial best match, i.e. which is visually the most similar to the tile 2 under consideration. Whilst visual similarity may appear to be a somewhat subjective characteristic, there are in fact a number of parameters (or gross characteristics) which can be selected and evaluated to provide a measure of this similarity.

In selecting appropriate parameters, however, it is important that they be invariant under the affine transforms which are used to map large patches onto corresponding ones of the tiles. For example, if the affine transform which is applied to a patch results in the location of the centroid of that patch changing, then the location of the centroid cannot be used as one of the 'match' parameters. In the present example, these parameters, which are hereinafter referred to as attribute characteristics, are Fourier coefficients obtained by transforming the intensity data of each large patch into the spatial frequency domain using a two dimensional digital Fourier transform. However, a suitable alternative to Fourier coefficients would be Discrete Cosine Transformation coefficients. Alternatively, one might use the result of summing the pixels in the large patch under one or more different masks or determining the principle moment of the image in the large patch as the attribute characteristics. Other characteristics which could be used result from identifying the large patches in terms of the number of intensity maxima or minima (turning

points) or average level crossing points in the x and y direction of each patch.

The process by which Fourier transforms are obtained is well known (see Buerger, M.J. 'Crystal Structure Analysis', Chapters 13 to 18, John Wiley and Sons, 1960 Lib. of Congress Card No. 60-10310) . A digital Fourier transform produces a series of discrete Fourier components, only a few of which are generally required to characterise the data in the spatial frequency domain of the patches. Typically, three or four coefficients are sufficient, these coefficients providing suitable attribute characteristics for use in determining the initial best match for each tile.

Fourier coefficients are determined for each of the large patches and also for each of the tiles. In order to find the initial best match for each of the tiles, the Fourier coefficients for each tile are compared in turn against the coefficients determined for the large patches. This matching process can be carried out by comparing the coefficients for each tile against the coefficients of each and every large patch, using a measure such as Euclidean distance. However, it is significantly more computationally efficient to arrange the coefficients of the large patches in the form of a binary tree type associative table, and in particular a K-D tree, prior to carrying out the comparison.

A description of K-D trees is given in Friedman, J.H, Bentley, J.L., and Finkel, A.F., "An Algorithm for

Finding Best Matches in Logarithmic Expected Time", ACM Transactions on Mathematical Software, Vol. 3, No. 3, 1977. Briefly, a K-D tree is a means for storing a number of data records, each record being characterised by the same number of attribute characteristics. The tree has a threshold value for a particular one of the attribute characteristics at each node of the tree. This threshold determines the partition of the search at that node, i.e. which of the two branches leading from that node should be pursued to determine the best match. At each node there are also upperbound and lowerbound limits which determine the nearest subsequent node of the tree in terms of the single attribute characteristic considered by the node. Consecutive nodes of the tree consider different attribute characteristics, in a cyclical manner.

The process of constructing a K-D tree can be illustrated by considering a number of records, I to P, each having two attribute characteristics A and B as follows:

Attribute Characteristic Record A B

I 6 2 J 2 0.5

K 2 4

M 1 6

N 5 6

0 5 2

P 6 5

The first step in constructing the tree requires the records to be listed in ascending order in dependence upon the value of A. The average value of the two middle entries in the list is calculated (in this case 3.5). The value provides the threshold for the first node, or root node, of the tree as shown in Fig. 2. The distance of the A value from the threshold for the two middle values provides the upper and lower bound values for the first node. The records are divided into two sets, one containing records having an A value below the threshold and the other containing records having an A value above the threshold. Similarly, sub-trees are created for each of the subsets as shown in Fig. 2.

Once a K-D tree has been created, in order to find a best match for some search record from the K-D tree, the threshold value at the root node is compared with the corresponding attribute characteristic of the search record. The comparison proceeds down one of the tree branches repeating the comparison at subsequent sub- nodes for appropriate attribute characteristics until a final, leaf node, is reached. The leaf node is the initial candidate for a nearest neighbour match. For this initial candidate a multiple dimensional distance can then be calculated from the search record, using all of the attribute characteristics, as a measure of

similarity between the initial candidate and the search record. Multiple dimensional distance D can be calculated as, for example, the squared Euclidean distances between the respective sets of attribute characteristics, i.e.

D = [(x, - y,) ² + (x ₂ - y ₂) ² + (x. - _Vj) ²]

where x and y are the attribute coefficients and j is the number of attribute coefficients.

In practice the distance function used is a weighted Euclidean distance over the chosen invariant attribute characteristics corresponding to the patches being compared. The weight applied to a particular attribute value should optimally correspond to the accuracy of the attribute itself (more attention being given to parameters which on average are found to be more accurate, than to those on average found to be less accurate) . The weight must also, however, reflect the discriminating power of the attribute (e.g. an attribute which was constant over all patches would be useless even though consistent and accurate) .

The values of the weights may be determined by a standard procedure during the set-up as follows: 1. First process a set of images by the method of

U.S. 5,065,447 obtaining the exhaustively searched correct best-match transform. 2. Calculate all invariant attribute

characteristics for all small patches, and their corresponding large patches. 3. Using these values to provide the information, optimally adjust the weights, either by a statistical procedure as indicated above, or by an empirical optimisation process. The effect of this process is that the measured distance between each pair matched by the method of U.S. 5,065,447 is decreased while the distance between non-matched pairs is increased. The search process then moves to the node preceding the leaf node which provided the initial nearest neighbour match and determines from the bound (upper or lower) on the opposite side of the threshold from the leaf node whether any of the nodes on that opposite side could possibly be the true nearest neighbour (the lowest possible squared Euclidean distance for any leaf node on that opposite side is the square of the bound) . If so, a search is carried out down that opposite branch until a leaf node of that branch is reached. This process of travelling down and up the leaf nodes continues until a pre-selected number, K, of nearest neighbours have been obtained. The final best match is that one of the K nearest neighbours which provides the lowest root mean square distance.

When a best matching large patch has been determined for a tile using the K-D tree method, it is

necessary to then determine a transform which, when applied to the chosen large patch, maps that large patch onto the small patch. The parameters of this transform include; Offset

The offset parameter identifies the location of large patch providing the best match (e.g. by an x and y coordinate transform) . Level The level parameter is that constant which must be added to the intensity of each pixel of the chosen large patch to obtain the average pixel intensity of the small patch. Contrast The contrast parameter is the constant by which the intensity of each pixel of the chosen large patch must be multiplied in order to achieve the contrast, i.e. maximum intensity change, of the tile. Reduction The reduction parameter is that factor by which the number of pixels in the large patch must be reduced to give the number of pixels in the tile. If the large patch contains 16 x 16 pixels and the tile 8 x 8, then the reduction factor is 4. Rotation

The rotation parameter represents that angle (for a square patch the angles normally used are 0°, 90°, 180°, 270° although it is noted that in general any angle can

be considered) by which the large patch must be rotated in order to produce the best match. The rotation parameter may also consider whether reflecting each of the rotated patches in a 'mirror' produces a better match. Whilst it is possible to determine the rotation parameter by calculating the Hausdorf distance for each small patch in respect of each of the rotated patches (and also for the non-rotated large patch if this has not already been determined) and by selecting that one of the patches having the lowest Hausdorf distance, it is computationally more efficient to calculate a suitable rotation dependent parameter such as the identity of the quadrant containing the centroid, or the Principal Components, for both the tile and the large patches and to perform a comparison to determine the best match. In the preferred method the quadrant location of the centroid is selected as the parameter for determining rotation and is determined using the normal formula, i.e.

x coordinate = -*=-■——-

y coordinate =

where x- and y,- represent the x and y co-ordinates of

each of m pixels with respect to the centre of a patch and l _i represents the intensity at a pixel.

In order to increase the speed with which the rotation parameter can be obtained it is preferable to construct another data structure. This is a small table or array which is indexed along its x-axis with the centroid location of the tile and along the y-axis with the centroid location of the large patch. This gives to both the large patch and the tile a relatively small symmetry group and the resulting table has a number of elements which is the square of a cardinality of the two symmetry groups. This is sufficiently small to be cheaply held in memory and can be rapidly indexed. This means it is possible to directly index the appropriate mirror reflection or rotational transformations to be applied to map the large patch on to the tile.

As an example, Fig. 3 shows a table derived for the case of rotational and mirror plane symmetries and is derived for the case of the centroid function applied using a 2 x 2 partitioning of the patches which gives rise to a 4 x 4 table. An analogous larger 4 x 4 partitioning gives a corresponding 16 x 16 table. Note that in the larger table the mirror symmetry elements and rotational elements are not equivalent as they are in the simple 4 x 4 case adopted here for reasons of simplicity and clarity of description. Higher order moments can also be used instead of, or in addition to,

the centroid computation. It is noted that instead of determining the rotational transform subsequent to selecting a best match from the K-D tree as set out above, each of the large patches originally extracted from the image may be rotated prior to assembling the K- D tree and the centroid determined at that time. All of the resulting large patches, rotated and unrotated, may then be assembled into a K-D tree.

It is, however, always possible that the original image frame may contain a significant amount of noise in which case obtaining squared Euclidean distances may not select the actual best match from the K nearest neighbours obtained from the K-D tree. Whilst the possibility of selecting a poor match may be reduced by increasing the number of attribute characteristics (in this case the number of Fourier coefficients) , it is preferable to proceed by retaining at least some of the K nearest neighbours and to determine more precisely the final best match by using, for example, a supplementary test such as the Hausdorf distance based method set out in U.S. Patent No. 5,065,447. That test involves determining what is known as the Hausdorf distance between the search record and each of the selected nearest neighbours. The nearest neighbour giving rise to the smallest Hausdorf distance is chosen as the best match. Alternatively one may choose that large patch of the K nearest neighbours that gives rise to the lowest mean squared error under the transform when

compared on a pixel by pixel basis with the small patch. It is important to note that the transforms for each of the K patches must be obtained and applied to the large patches prior to determining a final best match by the supplementary, pixel by pixel, test.

As the supplementary test is carried out on only relatively few records, e.g. one to twenty or more preferably three or four, the resulting increase in complexity and hence in processing time is not great. If instead of only uniformly sub-dividing the. original image frame into square tiles of equal size the frame is sub-divided using "quadtree" partitioning, it is possible to improve the matching process and thus improve the accuracy with which images can be reconstructed. Quadtree partitioning involves, in addition to sub-dividing the image frame into squares of equal size, further sub-dividing each of those squares into four equally sized squares. If a sufficiently good match cannot be found for a square of the image frame produced by the first sub-division, then those squares produced by the second sub-division, third sub¬ division etc. are considered. The method of quadtree partitioning is discussed in a paper by Yuval Fisher of The San Diego Super Computer Center, University of San Diego, U.S.A. entitled "Fractal Image Compression",

SIGGRAPH '92 course notes. Once a transform has been obtained for each of the tiles, all of the transforms are assembled into a table which provides the final

compressed image frame. The original two-dimensional image frame can then be discarded.

The process for recovering the original image frame from the compressed data is identical to that set out in U.S. Patent No. 5,065,447. Briefly, the starting point for the reconstruction process is an arbitrary digitised image frame, for example with the intensity of all pixels being set to a value midway between the maximum and minimum values, of the same dimensions as the original uncompressed image frame which is to be reconstructed. Each tile, of the same size as the tiles into which the original image frame was sub¬ divided, of a first level reconstruction is constructed by applying the transform associated with that tile to the large patch of the arbitrary image frame identified by the transform, i.e. the corresponding large patch is selected and the location, contrast, level and orientation transformed and the appropriate reduction made (reduction may be achieved by pixel averaging) . The first level reconstruction for each tile of the frame is a very crude approximation to the original image. The reconstruction is refined by reapplying the set of transforms to the first level whole-frame reconstruction to obtain an improved, second level, whole-frame reconstruction. The set of transforms is iteratively applied until a reconstruction level is reached at which further iteration produces little or no improvement in image frame quality. That is to say

that at each reconstruction level bar the first, new tiles are created by shrinking large patches which are themselves constructed from a plurality of tiles created at the previous reconstruction level. It is this step- wise increase in tile complexity which gives rise to the corresponding increase in reconstruction accuracy.

By selecting the number of pixels in the initial arbitrary image to be either greater than or less than that in the original image frame, an enlargement or reduction in the original frame can be achieved. Fig. 4 illustrates one possible system for carrying out the above described process of data compression and which may be implemented in either software, hardware, or a combination of the two. A memory block 9 is provided for storing the two dimensional image frame to be compressed and the plurality of tiles and large patches into which it has been sub-divided. The input memory block is coupled to a stage A processor 4 which receives from the input memory block the data contained in the large patches. Data is transmitted serially in units of one patch. The stage A processor 4 determines for each large patch the required attribute characteristics and constructs a K-D tree accordingly. The K-D tree information is written to a memory block 5. When all the large patches have been processed and a K-D tree constructed, the data contained in each of the tiles is transmitted from the input memory block 9 to a stage B processor 6 which determines the attribute

characteristics for each of the tiles in turn. As each set of characteristics are determined, the stage B processor determines the best matching large patch from the K-D tree stored in memory block 5. A stage C processor 7 receives in sequence from the memory block 5 the data contained in each of the best matching large patches. The stage C processor also receives from the stage B processor the data contained in the corresponding tiles, from which it is able to determine the appropriate transforms. The resulting transforms are written to an output memory block 8.

Using the above described system images of good quality are produced on decompression. However, a visible artifact boundary can sometimes be seen at tile edges. This results from choosing the best matching large patch in a way which does not constrain its data values, and their gradients at the edges of the patches, to be continuous with those of adjacent patches.

This may be overcome, for example, by one of two basic methods:

1. Additionally weighting boundary (and near- boundary) data values when choosing matching patches.

2. Using overlapping patches to tile the space twice.

In method 1, the weight of the pixel values in comparisons (or indirectly in invariant attribute characteristics) is varied smoothly, being a maximum at

the boundaries. This is shown in Fig. 5.

In method 2, the image frame is sub-divided twice using tiles which are twice the size of those in a corresponding single-layer system. The compression ratio is thus the same. The pixel data values closest to the centre of each patch are given the highest weight (1.0), those at the boundary (the centre of an adjacent patch) are given the value 0.0, while those mid-way between patches are given equal contributions (0.5) . This weighting function is illustrated in Fig. 6. A non-linear function CosSq(ij) is applied to the i,j th. pixel. This gives a higher-than-linear weighting to near centre pixels, and a lower- than-linear weighting to those beyond the half-way distance to an adjacent patch. The sum of the contribution weights is 1.0.

This technique is known and has been adopted as part of the Discrete Cosine Transform as used in the JPEG and MPEG Standards Definitions.

Whilst the above described embodiment is restricted to grayscale images it will be appreciated that the present invention is also applicable to compressing black and white, i.e. each pixel is assigned a value of 1 or 0, and colour, i.e. each pixel has red, green and blue components of variable intensity, image frames. A possible system for processing colour image frames is to treat each of the three colours as a separate image frame. Each separate image frame is treated as set out above and a final, composite, image frame obtained by

combining the three separate frames. An alternative system, which treats each of the three primary colours as a separate dimension in 3D space, utilises the principles set out in U.S. 5,065,447. Unlike the previously known methods, the much reduced computational complexity of data compression offered by the above described system make it possible to apply it practically to 3D data whether derived from spatial dimensions, e.g. MRI (magnetic resonance imaging) 3D scan data, or from video sources in which time is the third dimension. Very large compressions are potentially; possible as a result.

Embodiments of the present invention (for data compression using affine Transforms) can be applied to even higher dimension data sets, such as are derived from areas of the atmosphere, or ocean, involving variation over time and involving additional attributes in additional dimensions.

It will also be apparent that the present invention is applicable, in addition, to 1-D data including audio or other data rendered as a serial stream.

The invention can also be applied, retaining the aspects of precomputation of characteristics of separately sampled data patches, similarity-based associative matching unit, configuration determining unit, to the recognition and matching of multidimensional data generally. It has particular relevance, to:

(a) Fast Compression of Images using the MPEG method.

(b) Mapping of DNA fragments into a set of larger DNA components and similar applications. (c) Mapping or discovering similar atomic environments in a set of molecular compounds. An alternative method of computing attribute characteristics of patches will now be described. Reference is made to Figure 1 which shows a 512 by 512 pixel image subdivided into a plurality of small tiles. For each of the small tiles it is desired to identify a large patch, from all possible large patches contained within the image, by computing and comparing attribute characteristics for the tiles and for the large patches. The technique described hereinbelow is concerned with a new process by which the appropriate attribute characteristics can be computed with minimal effort.

With reference to Figure 7 of the accompanying drawings there is shown eleven masks each of which may used to calculate an attribute characteristic of a block of pixels underlying the mask. The attribute characteristic for each mask is computed by adding the intensities of pixels underlying a black area whilst subtracting the intensities of pixels underlying a white area. It will be readily apparent that the value of each attribute characteristic, corresponding to a given one of the masks, can be computed from the intensity sum

of the four quadrants (sl,s2,s3 and s4) of the block underlying the masks. For example, the attribute characteristic for mask 1 corresponds to (si + s3) - (s2 + s4) whilst the attribute characteristic for mask 3 corresponds to (s3 + s2) - (si +s4 ) . The effort required to calculate a required plurality of attribute characteristics can be reduced considerably using the masks of Figure 7 as compared to the use of Fourier co¬ efficients described above. As described above, the large patches are extracted from the image frame by sliding a window across and down the frame one pixel at a time. Consider a data block displaced by some number of pixels from the left hand edge of the frame as shown in Figure 8. It will be apparent that in order to evaluate the four quadrant sum, sl,s2,s3,s4 necessary to compute the plurality of attribute characteristics for that block, and providing that the quadrant sums for the preceding block in the same row have been retained, the only additional data required is (1) the sum of the pixel intensities in the row covered by each quadrant of the current block which was not covered by the corresponding quadrant of the preceding block, i.e. b,c,e and f, and similarly (2) the sum of the pixel intensities in the row covered by each quadrant of the preceding block which is not covered by the corresponding quadrant of the current block, i.e. a,b,d and e. The sums a to e are termed half column sums hereinafter.

The quadrant sums si' to s4' of the current block can be calculated as follow from the quadrant sums si to s4 of the preceding block: si' = si - a + b s2' = s2 - b + c

S3' = S3 - d + e s4' = s4 - e + f

Given that the large patches are in fact relatively small, for example 16 by 16 pixels, it is possible to store all of the half column sums underlying a given location of the window used to extract the large patches, i.e. 32 individual sums, and for each new block only c and f need be calculated anew. Thus, for each quadrant of side n, the computation of the sum can be reduced from n x n operations to n + l operations (n corresponding to the number of new pixels overlapped by the quadrant and the 1 corresponding to the previously calculated and stored half column sum to be subtracted) . Prior to calculating the attribute characteristics corresponding to the masks of Figure 7 it is necessary to first determine the transform required to orient a patch under consideration into a standard orientation such that the comparison between the attribute characteristics for a tile and a patch can be made which is substantially independent of orientation.

Considering a large data patch for which the four quadrant sums, a,b,c,d have been calculated; the patch

is arranged as follows: a b c d An efficient orientation procedure is as follows: compute a value vx = 3*a -2d - c which is a maximum for all possible orientations. The configuration with the maximum value is the standard one. It is derived as follows: a>=b, a>=c, a>=d, b>=d This is equivalent to: a-b>=0, a-c>=0; a-d>=0, b-d>=0

Averaging we get: a -b +a -c +a -d -i-b -d >=0 or

3*a -2d -c >=0 In practice we find the maximum of 3*a -3*d -c.

This arrangement is achieved by rotating the patch about its central axis, if necessary, so that the quadrant giving rise to the largest sum is located in the top left position. Then, again if necessary, the quadrant sums at the top right and bottom left are interchanged. It is noted that these allowed transformations do not disrupt the adjacency of pixels within the patch: this restriction can be visualised by considering the patch to be drawn on a sheet of paper with the allowed transformations involving reflection and rotation and not requiring that the paper be torn.

If the coordinates of the four quadrants are {(0,0), (1,0) , (1,1), (0,1)} the transformations can be represented by a matrix M representing rotation, reflection, and scaling, and a vector V representing a translation, so that the transformed coordinate is given by x' ^a MX + V.

The eight transformations are given in Table 1 below.

Transform 1

1 0 0 0 1 0

Transform 2

0 1 0

-1 0 1

Transform 3

-1 0 1

0 -1 1

Transform 4

0 -1 1

1 0 0

Transform 5

0 1 0

1 0 0

Transform 6

-1 0 1

0 1 0

Transform 7

0 -1 1 -1 0 1

Transform 8

1 0 0 0 -1 1

For example, consider the effect of the transforms on the patch 5 30

20 10

Transform 1 1 0 0

0 1 0

5 30 20 10

Transform 2

0 1 0

-1 0 1

30 10

5 20

Transform 3

-1 0 1

0-1 1

10 20

3O 5

Transform 4

0-1 1

1 0 0

20 5

10 30

Transform 5

0 1 0

1 0 0

5 20

30 10

Transform 6

-1 0 1

0 1 0

30 5

10 20

Transform 7

0-1 1

-1 0 1

10 30

20 5

Transform 8

1 0 0

0 -1 1

20 10 5 30

When the orientation transformation for the data patch has been determined the quadrant sums are normalised. This is achieved by determining the multiplication factor (contrast) and the addition factor (level) which, when applied to each of the quadrant sums, ensures that the smallest sum equals a constant, e.g. 0, and the largest sum equals some other constant, e.g. 100.

Once the orientation of a data patch has been determined and its level and contrast normalised the masks of Figure 7 are applied in turn to the transformed

patch and the corresponding attribute characteristics determined from the appropriate summation of the four quadrant sums. As described above, the attribute characteristics for each of the large data patches are assembled into a table, preferably in the form of a K-D tree although other structures such as a hash table may be appropriate, to facilitate rapid searching. In addition to storing the attribute characteristics, the orientation, level and contrast transforms are stored for each of the large patches. In order to find a best match for each of the tiles into which the image has been subdivided it is necessary to determine the attribute characteristics for each of the tiles in turn. This is done in an identical manner to that described above for the large data patches with the required transform being determined and stored.

It will be appreciated that the contrast and level normalising step enables a comparison to be made between the attribute characteristics of the large data patches and the tiles despite the fact that the large data patches comprise a greater number of pixels.

The K-D tree containing the attribute characteristics for each of the large data patches is searched to find the best match for the tile under consideration. Once found, it is necessary to determine the affine transform which maps the tile onto the chosen large data patch which transforms are relative to a common origin, is can be done relatively simply by

combining the normalising transforms obtained for both the tile and the large patch, which transforms are relative to a common origin. The following table can be used to determine a combined orientation transform with the transform number for the tile providing the row subscript and the transform number of the matching large patch providing the column subscript.

large patch subscript no.

1 2 3 4 5 6 7 8

4 1 2 3 6 7 8 5

3 4 1 2 7 8 5 6 title 2 3 4 1 8 5 6 7 subscript 5 6 7 8 1 2 3 4 number 6 7 8 5 4 1 2 3

7 8 5 6 3 4 1 2

8 5 6 7 2 3 4 1

Whilst the masks of Figure 7 provide a number of attribute characteristics, and therefore enable matching blocks to be identified with a high degree of reliability, it is sufficient in some instances to use the four quadrant sums directly as attribute characteristics. A suitable algorithm is as follows. Consider a square made up from four quadrants with values a,b,c,d. a b d c

The first step in normalisation is to select the smallest quadrant, and retaining this as an Intensity Offset, convert the squares to relative intensities. e.g.

500 450 300 250

200

600 200 400 0

Contrast Scaling: 750 625

/ 200, 2.5

1000 0

Normalising Orientation

1000 750

/ 200, 2.5, 4

0 625

There will now be described an image compression system which results in improved compression times.

This system makes use of a reference image to provide the large data patches to which the tiles of an image to be matched are compared. There is shown in Figure 9 a system for compressing a video signal. Analog signals from a video source are input to the system and digitised in an analog to digital Converter 1, to some set number of bits of accuracy, at times controlled by timing signals 2 from an appropriate timing unit 3. The effect of converting the analog signal is to produce for each frame of the signal a sequence of binary digital values representing local estimates of the image brightness, such estimates being made on a regular rectangular grid of points distributed across the image frame. The digital sample sequence is passed to an

attribute extraction unit 4, illustrated in detail in Figure 10, which computes geometric attributes of a set of contiguous subsets of the total sample sequence representing the frame currently being compressed. Typically, but not exclusively, such subsets will correspond to rectangular sub-arrays of the rectangular frame. Simultaneously with the computation of the geometric attributes of each sub-frame, global [to the sub-image] brightness and contrast attributes are computed which subsequently form part of the compressed data.

For each frame, the set of attributes is passed to a comparison unit 5 which has the function of comparing the current attributes of each sub-frame with those determined for the equivalent sub-image of the previous frame and of identifying from the comparisons those sub- frames of the complete frame the attributes of which have significantly changed. The attribute characteristics for the previous frame are stored in a linked storage unit 6 which is connected to the comparison unit 5. Following the comparison operation, the storage unit is updated by replacing the attribute characteristics of the previous frame with those of the current frame. An associate store 8 contains a set of attribute characteristics derived for a corresponding set of large patches extracted from a fractal reference image. This reference image may be for example a photograph of a

fractal generated image. The attributes corresponding to those sub-frames which have been flagged as being significantly changed (or alternatively the attributes of all sub-images with those which have significantly changed being appropriately labelled for future identification) are passed to a unit 7 which identifies the entries in the associative store 8 which most closely match the flagged sub-frames. To achieve this match, the attributes are normalised in terms of brightness and contrast and are used to access the associative store. The patch index of the best matching patch is passed to a data stream formation unit 9 which combines the patch index with the corresponding brightness and contrast values to form the data which make up the image brightness transformations which comprise the data stream.

The attribute extraction unit 4 is shown in detail in Figure 11. The function of this unit is to compare the attributes of each patch of the source frame to enable it to be matched against the nearest equivalent patch of the fractal reference image.

The attributes for a patch are logically calculated by summing the pixel intensities of the patch under a mask as described above. A problem with this approach however is that a television camera presents the image in raster scan order rather than patch by patch. If we assume for the sake of a concrete example that the image is of dimensions 320 by 240 resolvable points and that

patches of size 8 by 8 resolvable points are used then there will be 40 patches across the screen. After the camera has scanned 8 lines of the screen enough information has been transmitted to calculate the attributes for 40 patches. It is necessary, however, for the system to maintain information about the stage of calculation of each attribute and each patch whilst the 8 lines that make up a patch are being scanned.

The attribute extraction unit therefore consists of as many adder/subtractor units as there are attributes to be calculated. Associated with each of these is a circular buffer that can be used to store the state of the attribute calculations for all of the patches currently being scanned. The adder/subtractor units contain an accumulator A and are fed with both the digitised raw video information and a value from the circular buffer pertaining to one particular patch. The sequence of operations are as follows:

Assume that the camera is just starting to scan the first line of a row of patches.

1) Just before the first pixel of each patch arrives the accumulator A is cleared.

2) As each of the 8 pixels on the scan line arrive from the camera, they are added to or subtracted from the accumulator total under the control of a signal from the mask generator.

3) Once 8 pixels have been processed the accumulator total is output to the tail of the circular buffer and

step 1 is repeated.

4) Once the first scan line has been processed the next 6 lines are processed in a similar way as follows:

4.1) Just before the first pixel of each patch arrives, the accumulator A is loaded from the head of the circular buffer.

4.2) As each of the 8 pixels on the scan line arrive from the camera, they are added to or subtracted from the accumulator total under the control of a signal from the mask generator.

4.3) Once 8 pixels have been processed the accumulator total is output to the tail of the circular buffer and step 4.1 is repeated. 5) The last scanline of the row of patches is processed as follows: 5.1) Just before the first pixel of each patch arrives the accumulator A is loaded from the head of the circular buffer. 5.2) As each of the 8 pixels on the scan line arrive from the camera they are added to or subtracted from the accumulator total under the control of a signal from the mask generator. 5.3) Once 8 pixels have been processed the accumulator total is output to the output stream S and step 5.1 is repeated. In order to decompress a compressed image, a video

decompressor is provided which is designed to be integrated into the bus of a personal computer enabling it to:

1. Decompress video images. 2. Operate as a standard character display. 3. Operate as a graphics display.

It is well known that the character map displays which are widely used in desktop computers are faster to update than bitmap displays. This is because when the image of a character is to be placed on the screen only its ascii value has to be placed in the display buffer rather than a complete bitmap of the character image. The actual bitmap is looked up by the display hardware using an appropriate memory indexed by the ascii character, the character set memory. This means that the data-rate required to update the screen is perhaps l/64th of that required for a 256 colour bitmapped display.

The video decompressor shown in Figure 12 is an extension of this technology. The design substitutes a patch index buffer for the normal character buffer. This is substantially the same as a character buffer except that, instead of characters (represented in ascii) to be displayed, the words stored in the buffer contain, for a monochrome display, 3 fields {I,b,c} where:

I. is an index into a fractal reference image, b. is a brightness shift to be applied to the

reference image; and c. is a contrast shift to be applied to the reference image.

The patch index buffer is a random accessed memory which can be addressed by the host processor as well as by the display timing logic. The host processor can write the fields into any of the words of the patch index buffer. If a picture is to be shown, the host processor loads an appropriate set of these words into a patch index buffer.

The fractal reference image is stored in a second memory and consists of an array of pixels for each of which sufficient bits are stored to encode the brightness of the fractal image at that point. If we assume that the patches on the screen are for instance 8 x 8 pixels square then the decompression process proceeds as follows: a) as the CRT display gun passes into a particular patch on the screen the corresponding word of the patch index buffer is accessed; b) the image index field I is used in conjunction with a line number n to fetch a line L within a patch of the fractal reference image R such that L = R[I,n]; c) the line L is now shifted out pixel by pixel into a digital to analogue converter (call the output of this a) ; and d) the analog data stream is sent to a shift multiply unit where each pixel intensity is transformed into a

regenerated video stream v by multiplying by the contrast shift c and adding the brightness shift b so that v = b + (a e).

The resulting pixel intensity values are sent to a CRT or other display hardware. Certain variations on the basic design are also possible.

If the patch index buffer is termed level one and the reference image buffer level two then the significant features of the above described embodiments are: i) the use of two level display indirection as a technique for decompressing images; ii) the use of a fractal reference image as the contents of the second level; iϋ) the storage of a combination of brightness and contrast shift information in each word of the first level of the display; and iv) the use of the same mechanism for decompression of images as is used to display conventional textual or graphical information.

It will be appreciated by the skilled person that various modifications may be made to the above described embodiments within the scope of the present invention.

For example, it would be feasible to perform the brightness shift and the contrast adjustment on the patches from the reference image by means of a digital add/multiply unit set between the digital output stream from the fractal reference buffer and the digital to

analogue convertor. For instance, the operations could be performed using a lookup table in an appropriate transformation memory.

Colour displays may be provided using the standard technique of running 3 data paths in parallel to generate the red, green and blue or chrominance, luminance and hue components of the image. This requires a widening of the words in the patch index buffer to allow indices for each of the components to be included. Three reference image buffers would have to be provided (or a single reference image buffer would be time multiplexed between the three colour components) . If the reference image buffer is large enough and is made readable and writable by the host processor, parts of the buffer may be used to store a character set for text display in the conventional manner whilst other parts may be used to store uncompressed bitmap images as in a conventional bitmap display. By setting the patch index buffers to point at an area of the reference image containing characters, characters are displayed on the screen. By setting the patch index buffers to point at portions of an uncompressed image then the uncompressed image is displayed on the screen.

In the vector quantization method described above, an image is divided up into a set of patches which

"tile" its area, in an overlapping, or more usually, a non-overlapping fashion. We can envisage this for the sake of illustration as analogous to a page of text

being represented by an arrangement of tiles, each of which contains one letter. The Vector Quantization System employs a code book of entries which associate a pattern of pixels (such as may occur in an image) with a standard code number. This association in the "letter" illustration would be between the shape of a particular character and its standard code. Whereas the pattern itself requires a number of pixel values, the code number is compact and can be represented by a single number, taking only a few bits when expressed in binary representation.

The vector quantization method matches each tile of the image to be compressed to the nearest entry in its code book and then stores, or sends to a remote site, only the corresponding short code. When it is desired to decompress the compressed format, the decompressor (which must also have access to an identical code book) can reverse the process, looking up each short code and extracting the corresponding larger set of individual pixel values which represent the pattern.

Techniques of rapidly locating the nearest code in the code book during the compression phase have been described above.

One of the difficulties concerns the code book's set of -standard patterns. Conventionally 8x8, 16x16 or such blocks of pixels are employed as tiles, each pixel being represented by 24 bits of data. Code books have only a relatively small number of standard entries and

although these are normally adjusted by level and contrast or other parameters the fact remains that a relatively tiny number of standard codes must be used to optimally represent a potentially enormous number of different possible patterns.

This problem has been handled in the past by either adopting a set of patches from typical pictures, or adopting an adaptive strategy in which a special dedicated code book dictionary is created and stored with, or sent with, the compacted image or series of images. Neither approach is optimal.

There will now be described an image compression process the basis of which is the application of a form of self-organising memory system such as is described in PCT/GB9400777. This a system can be set up to determine a given number of patterns which can best match a given set of patterns in a large representative set of "training images". As described for the conventional approach, each input tile pattern is first normalised to standardise its level and contrast before it is applied to the code book as a candidate tile pattern. The effect of the self-organising memory is to produce a "best average" set of individual patterns. The system attempts initially to store each pattern individually, however this soon becomes impossible and the incoming pattern is stored as a weighted composite of its nearest match already in the memory. The memory also ensures that each candidate pattern accepts an approximately

equal number of component input patterns to make its composite.

The effect is thus that the system produces codes which will be of approximately equal frequency, when the code book is used to compress an image of similar composition to those in the training set.

The code book of tile patterns thus produced is then used for coding (and thus compressing) subsequent images. There are a number of significant advantages in the new approach:

1. The New System using Optimal Codes as described here is capable of achieving much greater degrees of compression, while maintaining the perceptual quality of the image, than other systems which use conventional methods of producing their code books, or Fractal Image Compression, (in which the code book is implicit, and is the image itself) .

2. The Optimal Codes retain their ability to represent images over a remarkably wide range of images. In fact they make it feasible to construct and maintain a general-purpose code book which has good average performance over the complete range of images.

3. Once produced by the training process, the Optimal Codes can be applied directly by fast, simple hardware and software methods. Unlike dynamic (Adaptive) Vector Quantization Methods there is no need to dynamically provide new versions of the code book for decompression and, unlike Fractal Methods, the decompression process

is a simply linear and non-iterative one. 4. Notwithstanding the good general-purpose nature of the codes, the method can also be applied in a dynamic way, if this is required. 5. The properties of the Optimal Code make them suitable elements in other applications; for example those involving recognition of picture elements.

There are three systems involved in the overall operation: (1) the Optimal Code Construction System;

(2) the Compression System; and

(3) the Decompression System.

The code-forming process uses a training set (A) which is representative of the sample of the images to be compressed, both with regard to frequency and content. A scanner (B) splits each image into a set of Actual Tile Patterns, which are presented one at a time to a Code Maker (C) . The Code Maker presents each Actual Tile Pattern to its self-organising store of Candidate Codes, and the associative properties of this store result in the Actual Pattern being allocated a "nearest match" candidate Pattern. Two processes then interact. The first takes account of the number of Actual Tile Patterns which already contribute to this Candidate Code and other Candidate Codes. Candidate Codes with atypically few contributors are combined, those with atypically many are split. The second combines the new Actual Tile Pattern with its selected

Candidate Code as a weighted average representation. The operations of the self-organising store are already described in PCT/GB9400777.

The image to be compressed is presented to the Scanner which splits it into Actual Tile Patterns and normalises them to standard level and contrast, retaining the appropriate parameters as part of the compressed representation. The Actual Tile Pattern is then associatively matched against the code book. This may be done either by a simple exhaustive matching procedure or by a fast associative search as described in Patent above or by the associative search process analogous so that used in constructing the optimal codes. The net result is to identify the code number of the most nearly matching optimal code, which is stored as the remaining part of the compressed representation of the Actual Tile. In this way each Actual Tile of the image is processed, and the whole image is eventually compressed.

Adaptive versions of the system continue the optimal code training process to produce a new Code Book V(n+1) , while continuing to code using Code Book V(n) . When the Code Book V(n+1) is judged to be sufficiently more optimal that the out-dated V(n) , the new code book is made available for both compression and decompression.

The decompression system is a conventional Vector

Quantization decompression sub-system as described above. The compressed representation of each tile is processed in turn, each being converted to an Actual Tile Representation closely approximate to the original. To do this the system applies the code number to the code book associative memory thus addressing the corresponding optimal Code Pattern. This is then modified by the remaining parameters of the compressed representation for the tile, e.g. the level and contrast parameters.

Previous Patent: REMOTE VIDEO IMAGE TRANSMISSION AND RECEPTION METHOD AND DEVICE

Next Patent: AN ELEMENT MANAGER FOR A COMMUNICATIONS NETWORK