Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A PROGRESSIVE JPEG BITSTREAM TRANSCODER AND DECODER
Document Type and Number:
WIPO Patent Application WO/2015/041652
Kind Code:
A1
Abstract:
A method, system, and computer program for providing progressive JPEG image (40) over a flexible range of memory available for a progressive JPEG decode (24). Progressive JPEG decode (24) involves transcoding of a Progressive JPEG image (22) by insertion of Restart/ReSync Marker (27, 31, 32). An Application Data Segment is inserted in the progressive JPEG bitstream for specifying Region of Interest (ROI) and operations to be performed on ROI. The system scales itself according to the memory available (36) for the progressive JPEG decode. It reduces the intermediate memory required for decoding a progressively encoded JPEG bitstream (22). It simultaneously reduces the computation resources required to decode a progressively encoded image (JPEG/ JPEG-XR) bitstream (22). It increases the performance efficiency in terms of real time performance as well as reduces power consumed for decoding a progressively encoded JPEG bitstream (22).

Inventors:
GOEL ANURAG (IN)
Application Number:
PCT/US2013/060626
Publication Date:
March 26, 2015
Filing Date:
September 19, 2013
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ENTROPIC COMMUNICATIONS INC (US)
International Classes:
H03M7/40
Foreign References:
US20110150351A12011-06-23
US20120155767A12012-06-21
US20130148139A12013-06-13
Attorney, Agent or Firm:
BACHAND, Richard A. (San Diego, California, US)
Download PDF:
Claims:
CLAIMS

1. A method of transcoding and reconstructing of at least one MCU/data unit segment of at least one image component comprising the steps of:

segmenting (27, 31, 32) the scans of the at least one image component into a number of multiple segments by inserting predetermined markers (32) and application data segments in a JPEG bitstream (22);

decoding (28) available scans of the at least one MCU/data unit segment; and reconstructing (34, 38, 40) the at least one image. 2. The method of claim 1 wherein the predetermined markers (32) comprise at least one member from the group consisting of a restart interval segment, restart markers (32), an application data segment and resync markers (32) and/or wherein the step of decoding comprises decoding frequency coefficients of at least one MCU/data unit segment of a first scan of an image component and then decoding a MCU/data unit segment of a second scan which corresponds to a MCU/data unit segment of previous scan and/or wherein the step of decoding comprises decoding Multiple Frequency Coefficients in parallel on different scans of an Image component and/or wherein the step of decoding comprises entropy decoding (28) of one complete segment of each multiple segmented scan but reconstructing only the MCU/data units which correspond to a minimum restart interval among all the restart intervals of scans of an image component and/or wherein the step of segmenting comprises segmenting each scan of an image component using a different restart interval for each scan and/or further comprising splitting an end of band entropy code into multiple end of band entropy codes for creation of uniform restart intervals in a scan, a last restart interval being an exception and/or further comprising the step of inserting a default restart interval marker segment in a table/miscellaneous marker segment before a frame header and inserting corresponding restart markers (RSTm) in the scan and/or further comprising the step of inserting a scan specific restart interval marker segment in a table or miscellaneous marker segment present in a scan and inserting corresponding RSTm markers in the scan and/or further comprising the step of storing information about zero runs of the MCU/data units of a scan to prevent parsing of zero runs of the MCU/data units when zero runs of MCU/data units straddle across a synchronization point and/or wherein the predetermined markers comprises multi byte resync marker that comprise a member from the group of restart markers (RSTm), start of image (SOI) markers, end of image (EOI) markers , Reserved Markers, and start of frame (SOF) markers and further comprises the step of inserting the predetermined marker by replacing bits in the compressed bitstream to indicate a location up to which entropy encoding has been done and/or further comprising the step of randomly seeking regions of interest according to an inserted restart interval segment, an application data segment, restart and resync markers and/or further comprising the step of buffering the entropy decoded data of available scans wherein a size of an intermediate buffer memory comprises:

2*Bytes j)er_DCT_coeffcient*MCU_data_unit_size bytes to

2*Bytes j)er_DCT_coeffcient*MCU_data_unit_size *

Max_number_of_non_zero_data_units_between_2_synchronization joints bytes and/or wherein the step of reconstructing comprises reconstructing up to 1/4Λ of frequency coefficients which are present in a top-left 4x4 array of 8x8 block or only up to the frequency coefficients which correspond to half a height and width of a frequency transformed data unit block; and

reconstructing the complete JPEG image and/or wherein the step of reconstructing comprises reconstructing approximations of the JPEG image using stored DCT coefficients and/or wherein the step of reconstructing comprises reconstructing up to 4 or 5 most significant bits (MSB's) of frequency coefficients and/or wherein the step of reconstructing further comprises reconstructing of scans of one image component before proceeding to decode other image components. 3. The method of claim 2 wherein the application data segment comprises application data bytes and a multi byte Application_specific_marker/ReSync Marker. 4. A system for transcoding and reconstructing of at least one MCU/data unit segment of at least one image component comprising:

means for segmenting (27, 31, 32) the scans of the at least one image component into a number of multiple segments by inserting predetermined markers (32) and application data segments in a JPEG bitstream (22); means for decoding (28) available scans of the at least one MCU/data unit segment; and

means for reconstructing (34, 38, 40) the at least one image. 5. The system of claim 4 wherein the predetermined markers comprise at least one member from the group consisting of a restart interval segment, restart markers, an application data segment and resync markers and/or wherein the means for decoding comprises a means for decoding frequency coefficients of at least one MCU/data unit segment of a first scan of an image component and then decoding a MCU/data unit segment of a second scan which corresponds to a MCU/data unit segment of previous scan and/or wherein the means for decoding comprises a means for decoding Multiple Frequency Coefficients in parallel on different scans of an Image component and/or wherein the means for decoding comprises a means for entropy decoding of one complete segment of each multiple segmented scan but reconstructing only the MCU/data units which correspond to a minimum restart interval among all the restart intervals of scans of an image component and/or wherein the means for segmenting comprises a means for segmenting each scan of an image component using a different restart interval for each scan and/or further comprising a means for splitting an end of band entropy code into multiple end of band entropy codes for creation of uniform restart intervals in a scan, a last restart interval being an exception and/or further comprising a means for inserting a default restart interval marker segment in a table/miscellaneous marker segment before a frame header and inserting corresponding restart markers (RSTm) in the scan and/or further comprising a means for inserting a scan specific restart interval marker segment in a table or miscellaneous marker segment present in a scan and inserting corresponding RSTm markers in the scan and/or further comprising a means for storing information about zero runs of the MCU/data units of a scan to prevent parsing of zero runs of the MCU/data units when zero runs of MCU/data units straddle across a synchronization point and/or wherein the predetermined markers comprises multi byte resync marker that comprise a member from the group of restart markers (RSTm), start of image (SOI) markers, end of image (EOI) markers , Reserved Markers, and start of frame (SOF) markers and further comprises the step of inserting the predetermined marker by replacing bits in the compressed bitstream to indicate a location up to which entropy encoding has been done and/or further comprising a means for randomly seeking regions of interest according to an inserted restart interval segment, an application data segment, restart and resync markers and/or further comprising a means for buffering the entropy decoded data of available scans wherein a size of an intermediate buffer memory comprises:

2*Bytes j)er_DCT_coeffcient*MCU_data_unit_size bytes to

2*Bytes j)er_DCT_coeffcient*MCU_data_unit_size *

Max_number_of_non_zero_data_units_between_2_synchronization joints bytes and/or wherein the means for reconstructing comprises a means for reconstructing up to l/4th of frequency coefficients which are present in a top-left 4x4 array of 8x8 block or only up to the frequency coefficients which correspond to half a height and width of a frequency transformed data unit block; and reconstructing the complete JPEG image and/or wherein the means for reconstructing comprises a means for reconstructing approximations of the JPEG image using stored DCT coefficients and/or wherein the means for reconstructing comprises a means for reconstructing up to 4 or 5 most significant bits (MSB's) of frequency coefficients and/or wherein the means for reconstructing further comprises a means for reconstructing of scans of one image component before proceeding to decode other image components. 6. The system of claim 6 wherein the application data segment comprises application data bytes and a multi byte Application_specific_marker/ReSync Marker. 7. A non-transitory computer-executable storage medium comprising program instructions which are computer-executable to implement transcoding and reconstructing of at least one MCU/data unit segment of at least one image component comprising:

program instructions that cause segmentation (27, 31, 32) of the scans of the at least one image component into a number of multiple segments by inserting predetermined markers (32) and application data segments in a JPEG bitstream (22);

program instructions that cause available scans of the at least one MCU/data unit segment to be decoded (28); and

program instructions that cause a reconstruction (34, 38, 40) of the at least one image.

8. The non-transitory computer-executable storage medium of claim 7 wherein the application data segment comprises application data bytes and a multi byte

Application_specific_marker/ReSync Marker and/or wherein the program instructions that cause available scans of the at least one MCU/data unit segment to be decoded comprises decoding frequency coefficients of at least one MCU/data unit segment of a first scan of an image component and then decoding a MCU/data unit segment of a second scan which corresponds to a MCU/data unit segment of previous scan and/or wherein the program instructions that cause available scans of the at least one MCU/data unit segment to be decoded comprises decoding Multiple Frequency Coefficients in parallel on different scans of an Image component and/or wherein the program instructions that cause available scans of the at least one MCU/data unit segment to be decoded comprises entropy decoding of one complete segment of each multiple segmented scan but reconstructing only the MCU/data units which correspond to a minimum restart interval among all the restart intervals of scans of an image component and/or wherein the program instructions that cause segmentation comprises segmenting each scan of an image component using a different restart interval for each scan and/or further comprising program instructions that cause a split of an end of band entropy code into multiple end of band entropy codes for creation of uniform restart intervals in a scan, a last restart interval being an exception and/or further comprising program instructions that cause insertion of a default restart interval marker segment in a table/miscellaneous marker segment before a frame header and inserting corresponding restart markers (RSTm) in the scan and/or further comprising program instructions that cause insertion of a scan specific restart interval marker segment in a table or miscellaneous marker segment present in a scan and inserting corresponding RSTm markers in the scan and/or further comprising program instructions that cause information about zero runs of the MCU/data units of a scan to be stored to prevent parsing of zero runs of the MCU/data units when zero runs of MCU/data units straddle across a synchronization point and/or wherein the predetermined markers comprises multi byte resync marker that comprise a member from the group of restart markers (RSTm), start of image (SOI) markers, end of image (EOI) markers , Reserved Markers, and start of frame (SOF) markers and further comprising program instructions that cause insertion of the predetermined marker by replacing bits in the compressed bitstream to indicate a location up to which entropy encoding has been done and/or further comprising program instructions that cause randomly seeking regions of interest according to an inserted restart interval segment, an application data segment, restart and resync markers and/or further comprising program instructions that cause the entropy decoded data of available scans be buffered wherein a size of an intermediate buffer memory comprises:

2*Bytes j)er_DCT_coeffcient*MCU_data_unit_size bytes to

2*Bytes jer_DCT_coeffcient*MCU_data_unit_size *

Max_number_of_non_zero_data_units_between_2_synchronization joints bytes and/or wherein the program instructions that cause a reconstruction comprises reconstructing up to l/4th of frequency coefficients which are present in a top-left 4x4 array of 8x8 block or only up to the frequency coefficients which correspond to half a height and width of a frequency transformed data unit block; and reconstructing the complete JPEG image and/or wherein the program instructions that cause a reconstruction comprises reconstructing approximations of the JPEG image using stored DCT coefficients and/or wherein the program instructions that cause a reconstruction comprises reconstructing up to 4 or 5 most significant bits (MSB's) of frequency coefficients and/or wherein the program instructions that cause a reconstruction further comprises reconstructing of scans of one image component before proceeding to decode other image components. 9. The non-transitory computer-executable storage medium of claim 8 wherein the predetermined markers (32) comprise at least one member from the group consisting of a restart interval segment, restart markers, an application data segment and resync markers.

Description:
A PROGRESSIVE JPEG BITSTREAM TRANSCODER AND DECODER

FIELD

[0001] The disclosed method and apparatus relates to communication systems, and more particularly, embodiments related to reduce the memory and computation resources required to decode an image bitstream encoded in Progressive JPEG Mode.

BACKGROUND INFORMATION

[0002] The Joint Picture Experts Group (JPEG) Standard, encodes an image as successive approximations of a JPEG image. A JPEG image encoded in such a manner is termed as a Progressive JPEG bitstream. A Progressive JPEG bitstream is transmitted by sending successive approximations of a JPEG image over a network in a succession. The JPEG Standard suggests storing entropy decoded Discrete Cosine Transform coefficients for all the components in memory for an image bitstream encoded in progressive JPEG Mode. As soon as a subset of frequency coefficients of all the components as partitioned by an image or JPEG encoder becomes available, the same are stored and decoded and an image that is a coarse approximation of an original image is displayed. Discrete Cosine Transform (DCT) coefficients, which have been decoded are stored as they are required for decoding an improved approximation of an image after a remaining portion of the bitstream has been received. As more frequency coefficients of all the components as partitioned by image/JPEG encoder become available, they are stored and decoded along with previously stored frequency coefficients and an image, which is an improvement over the previous coarse approximation of an original image, is displayed.

[0003] JPEG image decoding as described above requires an intermediate memory of the order of image_width * image_height * no_of_components *

frequency_coefficient_num_bytes, i.e., W*H*N*2, for storing DCT coefficients of all image components. No_of_components is the total number of different components, which when combined, represent a multi-component JPEG image; for example, a YUV JPEG image consists of three components Y, U and V.

frequency coefficient num bytes is the number of bytes required to represent a frequency coefficient. The JPEG library developed by the independent JPEG group requires so much intermediate memory for decoding a progressively encoded JPEG bitstream.

[0004] If the original image dimensions are 2K *2K, then approx 2K * 2K * 3 * 2 bytes of memory are needed for YUV 4:4:4 color format, which is approximately 24MB. Successive improvements in an image require repeated Inverse Discrete Cosine Transform (IDCT) computation, hence increasing computation resources by a factor proportional to the number of successive improvements.

[0005] Researchers have proposed various solutions toward efficient implementation of progressive JPEG decoder.

[0006] U.S. Patent No. 7,313,281B2, entitled "Method and Related Apparatus for JPEG

Decoding" to Chi-Cheng Ju, et. al, teaches decoding each of the scans into partial decoded pixel and summing each newly generated partial decoded pixel.

[0007] U.S. Patent Application No. 2003/0091240A1, entitled "Method and Apparatus for

Progressive JPEG Image Decoding" to Chi-Cheng Ju, et. al, and U.S. Patent Application No. 2007/0098275A1, entitled "Method and Apparatus for Progressive JPEG Image Decoding" to Kun-Bin Lee, teach decoding a progressive JPEG image by dividing each of the scans into multiple regions and then decoding the regions individually. Finally, the decoded coefficients of the current decoding region of all scans are outputted in order to construct a portion of the image data.

[0008] U.S. Patent Application No. 2005/0008234A1, entitled "Process and Functional Unit for the Optimization of Displaying Progressively Coded Image Data" to Uwe-Erik Martin, teaches a method for optimizing the downloading of progressively coded image data. The wait times between the time points of directly consecutive decoding steps are calculated using statistical image quality parameters of received partial image data in such a manner that the decoding steps, which do not lead to perceptible improvement in the quality of a reconstructed image are suppressed. U.S. Patent Application No. 2006/0067582 Al, entitled "Progressive JPEG Decoding System" to Mi Michael Bi, et. al, teaches a method in which DCT coefficients in a particular decoded scan are classified into two categories, namely, most significant DCT coefficients and least significant DCT coefficients. The least significant DCT coefficients are not stored directly in the memory. They are binarized and represented by either "0" or "1" indicating if they are zero or non-zero coefficients. The binarized bitmap for the least significant DCT coefficients and the actual values of most significant DCT coefficients are stored in the memory and, thus, the overall memory requirements are significantly reduced.

U.S. Patent Application No. 2008/0130746A1, entitled "Decoding a Progressive JPEG Bitstream as a Sequentially Predicted Hybrid Video Bitstream" to Soroushian, et. al, teaches generating an intermediate bitstream by parsing a JPEG bitstream carrying a picture. The intermediate bitstream generally includes one or more encoded frames each representing a portion of the picture. A second circuit may be configured to (i) generate one or more intermediate images by decoding the encoded frames, and (ii) recreate the picture using the intermediate images.

U.S. Patent Application No. 2008/0310741A1, entitled "Method for Progressive JPEG Image Decoding" to Yu-Chi Chen, et. al, describes a method of using a nonzero history table and a sign table of each Variable Length Decoding (VLD) result, which are recorded and used as a reference for decoding the next scan layer. The decoded coefficients are no longer directly stored in a memory to save the memory space.

U.S. Patent Application No. 2009/0067732A1, entitled "Sequential Decoding of Progressive coded JPEGS" to Sukesh V. Kaithakapuzha, teaches progressive scan encoded JPEGS are decoded sequentially on a Minimum Coded Unit (MCU) basis. Address Pointers are used to index into each scan, and coded data from each scan is outputted to form an entropy decoded MCU. Each of these attempts to solve the problem addressed by this disclosure have the similar shortcoming of increased memory usage, and increased decode latency.

Secondly the prior art references perform IDCT, data copy and color format conversion, such as YCbCr to RGB, for entire MCU/data unit for every reconstruction of an approximation of an image.

SUMMARY

The problem solved by this disclosure is to reduce the memory and computation resources required to decode an image bitstream encoded in Progressive JPEG Mode.

For an understanding of the features in this disclosure, a brief description of the state of the art is provided.

Fig. 1 shows the major operations involved in a typical prior art JPEG encoder 10. A color image is generally represented as a mixture of various color component images, for example, a color image may be represented by a combination of a Red, Green and Blue color component image. For higher compression and efficient implementation of various use cases a RGB source image is generally pre-processed to convert it into a YUV source image. Source Image Data, which is an input to JPEG Encoder may or may not be a multiple component image, for example a YUV source image is a three component image. JPEG encoder 10 illustrates encoding of a single component source image. Typically, similar operations as performed for encoding of a single component source image are performed by JPEG encoder 10 for encoding multiple component source image (such as shown in Fig. 1A).

Encoder 10 partitions the source image data into MCU/data unit and performs the encoding operation on each MCU/data unit. A data unit is 8x8 block of samples of one component in DCT-based processes.

FDCT block performs a mathematical transformation of the data unit to convert a block of samples into a corresponding block of original DCT coefficients. One of the DCT coefficients is referred to as the DC coefficient and the rest are the AC coefficients.

[0019] JPEG Encoder 10 selects a Quantization Table selected from Table Specifications block. Quantizer block quantizes DCT coefficients by using a specific quantization value for each positional DCT coefficient. Positional quantization value is obtained from Quantization Table.

Iq uv = round(I U v/Quv)

Iq U v - Quantized DCT coefficient at frequency (u,v), I uv - DCT coefficient at frequency (u,v), Quv - Quantization value at frequency (u,v).

[0020] After quantization, and in preparation for entropy encoding, JPEG Encoder 10

encodes the quantized DC coefficient as the difference from the DC term of the previous block in the encoding order (defined in the following), as shown in Figure 2A.

[0021] After quantization, and in preparation for entropy encoding, the quantized AC

coefficients are converted to a stream of coefficients as per the zig-zag order. The zigzag sequence is specified in Fig. 2B.

[0022] Entropy Encoder block (Fig. 1) encodes the quantized DCT coefficients by

performing either of the two entropy coding methods, i.e., Huffman or Arithmetic coding. Corresponding entropy encoding tables are selected from Table

specifications block. Entropy Encoder block may format the encoded stream as progressive JPEG bitstream or as a sequential JPEG stream.

[0023] A progressive JPEG encoder typically stores all the quantized DCT coefficients of an image in an intermediate image buffer that exists between the Quantizer block and the Entropy Encoder block. There are two procedures, i.e. spectral selection and successive approximation, by which the quantized coefficients in the buffer may be partially encoded within a scan, the scan contains data from all the MCU's / data units present in an image component. Reference is now made to Fig. IB. In spectral selection, the zig-zag sequence of DCT coefficients is segmented into frequency bands. The same frequency bands from all the MCUs/data units are encoded sequentially to form a scan. DC coefficients are always coded separately from AC coefficients. DC coefficients scan may have interleaved blocks from more than one component. All other scans will have only one component.

Successive approximation is a progressive coding process in which the coefficients are coded with reduced precision. DCT coefficients are divided by a power of two before coding.

An encoder or decoder implementing a full progression uses spectral selection within successive approximation. As indicated above, Fig. IB illustrates the spectral selection and successive approximation.

Fig. 3 shows the major operations involved in a JPEG decoder 14. Progressive JPEG decoder typically stores all quantized DCT coefficients of an image before decoding DCT coefficients of an image present in separate scans. As soon as a subset of DCT coefficients as partitioned by JPEG encoder becomes available, it is decoded for all image components and image is sent for display. A maximum number of times a new approximation of an image can be displayed may be equal to the minimum number of scans of an image component.

JPEG Decoder parses the compressed JPEG bitstream and determines if the JPEG bitstream to be decoded is a Progressive, Sequential, Hierarchical or lossless JPEG bitstream. Entropy Decoding, i.e., Huffman or Arithmetic Decoding, and

Quantization tables to be used for decoding are obtained from the compressed bitstream.

Entropy Decoder block performs an Entropy Decoding operation on compressed bitstream using Entropy Decoding Tables specified in the bitstream. Entropy Decoding Table is obtained from the Table Specification block, on the basis of information parsed from the JPEG bitstream. Typical progressive JPEG Entropy decoder entropy decodes a particular scan completely before proceeding to decode the next scan. In this manner, progressive Entropy Decoder entropy decodes all the scans, and hence entropy decodes all image components. Entropy decoder block generates quantized DCT coefficients.

[0030] An intermediate Image Buffer exists between the Quantizer block and the Entropy

Decoder block. Progressive JPEG entropy decoder stores all the quantized DCT coefficients of an image in an intermediate image buffer.

[0031] Dequantizer block (Fig. 3) performs a dequantization operation using Dequantization

Tables specified in the bitstream. De-Quantization Table is obtained from the Table Specification block, on the basis of information parsed from the JPEG bitstream.

Ruv - Inverse Quantized DCT coefficient at frequency (u,v) Iq uv - Entropy decoded DCT coefficient at frequency (u,v)

Quv - Quantization value at frequency (u,v)

[0032] If successive approximation was used by Progressive JPEG Encoder then JPEG

decoder multiplies the quantized DCT coefficients by a power of two before computing the IDCT. Power of two to be used is obtained from the encoded bitstream.

[0033] IDCT block performs an Inverse DCT operation on an 8x8 block of inverse quantized

DCT coefficients to generate an 8x8 block of image samples of a particular image component.

[0034] JPEG Decoder decodes all MCUs/data units to form Reconstructed Image Data.

[0035] Decoding of a progressively encoded JPEG image usually shows successive improved approximations of an entire image, as shown in Fig. 4. [0036] Decoding of a sequentially encoded JPEG image usually shows a row-by-row buildup of a final image, as shown in Fig. 5.

[0037] A restart marker segment and it's placement in the compressed JPEG bitstream is defined in the JPEG compression standard, as identified previously.

[0038] A restart marker (RST m ) is a conditional marker, which is placed between entropy- coded segments only if restart is enabled. There are eight unique restart markers (m = 0 - 7) which repeat in sequence from 0 to 7, starting with zero for each scan, to provide a modulo 8 restart interval count, as shown in Fig. 6. Fig 6 shows a syntax for sequential DCT -based, progressive DCT-based, and lossless modes of operation. As per the JPEG standard value of Restart Marker is X'FFDO' through X'FFD7', with the X' value represented in hexadecimal.

[0039] Fig. 7 shows the layout or tables/miscellaneous marker segment syntax. If any table specification for a particular destination occurs in the compressed image data, it shall replace any previous table specified for this destination, and shall be used whenever this destination is specified in the remaining scans in the frame or subsequent images represented in the abbreviated format for compressed image data. If a table specification for a given destination occurs more than once in the compressed image data, each specification shall replace the previous specification.

[0040] Fig. 8 shows the parameters of a restart interval segment. A restart interval marker segment specifies the marker segment, which defines the restart interval, where DRI 16 represents a Define Restart Interval Marker and marks the beginning of the parameters which define the restart interval. As per JPEG standard value of DRI Marker is X'FFDD', again with the X'value represented in hexadecimal. Lr 18 defines the restart interval segment length, which specifies the length of the parameters in the DRI segment shown in Fig. 8, and Ri 20 is the restart interval that specifies the number of MCUs in the restart interval. A DRI marker segment with Ri nonzero enables restart interval processing for the following scans. A DRI marker segment with Ri equal to zero disables restart intervals for the following scans. DRI segment is inserted in the JPEG bitstream in Tables or Miscellaneous Marker Segment as shown in Fig 6 and Fig 7.

The claimed invention provides a significant improvement over the prior art by transcoding and reconstruction of at least one MCU/data unit segment of at least one image component by segmenting the scans of an image into a number of multiple segments by insertion of restart interval segment markers, restart markers, resync markers, and application data segments in the JPEG bitstream. One or more images components are reconstructed in multiple steps. Each step corresponds to reconstruction of identified segments of compressed data, which are present in one or more scans of an image component. Entropy coded segments of each scan of each component are divided into multiple segments of smaller sizes. Each MCU/data unit segment of an image component is reconstructed before proceeding to start the reconstruction of next MCU/data unit segment of this image component or any other image component. Entropy coded segments of various scans of an image component can be decoded sequentially or in parallel.

Restart/Resync Markers are inserted in the entropy coded segment of each scan to create multiple entropy coded segments of relatively smaller sizes in each scan. Each entropy coded segment of same ID in different scans of an image component, may or may not contain same number of MCU/data units. Hence, each entropy decoder may decode different number of MCU/data unit for an entropy coded segment per scan. Restart/Resync markers can be used to randomly seek entropy coded segments. Restart/Resync markers also allow parallel decoding of multiple entropy coded segments.

The embodiments disclosed herein, can be used in conjunction with another related co-pending patent application for efficient IDCT computation for reconstruction of approximate JPEG images. This patent application is entitled "An Efficient

Progressive JPEG Decode Method", Serial No. PCT/US2013/059899. The claimed invention performs efficient region of interest reconstruction by randomly seeking regions of interest according to inserted restart markers and explicitly specifies regions of interest and operations to be performed in these regions of interest by specifying the same in JPEG bitstream as application data segments. The JPEG library which is a prior art uses W*H*N*2 bytes of intermediate memory to store DCT coefficients. In comparison, this invention reduces the memory requirement to approx:

2*Bytes j3er_DCT_coeffcient*MCU_Data_Unit_size bytes to 2*Bytes_per_DCT_coeffcient*MCU_Data_Unit_size *

Max_Number_of_Non_Zero_Data_Units_between_2_Synchronization_ Points bytes.

W is typically 2048 bytes and W*H is typically 4 to 9 MB or more.

[0045] The claimed invention allows progressive JPEG over a very flexible range of memory available for progressive JPEG decode. This inventive solution scales itself according to the memory available for progressive JPEG decode.

[0046] There are various uses which require a JPEG Decoder to be present on Set Top Box

(STB), which include:

• Internet browsing on set top box;

• Decode and display of JPEG images downloaded from internet/intranet, which may be wired or wireless or a combination thereof;

• Decode and display of JPEG images shared via storage devices such as USB stick, USB hard disk, SATA hard disk, Flash, etc; and

• Decode and display of JPEG images shared over relatively close distance wired or wireless channel such as Bluetooth, WI-FI etc.

[0047] JPEG images encoded for internet/intranet/ Wide Area Network (WAN)/ Local Area

Network (LAN) or downloaded from internet/intranet/WAN/LAN may be available in JPEG progressive mode, hence, it is important for Set Top Box to support decoding of progressively encoded JPEG content. STBs have a limited amount of memory, hence it is essential that a reduced memory footprint be used for decoding of Progressive JPEG images. Computational resources are becoming economical, and relatively more computation power is available. The computation resources of higher capability have propagated into almost all strata of society of developed, developing countries and to some extent in under developed countries. Resolution supported by digital cameras has also gone up considerably. However the same is not true for the bandwidth available to the consumers, hence, progressively encoded JPEG content is likely to become, and is becoming, available on internet.

There are two embodiments described for performing a decode of progressively encoded JPEG bitstream by using a reduced amount of memory. These embodiments are transcoding of a progressive JPEG bitstream into another progressive JPEG bitstream and updating of progressive JPEG bitstream by removal of portions of progressive JPEG bitstream, which have been decoded.

These embodiments are equally applicable for both use cases, i.e., decode of JPEG bitstream from a file and decode of JPEG bitstream streaming over a wired/ wireless network or a combination thereof.

The claimed embodiments are a significant improvement over the prior art due to memory reduction, because they allow progressive JPEG over a very flexible range of memory available for a progressive JPEG decode. This solution scales itself according to the memory available for progressive JPEG decode. The embodiments provide for efficient decoding of consecutive approximations of a JPEG image.

Further, all components decode and display of JPEG image approximations and include efficient region of interest decoding.

BRIEF DESCRIPTION OF THE DRAWINGS

The disclosed method and apparatus, in accordance with one or more various embodiments, is described with reference to the following figures. The drawings are provided for purposes of illustration only, and merely depict examples of some embodiments of the disclosed method and apparatus. These drawings are provided to facilitate the reader's understanding of the disclosed method and apparatus. They should not be considered to limit the breadth, scope, or applicability of the claimed invention. It should be noted that for clarity and ease of illustration these drawings are not necessarily made to scale.

[0053] Fig. 1 shows a typical prior art JPEG encoder.

[0054] Fig. 1A shows a multi-component source image.

[0055] Fig. IB illustrates progressive encoding with spectral selection shown on the left side, and successive approximation shown on the right side.

[0056] Fig. 2 A shows how a prior art DC prediction is done.

[0057] Fig. 2B shows the reordering of the DCT coefficients in a zigzag order.

[0058] Fig. 3 shows a typical prior art JPEG decoder.

[0059] Fig. 3A illustrates system level flow of JPEG decoding in Set Top Box for one

component.

[0060] Fig. 4 depicts prior art progressive JPEG decoding. [0061] Fig. 5 depicts prior art sequential JEPG decoding.

[0062] Fig 6 shows bitstream syntax for sequential DCT -based, progressive DCT -based, and lossless modes of JPEG operation.

[0063] Fig. 7 shows the layout or Tables/miscellaneous marker segment syntax.

[0064] Fig. 8 shows the parameters of a Restart Interval Segment.

[0065] Fig. 9 shows how a Scan of an original JPEG bitstream is transcoded by inserting

DRI, Application Data Segment and Restart Markers.

[0066] Fig. 10 shows the transcoding of a progressive JPEG bitstream into another

progressive JPEG bitstream for an individual component.

[0067] Fig. 1 1 shows the decoding of transcoded/modified progressive JPEG bitstream for simultaneous parallel decoding of one or more than one image component. Fig. 12 depicts the process of decoding of transcoded/modified progressive JPEG bitstream with multiple entropy segments in each scan.

Fig. 13 shows one way of how decoding proceeds by insertion and replacement of a ReStart Marker.

Fig 14 shows transcoding and decoding of progressive JPEG bitstream by insertion and replacement of restart/resync marker.

Fig. 15 shows a default region of interest.

Fig. 16 shows an Application Data Segment which explicitly specifies region of interest and actions to be performed on region of interest.

Fig 17 shows a decode and display of an approximation of a JPEG image using an upscaling operation.

The figures are not intended to be exhaustive or to limit the claimed invention to the precise form disclosed. It should be understood that the disclosed method and apparatus can be practiced with modification and alteration, and that the invention should be limited only by the claims and the equivalents thereof.

DETAILED DESCRIPTION

The following presents a simplified summary of one or more embodiments in order to provide a basic understanding of some aspects of such embodiments. This summary is not an extensive overview of the one or more embodiments, and is intended to neither identify key or critical elements of the embodiments nor delineate the scope of such embodiments. Its sole purpose is to present some concepts of the described embodiments in a simplified form as a prelude to the more detailed description that is presented later.

A progressively encoded JPEG bitstream is likely to contain AC frequency coefficients of each image component in multiple scans. An encoder could have partitioned frequency coefficients according to spectral selection and/or successive approximation as described in JPEG standard, as referenced above. This disclosure teaches the transcoding and then the reconstruction of one or more image components in multiple steps. Each step corresponds to reconstruction of identified segments of compressed data, which are present in one or more scans of an image component. Entropy coded segments of each scan of each component are divided into multiple segments of smaller sizes. Each MCU/data unit segment of an image component is reconstructed before proceeding to start the reconstruction of next MCU/data unit segment of this image component or any other image component. Entropy coded segments of various scans of an image component can be decoded sequentially or in parallel.

Resync markers are inserted in the entropy coded segment of each scan to create multiple entropy coded segments of relatively smaller sizes in each scan. Each entropy coded segment of same ID in different scans of an image component may or may not contain the same number of MCU/data units. Hence, each entropy decoder may decode different numbers of MCU/data units per entropy coded segment per scan. Resync markers can be used to randomly seek entropy coded segments. Resync markers also allow parallel decoding of multiple entropy coded segments.

Fig. 9 shows how a Scan of an original JPEG bitstream is transcoded by inserting DRI, Application Data Segment and Restart Markers.

A first embodiment showing the transcoding of a progressive JPEG bitstream 22 into another progressive JPEG bitstream 24 is shown in Fig. 10 for an individual/multiple component transcode, and Fig. 1 1 for a simultaneous parallel decode of more than one image component. Fig. 10 depicts the process of transcoding a progressive JPEG bitstream 22 by creating multiple entropy segments in each scan via insertion of Resync marker. Scan Parser 27 finds a scan to be transcoded and schedules the same to be decoded by Entropy Decoder 28. Entropy Decoder 28 entropy decodes the bitstream via Table Specifications 30 and sends information about the current MCU/data unit to Re-Sync Marker Inserter 32. Re-Sync Marker Inserter 32 inserts a Re-Sync Marker after each decided MCU/data unit segment of a scan. Entropy Encoder 29 inserts the Application Data Segment and DRI segment in Tables or Miscellaneous Marker Segment as exemplified in Fig 9. Encoder 29 may also split end of band entropy code into multiple end of band entropy codes for creation of uniform restart intervals in a scan. This process is repeated unless entire scan has been processed. This process is in-turn repeated unless all scans of Progressive JPEG bitstream 22 have been processed. Encoder 29 formats the modifications in

Progressive JPEG bitstream to form a transcoded/modified progressive JPEG bitstream 24. This embodiment performs parallel decoding of the available portion (available scans of available image components) of transcoded/modified progressive JPEG bitstream 24. Scan parser 27 decides which scans from the available portion (available scans of available image components) of transcoded/modified progressive JPEG bitstream 24 are to be decoded. Scan parser 27 finds a scan from this subset of scans and schedules them for entropy decoding. Each scan, via an input buffer 26, of a component is scheduled to be decoded, via an entropy decoder 28, by a separate thread/process pursuant to a table of specifications 30. Each thread/process can run on one or more processors, hence, decoding of one or more scans can occur on one or more processors, as shown in Fig.11. Processor can be a DSP processor or a General purpose processor. Each entropy decoder 28 running on a thread/process entropy decodes an entropy coded segment until Resync marker 32 is found.

Decoding of progressive JPEG stream 24 is synchronized by communication of a synchronization signal. The synchronization signal may be implemented as various inter thread or process or processor communication methods. Message queues, semaphores, mutexes or interrupts can be used as some of the inter thread or inter process or inter processor communication methods. If resync markers 32 are separated by large number of MCU/data units, then the memory required to decode an image component will increase and if Resync markers 32 are separated by relatively small number of MCU/data units, then inter thread/process/ processor communication will increase. Depending on the memory and computation resources available, an optimum size of entropy coded segment is chosen. Entropy decoder 28 is run on each thread/process. Each entropy decoder 28 decodes an entropy coded segment, which has the same entropy coded segment ID. Entropy decoded DCT coefficients are stored in a shared buffer memory 36 which is accessible by all threads/processes. Double buffering, via shared memory 36, is used to efficiently use the computational resources to keep all the decoding modules busy for decoding JPEG bitstream 22. When de-quantization 34 and IDCT 38 tasks get the information, i.e., synchronization signal, and all entropy decoders 28 have reached a synchronization point, i.e. restart/resync marker, this task then schedules the entropy decoded DCT coefficients for de-quantization 34, inverse point transform (not shown), IDCT 38, and other operations involved in decoding of JPEG bitstream 22. This procedure is repeated until all the entropy segments of selected scans of an image component have been decoded. Scan Parser 27 uses Restart Markers to locate entropy coded segments of interest.

Once Data units of an image component are reconstructed 40, some are then written to output buffer 44. After DCT coefficients of data units present till the

synchronization point have been reconstructed 40, shared buffer memory 36, which was being used for storing DCT coefficients is released to be used by entropy decoders 28 and a buffer release signal is communicated to all the entropy decoders 28. Hence, the intermediate memory required to store the DCT coefficients is in the following range

• 2*Max_Number_of_Data_Units present between 2 Resync Markers to2* Max_Number_of_Non_Zero_Data_Units present between 2 Resync Markers.

A record of zero data units in scans is kept separately. During entropy decoding and reconstruction this record is looked at to avoid reparsing of scans if non-uniform number of MCU/data units is present between 2 Resync markers 32 (Restart

Markers).

After available scans of an image component are decoded, the process is then repeated for chosen scans from the available scans of the rest of the image components. The output of the JPEG decoder may go through scaling to be coherent with the display resolution and may go through image post processing operations for enhancement of image quality.

[0086] If a progressive JPEG bitstream 22 is being transmitted over a relatively narrow

bandwidth network then only few scans of each image component will be available to the JPEG Decoder at a given instant of time. Hence, for a streaming progressive JPEG bitstream, multiple approximations of a JPEG image will be sent to display until complete JPEG bitstream is received by the JPEG decoder. Transcoded / modified progressive JPEG bitstream may be stored to aid reconstruction of multiple approximations of a JPEG image and finally the reconstruction of a complete JPEG image.

[0087] Parallel decoding means that various portions of one or more Data Units are being entropy decoded in parallel. Parallel decoding continues until a synchronization point is reached. Once a synchronization point is reached, another buffer is ready to be used by entropy decoder 28. If buffer 36 is not ready, entropy decoders 28 will wait until they receive the buffer available information.

[0088] Next, the process of decoding of a progressive JPEG bitstream is discussed and

illustrated in Fig.12. Fig. 12 depicts the process of decoding of transcoded/modified progressive JPEG bitstream with multiple entropy segments in each scan. Scan parser

27 decides which scans from the available portion (available scans of available image components) of transcoded/modified progressive JPEG bitstream 24 are to be decoded. The scan parser finds a scan from this subset of scans and schedules them for entropy decoding. Entropy decoder 28 decodes an entropy coded segment of a scan until Resync marker 32 is found. Entropy decoded DCT coefficients are stored in shared buffer memory 36. On reaching the Resync marker 32, scan parser finds another scan from the subset of scans of the same image component. Entropy decoder

28 decodes an entropy coded segment, whose ID is same as that of the entropy coded segment from previous scan, of this scan until Resync marker 32 is found. Entropy decoded DCT coefficients are stored in shared buffer memory36. This process is repeated until entropy coded segments with the same IDs from each selected scan of an image component have been entropy decoded. The number of MCU/data unit which correspond to minimum restart interval among all the selected scans of an image component are reconstructed 40 (de-quantization, inverse point transform, IDCT, and other operations involved in decoding of a JPEG bitstream) and these reconstructed MCU/data units are written to output buffer. Memory 36, which was used for holding the entropy decoded DCT coefficients, which correspond to a subset entropy coded segments of scans of an image component, is also used for holding the entropy decoded DCT coefficients, which correspond to the next subset of entropy coded segments of scans of an image component. This procedure is repeated until all the entropy segments of selected scans of an image component have been decoded.

The above procedure is repeated until all available/chosen components of a JPEG image have been reconstructed. Output of JPEG decoder may go through scaling to be coherent with the display resolution and may go through image post processing operations for enhancement of image quality.

If progressive JPEG bitstream 22 is being transmitted over a relatively narrow bandwidth network then only few scans of each image component will be available to the JPEG decoder at a given instant of time. Hence, for a streaming progressive JPEG bitstream multiple approximations of a JPEG image will be reconstructed and sent to display until complete JPEG bitstream is received by the JPEG decoder.

If Resync markers are separated by large number of MCU/data units then the memory required to decode an image component will increase and if Resync markers are separated by relatively small number of MCU/data units then the processing done by scan parser will increase which involves the searching and scheduling of next decodable entropy coded segment.

If decoding processes other than entropy decoding are running on another processing unit then double buffering is used to efficiently use the computational resources to keep all the decoding modules busy for decoding a JPEG bitstream. When de- quantization 34 and IDCT tasks get the information, i.e., synchronization signal, and the decided (decided on the basis of minimum restart interval for a given entropy coded segment) number of MCU/data units have been decoded, this task then schedules the entropy decoded DCT coefficients for de-quantization, inverse point transform, IDCT, and other operations involved in decoding of a JPEG bitstream. Hence intermediate memory required to store DCT coefficients is in the following range

2*Max_Number_of_Data_Units present between 2 Resync Markers to 2*Max_Number_of_Non_Zero_Data_Units present between 2 Resync Markers.

[0093] Record of zero data units in scans is kept separately. During entropy decoding and reconstruction, this record is reviewed to avoid reparsing of scans if a non-uniform number of MCU/data units are present between two Resync markers (Restart Markers).

[0094] This process can either perform individual component decode, i.e., decode one

component at a time, or simultaneous decoding of all image components.

[0095] For the deletion of a compressed bitstream in a progressive JPEG bitstream decode, the scan parser decides that from the available portion (available scans of available image components) of progressive JPEG bitstream, which scans are to be decoded. Scan parser finds a scan from this subset of scans and schedules them for entropy decoding. Entropy decoder 28 decodes an entropy coded segment of a scan until a synchronization point is reached. Entropy decoded DCT coefficients are stored in a shared buffer memory 36. On reaching the synchronization point, the remaining compressed bitstream of this scan is copied to the start of entropy coded segment section. The entire scan header except for SOS marker is also deleted. After deletion of parsed portion of compressed bitstream, scan parser finds another scan from the subset of scans of the same image component. Entropy decoder 28 decodes an entropy coded segment of this scan until synchronization point is reached. Entropy decoded DCT coefficients are stored in shared buffer memory 36. Upon reaching the synchronization point, the remaining compressed bitstream of this scan is copied to the start of the entropy coded segment section. This process is repeated until entropy coded segments from each selected scan of an image component have been entropy decoded up to synchronization point. The number of MCU/data units that are present between two synchronization points are reconstructed (de-quantization, inverse point transform, IDCT, and other operations involved in decoding of a JPEG bitstream) and these reconstructed MCU/data units are written to output buffer. Memory 36, which was used for holding the entropy decoded DCT coefficients, which correspond to a subset entropy coded segments of scans of an image component up to synchronization point is reused for holding the entropy decoded DCT coefficients, which correspond to next subset of entropy coded segments of scans of an image component. This procedure is repeated until all the entropy segments of selected scans of an image component have been decoded.

[0096] The above procedure is repeated until all available/chosen components of a JPEG image have been reconstructed. The output of JPEG decoder may go through scaling to be coherent with the display resolution and may go through image post processing operations for enhancement of image quality.

[0097] If a progressive JPEG bitstream is being transmitted over a relatively narrow

bandwidth network then only a few scans of each image component will be available to the JPEG decoder at a given instant of time. Thus, for a streaming progressive JPEG bitstream multiple approximations of a JPEG image will be reconstructed and sent to display until complete JPEG bitstream is received by the JPEG decoder. Progressive JPEG bitstream 22 may be stored to aid reconstruction of multiple approximations of a JPEG image and finally the reconstruction of a complete JPEG image.

[0098] If synchronization points are separated by large number of MCU/data units then memory required to decode an image component will increase and if synchronization points are separated by a relatively small number of MCU/data units, then processing required for copy and deletion of bitstream will increase. Depending on the memory and computation resources available, an optimum size of entropy coded segment is chosen. [0099] If decoding processes other than entropy decoding are running on another processing unit, then double buffering is used to efficiently use the computational resources as it keeps all the decoding modules busy for decoding a JPEG bitstream. When de- quantization and IDCT task gets the information, i.e., synchronization signal, and the resultant number of MCU/data units have been decoded, this task then schedules the entropy decoded DCT coefficients for de-quantization, inverse point transform, IDCT, and other operations involved in decoding of a JPEG bitstream. Hence intermediate memory required to store DCT coefficients is in the following range equal to 2*Max_Number_of_Data_Units present between 2 Synchronization Points to.

2*Max_Number_of_Non_Zero_Data_Units present between 2 Synchronization Points.

[00100] Record of zero data units in scans is kept separately. During entropy decoding and reconstruction, this record is looked into to avoid reparsing of scans if non-uniform number of MCU/data units are present between two synchronization points.

[00101] Again, this process can either perform individual component decode, i.e., decode one component at a time, or simultaneous decoding of all image components.

[00102] Fig. 13 shows one way of how decoding proceeds by insertion and replacement of a

ReStart Marker.

[00103] For the insertion of a Resync marker in a progressive JPEG Bitstream decode, scan parser 27 decides that from the available portion (available scans of available image components) of progressive JPEG bitstream, which scans are to be decoded. This is shown in Fig. 13 and Fig. 14. Scan parser 27 finds a scan from this subset of scans and schedules them for entropy decoding 28. Entropy decoder 28 decodes an entropy coded segment of a scan until a synchronization point is reached. Entropy decoded DCT coefficients are stored in shared buffer memory 36. Upon reaching the synchronization point, Re-Sync Marker Inserter 32 inserts a resync marker into the entropy coded segment of this scan and the compressed bits replaced by resync marker are stored. After insertion of resync marker, scan parser 27 finds another scan from the subset of scans of the same image component. Entropy decoder 28 decodes an entropy coded segment of the next scan until synchronization point is reached. Entropy decoded DCT coefficients are stored in shared buffer memory 36. Upon reaching the synchronization point, Re-Sync Marker Inserter 32 inserts a resync marker into the entropy coded segment of this scan and the compressed bits replaced by resync marker are stored. This process is repeated until entropy coded segments from each selected scan of an image component have been entropy decoded up to synchronization point. The number of MCU/data units which are present between two synchronization points are reconstructed (de-quantization, inverse point transform, IDCT, and other operations involved in decoding of a JPEG bitstream) and these reconstructed MCU/data units are written to an output buffer. Memory 36 which was used for holding the entropy decoded DCT coefficients, which correspond to a subset of entropy coded segments of scans of an image component up to synchronization point is reused for holding the entropy decoded DCT coefficients which correspond to next subset of entropy coded segments of scans of an image component.

[00104] Search and replace resync marker 31 searches for the inserted resync marker when the remaining portion of entropy coded segment of a scan is to be decoded. Resync marker is replaced by the stored compressed bits and entropy decoding of the remaining portion of entropy coded segment of a scan is continued until a synchronization point is reached. Upon reaching the synchronization point, a resync marker is inserted into compressed stream and the compressed bits replaced by resync marker are stored. This procedure is repeated until all the entropy segments of selected scans of an image component have been decoded.

[00105] The above procedure is repeated until all available/chosen components of a JPEG image have been reconstructed. Output of JPEG decoder may go through scaling to be coherent with the display resolution and may go through image post processing operations for enhancement of image quality. [00106] If a progressive JPEG bitstream is being transmitted over a relatively narrow bandwidth network then only a few scans of each image component will be available to the JPEG decoder at a given instant of time. Hence, for a streaming progressive JPEG bitstream multiple approximations of a JPEG image will be reconstructed and sent to display until a complete JPEG bitstream is received by the JPEG decoder.

[00107] If synchronization points are separated by large number of MCU/data units, then the memory required to decode an image component will increase and if synchronization points are separated by relatively small number of MCU/data units then processing done by scan parser 27, Re-Sync Marker Inserter 32 and Search and replace resync marker 31 will increase which involves the searching and scheduling of next decodable entropy coded segment. Depending on the memory and computation resources available an optimum size of entropy coded segment is chosen.

[00108] If decoding processes other than entropy decoding are running on another processing unit then double buffering is used to efficiently use the computational resources to keep all the decoding modules busy for decoding a JPEG bitstream. When de- quantization and IDCT task gets the information, i.e., synchronization signal, the decided number of MCU/data units have been decoded, this task then schedules the entropy decoded DCT coefficients for de-quantization, inverse point transform, IDCT, and other operations involved in decoding of a JPEG bitstream. Hence intermediate memory required to store DCT coefficients is in the following range

2*Max_Number_of_Data_Units present between 2 synchronization points to 2*Max_Number_of_Non_Zero_Data_Units present between 2 synchronization points.

[00109] Record of zero data units in scans is kept separately. During entropy decoding and reconstruction, this record is reviewed to avoid reparsing of scans if a non-uniform number of MCU/data units are present between two synchronization points.

[00110] The above process can either perform individual component decode, i.e., decode one component at a time, or simultaneous decoding of all image components. All regions of a JPEG image may not be of immediate and simultaneous importance for the user. For example, a user might be interested in some portions of a JPEG image, therefore, it is a waste of computational resources to decode unwanted regions, and decoding of certain regions can be skipped. Secondly, a display resolution of a display device may be much less than that of resolution of JPEG image. In this case one has to either downscale the reconstructed image or to crop the reconstructed image. Thus, it is a waste of computational resources to decode unwanted regions and then either downscale/crop away the same. Finally, the region of interest decoding of image quality can be used according to a particular use case. One being decoding and displaying approximations of a JPEG image. Certain regions of JPEG image are reconstructed with lower quality as compared to other regions in which relatively higher quality is maintained.

The regions of interest can be specified in many ways using resync markers. Resync markers enable efficient decode of regions of interest.

A default region of interest is generally centered in an image. One way to choose default region of interest is to divide the image into an array of 4x4, as shown in Fig. 15. Dark region 42 depicts a region of interest.

A user guided region of interest can be a user input, which specify regions of interest to JPEG decoder via mouse, keyboard, touch screen, or the like.

An explicitly specified region of interest can also be used. As shown in Fig. 9 and Fig. 16, the system segments JPEG bitstream into multiple regions by placing restart markers in the bitstream. Explicit commands in the JPEG bitstream indicate various regions and various actions to be performed for these regions. Application data segment 44, as referenced in before mentioned JPEG standard, is used to indicate regions of interest and various actions to be performed for these regions. These can include application data marker (APPn) 46, which marks the beginning of an application data segment and application data segment length (Lp) 48, which specifies the length of the application data segment shown in application data byte (Api) 50. The interpretation of Api 50 is left to the application. As specified in JPEG standard,

Application data marker (APPn) 46 takes values from X'FFEO' through X'FFEF', with the X 'value' represented in hexadecimal. Application Data segment is inserted in the JPEG bitstream in Tables or Miscellaneous Marker Segment as shown in Fig 6 and Fig 7.

[00116] This embodiment uses resync markers and region of interest information carried in the

bitstream to locate regions which have to go through the decoding process and manner in which they are to be decoded, and efficiently decode regions of interest.

Application Data Segments are shown in the following table:

APPn Lp Action/

Operation

Byte

Application data bytes 50 are preferably inserted in the following order.

Application_specific_marker/ReSync Marker, which comprises a 2 or 4 or 8 byte

marker is inserted to ascertain that the subsequent bytes belong to a particular application. Depending on JPEG bitstream being decoded

Application_specific_marker can be either of following Start of Image (SOI) Marker , i.e., X'FFD8', End of Image (EOI) Marker , i.e., X'FFD9', Reserved Markers, i.e.,

X'FFOl ', X'FF02' through X'FFBF' and Start of Frame (SOF) Marker, i.e., X'FFCO'to X'FFCF' etc. Region of interest coordinates comprising 2 bytes, for top- left MCU/data unit address of region of interest, 2 bytes for region of interest width in units of MCU/data unit and 2 bytes for region of interest height in units of MCU/data unit Action/operations for regions comprise 1 byte, representing low, medium, or high importance. Thus, the memory required for specifying a preferred region of

interest in a bitstream for this embodiment comprises an application_specific_marker_bytes (2, 4, or 8 bytes) + 7 bytes per region to specify regions of interest.

The preferred embodiment completely avoids usage of intermediate buffer and uses a small portion of output buffer as an intermediate buffer to store relatively small number of DCT coefficients.

The output buffer size can vary from "pixel_depth_in_bytes x N x M x

number_of_color_components" to "pixel_depth_in_bytes x (N x M + N x M x (number_of_color_components -1) x ¼)" depending on the output color format. Output color formats can be YUV 4:4:4, YUV 4:2:2, YUV 4:2:0, CMYK etc.

Normally YUV 4:4:4, YUV 4:2:2, YUV 4:2:0 are used by digital cameras and the content present on the internet is largely in these color formats. Pixel_depth_in_bytes can be 1 byte or 2 bytes depending on whether the input sample precision is 8 bpp (bits per pixel) or 12 bpp.

Sub-sampled reconstructed approximations of a JPEG image require a buffer which is at most the size of output buffer. The intermediate buffer required for storing frequency coefficients for decoding sub-sampled approximations of a JPEG image varies from "Intermediate_sample_bit_depth_in_bytes x N x M x

number_of_color_components x ¼)" to "¼ x

Intermediate_sample_bit_depth_in_bytes x (N x M + N x M x

(number_of_color_components -1) x ¼)" which is clearly less than the total output buffer memory available.

"Intermediate_sample_bit_depth_in_bytes" can be 1 byte or 2 bytes depending on the dynamic range of entropy decoded DCT coefficient.

Sub-sampled reconstructed approximations of a JPEG image and corresponding entropy decoded DCT coefficients can be easily stored in output buffer. In the worst case only ¾ (1/4 for storing the sub sampled reconstruction and ½ for storing the corresponding DCT coefficients) of the output buffer will be used for simultaneously storing frequency coefficients and reconstructed outputs. Worst case is defined as each frequency coefficient of each color component requiring 16 bits for storage and each reconstructed sample of each color component requiring 8 bits for storage.

Consequently sub-sampled approximations of a JPEG image can be reconstructed and sent to the display without overwriting decoded frequency coefficients while ¼ of the output buffer memory is still available for decoding the progressive JPEG bitstream being received.

[00123] After sub-sampled approximations of a JPEG image have been reconstructed and sent to display, the next step is to reconstruct the final JPEG image to be sent to display.

[00124] Display of final image is not required until the complete image has been decoded. For reconstruction of a final JPEG image a small buffer along with output buffer can be used to temporarily store the DCT coefficients of an image component. Following is the sequence of operations which are performed for decoding a progressively encoded JPEG File.

• At one time only one image component is decoded.

• Depending on color format and other attributes of JPEG bitstream a decoding order of image components is decided such that this decoding order requires minimal amount of intermediate memory.

• Decoding of next image component is started only after decoding of current image component is finished.

• Entropy decode remaining scans of an image component in a segment wise manner. Remaining scans are the scans whose DCT coefficients were not stored during reconstruction of approximations of a JPEG image.

• A small buffer along with output buffer is big enough to store all the required DCT coefficients of a segment of one image component even if most of the frequency coefficients require 16 bit.

• Store quantized frequency coefficients, i.e., DCT coefficients, in output buffer.

Almost all quantized DCT coefficients of 8 bit sample precision can be stored in 8 bits due to quantization. If some frequency coefficients require 16 bits then same can be known by (i) input sample precision (ii) Quantization Table used for a particular component.

• Defer performing rest of the decoding processes such as inverse quantization, IDCT, color format conversion etc until quantized DCT coefficients from all scans of a MCU/data unit segment of particular image component become available.

• Decode each MCU/data unit of a segment of an image component and store the decoded MCU in Output buffer.

• Repeat steps 1 to 8 unless all MCU/data unit segments of an image component are decoded.

• Repeat steps 1 to 9 unless all image component are decoded.

[00125] Instead of using an IDCT computation to upscale approximations of a JPEG image, the preferred embodiment uses a software/hardware up-scaling operation. Fig. 17 shows a JPEG image of N x M size is reconstructed by using only N/(2 n ) x M(2 m ) frequency coefficients. This division of the work flow is efficient when a core operation inherently performs up-scaling operation 46 in addition to data processing operation, such as decoding via prior art Progressive JPEG DecoderlA In comparison up-scaling operation 46 is being performed by a separate module, hence the JPEG decoder 50 requires less computation to decode progressive JPEG bitstream 22 resulting in a faster progressive JPEG decode in real time.

[00126] Sub-sampled approximations of a JPEG image are up-scaled by using software / hardware up-scaling operation 46. A software up-scaling operation, if used, preferably runs on a separate processor. Optionally image enhancement operations can be performed prior to up scaling or after scaling. These operations can be de- ringing, de-blocking, color correction, white balance, etc.

[00127] The preferred embodiment proposes to decode one image component in its entirety before it progresses to decode other image components. The JPEG standard mandates that all AC DCT coefficients of an image component for a progressive JPEG image will be encoded in a non-interleaved mode, i.e. AC DCT coefficients of each image component shall be present as a separate scan in a JPEG bitstream. For example, N image components are present in a progressively encoded JPEG bitstream. The preferred embodiment chooses to decode all DCT coefficients of one image component before it progresses to decode DCT coefficient of next image component.

[00128] Strategy of decoding one image component at a time reduces the intermediate

memory required to decode a progressively coded JPEG bitstream by a factor of N, because only frequency coefficients, i.e., DCT coefficients of one component has to be stored. Progressive mode of JPEG allows only DC coefficients of all image components to be encoded in interleaved mode. Storage of DC coefficients of other image components require typically (N-l) *W*H*2/64 bytes of memory. The numbers of image components in an image are usually in single digits. Since only one entropy coded segment of an image component is decoded before proceeding to next entropy coded segment of this /other image component, the memory required to store DCT coefficients is proportional to the number of data units contained between 2 synchronization points.

[00129] If the system is decoding a JPEG bitstream contained in a file then all scans of an image component are entropy decoded up to a synchronization point and are scheduled for the rest of the decoding processes, i.e., decoding processes which are performed after entropy decoding, once all quantized frequency/DCT coefficients of all the data units, i.e., the number of data units being equal to minimum restart interval among all the current restart intervals of all the scans, of an image component between the two synchronization points become available. This helps in reducing the computation resources (IDCT, memory bandwidth, data copy, color format conversion etc.) required to decode and display an image.

[00130] The preferred embodiment decodes a streaming progressive JPEG bitstream, which is being delivered over internet, as discussed below.

In an all component sub-sampled progressive JPEG decode, sub-sampled approximations of JPEG image are decoded, reconstructed and sent for display. Sub- sampled approximations of JPEG image are decoded in a component wise manner, i.e., at a time single component is decoded and approximation of next component is scheduled for decoding only after approximation of previous component has been decoded. Sub-sampled approximations are upscaled 46. Once approximations of all image components become available, some are then sent for display as an all component approximation of a JPEG image. After N/2xM/2 approximation of a JPEG image has been decoded and displayed, entire N x M JPEG image is decoded and displayed next. Decode of complete JPEG bitstream is again performed in a component wise manner. An image component is scheduled for decoding in entirety. Complete data units of current image component present between two

synchronization points are scheduled for the rest of decoding processes, once all DCT coefficients of these MCU/data units, i.e., the number of data units, which correspond to minimum restart interval among all the current restart intervals, of current image component become available. This process is repeated for rest of the image component unless all MCU/data unit segments of an image component are reconstructed. Decoding of next image component is not started unless current image component has been decoded in entirety. This process is repeated for rest of the image components unless all image components are reconstructed. An image for display can be updated when next image components become available for display, i.e., all image components are displayed in succession one after the other, or all Image components can be displayed at one time. As a result, user may experience a gradual build-up of all component JPEG image, i.e., display of coarse approximation of an image followed by display of a final image.

In a single component sub-sampled progressive decode, DC coefficients of all image components are reconstructed, and the image reconstructed using DC coefficients sent to display. Luminance scans (if image was coded in YCbCr color format) are then scheduled for decoding and display. Sub-sampled approximations of luminance component of JPEG image are then decoded 50, reconstructed and sent for display. Next, scheduling of a luminance image component for decoding in entirety is made. Complete data units of the current image component present between two

synchronization points are scheduled for rest of decoding processes once all DCT coefficients of these MCU/data units, i.e., the number of data units, which correspond to minimum restart interval among all the current restart intervals of current image component become available. This process is repeated for rest of the image component unless all MCU/data unit segments of an image component are reconstructed. Decoding of next image component is not started unless current image component, i.e., all and complete MCU/data units of current image component, has been decoded in entirety. This process is repeated for rest of the image components unless all image components are reconstructed. The image can be updated when the next image component becomes available for display, i.e., all image components are displayed in succession one after the other, or all the image components are displayed at one go with the already displayed luminance component. As a result, user may experience a gradual build of one component, i.e., luminance component, as an approximation of a JPEG image followed by component by component build-up of a JPEG image.

In a decode of successive approximation encoded progressive bitstream, the most significant bits of the frequency coefficients are decoded in a component wise manner, i.e., at a time single component is decoded and approximation of next component is scheduled for decoding only after approximation of previous component has been decoded. If memory is falling short during reconstruction of approximations of the JPEG image then the preferred embodiment scheme of reconstructing an image in multiple steps, i.e., reconstruction of available frequency coefficients of data units of current image component present between two synchronization points is employed. Once approximations of all image components become available, some are then sent for display as an all component approximation of a JPEG image. After four to five Most Significant Bit's (MSB's) of the frequency coefficients have been decoded and displayed for 8bpp image, all the bits of frequency coefficients of complete N x M JPEG image are decoded and displayed. Decode of complete JPEG bitstream is again performed in a component wise manner. An image component is scheduled for decoding in entirety. Complete data units of current image component present between two synchronization points are scheduled for rest of decoding processes once all DCT coefficients of these MCU/data units, i.e., the number of data units, which correspond to minimum restart interval among all the current restart intervals, of current image component become available. This process is repeated for rest of the image component unless all MCU/data unit segments of an image component are reconstructed. Decoding of next image component is not started unless current image component, i.e. all and complete MCU/data units of current image component, has been decoded in its entirety. This process is repeated for rest of the image components unless all image components are reconstructed. The image for display can be updated when the next image component becomes available for display, i.e., all image components are displayed in succession one after the other, or we can choose to display all Image components at one go. As a result, the user may experience a gradual build-up of all component JPEG image, i.e., display of coarse approximation of an image followed by display of a final image.

In an all component progressive JPEG decode, the available frequency coefficients, which may be contained in multiple scans, are decoded in a component wise manner, i.e., at a time single component is decoded. Data units of current image component present between two synchronization points are scheduled for rest of decoding processes once available DCT coefficients of these MCU/data units, i.e. the number of data units, which correspond to minimum restart interval among all the current restart intervals, of current image component become available. This process is repeated for rest of the image component unless all MCU/data unit segments of an image component are reconstructed. Decoding of next image component is not started unless current image component, i.e. all MCU/data units of current image component, has been decoded in entirety. This process is repeated for rest of the image components unless approximations of all image components get reconstructed. Once approximations of all image components become available, they are sent for display as an all component approximation of a JPEG image. After N/2xM/2 approximation of a JPEG image has been decoded and displayed, complete N x M JPEG image is decoded and displayed next. Decode of complete JPEG bitstream is again performed in a component wise manner. An image component is scheduled for decoding in entirety. Complete data units of current image components present between two synchronization points are scheduled for the rest of decoding processes once all DCT coefficients of these MCU/data units, i.e., the number of data units, which correspond to minimum restart interval among all the current restart intervals, of current image component become available. This process is repeated for rest of the image component unless all MCU/data unit segments of an image component are reconstructed. Decoding of next image component is not started unless current image component has been decoded in entirety. This process is repeated for rest of the image components unless all image components are reconstructed. An image for display can be updated when the next image component becomes available for display, i.e., all image components are displayed in succession one after the other, or all image components can be displayed at one time. As a result, user may experience a gradual build-up of all component JPEG image, i.e., display of coarse

approximation of an image followed by display of a final image.

[00135] As previously indicated the prior art teaches reconstructing all the components of an image for every approximation of image, which is sent to display and simultaneously stores the DCT coefficients of all the image components before starting the reconstruction of each MCU/data unit. As described above the presently claimed invention reconstructs an image in a significantly different and efficient manner.

[00136] There are various scenarios, which require STB, DTV and other multimedia products to support decoding of a progressively encoded JPEG bitstream. Claimed invention is equally applicable in such scenarios. Multimedia content players are required to support decoding of sequentially and progressively encoded JPEG bitstream. Progressively encoded JPEG content is becoming available on the internet because of disparity in the growth of bandwidth and available computation resources. Progressive JPEG is useful when the computation speed of JPEG decoder is faster and network bandwidth is relatively less fast. Hence, complete JPEG bitstream may not be available in real time.

[00137] The embodiments can be used for Internet browsing on STB, decode and display of

JPEG images downloaded from the internet/intranet, decode and display of JPEG images shared via Storage devices such as a USB stick, a USB Hard disk, a SATA Hard Disk, Flash, etc., decode and display of JPEG images shared over relatively close distance wired or wireless channel such as Bluetooth, WI-FI etc. and STB and DTV are getting connected to a home network, STB/ DTV network and internet.

[00138] The STB solution typically uses JPEG library for decoding a progressive JPEG

bitstream. JPEG library uses W*H*N*2 bytes of intermediate memory to store DCT coefficients for Progressive JPEG bitstream. In comparison the preferred embodiment reduces the memory requirement to approximately

• 2*Bytes j)er_DCT_coeffcient*MCU_Data_Unit_size bytes to

2*Bytes_per_DCT_coeffcient*MCU_Data_Unit_size *

Max_Number_of_Non_Zero_Data_Units_between_2_Synchronization_ Points bytes

[00139] While various embodiments of the disclosed method and apparatus have been

described above, it should be understood that they have been presented by way of example only, and should not limit the claimed invention. Likewise, the various diagrams may depict an example architectural or other configuration for the disclosed method and apparatus. This is done to aid in understanding the features and functionality that can be included in the disclosed method and apparatus. The claimed invention is not restricted to the illustrated example architectures or configurations, rather the desired features can be implemented using a variety of alternative architectures and configurations. Indeed, it will be apparent to one of skill in the art how alternative functional, logical or physical partitioning and configurations can be implemented to implement the desired features of the disclosed method and apparatus. Also, a multitude of different constituent module names other than those depicted herein can be applied to the various partitions. Additionally, with regard to flow diagrams, operational descriptions and method claims, the order in which the steps are presented herein shall not mandate that various embodiments be implemented to perform the recited functionality in the same order unless the context dictates otherwise.

[00140] Although the disclosed method and apparatus is described above in terms of various exemplary embodiments and implementations, it should be understood that the various features, aspects and functionality described in one or more of the individual embodiments are not limited in their applicability to the particular embodiment with which they are described. Thus, the breadth and scope of the claimed invention should not be limited by any of the above-described exemplary embodiments.

[00141] Terms and phrases used in this document, and variations thereof, unless otherwise expressly stated, should be construed as open ended as opposed to limiting. As examples of the foregoing: the term "including" should be read as meaning "including, without limitation" or the like; the term "example" is used to provide exemplary instances of the item in discussion, not an exhaustive or limiting list thereof; the terms "a" or "an" should be read as meaning "at least one," "one or more" or the like; and adjectives such as "conventional," "traditional," "normal," "standard," "known" and terms of similar meaning should not be construed as limiting the item described to a given time period or to an item available as of a given time, but instead should be read to encompass conventional, traditional, normal, or standard technologies that may be available or known now or at any time in the future.

Likewise, where this document refers to technologies that would be apparent or known to one of ordinary skill in the art, such technologies encompass those apparent or known to the skilled artisan now or at any time in the future.

[00142] A group of items linked with the conjunction "and" should not be read as requiring each and every one of those items be present in the grouping, but rather should be read as "and/or" unless expressly stated otherwise. Similarly, a group of items linked with the conjunction "or" should not be read as requiring mutual exclusivity among that group, but rather should also be read as "and/or" unless expressly stated otherwise. Furthermore, although items, elements or components of the disclosed method and apparatus may be described or claimed in the singular, the plural is contemplated to be within the scope thereof unless limitation to the singular is explicitly stated.

[00143] The presence of broadening words and phrases such as "one or more," "at least," "but not limited to" or other like phrases in some instances shall not be read to mean that the narrower case is intended or required in instances where such broadening phrases may be absent. The use of the term "module" does not imply that the components or functionality described or claimed as part of the module are all configured in a common package. Indeed, any or all of the various components of a module, whether control logic or other components, can be combined in a single package or separately maintained and can further be distributed in multiple groupings or packages or across multiple locations.

Additionally, the various embodiments set forth herein are described in terms of exemplary block diagrams, flow charts and other illustrations. As will become apparent to one of ordinary skill in the art after reading this document, the illustrated embodiments and their various alternatives can be implemented without confinement to the illustrated examples. For example, block diagrams and their accompanying description should not be construed as mandating a particular architecture or configuration.