WANG WEI (US)
LIU SHAN (US)
US20130297542A1 | 2013-11-07 | |||
US20190124346A1 | 2019-04-25 |
TONG CHEN; HAOJIE LIU; ZHAN MA; QIU SHEN; XUN CAO; YAO WANG: "Neural Image Compression via Non-Local Attention Optimization and Improved Context Modeling", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 11 October 2019 (2019-10-11), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081514975
See also references of EP 4029240A4
WHAT IS CLAIMED IS: 1. A method of multi-rate neural image compression with stackable nested model structures, the method being performed by at least one processor, and the method comprising: iteratively stacking, on a first prior set of weights of a first neural network corresponding to a prior hyperparameter, a first plurality of sets of weights of a first plurality of stackable neural networks corresponding to a current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged; encoding an input image to obtain an encoded representation, using the first prior set of weights of the first neural network on which the first plurality of sets of weights of the first plurality of stackable neural networks is stacked; and encoding the obtained encoded representation to determine a compressed representation. 2. The method of claim 1, further comprising: iteratively stacking, on a second prior set of weights of a second neural network corresponding to the prior hyperparameter, a second plurality of sets of weights of a second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged; decoding the determined compressed representation to determine a recovered representation; and decoding the determined recovered representation to reconstruct an output image, using the second prior set of weights of the second neural network on which the second plurality of sets of weights of the second plurality of stackable neural networks is stacked. 3. The method of claim 2, wherein the first neural network and the second neural network are trained by updating a first initial set of weights of the first neural network and a second initial set of weights of the second neural network, to optimize a rate-distortion loss that is determined based on the input image, the output image and the compressed representation. 4. The method of claim 3, wherein the first neural network and the second neural network are further trained by iteratively stacking, on the first prior set of weights of the first neural network, the first plurality of sets of weights of the first plurality of stackable neural networks corresponding to the current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged. 5. The method of claim 4, wherein the first neural network and the second neural network are further trained by iteratively stacking, on the second prior set of weights of the second neural network, the second plurality of sets of weights of the second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged. 6. The method of claim 5, wherein the first neural network and the second neural network are further trained by updating the stacked first plurality of sets of weights of the first plurality of stackable neural networks, and the stacked second plurality of sets of weights of the second plurality of stackable neural networks, to optimize the rate-distortion loss. 7. The method of claim 2, wherein one or more of the first plurality of sets of weights of the first plurality of stackable neural networks and the second plurality of sets of weights of the second plurality of stackable neural networks do not correspond to the current hyperparameter. 8. An apparatus for multi-rate neural image compression with stackable nested model structures, the apparatus comprising: at least one memory configured to store program code; and at least one processor configured to read the program code and operate as instructed by the program code, the program code comprising: first stacking code configured to cause the at least one processor to iteratively stack, on a first prior set of weights of a first neural network corresponding to a prior hyperparameter, a first plurality of sets of weights of a first plurality of stackable neural networks corresponding to a current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged; first encoding code configured to cause the at least one processor to encode an input image to obtain an encoded representation, using the first prior set of weights of the first neural network on which the first plurality of sets of weights of the first plurality of stackable neural networks is stacked; and second encoding code configured to cause the at least one processor to encode the obtained encoded representation to determine a compressed representation. 9. The apparatus of claim 8, wherein the program code further comprises: second stacking code configured to cause the at least one processor to iteratively stack, on a second prior set of weights of a second neural network corresponding to the prior hyperparameter, a second plurality of sets of weights of a second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged; first decoding code configured to cause the at least one processor to decode the determined compressed representation to determine a recovered representation; and second decoding code configured to cause the at least one processor to decode the determined recovered representation to reconstruct an output image, using the second prior set of weights of the second neural network on which the second plurality of sets of weights of the second plurality of stackable neural networks is stacked. 10. The apparatus of claim 9, wherein the first neural network and the second neural network are trained by updating a first initial set of weights of the first neural network and a second initial set of weights of the second neural network, to optimize a rate-distortion loss that is determined based on the input image, the output image and the compressed representation. 11. The apparatus of claim 10, wherein the first neural network and the second neural network are further trained by iteratively stacking, on the first prior set of weights of the first neural network, the first plurality of sets of weights of the first plurality of stackable neural networks corresponding to the current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged. 12. The apparatus of claim 11, wherein the first neural network and the second neural network are further trained by iteratively stacking, on the second prior set of weights of the second neural network, the second plurality of sets of weights of the second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged. 13. The apparatus of claim 12, wherein the first neural network and the second neural network are further trained by updating the stacked first plurality of sets of weights of the first plurality of stackable neural networks, and the stacked second plurality of sets of weights of the second plurality of stackable neural networks, to optimize the rate-distortion loss. 14. The apparatus of claim 9, wherein one or more of the first plurality of sets of weights of the first plurality of stackable neural networks and the second plurality of sets of weights of the second plurality of stackable neural networks do not correspond to the current hyperparameter. 15. A non-transitory computer-readable medium storing instructions that, when executed by at least one processor for multi-rate neural image compression with stackable nested model structures, cause the at least one processor to: iteratively stack, on a first prior set of weights of a first neural network corresponding to a prior hyperparameter, a first plurality of sets of weights of a first plurality of stackable neural networks corresponding to a current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged; encode an input image to obtain an encoded representation, using the first prior set of weights of the first neural network on which the first plurality of sets of weights of the first plurality of stackable neural networks is stacked; and encode the obtained encoded representation to determine a compressed representation. 16. The non-transitory computer-readable medium of claim 15, wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: iteratively stack, on a second prior set of weights of a second neural network corresponding to the prior hyperparameter, a second plurality of sets of weights of a second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged; decode the determined compressed representation to determine a recovered representation; and decode the determined recovered representation to reconstruct an output image, using the second prior set of weights of the second neural network on which the second plurality of sets of weights of the second plurality of stackable neural networks is stacked. 17. The non-transitory computer-readable medium of claim 16, wherein the first neural network and the second neural network are trained by updating a first initial set of weights of the first neural network and a second initial set of weights of the second neural network, to optimize a rate-distortion loss that is determined based on the input image, the output image and the compressed representation. 18. The non-transitory computer-readable medium of claim 17, wherein the first neural network and the second neural network are further trained by iteratively stacking, on the first prior set of weights of the first neural network, the first plurality of sets of weights of the first plurality of stackable neural networks corresponding to the current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged. 19. The non-transitory computer-readable medium of claim 18, wherein the first neural network and the second neural network are further trained by iteratively stacking, on the second prior set of weights of the second neural network, the second plurality of sets of weights of the second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged. 20. The non-transitory computer-readable medium of claim 19, wherein the first neural network and the second neural network are further trained by updating the stacked first plurality of sets of weights of the first plurality of stackable neural networks, and the stacked second plurality of sets of weights of the second plurality of stackable neural networks, to optimize the rate-distortion loss. |
Typical, multiple epoch iterations will be taken to optimize the R-D loss in this weight update process, e.g., until reaching a maximum iteration number or until the loss converges. [0056] Comparing with the previous end-to-end (E2E) image compression methods, the embodiments of FIGS.3 and 4 may include only one model instance to achieve a multi-rate compression effect with stackable nested model structures, and a training framework to learn the model instance. Accordingly, the embodiments may include a largely reduced deployment storage to achieve multi-rate compression, and a flexible framework that accommodates various types of NIC models. Further, shared computation from the nested network structure performing higher bitrate compression can be achieved by reusing the computation of lower bitrate compression, which saves computation in multi-rate compression. [0057] FIG. 5 is a flowchart of a method 500 of multi-rate neural image compression with stackable nested model structures, according to embodiments. [0058] In some implementations, one or more process blocks of FIG. 5 may be performed by the platform 120. In some implementations, one or more process blocks of FIG. 5 may be performed by another device or a group of devices separate from or including the platform 120, such as the user device 110. [0059] As shown in FIG. 5, in operation 510, the method 500 includes iteratively stacking, on a first prior set of weights of a first neural network corresponding to a prior hyperparameter, a first plurality of sets of weights of a first plurality of stackable neural networks corresponding to a current hyperparameter. The first prior set of weights of the first neural network remains unchanged. [0060] In operation 520, the method 500 includes encoding an input image to obtain an encoded representation, using the first prior set of weights of the first neural network on which the first plurality of sets of weights of the first plurality of stackable neural networks is stacked. [0061] In operation 530, the method 500 includes encoding the obtained encoded representation to determine a compressed representation. [0062] Although FIG. 5 shows example blocks of the method 500, in some implementations, the method 500 may include additional blocks, fewer blocks, different blocks, or differently arranged blocks than those depicted in FIG. 5. Additionally, or alternatively, two or more of the blocks of the method 500 may be performed in parallel. [0063] FIG. 6 is a block diagram of an apparatus 600 for multi-rate neural image compression with stackable nested model structures, according to embodiments. [0064] As shown in FIG. 6, the apparatus 600 includes first stacking code 610, first encoding code 620 and second encoding code 630. [0065] The first stacking code 610 is configured to cause at least one processor to iteratively stack, on a first prior set of weights of a first neural network corresponding to a prior hyperparameter, a first plurality of sets of weights of a first plurality of stackable neural networks corresponding to a current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged. [0066] The first encoding code 620 is configured to cause the at least one processor to encode an input image to obtain an encoded representation, using the first prior set of weights of the first neural network on which the first plurality of sets of weights of the first plurality of stackable neural networks is stacked. [0067] The second encoding code 630 is configured to cause the at least one processor to encode the obtained encoded representation to determine a compressed representation. [0068] FIG. 7 is a flowchart of a method 700 of multi-rate neural image decompression with stackable nested model structures, according to embodiments. [0069] In some implementations, one or more process blocks of FIG. 7 may be performed by the platform 120. In some implementations, one or more process blocks of FIG. 7 may be performed by another device or a group of devices separate from or including the platform 120, such as the user device 110. [0070] As shown in FIG. 7, in operation 710, the method 700 includes iteratively stacking, on a second prior set of weights of a second neural network corresponding to the prior hyperparameter, a second plurality of sets of weights of a second plurality of stackable neural networks corresponding to the current hyperparameter. The second prior set of weights of the second neural network remains unchanged. [0071] In operation 720, the method 700 includes decoding the determined compressed representation to determine a recovered representation. [0072] In operation 730, the method 700 includes decoding the determined recovered representation to reconstruct an output image, using the second prior set of weights of the second neural network on which the second plurality of sets of weights of the second plurality of stackable neural networks is stacked. [0073] The first neural network and the second neural network may be trained by updating a first initial set of weights of the first neural network and a second initial set of weights of the second neural network, to optimize a rate-distortion loss that is determined based on the input image, the output image and the compressed representation, iteratively stacking, on the first prior set of weights of the first neural network, the first plurality of sets of weights of the first plurality of stackable neural networks corresponding to the current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged, iteratively stacking, on the second prior set of weights of the second neural network, the second plurality of sets of weights of the second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged, and updating the stacked first plurality of sets of weights of the first plurality of stackable neural networks, and the stacked second plurality of sets of weights of the second plurality of stackable neural networks, to optimize the rate-distortion loss. Additional neural networks can be stacked iteratively in the same manner. The first prior set of weights of the first neural network remains unchanged throughout the iterative stacking process. [0074] One or more of the first plurality of sets of weights of the first plurality of stackable neural networks and the second plurality of sets of weights of the second plurality of stackable neural networks may not correspond to the current hyperparameter. [0075] Although FIG. 7 shows example blocks of the method 700, in some implementations, the method 700 may include additional blocks, fewer blocks, different blocks, or differently arranged blocks than those depicted in FIG. 7. Additionally, or alternatively, two or more of the blocks of the method 700 may be performed in parallel. [0076] FIG. 8 is a block diagram of an apparatus 800 for multi-rate neural image decompression with stackable nested model structures, according to embodiments. [0077] As shown in FIG. 8, the apparatus 800 includes second stacking code 810, first decoding code 820 and second decoding code 830. [0078] The second stacking code 810 is configured to cause the at least one processor to iteratively stack, on a second prior set of weights of a second neural network corresponding to the prior hyperparameter, a second plurality of sets of weights of a second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged. [0079] The first decoding code 820 is configured to cause the at least one processor to decode the determined compressed representation to determine a recovered representation. [0080] The second decoding code 830 is configured to cause the at least one processor to decode the determined recovered representation to reconstruct an output image, using the second prior set of weights of the second neural network on which the second plurality of sets of weights of the second plurality of stackable neural networks is stacked. [0081] The first neural network and the second neural network may be trained by updating a first initial set of weights of the first neural network and a second initial set of weights of the second neural network, to optimize a rate-distortion loss that is determined based on the input image, the output image and the compressed representation, iteratively stacking, on the first prior set of weights of the first neural network, the first plurality of sets of weights of the first plurality of stackable neural networks corresponding to the current hyperparameter, wherein the first prior set of weights of the first neural network remains unchanged, iteratively stacking, on the second prior set of weights of the second neural network, the second plurality of sets of weights of the second plurality of stackable neural networks corresponding to the current hyperparameter, wherein the second prior set of weights of the second neural network remains unchanged, and updating the stacked first plurality of sets of weights of the first plurality of stackable neural networks, and the stacked second plurality of sets of weights of the second plurality of stackable neural networks, to optimize the rate-distortion loss. [0082] One or more of the first plurality of sets of weights of the first plurality of stackable neural networks and the second plurality of sets of weights of the second plurality of stackable neural networks may not correspond to the current hyperparameter. [0083] The methods may be used separately or combined in any order. Further, each of the methods (or embodiments), encoder, and decoder may be implemented by processing circuitry (e.g., one or more processors or one or more integrated circuits). In one example, the one or more processors execute a program that is stored in a non-transitory computer-readable medium. [0084] The foregoing disclosure provides illustration and description, but is not intended to be exhaustive or to limit the implementations to the precise form disclosed. Modifications and variations are possible in light of the above disclosure or may be acquired from practice of the implementations. [0085] As used herein, the term component is intended to be broadly construed as hardware, firmware, or a combination of hardware and software. [0086] It will be apparent that systems and/or methods, described herein, may be implemented in different forms of hardware, firmware, or a combination of hardware and software. The actual specialized control hardware or software code used to implement these systems and/or methods is not limiting of the implementations. Thus, the operation and behavior of the systems and/or methods were described herein without reference to specific software code—it being understood that software and hardware may be designed to implement the systems and/or methods based on the description herein. [0087] Even though combinations of features are recited in the claims and/or disclosed in the specification, these combinations are not intended to limit the disclosure of possible implementations. In fact, many of these features may be combined in ways not specifically recited in the claims and/or disclosed in the specification. Although each dependent claim listed below may directly depend on only one claim, the disclosure of possible implementations includes each dependent claim in combination with every other claim in the claim set. [0088] No element, act, or instruction used herein may be construed as critical or essential unless explicitly described as such. Also, as used herein, the articles “a” and “an” are intended to include one or more items, and may be used interchangeably with “one or more.” Furthermore, as used herein, the term “set” is intended to include one or more items (e.g., related items, unrelated items, a combination of related and unrelated items, etc.), and may be used interchangeably with “one or more.” Where only one item is intended, the term “one” or similar language is used. Also, as used herein, the terms “has,” “have,” “having,” or the like are intended to be open-ended terms. Further, the phrase “based on” is intended to mean “based, at least in part, on” unless explicitly stated otherwise.
Next Patent: HIGH FLOW DIFFERENTIAL CLEANING SYSTEM