APPARATUS AND METHOD FOR IMPROVING PERFORMANCE OF SUPER RESOLUTION FOR PREVIOUSLY UNSEEN IMAGES

Title:

APPARATUS AND METHOD FOR IMPROVING PERFORMANCE OF SUPER RESOLUTION FOR PREVIOUSLY UNSEEN IMAGES

Document Type and Number:

WIPO Patent Application WO/2022/197199

Kind Code:

Abstract:

Described is an apparatus (500) for training an image cleaning model, the apparatus (500) comprising one or more processors (501) configured to: receive a first clean low resolution image (102); stochastically corrupt the first clean tow resolution image (102) to generate a corrupted tow resolution image (103); clean the corrupted low resolution image (103) to generate a second clean tow resolution image (105); compare the first clean tow resolution image (102) with the second clean tow resolution image (105); and adapt the image cleaning mode) in dependence on the comparison between the first clean tow resolution image (102) and the second clean tow resolution image (105). By stochastically corrupting the first clean tow resolution image (102), the image cleaning model may be able to team how to clean corrupted tow resolution image (103).

More Like This:

JPH0832797	METHOD AND DEVICE FOR PROCESSING IMAGE
WO/2022/262660	PRUNING AND QUANTIZATION COMPRESSION METHOD AND SYSTEM FOR SUPER-RESOLUTION NETWORK, AND MEDIUM
JP2002057889	IMAGE PROCESSOR AND IMAGE PROCESSING METHOD

Inventors:

FILIPPOV ALEXANDER NIKOLAEVICH (CN)
ROMERO VERGARA ANDRES FELIPE (CH)
TIMOFTE RADU (CH)
VAN GOOL LUC (CH)

Application Number:

PCT/RU2021/000108

Publication Date:

September 22, 2022

Filing Date:

March 16, 2021

Export Citation:

Click for automatic bibliography generation Help

Assignee:

HUAWEI TECH CO LTD (CN)
FILIPPOV ALEXANDER NIKOLAEVICH (CN)
ETH ZURICH EIDGENOESSISCHE TECHNISCHE HOCHSCHULE ZUERICH (CH)

International Classes:

G06T3/40; G06T5/00

Other References:

KIM GWANTAE ET AL: "Unsupervised Real-World Super Resolution with Cycle Generative Adversarial Network and Domain Discriminator", 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), IEEE, 14 June 2020 (2020-06-14), pages 1862 - 1871, XP033798943, DOI: 10.1109/CVPRW50498.2020.00236
JI XIAOZHONG ET AL: "Real-World Super-Resolution via Kernel Estimation and Noise Injection", 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), IEEE, 14 June 2020 (2020-06-14), pages 1914 - 1923, XP033798841, DOI: 10.1109/CVPRW50498.2020.00241
ANDREAS LUGMAYR ET AL: "Unsupervised Learning for Real-World Super-Resolution", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 20 September 2019 (2019-09-20), XP081479840
TIAN CHUNWEI ET AL: "Deep learning on image denoising: An overview", NEURAL NETWORKS, ELSEVIER SCIENCE PUBLISHERS, BARKING, GB, vol. 131, 6 August 2020 (2020-08-06), pages 251 - 275, XP086281962, ISSN: 0893-6080, [retrieved on 20200806], DOI: 10.1016/J.NEUNET.2020.07.025

Attorney, Agent or Firm:

LAW FIRM "GORODISSKY & PARTNERS" LTD. et al. (RU)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. An apparatus (500) for training an image cleaning model, the apparatus (550) comprising one or more processors (501) configured to: receive a first clean low resolution image (102); stochastically corrupt the first clean low resolution (102) image to generate a corrupted low resolution image (103); clean the corrupted low resolution image (103) to generate a second clean low resolution image (105); compare the first clean low resolution image (103) with the second clean low resolution image (105); and adapt the image cleaning model in dependence on the comparison between the first clean low resolution image (102) and the second clean low resolution image (105).

2. The apparatus (500) according to claim 1, wherein the one or more processors (501) are configured to: before receiving the first clean low resolution image (102), receive a first high resolution image (101); and deterministically downsample the first high resolution image (101) to generate the first clean low resolution image (102).

3. The apparatus (500) according to claim 1 or 2, wherein the one or more processors (501) are configured to: after generating the second clean low resolution image (105), upsample the second clean low resolution image (105) using an image super resolution model to generate a second high resolution image (106).

4. The apparatus (500) according to claim 3, wherein the one or more processors (501) are configured to: after generating the second high resolution image (106), compare the first high resolution image (101) with the second high resolution image (106); and adapt the image super resolution model in dependence on the comparison between the first high resolution image (101) and the second high resolution image (106).

5. The apparatus (500) according to claim 4, wherein the one or more processors (501) are configured to: after comparing the first high resolution image (101) with the second high resolution image (106), adapt the image cleaning model in dependence on the comparison between the first high resolution image (101) and the second high resolution image (106).

6. The apparatus (500) according to any preceding claim, wherein the one or more processors (501) are configured to: generate the corrupted low resolution image (103) by encoding the first clean low resolution image (102) with a corruption distribution (112).

7. The apparatus (500) according to claim 6, wherein the one or more processors (501) are configured to: encode the first clean low resolution image (102) with a corruption distribution (112) comprising at least one of corruptions characteristic of corruptions imposed by a sensors and/or corruptions characteristic of corruptions imposed by compression models.

8. The apparatus (500) according to any of claims 2 to 7, wherein the one or more processors (501) are configured to: deterministically downsample the first high resolution image (101) by bicubic interpolation.

9. The apparatus (500) according to any of claims 3 to 8, wherein the one or more processors (501) are configured to: generate the second high resolution image (106) by using a super resolution model trained to upsample deterministically downsampled images.

10. The apparatus (500) according to any of the preceding claims, wherein the one or more processors (501) are configured to: stochastically corrupt the first clean low resolution image (102) to generate a plurality of corrupted low resolution images (103), clean the plurality of corrupted low resolution images (103) to generate a corresponding plurality of second clean low resolution images (105); compare the first clean low resolution image (102) with the plurality of second clean low resolution images (105); and adapt the image cleaning model in dependence on the comparison between the first clean low resolution image (102) and the plurality of second clean low resolution images (103).

11. The apparatus (500) according to claim 10 as dependant on claims 6 to 9, wherein the one or more processors (501) are configured to: generate the plurality of corrupted low resolution images (103) by encoding the first clean low resolution image (102) with a different corruption distribution (112) for each of the corrupted low resolution images (103).

12. The apparatus (500) according to claim 10 or 11, wherein the one or more processors (501) are configured to: after generating the plurality of second clean low resolution images (105), upsample the second clean low resolution images (105) using an image super resolution model to generate a corresponding plurality of second high resolution images (106).

13. The apparatus (500) according to claim 12, wherein the one or more processors (501 ) are configured to: after generating the plurality of second high resolution images (106), compare the first high resolution image (101) with the plurality of second high resolution images (106); and adapt the image super resolution model in dependence on the comparison between the first high resolution image (101) and the plurality of second high resolution images (106).

14. The apparatus (500) according to claim 13, wherein the one or more processors (501) are configured to: after comparing the first high resolution image (101) with the plurality of second high resolution images (106), adapt the image cleaning model in dependence on the comparison between the first high resolution image (101) and the plurality of second high resolution images (106).

15. The apparatus (500) according to any preceding claim, wherein the one or more processors (501) are configured to; carry out the steps of any preceding claim for one or more subsequent first high resolution image(s) (101), the one or more subsequent first high resolution image(s) (101) providing subsequent training iterations.

16. An image super resolution apparatus (500), the apparatus comprising one or more processors (501) and a memory (502) storing in non-transient form data defining program code executable by the processor(s) (501) to implement an image cleaning model trained by the apparatus of any of claims 1 to 15, the apparatus being configured to: receive a raw corrupted low resolution image (301); and apply super resolution to the raw corrupted low resolution image (301) by means of the image cleaning model.

17. An image super resolution apparatus (500), the apparatus comprising one or more processors (501) and a memory (502) storing in non-transient form data defining program code executable by the processor(s) (501) to implement an image cleaning model, the apparatus being configured to: receive a raw corrupted low resolution image (301); clean the raw corrupted low resolution image (301) to generate a clean low resolution image (302); and upsample the clean low resolution image (302) using an image super resolution model to generate a high resolution image (303).

18. An apparatus (500) according to claim 17, wherein the one or more processors (501) are configured to: generate the high resolution image (303) by using a super resolution model trained to upsample deterministically downsampled images

19. A method (400) for training an image cleaning model, the method (400) comprising: receiving (401) a first clean low resolution image (102); stochastically corrupting (402) the first clean low resolution image (102) to generate a corrupted low resolution image (103); cleaning (403) the corrupted low resolution image (103) to generate a second clean low resolution image (105); comparing (404) the first clean low resolution image (102) with the second clean low resolution image (105); and adapting (405) the image cleaning model in dependence on the comparison between the first clean low resolution image (102) and the second clean low resolution image (105).

20. A method for image super resolution, the method comprising: receiving a raw corrupted low resolution image (301); cleaning the raw corrupted low resolution image (301) to generate a clean low resolution image (302); and upsampling the clean low resolution image (302) using an image super resolution model to generate a high resolution image (303).

Description:

APPARATUS AND METHOD FOR IMPROVING PERFORMANCE OF SUPER RESOLUTION

FOR PREVIOUSLY UNSEEN IMAGES

FIELD OF THE INVENTION

This invention relates to image super resolution and cleaning, for example, for Real World Super Resolution (RWSR).

BACKGROUND

Super resolution refers to the operation increasing the resolution of an image from a low resolution image to a high resolution image.

Most super resolution methods are trained in a paired fashion by generating downsampled low resolution images from the high resolution counterpart, and learning what the high resolution image should look like. This method may fall down when the low resolution image has also been corrupted. In this situation, the super resolution apparatus may be unable to perform accurate super resolution as the input low resolution image is not what the apparatus is expecting. The corruptions may include noise, sensor corruptions, compression artifacts and so on.

RWSR differs from traditional super resolution approaches in a subtle yet fundamental way. RWSR may still perform accurate super resolution even with a corrupted low resolution input image, as the apparatus is trained to account for corruptions in the low resolution image. Training a RWSR model can be difficult as the corruptions may be unknown.

There is one common trend for RWSR techniques which is to fine-tune existing traditional super resolution methods for the real-world purpose. Therefore, approaches that perform well on traditional super resolution may fail on RWSR, and vice-versa.

Moreover, existing RWSR techniques rely on a two-stage approach. First, learning to degrade the high resolution images in order to resemble the noise and corruptions in the LR domain. Second, to artificially generate LR-HR pairs in order to learn a new SR method with such paired supervision.

However, these methods learn to generate pairs of low and high resolution images and train new models in full supervision.

It is desirable to develop a method that overcomes the above problems. SUMMARY

According to a first aspect there is provided an apparatus for training an image cleaning model, the apparatus comprising one or more processors configured to: receive a first clean low resolution image; stochastically corrupt the first clean low resolution image to generate a corrupted low resolution image; clean the corrupted low resolution image to generate a second clean low resolution image; compare the first clean low resolution image with the second clean low resolution image; and adapt the image cleaning model in dependence on the comparison between the first clean low resolution image and the second clean low resolution image.

Corresponding clean and corrupted images may be difficult and expensive to source. By stochastically corrupting the first clean low resolution image, this may enable the apparatus to generate corrupted images itself. This way the image cleaning model may be able to learn how to clean corrupted low resolution images without the difficulty and expense. In particular, the stochastic. corruption may provide a range of different corruptions and enable the image cleaning model to learn how to clean the range of different corruptions.

In some implementations, the apparatus may be configured to before receiving the first clean low resolution image, receive a first high resolution image; and deterministically downsample the first high resolution image to generate the first clean low resolution image.

Generating the first clean low resolution image from a first high resolution image may enable the image cleaning algorithm to also learn how to receive a high resolution image as an input as well as low resolution images.

Additionally, as the first high resolution image is deterministically downsampled, the first clean low resolution image output is always be the same. This may unable the image cleaning model to more easily learn how receive the high resolution input.

In some implementations, the apparatus may be configured to after generating the second clean low resolution image, upsample the second clean low resolution image using an image super resolution model to generate a second high resolution image. By upsampling the second clean low resolution image to a second clean high resolution image, this may enable the image cleaning model to be used in combination with an image super resolution model. As the image super resolution model may receive a clean low resolution image, the image super resolution model may run more effectively as there may be less or no corruption in the low resolution input image. This may result in better quality high resolution images being output from the super resolution model.

In some implementations, the apparatus may be configured to, after generating the second high resolution image, compare the first high resolution image with the second high resolution image; and adapt the image super resolution model in dependence on the comparison between the first high resolution image and the second high resolution image.

By adapting the image super resolution model using the comparison between the first high resolution image and the second high resolution image, the image super resolution model may learn from the comparison. This may enable the image super resolution model to improve depending on the quality of the output high resolution images.

In some implementations, the apparatus may be configured to after comparing the first high resolution image with the second high resolution image, adapt the image cleaning model in dependence on the comparison between the first high resolution image and the second high resolution image.

By adapting the image cleaning model using the comparison between the first high resolution image and the second high resolution image, the image cleaning model may learn from the comparison. This means that the comparison in the high resolution domain may back propagate into the low resolution domain to adapt the image cleaning model in the low resolution domain. This may enable the whole network of domains to learn from different domains. This may result in an improvement in the image output quality of the whole network.

In some implementations, the apparatus may be configured to generate the corrupted low resolution image by encoding the first clean low resolution image with a corruption distribution.

By encoding the first clean low resolution image with a corrupted distribution, this may enable the low resolution corrupted image to vary. As mentioned above, this may improve the learning of the image cleaning model, as it may learn to clean a range of corrupted low resolution images. In particular, using a corrupted distribution may enable the corruption to be controlled across the distribution. For example, different frequencies within the image may be affected, or corrupted, differently.

In some implementations, the apparatus may be configured to encode the first clean low resolution image with a corruption distribution comprising at least one of corruptions characteristic of corruptions imposed by a sensors and/or corruptions characteristic of corruptions imposed by compression models.

By encoding the first clean low resolution image with a corruption characteristic of corruptions imposed by sensors, this may artificially teach the image cleaning algorithm to clean images with real corruptions from sensors. This may be useful in fields where sensor corruption is common.

By encoding the first clean low resolution image with a corruption characteristic of corruptions imposed by compression models, this may artificially teach the image cleaning algorithm to clean images with real corruptions from compression models. This may be useful in fields where compression model corruption is common.

In some implementations, the apparatus may be configured to deterministically downsample the first high resolution image by bicubic interpolation.

By downsampling the first high resolution image by bicubic interpolation, the data set output may be smoother, and more accurate, than if other interpolations were used, such as bilinear interpolation or nearest-neighbour interpolation. Improving the quality of the first clean low resolution image may enable the image super resolution model and image cleaning model to have better inputs and consequently may produce better outputs.

In some implementations, the apparatus may be configured to generate the second high resolution image by using a super resolution model trained to upsample deterministically downsampled images.

By using a using a super resolution model trained to upsample deterministically downsampled images, the upsampling of the second clean low resolution image may produce a better high resolution output. This is because the super resolution model may be specifically trained, or designed, to work on deterministically downsampled images.

In some implementations, the apparatus may be configured to stochastically corrupt the first clean low resolution image to generate a plurality of corrupted low resolution images; clean the plurality of corrupted low resolution images to generate a corresponding plurality of second clean low resolution images; compare the first clean low resolution image with the plurality of second clean low resolution images; and adapt the image cleaning model in dependence on the comparison between the first clean low resolution image and the plurality of second clean low resolution images.

By generating a plurality of corrupted low resolution images from the first clean low resolution image, this may enable the training apparatus to run a number of iterations in the low resolution domain, and train the image cleaning model, from a single clean low resolution image. As mentioned above, corresponding clean and corrupted images may be difficult and expensive to source. Generating a plurality of corrupted images from a single clean image may reduce this difficulty and cost.

In some implementations, the apparatus may be configured to generate the plurality of corrupted low resolution images by encoding the first clean low resolution image with a different corruption distribution for each of the corrupted low resolution images.

By generating the plurality of corrupted low resolution images with a different corruption distribution for each, this may enable the training apparatus to produce a range of different corrupted images. As mentioned above, corresponding clean and corrupted images may be difficult and expensive to source. Generating a range of different corrupted images may, as mentioned above, allow for better training of the image cleaning model, as it may learn to train a range of different corrupted images from the same clean image.

In some implementations, the apparatus may be configured to, after generating the plurality of second clean low resolution images, upsample the second clean low resolution images using an image super resolution model to generate a corresponding plurality of second high resolution images.

In some implementations, the apparatus may be configured to, after generating the plurality of second high resolution images, compare the first high resolution image with the plurality of second high resolution images; and adapt the image super resolution model in dependence on the comparison between the first high resolution image and the plurality of second high resolution images.

By generating a plurality of second high resolution images, this may enable to image super resolution model to learn from a plurality of second high resolution images from a single clean image. As mentioned above, corresponding clean and corrupted images may be difficult and expensive to source. Generating a range of different corrupted images may, as mentioned above, allow for better training of the image super resolution model, as it may learn to train a range of different corrupted images from the same clean image.

In some implementations, the apparatus may be configured to, after comparing the first high resolution image with the plurality of second high resolution images, adapt the image cleaning model in dependence on the comparison between the first high resolution image and the plurality of second high resolution images.

By adapting the image cleaning model using the comparison between the first high resolution image and the plurality of second high resolution images, the image cleaning model may learn from the comparison. This means that the comparisons in the high resolution domain may back propagate into the low resolution domain to adapt the image cleaning model in the low resolution domain. This may enable the whole network of domains to learn from different domains. This may result in an improvement of the image output quality of the whole network.

In some implementations, the apparatus may be configured to carry out the steps of any preceding claim for one or more subsequent first high resolution image(s), the one or more subsequent first high resolution image(s) providing subsequent training iterations.

By providing one or more subsequent first high resolution image(s), the training apparatus may run one or more iterations. Running more iterations on the training apparatus may enable the image super resolution model and image cleaning model to better learn from the input images and may converge on an optimum model.

According to a second aspect there is provided an image super resolution apparatus, the apparatus comprising one or more processors and a memory storing in non-transient form data defining program code executable by the processor(s) to implement an image cleaning model trained by the apparatus of any of the steps above, the apparatus being configured to: receive a raw corrupted low resolution image; and apply super resolution to the raw corrupted low resolution image by means of the image cleaning model.

By applying super resolution to a raw corrupted low resolution image by means of the image cleaning model in the steps above, the image resolution apparatus may be able to use the cleaning model to produce a better high resolution output. This is because the image cleaning model may be used to clean the raw corrupted low resolution image before super resolution is applied which may improve the high resolution output of the super resolution apparatus.

According to a third aspect there is provided an image super resolution apparatus, the apparatus comprising one or more processors and a memory storing in non-transient form data defining program code executable by the processor(s) to implement an image cleaning model, the apparatus being configured to: receive a raw corrupted low resolution image; clean the raw corrupted low resolution image to generate a clean low resolution image; and upsample the clean low resolution image using an image super resolution model to generate a high resolution image.

By applying super resolution to a raw corrupted low resolution image by means of an image cleaning model, the image resolution apparatus may be able to use the cleaning model to produce a better high resolution output. This is because the image cleaning model may be used to clean the raw corrupted low resolution image before super resolution is applied which may improve the high resolution output of the super resolution apparatus.

In some implementations, the apparatus may be configured to generate the high resolution image by using a super resolution model trained to upsample deterministically downsampled images.

By using a using a super resolution model trained to upsample deterministically downsampled images, the upsampling of the raw clean low resolution image may produce a better high resolution output. This is because the super resolution model may be specifically trained, or designed, to work on deterministically downsampled images.

According to a fourth aspect there is provided a method for training an image cleaning model, the method comprising: receiving a first clean low resolution image; stochastically corrupting the first clean low resolution image to generate a corrupted low resolution image; cleaning the corrupted low resolution image to generate a second clean low resolution image; comparing the first clean low resolution image with the second clean low resolution image; and adapting the image cleaning model in dependence on the comparison between the first clean low resolution image and the second clean low resolution image.

By adapting the image cleaning model in dependence on the comparison between the first clean low resolution image and the second clean low resolution image, this may enable the image cleaning model to learn from the differences in the low resolution domain. Corresponding clean and corrupted images may be difficult and expensive to source. By stochastically corrupting the first clean low resolution image, this may enable the apparatus to generate corrupted images itself. This way, the image cleaning model may be able to learn how to clean corrupted low resolution images without the difficulty and expense. In particular, the stochastic corruption may provide a range of different corruptions and enable the image cleaning model to learn how to clean the range of different corruptions.

According to a fifth aspect there is provided a method for image super resolution, the method comprising: receiving a raw corrupted low resolution image; cleaning the raw corrupted low resolution image to generate a clean low resolution image; and upsampling the clean low resolution image using an image super resolution model to generate a high resolution image.

BRIEF DESCRIPTION OF THE FIGURES

The present invention will now be described by way of example with reference to the accompanying drawings. In the drawings:

Figure 1 schematically illustrates an exemplary network architecture used in the image cleaning training apparatus.

Figure 2 schematically illustrates an exemplary network architecture used in the image cleaning training apparatus including example images.

Figure 3 schematically illustrates an exemplary network architecture used in the image super resolution apparatus.

Figure 4 shows an example of a method for training an image super resolution model.

Figure 5 shows an example of an apparatus configured to perform the methods described herein. DETAILED DESCRIPTION

The apparatuses and methods described herein concern training an image cleaning model and using said model to apply super resolution to raw corrupted low resolution images.

Embodiments of the present invention tackle one or more of the problems previously mentioned by stochastically corrupting the first clean low resolution image to generate a corrupted low resolution image. In this way, it is possible to enable the cleaning model to learn from corrupted low resolution images and consequently enable the model to improve the super resolution of the image.

The image may also include a range of features, including images and videos in relation to computer vision, to corpora of text in relation to natural language processing, or to gene expressions in relation to bioinformatics.

Figure 1 schematically illustrates an exemplary network architecture 100 used in an image cleaning training apparatus.

The network 100 may receive a first high resolution image 101 or a first clean low resolution image 102. It may be appreciated that the low and high resolution terms are relative to one another and are not limiting to specific bands of resolution.

If the first high resolution image 101 is provided as an input to the network 100 then the first high resolution image 101 may be received in a high resolution domain 113. The first high resolution image 101 may be deterministically downsampled, as shown at 107, to generate the first clean low resolution image 102. Deterministically downsampled may mean that the low resolution output is not random and would be the same each time the downsampling is run. This may enable the image cleaning model to more easily learn how to receive the high resolution input.

Preferably, the deterministic downsampling may be carried out by bicubic interpolation, as shown at 107. By downsampling the first high resolution image by bicubic interpolation, the data set output may be smoother, and more accurate, than if other interpolations were used, such as bilinear interpolation or nearest-neighbour interpolation. Improving the quality of the first clean low resolution image 102 may enable the image super resolution model and image cleaning model to have better inputs and consequently may produce better outputs. Alternatively, the first low resolution image 102 may be provided as an input to the network 100. The first low resolution image 102 may be provided directly without, or in addition to, the first high resolution image 101 being provided. Providing the first low resolution image 102 without the first high resolution image 101 may enable the network 100 to reduce the number of steps as the first high resolution image 101 may not be required for training the image cleaning model.

The first low resolution image 102 may be stochastically corrupted, as shown at 108, to generate a corrupted low resolution image 103. Corresponding clean and corrupted images may be difficult and expensive to source. By stochastically corrupting the first clean low resolution image this may enable the apparatus to generate corrupted images itself. This way, the image cleaning model may be able to learn how to clean corrupted low resolution images without the difficulty and expense. In particular, the stochastic corruption may provide a range of different corruptions and enable the image cleaning model to learn how to clean the range of different corruptions.

The first low resolution image 102 may be corrupted by encoding 111 the first low resolution image 102 with a corruption distribution 112. The corruption distribution 112 may comprise a corruptions characteristic of corruptions imposed by a sensors and/or corruptions characteristic of corruptions imposed by compression models. By encoding 111 the first clean low resolution image 102 with a corruption characteristic of corruptions imposed by sensors, this may artificially teach the image cleaning algorithm to clean images with real corruptions from sensors. This may be useful in fields where sensor corruption is common. By encoding 111 the first clean low resolution image 102 with a corruption characteristic of corruptions imposed by compression models, this may artificially teach the image cleaning algorithm to clean images with real corruptions from compression models. This may be useful in fields where compression model corruption is common.

The first low resolution image 102 may be stochastically corrupted 108 to generate a plurality of corrupted low resolution images 103. This may enable the training apparatus to run a number of iterations in the low resolution domain 114, and train the image cleaning model, from a single clean low resolution image 102. As mentioned above, corresponding clean and corrupted images may be difficult and expensive to source. Generating a plurality of corrupted images 103 from a single first clean image 102 may reduce this difficulty and cost.

The first low resolution image 102 may be stochastically corrupted 108 to generate a plurality of corrupted low resolution images by encoding 111 the first clean low resolution image 102 with a different corruption distribution 112 for each of the corrupted low resolution images 103. This may enable the training apparatus to produce a range of different corrupted images. As mentioned above, corresponding clean and corrupted images may be difficult and expensive to source. Generating a range of different corrupted low resolution images 103 may, as mentioned above, allow for better training of the image cleaning model, as it may learn to train a range of different corrupted low resolution images 103 from the same first clean low resolution image 102.

The following stages may apply for a single corrupted low resolution image 103 or a plurality of corrupted low resolution images.

The corrupted low resolution image 104 may be cleaned, as shown at 109, to generate a second clean low resolution image 105. The corrupted low resolution image 104 may be cleaned using a cleaning network 109.

The first clean low resolution image 102 may be compared with the second clean low resolution image 105. The comparison may be based on a difference between the first clean low resolution image 102 and the second clean low resolution image 105.

The image cleaning model may be adapted in dependence on the comparison between the first clean low resolution image 102 and the second clean low resolution image 105. This may enable the image cleaning model to learn from the differences in the low resolution clean domain 114. The image cleaning model may be provided by the low resolution clean domain 114 and/or the clean network 109.

The image cleaning model may also be adapted in dependence on the comparison between the first clean low resolution image 102 and the plurality of second clean low resolution images 105. The additional clean low resolution images 105 may further train the image cleaning model.

The second clean low resolution image 105 may be upsampled, as shown at 110, to generate a second high resolution image 106. This may enable the image cleaning model to be used in combination with an image super resolution model. As the image super resolution model may receive a second clean low resolution image 105, the image super resolution model may run more effectively as there may be less or no corruption in the second clean low resolution image 105. This may result in better quality second high resolution images 106 output from the super resolution model.

The second clean low resolution image 105 may be upsampled 110 to generate a second high resolution image 106 using a super resolution model trained to upsample 110 deterministically downsampled 107 images. This may enable the upsampling 110 of the second clean low resolution image 105 to produce a better quality second high resolution image 106. This is because the super resolution model may be specifically trained, or designed, to work on deterministically downsampled images, such as the first clean low resolution image 102 which has been converted into a second clean low resolution image 105 through the low resolution corrupted domain 115. The super resolution model trained to upsample 110 deterministically downsampled 107 images may be an “off the shelf super resolution network. In other words, the super resolution network may be known from the prior art.

The first high resolution image 101 and the second clean high resolution image 106 may be compared. The comparison may be based on a difference between the first high resolution image 101 and the second clean high resolution image 106.

The image super resolution model may be adapted in dependence on the comparison between the first high resolution image 101 and the second clean high resolution image 106. This may enable the image super resolution model to learn from the comparison. This may enable to image super resolution model to improve depending on the quality of the output second high resolution images 106.

Additionally, the image super resolution model may be adapted in dependence on the comparison between the first high resolution image 101 and the second clean high resolution image 106. This may enable the image cleaning model may learn from the comparison. This means that the comparison in the high resolution domain 113 may back propagate into the low resolution domain 114 to adapt the image cleaning model in the low resolution domain 114. This may enable the whole network 100 of domains to learn from different domains. This may result in an improvement of the image output quality of the whole network 100.

The image super resolution model and the image cleaning model may also be adapted in dependence on the comparison between the first high resolution image 101 and the plurality of second high resolution images 106. The additional second high resolution images 105 may further train the image super resolution model and the image cleaning model.

The image super resolution model may be provided by the high resolution domain 113 and/or the upsampling method 110.

The network 100 may be configured to carry out the above stages for one or more subsequent first high resolution images 101 or subsequent first clean low resolution images 102. This may enable the training apparatus to run one or more iterations. Running more iterations on the training apparatus may enable the image super resolution model and image cleaning model to better learn from the input images and may converge on an optimum model.

Figure 2 schematically illustrates an exemplary network architecture used in the image cleaning training apparatus including example images. Figure 2 may include the same components as Figure 1. Figure 2 demonstrates the closed-loop nature of the training framework. In this example, the first high resolution image 201 , the first clean low resolution image 202 and the corrupted low resolution image 203 represent a bird and the corrupted low resolution image 204, the second clean low resolution image 205 and the second high resolution image 206 represent a plant. Figure 2 may vary the corrupted low resolution image from a first corrupted low resolution image 203, which in this example represents a bird, to a second corrupted low resolution image 204, which in this example represents a plant. This may better train the imaging cleaning model by including more variation in the training inputs.

Figure 3 schematically illustrates an exemplary network architecture 300 used in the image super resolution apparatus described herein.

The network 300 may receive a raw corrupted low resolution image 301. The raw corrupted low resolution image 301 may be corrupted with noise, sensor corruptions, compression artifacts and so on.

The network 300 may clean the raw corrupted low resolution image 301 using a cleaning network 304. The cleaning network 304 may use an image cleaning model. The image cleaning model may have been trained by the image cleaning training apparatus described above.

Cleaning the raw corrupted low resolution image 301 may generate clean low resolution image 302. This cleaning may be carried out in the domain adaption space 306.

The clean low resolution image 302 may be upsampled, as shown at 305, to generate a high resolution image 303. The upsampling may be carried out by an image super resolution model. The image super resolution model may be trained to upsample deterministically downsampled images. The image super resolution model may be trained to upsample deterministically downsampled images. The image super resolution model may be an “off the shelf super resolution network. In other words, the super resolution network may be known from the prior art. This upsampling may be carried out in the super resolution space 307. An exemplary embodiment of the image cleaning training apparatus and image super resolution apparatus will now be described in more detail.

An examplary training network, as illustrated generally in Figure 1, may comprise three generators, three discriminators, and one noise encoder. The three generators are provided by the Gcorrupted 108, the Gclean 109 and the Gup 110 generators. The three discriminators are provided by the high resolution domain 113, the low resolution clean domain 114 and the low resolution corrupted domain 115. The noise encoder us provided by Snoise 111.

The clean generator (Gclean) 109 and the corrupted generator (Gcorrupted) 108 may be based on a RRBD architecture without upsampling layers. The main difference between the clean generator 109 and the corrupted generator 108 is that the corrupted generator 108 takes as input the concatenation of an image and a randomly sampled noise 112 with the same spatial size as the image.

For the two low resolution discriminators, Dclean 114 and Dcorrupted 115 and the noise encoder Snoise 111, the network may use three convolutional layers with 5x5 filters and no stride, followed by Batch Normalization and LeakyReLU, and a last convolutional layer to produce the adversarial output or the extracted noise, for the discriminator 114, 115 or noise encoder 111 respectively. Preferably, the corruptions and artifacts are high frequency perturbations and the noise encoder 111 uses a high-pass filter prior forwarding. As for the high resolution discriminator (Dup) 113, the network may use a light version of a UNet discriminator.

A goal is to learn a mapping function G that reconstructs a high resolution image 106, represented by YhreRHxWx3, by using the real-world low-resolution counterpart 105, the corrupted low resolution image 104, represented by XlreRhxwx3, without having access to the joint distribution (Xlr,Xhr), where (h,w) = (H/n,W/n) and n is a predefined upscaling factor.

Using real world low resolution images implies that traditional methods for super resolution do not perform well at this task, due to the inherent artifacts and noises absent in such training frameworks. With this in mind, a preferred embodiment of the invention may use an unsupervised generative adversarial approach where the network 100 learns to degrade and clean high- resolution images 101 and low resolution images 102, respectively.

As mentioned above, the super resolution model trained to upsample 110 deterministically downsampled 107 images may be an “off the shelf super resolution network. In other words, the super resolution network may be known from the prior art. For the image cleaning training apparatus the network 100, as illustrated in Figure 1, may run through iterations by means of gradient leveraging. In other words, each iteration may optimise the gradients based on the preceding iteration to optimise the output.

As mentioned above, the corruption distribution 112 encoded by the noise encoder network 111 may be stochastic. Preferably, the corruption distribution is selected by randomly sampling from a gaussian distribution. Preferably, the corruptions have a high diversity in that, as described above, the corruption distribution encoded for each corrupted low resolution image 103 is different.

In contrast to traditional plug and play methods that only learn the general mapping real world vs clean domain, the exemplary embodiment of the invention may rely on a joint training and end- to-end strategy using an off-the-shelf specialized super resolution method. The exemplary embodiment may leverage on the super resolution fixed weights in order to adapt the real world low resolution image to the low resolution image expected by the specialized super resolution network.

The inventors aimed to introduce the specialized super resolution system into the training scheme and leverage on the gradient flow from the low resolution domain 114, 115 to the high resolution domain 113 to enhance a better domain adaptation component. Moreover, this solution may be used in a plug and play manner in which the image super resolution model is off the shelf. Alternatively, the image super resolution model may be a pseudo plug and play in which the domain adaptation component specializes on the specialized super resolution model.

For the training apparatus to function appropriately, the apparatus may rely on (i) pushing gradients between and effectively jointly learning the domain adaptation from the low resolution corruption domain 115 to the low resolution clean domain 114 while aiming to maximise the quality of the high resolution results obtained by an off-the-shelf fixed specialized super resolution model; (ii) a carefully designed combination of losses which stabilizes the learning and secures convergence to an optimal performance.

During training, and by sampling non-aligned low and high resolution images, the exemplary embodiment of the invention aims to perform a cycle-reconstruction of the high resolution image, that is, degrading it, cleaning it, and super-resolving it using an off-the-shelf super resolution method. The low resolution image is preferably only cleaned and super-resolved. The framework may be based on a General Adversarial Network (GAN) and each domain may be trained adversarially.

In one particular example, the framework may be trained using the NTIRE 2020 RWSR Challenge, Tracks 1 and 2, and the AIM19 Challenge Track 2.

NTIRE 2020 Track 1 may use the DIV2K dataset, where the artifacts and corruptions are artificially created. This may mean that the network may have access to paired data during the validation stage that may allow the network to compute perceptual (LPIPS) and fidelity (PSNR, SSIM, RMSE) metrics.

NTIRE 2020 Track 2 may use the DPED dataset, where images are extracted from a cellphone camera, where inherent artifacts and corruptions are present, so no ground-truth is available for the evaluation. The network may use a non-reference perceptual quality metrics (BRISQUE, PIQE, NIQE) to perform the evaluation.

AIM19 Track 2 may use the DIV2K dataset, where the artifacts and corruptions are artificially created, this may mean that the network may have access to paired data during the validation stage that allows us to compute perceptual (LPIPS) and fidelity (PSNR, SSIM, RMSE) metrics.

The same network may be used for each of the above training methods.

For the image super resolution apparatus, the network 300, as illustrated in Figure 3 may comprise a trainable entity. The trainable entity may comprise a neural network machine-learning entity. The trainable entity may be trained by using plurality of images obtained under image deterioration. The deterioration may comprise noise, sensor corruptions, compression artifacts and so on. The image deterioration may be learned based on cyclic consistency consideration. Learning based on cyclic consistency considerations may comprise minimizing loss functions. The image deterioration may be learned from imagery configured as adversarially learned imagery.

Figure 4 summarises an example of a method for training an image cleaning model. At step 401 , the method comprises receiving a first clean low resolution image. At step 402, the method comprises stochastically corrupting the first clean low resolution image to generate a corrupted low resolution image. At step 403, the method comprises cleaning the corrupted low resolution image to generate a second clean low resolution image. At step 404, the method comprises comparing the first clean low resolution image with the second clean low resolution image. At step 405, the method comprises adapting the image cleaning model in dependence on the comparison between the first clean low resolution image and the second clean low resolution image.

A method for image super resolution that may use an image cleaning model formed by the method above may comprise the steps of receiving a raw corrupted low resolution image; cleaning the raw corrupted low resolution image to generate a clean low resolution image; and upsampling the clean low resolution image using an image super resolution model to generate a high resolution image.

An example of an apparatus 500 configured to implement the method is schematically illustrated in Figure 5. The apparatus 500 may be implemented on an electronic device, such as a laptop, tablet, smart phone or TV.

The apparatus 500 comprises a processor 501 configured to process the datasets in the manner described herein. For example, the processor 501 may be implemented as a computer program running on a programmable device such as a Central Processing Unit (CPU). The apparatus 500 comprises a memory 502 which is arranged to communicate with the processor 501. Memory 502 may be a non-volatile memory. The processor 501 may also comprise a cache (not shown in Figure 5), which may be used to temporarily store data from memory 502. The apparatus may comprise more than one processor and more than one memory. The memory may store data that is executable by the processor. The processor may be configured to operate in accordance with a computer program stored in non-transitory form on a machine readable storage medium. The computer program may store instructions for causing the processor to perform its methods in the manner described herein.

The apparatus 500 may also be used to produce super resolution images using the trained cleaning model described above. The image super resolution apparatus may comprise one or more processors, such as processor 501 , and a memory 502 storing in non-transient form data defining program code executable by the processor(s) to implement the image cleaning model formed by the image cleaning training apparatus. The image restoration apparatus may be configured to receive a raw corrupted low resolution image 301. The raw corrupted low resolution image 301 may be restored by means of the image cleaning model formed by the image cleaning training apparatus.

The apparatus and method described herein may be practically applied to other data inputs in other fields such as images and videos in relation to computer vision, to corpora of text in relation to natural language processing or to gene expressions in relation to bioinformatics. The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Previous Patent: REFLECTION REMOVAL FROM AN IMAGE

Next Patent: METHOD, APPARATUS AND SYSTEM FOR ENCODING