A SOFT SHADOW ALGORITHM WITH CONTACT HARDENING EFFECT FOR MOBILE GPU

Title:

A SOFT SHADOW ALGORITHM WITH CONTACT HARDENING EFFECT FOR MOBILE GPU

Document Type and Number:

WIPO Patent Application WO/2023/208385

Kind Code:

Abstract:

Described is an image processing apparatus (900) for generating a shadow effect, the apparatus (900) comprising one or more processors (901) and a memory (902) storing in non-transient form data defining program code executable by the one or more processors (901), wherein the program code, when executed by the one or more processors (901), causes the image processing apparatus (900) to: obtain an input scene (701); generate a shadow map image (702) from the input scene (701); generate one or more shadow map downsampled images (703, 704) by downsampling the shadow map image (702); and generate an output shadow effect image (705) in dependence on two or more images selected from (a) the shadow map image (702) and (b) the one or more shadow map downsampled images (703, 704). By generating the output shadow effect image (705) in dependence on two or more images selected from (a) the shadow map image (702) and (b) the one or more shadow map downsampled images (703, 704), this may enable the shadow in the output shadow effect image (705) to be a combination of the shadows in the two or more images selected from (a) the shadow map image (702) and (b) the one or more shadow map downsampled images (703, 704). In this way, the shadow in the output shadow effect image (705) may be more realistic.

Inventors:

LIU BAOQUAN (DE)
LIN SENLING (DE)

Application Number:

PCT/EP2022/061616

Publication Date:

November 02, 2023

Filing Date:

April 29, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

HUAWEI TECH CO LTD (CN)
LIU BAOQUAN (DE)

International Classes:

G06T15/60

Domestic Patent References:

WO2018227100A1

2018-12-13

Foreign References:

US7589724B1

2009-09-15

Other References:

TOM LOKOVIC ET AL: "Deep shadow maps", COMPUTER GRAPHICS. SIGGRAPH 2000 CONFERENCE PROCEEDINGS. NEW ORLEANS, LA, JULY 23 - 28, 2000; [COMPUTER GRAPHICS PROCEEDINGS. SIGGRAPH], NEW YORK, NY : ACM, US, July 2000 (2000-07-01), pages 385 - 392, XP058374799, ISBN: 978-1-58113-208-3, DOI: 10.1145/344779.344958
ELMAR EISEMANN ET AL: "Casting Shadows in Real Time", 20091216; 1077952576 - 1077952576, 16 December 2009 (2009-12-16), XP058135527, DOI: 10.1145/1665817.1722963
RANDIMA FERNANDO: "Percentage-closer soft shadows", 20050731; 1077952576 - 1077952576, 31 July 2005 (2005-07-31), pages 35 - es, XP058302207, DOI: 10.1145/1187112.1187153
YU LI ED - WAEWSAK JOMPOB ET AL: "Rendering Research of Variance Shadow Map by Gaussian Filter", ENERGY PROCEDIA, vol. 13, 9 December 2011 (2011-12-09) - 10 December 2011 (2011-12-10), pages 7835 - 7840, XP028470317, ISSN: 1876-6102, [retrieved on 20120315], DOI: 10.1016/J.EGYPRO.2011.12.527
THOMAS ANNEN ET AL: "Exponential shadow maps", GRAPHICS INTERFACE 2005 : PROCEEDINGS ; VICTORIA, BRITISH COLUMBIA, 9 - 11 MAY 2005, CANADIAN INFORMATION PROCESSING SOCIETY, 403 KING STREET WEST, SUITE 205 TORONTO, ONT. M5U 1LS CANADA, 28 May 2008 (2008-05-28), pages 155 - 161, XP058289287, ISSN: 0713-5424, ISBN: 978-1-56881-337-0
LUNG-JEN WANG ET AL: "An Improved Non-Linear Image Enhancement Method for Video Coding", COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, 2008. CISIS 2008. INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 4 March 2008 (2008-03-04), pages 79 - 84, XP031312135, ISBN: 978-0-7695-3109-0
AMOAKO-YIRENKYI PETER ET AL: "Performance Analysis of Image Smoothing Techniques on a New Fractional Convolution Mask for Image Edge Detection", OPEN JOURNAL OF APPLIED SCIENCES, vol. 06, no. 07, 2016, pages 478 - 488, XP093006746, ISSN: 2165-3917, DOI: 10.4236/ojapps.2016.67048
LAURITZEN ANDREW: "GPU Gems 3 Chapter 8. Summed-Area Variance Shadow Maps", GPU GEMS 3, 27 December 2019 (2019-12-27), pages 1 - 14, XP093007367, Retrieved from the Internet [retrieved on 20221212]

Attorney, Agent or Firm:

KREUZ, Georg M. (DE)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1 . An image processing apparatus (900) for generating a shadow effect, the apparatus (900) comprising one or more processors (901) and a memory (902) storing in non-transient form data defining program code executable by the one or more processors (901), wherein the program code, when executed by the one or more processors (901), causes the image processing apparatus (900) to: obtain an input scene (701); generate a shadow map image (702) from the input scene (701); generate one or more shadow map downsampled images (703, 704) by downsampling the shadow map image (702); and generate an output shadow effect image (705) in dependence on two or more images selected from (a) the shadow map image (702) and (b) the one or more shadow map downsampled images (703, 704).

2. The apparatus (900) according to claim 1 , wherein the apparatus (900) is configured to generate the output shadow effect image (705) by sampling the two or more images selected from (a) the shadow map image (702) and (b) the one or more shadow map downsampled images (703, 704).

3. The apparatus (900) according to claim 2, wherein the apparatus (900) is configured to generate the output shadow effect image (705) by performing a linear interpolation between the samples of the two or more images selected from (a) the shadow map image (702) and (b) the one or more shadow map downsampled images (703, 704).

4. The apparatus (900) according to any preceding claim, wherein the apparatus (900) is configured to generate a further shadow map downsampled image (704) by downsampling the shadow map downsampled image (703).

5. The apparatus (900) according to claim 4, wherein the apparatus (900) is configured to generate the output shadow effect image (705) in dependence on the shadow map downsampled image (703) and the further shadow map downsampled image (704).

6. The apparatus (900) according to any preceding claim, wherein the apparatus (900) is configured to generate the one or more shadow map downsampled images (703, 704) as at least part of a mipmap.

7. The apparatus (900) according to any preceding claim, wherein the apparatus (900) is configured to generate the shadow map image (702) from the input scene (701), the input scene (701) comprising one or more objects (1402) and one or more light sources (1401), by: calculating a blocked area (1403) of the input scene (701) in which the one or more light sources (1401) are blocked by the one or more objects (1402); estimating an average blocker depth; and calculate a penumbra size (1404) in dependence on the blocked area (1403) and the average blocker depth.

8. A computer implemented method (800) for generating a shadow effect, the method (800) comprising: obtaining an input scene (801); generating a shadow map image from the input scene (802); generating one or more shadow map downsampled images by downsampling the shadow map image (803); and generating an output shadow effect image in dependence on two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images (804).

9. An image processing apparatus (900) for generating a shadow effect, the apparatus (900) comprising one or more processors (901) and a memory (902) storing in non-transient form data defining program code executable by the one or more processors (901), wherein the program code, when executed by the one or more processors (901), causes the image processing apparatus (900) to: obtain an input scene (701); generate a shadow map image (702) from the input scene (701); generate a filtered shadow map image (1001) from the shadow map image (702) by means of a basis-spline filter algorithm (1101); and generate an output shadow effect image (705) in dependence on the filtered shadow map image (1001).

10. The apparatus (900) according to claim 9, wherein the apparatus (900) is configured to filter the shadow map image (702) by means of a 4-tap bicubic basis-spline filter algorithm (1101).

11. The apparatus (900) according to claim 9 or claim 10, wherein the apparatus (900) is configured to generate the output shadow effect image (705) by estimating the probability of a pixel being blocked in a neighbouring area from the filtered shadow map image (1001).

12. A computer implemented method (1200) for generating a shadow effect, the method (1200) comprising: obtaining an input scene (1201); generating a shadow map image from the input scene (1202); generating a filtered shadow map image from the shadow map image by means of a basis-spline filter algorithm (1203); and generating an output shadow effect image in dependence on the filtered shadow map image (1204).

13. An image processing apparatus (900) for generating a shadow effect, the apparatus (900) comprising one or more processors (901) and a memory (902) storing in non-transient form data defining program code executable by the one or more processors (901), wherein the program code, when executed by the one or more processors (901), causes the image processing apparatus (900) to: obtain an input scene (701); generate a shadow map image (702) from the input scene (701); generate one or more shadow map downsampled images (703, 704) by downsampling the shadow map image (702); generate a filtered shadow map image (1001) from the shadow map image (702) by means of a basis-spline filter algorithm (1101); generate a filtered shadow map downsampled image (1301 , 1302) from each of the one or more shadow map downsampled images (703, 704) by means of a basisspline filter algorithm (1101); and generate an output shadow effect image (705) in dependence on two or more images selected from (a) the filtered shadow map image (1001) and (b) the one or more filtered shadow map downsampled images (1301 , 1302).

14. A computer implemented method (1500) for generating a shadow effect, the method (1500) comprising: obtaining an input scene (1501); generating a shadow map image from the input scene (1502); generating one or more shadow map downsampled images by downsampling the shadow map image (1503); generating a filtered shadow map image from the shadow map image by means of a basis-spline filter algorithm (1504); generating a filtered shadow map downsampled image from each of the one or more shadow map downsampled images by means of a basis-spline filter algorithm (1505); and generating an output shadow effect image in dependence on two or more images selected from (a) the filtered shadow map image and (b) the one or more filtered shadow map downsampled images (1506).

15. A computing apparatus (1600) configured to carry out the method of any of claims 8, 12 and 14.

16. The computing apparatus (1600) according to claim 15, wherein the computing apparatus comprises a graphical processing unit (1601) and a main memory (1606), the computing apparatus (1600) being configured to: on the graphical processing unit (1601): obtain the input scene (701); generate the shadow map image (702); generate the filtered (1001 , 1301 , 1302) and/or the downsampled (703, 704, 1301 , 1302) shadow map image(s); and generate the output downsampled image (705); and on the main memory (1606): obtain the output downsampled image (705) from the graphical processing unit (1601); and output the output downsampled image (705).

17. The computing apparatus (1600) according to claim 15 or claim 16, wherein the graphical processing unit (1601) comprises: a renderpass (1602) for obtaining the input scene (701) and generating the shadow map image (702); a first subpass (1604) for generating the filtered (1001 , 1301 , 1302) and/or the downsampled (703, 704, 1301 , 1302) shadow map image(s); and a second subpass (1605) for generating the output downsampled image (705).

18. A mobile communication device comprising the computing apparatus (160) of any of claims 15 to 17.

19. A computer program comprising executable instructions which, when executed by a computer, cause the computer to carry out the method of any of claims 8, 12 and 14.

Description:

A SOFT SHADOW ALGORITHM WITH CONTACT HARDENING EFFECT FOR MOBILE GPU

FIELD OF THE INVENTION

This invention relates to image processing apparatuses and methods for generating a shadow effect, for example, for video games.

BACKGROUND

Shadows may be a result of the absence of light due to occlusion. When a light source's light rays do not hit an object because it gets occluded by some other object, the object is in shadow. Shadows may add a great deal of realism to a lit scene and make it easier for a viewer to observe spatial relationships between objects. Shadows may also give a greater sense of depth to our scene and objects. Rendering shadow may be very important for video games. This is because shadows may not only increase the realism of a rendered image, and but may also indicate inter-object distance and positions for the viewer.

One of the fundamental problems in computer graphics is generating accurate soft shadows from area light sources. Soft shadows may provide valuable cues about the relationships between objects The shadows may become sharper as objects contact, or are close to, each other and the shadows may become blurrier, or softer, the further they are apart.

Shadow rendering may be a fundamental and important feature for many applications. Applications, such as video games, may require the shadow rendering to be very efficient. Preferably, the rendering is less than 10ms per frame.

However, rendering high quality soft shadows using filtering algorithms, such as using percentage closer filtering (PCF), which involve a large kernel size with a large number of texture sampling taps may be very expensive and demanding, especially for modern mobile GPUs.

A first option for rendering shadow images is shadow mapping and percentage closer filtering (PCF).

Figure 1 schematically illustrates an exemplary depth map 100 for a shadow. The light source 101 may be blocked by the occluder 102. The shadow 103 may be formed behind the occluder 102. The occluder 102 may be Z distance 104 from the light source 101. The shadow 103 may be d distance 105 from the light source 101.

The idea behind shadow mapping is to render the scene from the light's 101 point of view and everything that can be seen from the light's 101 perspective is lit and everything that cannot be seen is in the shadow 103. Shadow mapping may therefore consist of two passes. Firstly, in the first pass, rendering the depth map 101. Figure 1 schematically illustrates an exemplary depth map for a shadow. Secondly, in the second pass, rendering the scene as normal and using the generated depth map to calculate whether fragments are in shadow.

Because the depth map 100 may have a fixed resolution, the depth may frequently span more than one fragment per texel. As a result, multiple fragments sample the same depth value from the depth map and come to the same shadow conclusions. This may produce jagged blocky edges of the shadow.

Based on shadow mapping [Williams 1978], percentage closer filtering (PCF) [Reeves et al. 1987] tried to solve these jagged edges by using some filtering functions that produce softer shadows, making them appear less blocky or hard. The idea is to sample more than once from the depth map, each time with slightly different texture coordinates. For each individual sample it may be checked whether it is in shadow or not. All the sub-results may then be combined and averaged to get a nice soft looking shadow.

Figure 2 schematically illustrates an exemplary NxN texel of a depth map. One exemplary implementation of PCF may be to simply sample 201 the surrounding NxN texels of the depth map 200 and average the comparing results, as shown in Figure 2.

Figures 3A and 3B illustrate exemplary shadow effect images. Figure 3A illustrates an exemplary hard shadow image 300. Figure 3B illustrates an exemplary soft shadow image 302. Many modern games may be developed using unreal and unity engine often by employing a 3x3 bilinear PCF filtering. This may fetch 9 texture-samples in a 3x3 grid of neighborhood area via 9 texture fetching instructions, and then perform a 3x3 triangle-shaped bilinear weighted calculation on the 9 samples in a fragment shader in order to get a smoothly filtered shadow. The shadow 301 in the hard shadow image is shown in Figure 3A. In comparison, the shadow 303 in Figure 3B, which uses the PCF filtering, has smoother edges.

Although the quality of PCF may be very good, achieving such high quality may require many samples. As with standard texture filtering, surfaces at shallow angles may require huge anisotropic filter regions. In the case of PCF, it may not be possible to use prefiltered mipmaps to accelerate the process. This is because of the per sample depth comparison. Consequently, in the worst case, sampling, and comparison of every individual texel in the shadow map may be required in order to compute the light attenuation for a single frame-buffer pixel. As expected, this process may be slow. The situation may deteriorate when PCF is used to achieve edge softening. This is because this approach may be considered equivalent to placing a lower bound on the size of the filter region and, consequently, on the cost of shading a pixel.

Another disadvantage of PCF is that it may only support fixed-size penumbra. This may be due to the uniform filtering kernel used for all pixels, which cannot produce a contact hardening effect.

A second option for rendering shadow images is percentage closer soft shadows (PCSS).

While the original PCF method may only deal with uniform filtering kernel producing shadows with fixed-size penumbra, a variant called PCSS has been proposed for rendering soft shadow with variable size penumbra. This is also known as contact hardening shadows. The contact hardening shadows are a graphics effect that may more accurately simulate how shadows are in real life. The shadows may be sharper when the object they are casting off is close to the surface the shadow is hitting. As that object gets further away, the shadows become blurrier.

Figures 4A, 4B and 4C illustrate exemplary shadow effect images. Figure 4A illustrates an exemplary shadow effect image 400 generated with a 1 percentage closer filtering (PCF) tap kernel. Figure 4B illustrates an exemplary shadow effect image 401 generated with a 9x9 PCF taps kernel. Figure 4C illustrates an exemplary shadow effect image 402 generated with a percentage closer soft shadow (PCSS) kernel. The shadow effect image 402 generated with a percentage closer soft shadow (PCSS) kernel is shown to produce a blurrier shadow than the PCF kernels.

The PCSS method may comprise the following three key steps.

Firstly, to compute the average blocker depth for the current pixel by averaging all depth values within an initial filter kernel that are smaller than current pixel’s depth.

Secondly, the average blocker depth is used to compute the penumbra size. Note that PCSS assumes that the blockers/receivers are all planar and in parallel. Using this assumption, the penumbra may be easily computed based on similar triangles, as shown in Figure 5. Figure 5 schematically illustrates an exemplary depth map for a shadow. The light source 501 may be blocked by the blocker 502. The shadow 503 may be formed behind the blocker 502.

Thirdly, a loop over all sampling points in the penumbra 504 is performed using PCF to do shadow comparisons and to sum up the visibility to get soft shadows. Since the PCSS’s implementation may only incur in shader modification, it may be easy to be integrated into existing rendering systems.

However, PCSS may achieve visually plausible quality and real-time performance, but only for very small area light source. For large area light source PCSS may suffer in performance, due to the large PCF filtering kernels required.

One key insight of PCSS is that the steps are based on the brute-force point sampling of the depth map. When the area light size becomes large, many sampling points, (e.g., 30x30), are required to avoid banding artifacts.

Increasing the kernel size of PCF may increase the softness of the shadows, but, for large kernels, many texture sampling will be involved. As a result, it may be costly for memory bandwidth, and for rendering performance.

A third option for rendering shadow images is summed-area table (SAT) based variance shadow maps.

Based on PCSS, more efficient prefiltering-based methods have been proposed, such as SAT- based variance shadow map (SAVSM). This method may support pre-filtering based on a one- tailed version of Chebyshev’s inequality and requires a much lower amount of texture memory.

The algorithm is similar to the algorithm for standard shadow maps, except that instead of simply writing depth to the shadow map, it may write depth and depth squared to a two- component variance shadow map. By filtering over some region, this may recover the two order moments M1 and M2 of the depth distribution in that region. From these, the mean and variance of the distribution can be calculated. Using the variance, the Chebyshev's inequality can be applied, as shown in Equation 1 and Equation 2, to compute an upper bound on the probability that the currently shaded surface (at depth t) is occluded.

In particular, for a single occluder and single receiver, there may be a small neighborhood in which they will be locally planar, and thus this probability computed over this region may be a good approximation of the true probability. As shown in Figure 4, PCSS may involve large filtering kernels for the pixels with large penumbra which could be very costly. However, this algorithm, as shown in Equation 3, needs a compute shader to calculate the SAT, which would be a very costly compute shader, due to the integration over a large image region for every single pixel.

In addition, it is well known that SAT may suffer from numerical precision loss for large filter kernel. In particular, the 32-bit integer format is used for SAT generation to achieve stable shadow quality. However, in contact shadow areas, where the blocker and receiver are placed closely, the precision of integer SAT may still not be enough and may introduce small errors.

Furthermore, SAT is a Fast Box Filters, which use the same weights for all texels within the kernel. This can produce blocky effects for soft shadows.

In summary, the defects of the prior arts mainly include:

• A large number of sampling operations which consume a large amount of GPU memory bandwidth and may increase the rendering latency. This may be found in PCF and PCSS based algorithms for larger filtering kernels.

• The issues of numerical precision and the costly compute shader for SAT generation, found in SAT-based variance shadow map algorithms.

• Prior art systems often use a box filter, which has the same weights for all texels within the kernel, which may produce undesirable blocky artifacts of soft shadows.

It is desirable to develop an apparatus and method that overcomes the above problems. SUMMARY

According to a first aspect, there is provided an image processing apparatus for generating a shadow effect, the apparatus comprising one or more processors and a memory storing in non-transient form data defining program code executable by the one or more processors, wherein the program code, when executed by the one or more processors, causes the image processing apparatus to: obtain an input scene; generate a shadow map image from the input scene; generate one or more shadow map downsampled images by downsampling the shadow map image; and generate an output shadow effect image in dependence on two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images.

By generating the output shadow effect image in dependence on two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images, this may enable the shadow in the output shadow effect image to be a combination of the shadows in the two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images. In this way, the shadow in the output shadow effect image may be more realistic.

In some implementations, the apparatus may be configured to generate the output shadow effect image by sampling the two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images.

By generating the output shadow effect image by sampling the two or more images, this may enable the apparatus to check whether the pixel is in the shadow of the object.

In some implementations, the apparatus may be configured to generate the output shadow effect image by performing a linear interpolation between the samples of the two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images.

By generating the output shadow effect image by performing a linear interpolation between the samples of the two or more images, this enable the generation of the output shadow effect image to be based on a combination, or average, of the two or more images.

In some implementations, the apparatus may be configured to generate a further shadow map downsampled image by downsampling the shadow map downsampled image. In some implementations, the apparatus may be configured to generate the output shadow effect image in dependence on the shadow map downsampled image and the further shadow map downsampled image.

In some implementations, the apparatus may be configured to generate the one or more shadow map downsampled images as at least part of a mipmap.

By generating a further shadow map downsampled image, this may enable the apparatus to provide a further level of downsampling, which may provide an increased range of levels to be combined to form a softer shadow.

In some implementations, the apparatus may be configured to generate the shadow map image from the input scene, the input scene comprising one or more objects and one or more light sources, by calculating a blocked area of the input scene in which the one or more light sources are blocked by the one or more objects; estimating an average blocker depth; and calculate a penumbra size in dependence on the blocked area and the average blocker depth.

By generating the shadow map image through calculating the blocked area, estimating the blocker depth, and calculating the penumbra size, this may enable the size and location of the shadow to be determined.

According to a second aspect, there is provided a computer implemented method for generating a shadow effect, the method comprising: obtaining an input scene; generating a shadow map image from the input scene; generating one or more shadow map downsampled images by downsampling the shadow map image; and generating an output shadow effect image in dependence on two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images.

According to a third aspect, there is provided an image processing apparatus for generating a shadow effect, the apparatus comprising one or more processors and a memory storing in non-transient form data defining program code executable by the one or more processors, wherein the program code, when executed by the one or more processors, causes the image processing apparatus to: obtain an input scene; generate a shadow map image from the input scene; generate a filtered shadow map image from the shadow map image by means of a basis-spline filter algorithm; and generate an output shadow effect image in dependence on the filtered shadow map image.

By generating an output shadow effect image in dependence on the filtered shadow map image, this may reduce the number of sampling points. In this way, this may reduce the computational load reducing the power consumption and rendering latency.

In some implementations, the apparatus may be configured to filter the shadow map image by means of a 4-tap bicubic basis-spline filter algorithm.

By using a 4-tap bicubic basis-spline filter algorithm, this may provide a realistic shadow effect while also reducing the number of sampling points.

In some implementations, the apparatus may be configured to generate the output shadow effect image by estimating the probability of a pixel being blocked in a neighbouring area from the filtered shadow map image.

By generating the output shadow effect image by estimating the probability of a pixel being blocked in a neighbouring area, this can enable the size and location of the shadow to be estimated.

According to a fourth aspect, there is provided a computer implemented method for generating a shadow effect, the method comprising: obtaining an input scene; generating a shadow map image from the input scene; generating a filtered shadow map image from the shadow map image by means of a basis-spline filter algorithm; and generating an output shadow effect image in dependence on the filtered shadow map image.

According to a fifth aspect, there is provided an image processing apparatus for generating a shadow effect, the apparatus comprising one or more processors and a memory storing in non-transient form data defining program code executable by the one or more processors, wherein the program code, when executed by the one or more processors, causes the image processing apparatus to: obtain an input scene; generate a shadow map image from the input scene; generate one or more shadow map downsampled images by downsampling the shadow map image; generate a filtered shadow map image from the shadow map image by means of a basis-spline filter algorithm; generate a filtered shadow map downsampled image from each of the one or more shadow map downsampled images by means of a basis-spline filter algorithm; and generate an output shadow effect image in dependence on two or more images selected from (a) the filtered shadow map image and (b) the one or more filtered shadow map downsampled images.

By combining the use of downsampling with the basis-spline filter algorithm, this may enable the output shadow effect image may be more realistic while also reducing the number of sampling points, and therefore the computational loading.

According to a sixth aspect, there is provided a computer implemented method for generating a shadow effect, the method comprising: obtaining an input scene; generating a shadow map image from the input scene; generating one or more shadow map downsampled images by downsampling the shadow map image; generating a filtered shadow map image from the shadow map image by means of a basisspline filter algorithm; generating a filtered shadow map downsampled image from each of the one or more shadow map downsampled images by means of a basis-spline filter algorithm; and generating an output shadow effect image in dependence on two or more images selected from (a) the filtered shadow map image and (b) the one or more filtered shadow map downsampled images.

According to a seventh aspect, there is provided a computing apparatus configured to carry out the method as described herein.

In some implementations, the computing apparatus may comprise a graphical processing unit and a main memory, the computing apparatus being configured to: on the graphical processing unit: obtain the input scene; generate the shadow map image; generate the filtered and/or the downsampled shadow map image(s); and generate the output downsampled image; and on the main memory: obtain the output downsampled image from the graphical processing unit; and output the output downsampled image.

In some implementations, the computing apparatus may be configured wherein the graphical processing unit comprises: a renderpass for obtaining the input scene and generating the shadow map image; a first subpass for generating the filtered and/or the downsampled shadow map image(s); and a second subpass for generating the output downsampled image.

By configuring the computing apparatus to carry out the bulk of the processing in the GPU, instead of the main memory, this may reduce the computational loading on the main memory of the computing apparatus.

According to an eighth aspect, there is provided a mobile communication device comprising the computing apparatus as described herein.

According to a ninth aspect, there is provided a computer program comprising executable instructions which, when executed by a computer, cause the computer to carry out the method as described herein.

BRIEF DESCRIPTION OF THE FIGURES

The present invention will now be described by way of example with reference to the accompanying drawings. In the drawings:

Figure 1 schematically illustrates an exemplary depth map for a shadow.

Figure 2 schematically illustrates an exemplary NxN texel of a depth map.

Figures 3A and 3B illustrate exemplary shadow effect images. Figure 3A illustrates an exemplary hard shadow image. Figure 3B illustrates an exemplary soft shadow image.

Figures 4A, 4B and 4C illustrate exemplary shadow effect images. Figure 4A illustrates an exemplary shadow effect image generated with a 1 percentage closer filtering (PCF) tap kernel. Figure 4B illustrates an exemplary shadow effect image generated with a 9x9 PCF taps kernel. Figure 4C illustrates an exemplary shadow effect image generated with a percentage closer soft shadow (PCSS) kernel.

Figure 5 schematically illustrates an exemplary depth map for a shadow.

Figure 6 schematically illustrates the stages of a first exemplary embodiment of an image processing apparatus for generating a shadow effect. Figure 7 shows an example of a first computer implemented method for generating shadow effect.

Figure 8 shows an example of an apparatus configured to perform the methods described herein.

Figure 9 schematically illustrates the stages of a second exemplary embodiment of an image processing apparatus for generating a shadow effect.

Figure 10A graphically illustrates an exemplary comparison between a Gaussian filter kernel and a bicubic B-spline filter kernel. Figure 10B graphically illustrates an exemplary weighted sum distribution for a cubic B-spline filtering algorithm. Figure 10C schematically illustrates an exemplary texel-area.

Figure 11 shows an example of a second computer implemented method for generating shadow effect.

Figure 12 schematically illustrates the stages of a second exemplary embodiment of an image processing apparatus for generating a shadow effect.

Figure 13 schematically illustrates an exemplary depth map for a shadow.

Figure 14 shows an example of a third computer implemented method for generating shadow effect.

Figure 15 schematically illustrates an exemplary computing apparatus configured to perform the methods described herein.

Figures 16A and 16B illustrate an output image of the apparatus for generating a shadow effect.

Figures 17A and 17B illustrate an output image of the apparatus for generating a shadow effect compared to the prior art. Figure 17A illustrates the output image of the prior art. Figure 17B illustrates the output image of the apparatus for generating a shadow effect.

Figures 18A and 18B illustrate an output image of the apparatus for generating a shadow effect compared to the prior art. Figure 18A illustrates the output image of the prior art. Figure 18B illustrates the output image of the apparatus for generating a shadow effect.

Figures 19A and 19B illustrate an output image of the apparatus for generating a shadow effect compared to the prior art. Figure 19A illustrates the output image of the prior art. Figure 19B illustrates the output image of the apparatus for generating a shadow effect.

Figures 20A and 20B illustrate an output image of the apparatus for generating a shadow effect compared to the prior art. Figure 20A illustrates the output image of the prior art. Figure 20B illustrates the output image of the apparatus for generating a shadow effect.

Figure 21 A illustrates a PCF shadow effect transition area. Figure 21 B illustrates a PCT shadow effect transition area. DETAILED DESCRIPTION

The apparatuses and methods described herein concern generating a shadow effect on an input scene.

Embodiments of the present invention may tackle one or more of the problems previously mentioned by generating an output shadow effect image in dependence on two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images. This may enable the shadow in the output shadow effect image to be a combination of the shadows in the two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images. In this way, the shadow in the output shadow effect image may be more realistic.

Additionally, embodiments of the present invention may tackle one or more of the problems previously mentioned problems by generating an output shadow effect image in dependence on the filtered shadow map image. This may reduce the number of sampling points. In this way, this may reduce the computational load reducing the power consumption and rendering latency.

Motivated by the problems found in the prior art, the aim of the apparatus may be to introduce a new soft shadow algorithm to enable real-time, high-quality soft shadow rendering with low- memory cost for mobile GPUs.

Optimization objectives may include reducing the number of sampling points for soft shadow filtering by using an easy-to-compute filtering kernel which has a similar kernel shape to a Gaussian blur filter kernel.

It has been found that the shapes of a Gaussian blur filter kernel and a cubic B-spline filter kernel may be similar. However, the B-spline filter algorithm may be easier to calculate than a Gaussian blur algorithm. In this way, the apparatus may take advantage of the GPU feature of hardware bilinear texture filter and use a cubic B-spline kernel to replace the Gaussian blur filter kernel to obtain a very similar blurring result, which may be much smoother than a box filtering.

So, in our new algorithms we will propose to use this cubic B-spline filter kernel to replace all the previous filter kernels involved in the shading passes of the soft shadow algorithm. Figure 6 schematically illustrates the stages 700 of a first exemplary embodiment of an image processing apparatus 900 for generating a shadow effect.

The aim of the apparatus 900 may be to render the scene from the light's point of view and everything that can be seen from the light's perspective will be lit and everything else that cannot be seen must be in shadow.

The apparatus 900 may be configured to obtain an input scene 701. In other words, the apparatus 900 may be configured to generate or receive the input scene.

The input scene 701 may be an input image. The input scene 701 may comprise one or more objects. The input scene 701 may comprise one or more light sources. The apparatus 900 may be configured to generate, or render, a shadow which is formed on the other side of the object to that of the light source.

The input scene 701 may comprise 3D mesh scene representation. The 3D mesh scene representation may comprise light source information. For example, the light source information may comprise the location and orientation of the light source relative to the object. In this way, there may be sufficient information for the location and orientation of a shadow to be generated.

The apparatus 900 may be configured to generate a shadow map image 702 from the input scene 701. In other words, the scene representation, including the light source information, in the input scene 701 may be used to generate the shadow map image 702.

The apparatus 900 may be configured to generate the shadow map image 702 from the input scene 701 by instead of simply writing depth to the shadow map, by writing depth and depth squared to a two-component variance shadow map, which has two channels. In particular, a Vulkan format of VK_FORMAT_R16G16_UNORM may be used. The shadow map may comprise two channels.

Generation of the shadow map image 702 is shown in more detail in Figure 13. Figure 13 schematically illustrates an exemplary depth map for a shadow.

To generate the shadow map image 702, the apparatus 900 may be configured to calculate a blocked area 1403 of the input scene 701 . In the blocked area, the one or more light sources 1401 may be blocked by the one or more objects 1402. The size of the blocked area 1403 may depend on the light source 1401 size and the receiver’s 1403 distance from the light source 1402, and the distance of the blocker 1402 between the light source 1401 and the blocked area 1403. The blocked area 1403 may be calculated by intersecting the shadow map plane with the frustum formed by P 1403 and the light source 1401.

To generate the shadow map image 702, the apparatus 900 may be configured to estimate the average blocker depth. According to the blocked area 1403, the average depth value may be sampled from a mipmapped texture. This may be carried out using customized higher order filtering algorithm. The average blocker depth may be estimated using a heuristic formula.

To generate the shadow map image 702, the apparatus 900 may be configured to calculate a penumbra size 1404 in dependence on the blocked area 1403 and the average blocker depth. After generating the average blocker depth, the penumbra size may be estimated based on the light source 1401 size and blocker/receiver distances from the light source 1401. This may be done using similar triangles as shown in Figure 13 and Equation 4.

In some implementations, after generating the shadow map image 702, for the purpose of prefiltering, the apparatus 900 may be optimised by removing the costly SAT computation stage, which is found in the prior art. Instead, a mipmapping sum may be applied to each level from bottom to up levels, as shown in Figure 7, and described herein.

The apparatus 900 may be configured to generate one or more shadow map downsampled images 703, 704 by downsampling the shadow map image 702. The apparatus may be configured to generate one shadow map downsampled image 703 from the shadow map image 702. Alternatively, the apparatus may be configured to generate more than one, such as two, shadow map downsampled images 703, 704 from the shadow map image 703.

As shown in Figure 7, the apparatus 900 may be configured to generate a further shadow map downsampled image 704 from the shadow map downsampled image 703. In this way, two shadow map downsampled images 703, 704 may be generated, and the further shadow map downsampled image 704 comprises a further level of downsampling. This is shown in Figure 7 by the further shadow map downsampled image 704 comprising a smaller pixel area than the shadow map downsampled image 703. This step may be repeated to generate additional further shadow map downsampled images 704 with further levels of downsampling. In the example shown in Figure 7, at each downsampling pass, the output texture size decreases continuously by two times along both width and height dimensions. The last downsampling pass has been found to generate the texture with the minimum size that may produce a realistic shadow.

The apparatus 900 may be configured to generate the one or more shadow map downsampled images 703, 704 as at least part of a mipmap. In other words, the different levels of downsampling in the shadow map image 702, the shadow map downsampled image 703 and the further shadow map downsampled image 704 may provide different mipmap levels. The mipmap levels to this shadow map texture may be used for the purpose of pre-filtering. It has been found that the four bottom pyramid levels of downsampling may provide a sufficient level in tested scenes.

For each visible scene point P, based on the fixed filter kernel size wi (which is given by the user according to the light source size), a corresponding mip level can be calculated, for which the per-pixel projected area in the shadow map space is same as the kernel size wi. This area can be the penumbra size for shadow filtering. The soft shadow value of this penumbra kernel may be evaluated by sampling into this level of the shadow map. The filtering may use a default bilinear, ortrilinear, filtering. Alternatively, the filtering may use a customized high order filtering algorithm as described herein.

The apparatus 900 may be configured to generate an output shadow effect image 705. In particular, the apparatus 900 may be configured to generate the output shadow effect image 705 in dependence on two or more images selected from (a) the shadow map image 702 and (b) the one or more shadow map downsampled images 703, 704. In other words, two or more images selected from (a) the shadow map image 702 and (b) the one or more shadow map downsampled images 703, 704 may be used to generate the output shadow effect image 705.

The apparatus 900 may use two of the (a) the shadow map image 702 and (b) the one or more shadow map downsampled images 703, 704 to generate the output shadow effect image 705. In other words, the apparatus 900 may use the shadow map image 702 and one shadow map downsampled image 703, 704 to generate the output shadow effect image 705. Alternatively, the apparatus 900 may use two of the shadow map downsampled images 703, 704 to generate the output shadow effect image 705. For example, the apparatus may use the shadow map downsampled image 703 and the further shadow map downsampled image 704 to generate the output shadow effect image 705. Preferably, the apparatus 900 may use two images which comprise different downsampling, or mipmap, levels. In this way, the generation of the output shadow effect image 705 may be based on a combination, or average, of the images of different downsampling, or mipmap, levels. This may enable the shadow to be more realistic.

Alternatively, the apparatus 900 may use more than two of the (a) the shadow map image 702 and (b) the one or more shadow map downsampled images 703, 704 to generate the output shadow effect image 705. In this example, the generation of the output shadow effect image 705 may be based on a combination, or average, of the more than two images.

The apparatus 900 may be configured to sample the shadow map image 702 and the one or more shadow map downsampled images 703, 704.

The apparatus 900 may be configured to take the samples of the shadow map image 702 and the one or more shadow map downsampled images 703, 704 and perform a linear interpolation between the samples. The linear interpolation may be used to generate the output shadow effect image 705. By using a linear interpolation, the generation of the output shadow effect image 705 may be based on a combination, or average, of the samples of the two or more images.

In this way, each upper level pixel may provide the average value of the lower level pixels. In a similar way to a dual filtering algorithm, the series of downsampling iterative passes, as shown in Figure 7, may provide a summing effect for each lower level pixel.

Increasing the number of downsampling levels can increase the level of blur in the shadow of the output shadow effect image. However, increasing the number of downsampling levels can also increase the computational loading. A balance between realistic shadows and computational loading has been found at four downsampling levels.

The apparatus 900 may output a shadowing factor (within a floating-point range of [0,1]) calculated for each pixel. The shadowing factor may be used for the lighting calculation for each pixel’s final colour in the output shadow effect image 705.

The main parameters for the apparatus 900 may include: (i) the resolution of the shadow map buffer may be set as 1024x1024 with no MSAA for PC platform, and 512x512 but with 4x MSAA for mobile platform; and (ii) the number of down-sampling passes to generate the mipmap levels, which controls the scope and softness of the soft shadow has been found that four is a good level for all tested scenes.

Figure 7 summarises an example of a method 800 for generating a shadow effect. At step 801 , the method 800 comprises obtaining an input scene. At step 802, the method 800 comprises generating a shadow map image from the input scene. At step 803, the method 800 comprises generating one or more shadow map downsampled images by downsampling the shadow map image. At step 804, the method 800 comprises generating an output shadow effect image in dependence on two or more images selected from (a) the shadow map image and (b) the one or more shadow map downsampled images.

An example of an apparatus 900 configured to implement the methods 800, 1200 and 1500 is schematically illustrated in Figure 8. The apparatus 900 may be implemented on an electronic device, such as a laptop, tablet, smart phone or TV. In particular, the apparatus 900 may be implemented using Vulkan, OpenGL, or other Application Programming Interfaces (APIs) and may be used on both PC and mobile devices.

The apparatus 900 comprises a processor 901 configured to process the datasets in the manner described herein. For example, the processor 901 may be implemented as a computer program running on a programmable device such as a Central Processing Unit (CPU). The apparatus 900 comprises a memory 902 which is arranged to communicate with the processor 901. Memory 902 may be a non-volatile memory. The processor 901 may also comprise a cache (not shown in Figure 8), which may be used to temporarily store data from memory 902. The apparatus may comprise more than one processor and more than one memory. The memory may store data that is executable by the processor. The processor may be configured to operate in accordance with a computer program stored in non-transitory form on a machine- readable storage medium. The computer program may store instructions for causing the processor to perform its methods in the manner described herein.

Figure 9 schematically illustrates the stages 1000 of a second exemplary embodiment of an image processing apparatus 900 for generating a shadow effect.

The apparatus 900 of the second embodiment may be configured to obtain an input scene 701 as described herein with regards to the first exemplary embodiment. The apparatus 900 of the second embodiment may be configured to generate a shadow map image 702 from the input scene 701 as described herein with regards to the first exemplary embodiment.

The apparatus 900 of the second embodiment may be configured not to generate one or more shadow downsampled images 703, 704. Instead, the apparatus 900 may be configured to generate a filtered shadow map image 1001 from the shadow map image 702. Although, the features of the first embodiment and the second embodiment of the apparatus 900 may be used interchangeably.

The apparatus 900 may be configured to generate the filtered shadow map image 1001 by means of a bilinear or trilinear filter. Although, preferably, the apparatus 900 may be configured to generate a filtered shadow map image 1001 by means of a basis-spline filter algorithm 1101. In particular, the basis-spline filter algorithm 1101 may be a 4-tap bicubic basis-spine filter algorithm. The basis-spline filter algorithm 1101 may produce more realistic and less blocky artifacts in the output shadow effect image 705. This is because the basis-spline filter algorithm 1101 may remove the square shaped pixel sims used in the bilinear or trilinear filters. The basis-spline filter algorithm 1101 may provide a higher order filtering which may overcome the produced blocky effects of bilinear or trilinear filters.

Figure 10A graphically illustrates an exemplary comparison 1100 between a Gaussian filter kernel 1101 and a bicubic B-spline filter kernel 1102. It has been found that the shapes of a Gaussian blur filter kernel 1102 and a cubic B-spline filter kernel 1101 are very similar. Additionally, both the Gaussian blur filter kernel 1102 and the cubic B-spline filter kernel 1101 may produce a smooth filtering result, as shown in Figure 10A. Figure 10A shows the similarity of the two filter kernels’ shapes in 1 D.

The B-spline filter algorithm 1101 may be much easier to calculate than a Gaussian blur algorithm 1102. In other words, the-spline filter algorithm 1101 require significantly lower computing power than the Gaussian blur algorithm 1102. The apparatus 900 may take advantage of the GPU feature of hardware bilinear texture filter and use a cubic B-spline kernel to replace the costly Gaussian blur filter kernel to obtain a similar blurring result at a lower cost.

The bicubic B-spline filter 1101 may be introduced into the shadow filtering algorithm of the apparatus 900. This may reduce the number of sampling points, and at the same time to produce a smooth transition from the hard to soft shadow area. Figure 10B graphically illustrates an exemplary weighted sum distribution 1103 for a cubic B- spline filtering algorithm 1104. A type of filter that is known in image processing is commonly known as the cubic B-spline filter. When the filter is applied in both x and y directions, it is known as a bicubic B-spline filter. A cubic B-spline is a cubic filter kernel with cubic polynomial weights, as shown in Figure 10B. Figure 10B also shows the weighted sum-based filtering algorithm.

In Figure 10B, the y value of the function is the relative weight to be assigned to the texels that are distant from the centre of a given texture sampling coordinate x. Texels more than two texels from that centre may be essentially ignored due to zero weight value. Whereas texels at the centre are given highest weight. In particular, for this particular cubic B-spline filter, all weights are positive, i.e., greater than zero.

In 1 D the cubic filtering algorithm can be expressed as in Equation 5. f(x) = w0(x) fi-i + w1(x) fi+ w2(x) fi _+i + w3(x) fi+2 (5)

In Equation 5, f, is the indexed neighbouring texel values at 4 taps of integer sampling locations, which may be multiplied by the corresponding cubic polynomial weights wi(x) from the convolution kernel. The weighted sum is the final result of the filtering.

The general 1 D cubic interpolation is a method for estimating a specific function value f(x) at an arbitrary continuous sampling point x by calculating the weighted sum of 4 taps of known functional values f(i-1), f(i), f(i+1) and f(i+2) at 4 integer grid locations (from i-1 to /+2), where x = / + or, or e[0, 1), i.e., 1> a > 0, and / eZ being the integer and fractional parts of x, respectively.

The formula to calculate the B-spline weights using a third-order polynomial is as in Equation 6.

In Equation 6, the 4 weights are determined by the fractional amount a of the present sampling coordinate x. The bicubic filter is a 2D extension of the 1 D cubic filtering for interpolating data points on a two-dimensional regular grid. The 2D interpolation function is a separable extension of the 1 D interpolation function. The 2D interpolation can be accomplished by two 1 D interpolation with respect to each coordinate direction. So, it may be possible to feed d into the above formula to get the filter weights along both X and Y direction.

Figure 10C schematically illustrates an exemplary texel-area 1105. This bicubic filter in 2D needs to filter a 4x4 texel-area with 16 texture fetches, and with third-order weighted filtering calculation along X and Y direction in a 4x4 local neighbourhood, which involves 32 multiplications for a single pixel shading, as shown in Figure 10C. The filter samples a 4x4 grid of texels surrounding the target UV coordinate 1106.

Fortunately, the method may be simplified by taking advantage of a GPU hardware bilinear sampling feature which needs only 4 taps of texture fetches to achieve the same filtering result for a 4x4 texel-area. This may effectively merge the sixteen sampling taps down to only four.

This is achieved by carefully tweaking the sampling coordinates. For 1 D, 1 D cubic filtering, in order to merge the 4 taps down to 2, the apparatus must combine the first pair of taps into a linear tap and the second pair into a further linear tap. Thus, rather than perform two texture lookups at fi and f _i+ 1, and then perform a linear combination a * fi + b * f _i+ 1 with general a and b, can perform a single texture lookup at / + b/(a + b) and just multiply the result by (a + b). In particular, a 2:1 reduction becomes a 4:1 reduction in 2D, thus effectively takes 16 taps down to 4 in 2D. This is done through Equations 7 to 9.

Using this 4-tap bicubic B-spline kernel, to replace the bilinear sampling, this may only need four taps of texture fetches in the pixel neighbourhood to approximate the weighted sum of the sampling points for the sampling passes. This can reduce the GPU memory bandwidth for texture sampling. The apparatus 900 may be configured to generate an output shadow effect image 705. In particular, the apparatus 900 may be configured to generate the output shadow effect image 705 in dependence on the filtered shadow map image 1001. In otherwords, the filtered shadow map image 1001 may be used to generate the output shadow effect image 705.

By filtering over a fixed region, this may provide the two order moments M1 and M2 of the depth distribution in that region. From these, the mean and variance of the distribution can be computed through Equations 10 and 11. Using the variance, Chebyshev's inequality can be applied to compute an upper bound on the probability that the currently shaded surface is occluded, which may be used as the final shadow factor for the pixel shading.

In other words, the apparatus 900 may be configured to generate the output shadow effect image 705 by estimating the probability of a pixel being blocked in neighbouring area from the filtered shadow map image.

Figure 11 summarises an example of a method 1200 for generating a shadow effect. At step 1201 , the method 1200 comprises obtaining an input scene. At step 1202, the method 1200 comprises generating a shadow map image from the input scene. At step 1203, the method 1200 comprises generating a filtered shadow map image from the shadow map image by means of a basis-spline filter algorithm. At step 1204, the method 1200 comprises generating an output shadow effect image in dependence on the filtered shadow map image.

Figure 12 schematically illustrates the stages 1300 of a third exemplary embodiment of an image processing apparatus 900 for generating a shadow effect.

The apparatus 900 of the third embodiment may be configured to obtain an input scene 701 as described herein with regards to the first and second exemplary embodiments.

When shading each pixel in the eye view, the apparatus 900 should return a floating-point value that indicates the amount of shadowing at each shaded point P 1403. The apparatus 900 of the third embodiment may be configured to generate a shadow map image 702 from the input scene 701 as described herein with regards to the first and second exemplary embodiments.

The apparatus 900 of the second embodiment may be configured to generate one or more shadow downsampled images 703, 704 as described herein with regards to the first embodiment. The apparatus 900 of the third embodiment may be configured to generate a filtered shadow map image 1001 from the shadow map image 702 as described herein with regards to the second embodiment. In other words, the features of the first and second embodiments of the apparatus 900 may be combined and/or used interchangeably.

Combining different levels of the mipmapping may result in the pixel sums only being in the region of squares. This may result in blocky filtering results during the final shading. Therefore, this apparatus 900 of the third embodiment may aim to reduce the blocking filtering results by including a higher order filtering algorithm.

After generating the shadow map with two channels and built a mipmap on it, for each visible scene point P 1403, the initial blocker search area A is computed by intersecting the shadow map plane with the frustum formed by P 1403 and the light source 1401 . According to this area size A, the average depth value zAvg in A is sampled from the mipmapped texture by using the customized higher order filtering algorithm, as described herein. The average blocker depth may be estimated using a heuristic formula. After getting the average blocker depth, the actual penumbra kernel wp may be computed. Finally, the soft shadow value of penumbra kernel wp can be evaluated directly as described with regards to the second embodiment of the apparatus 900, however, instead of sampling one mip level, the apparatus 900 may be configured to sample two, or more, mip levels, since we have a floating point mip level.

The apparatus 900 may be configured to generate a filtered shadow map downsampled image 1301 , 1302 from each of the one or more shadow map downsampled images 703, 704. In other words, as shown in Figure 12, if the apparatus 900 has generated two shadow map downsampled images 703, 704, a shadow map downsampled image 703 and a further shadow map downsampled image 704, the apparatus 900 may be configured to generate two corresponding filtered shadow map downsampled images 1301 , 1302, a filtered shadow map downsampled image 1301 and a further filtered shadow map downsampled image 1302.

The apparatus 900 may be configured to generate the one or more filtered shadow map downsampled images 1301 , 1302 from each of the one or more shadow map downsampled images 703, 704 by means of a basis-spline filter algorithm 1101. The apparatus 900 may be configured to generate the one or more filtered shadow map downsampled images 1301 , 1302 in the same way as described herein for the filtered shadow map image 1001 by means of the same basis-spline filter algorithm 1101.

The apparatus 900 may also be configured to generate the filtered shadow map image 1001 from the shadow map image 702. In this way, the apparatus 900 may generate a filtered image for each of the mipmap levels.

As shown in Figure 12, the filtered shadow map downsampled images 1301 , 1302 may comprise the same mipmap level as the corresponding shadow map downsampled images 703, 704.

The higher order bicubic filtering, from the basis-spline filter algorithm 1101 may be applied within each mipmap level. The apparatus 900 may be configured to perform a soft shadow filtering by sampling from the mipmapped texture. In particular, the apparatus 900 may apply a high order bicubic filtering using a kernel size proportional to this penumbra size, which are calculated per pixel according to Equation 4, as described herein with regards to the first embodiment of the apparatus. The soft shadow factor value of this penumbra kernel may be evaluated by a high order filtering algorithm in the same way as in the second embodiment of the apparatus 900, however, instead of sampling only one mip level, the apparatus 900 may be configured to sample two mip levels. Since a floating-point penumbra size is known, which corresponds to a level with a fractional part between the two neighbour levels (N-1 and N), the apparatus 900 may be configured to calculate the final shadow factor by performing a linear interpolation according to the fractional distance between the two neighbour levels. The calculation of the final shadow factor may be calculated using Equation 12.

Shadow_factoF= (1 -fractional) * Shadow_factor _N-i + (fractional) * Shadow_factor _N (12)

The result of this calculation may provide a soft shadow that looks smooth with beautiful transitions from hard to soft shadows.

The apparatus 900 may be configured to generate an output shadow effect image 705. In particular, the apparatus 900 may be configured to generate the output shadow effect image 705 in dependence on two or more images selected from (a) the filtered shadow map image 1001 and (b) the one or more filtered shadow map downsampled images 1301 , 1302. In other words, two or more images selected from (a) the filtered shadow map image 1001 and (b) the one or more filtered shadow map downsampled images 1301 , 1302 may be used to generate the output shadow effect image 705.

The apparatus 900 may use two of the (a) the filtered shadow map image 1001 and (b) the one or more filtered shadow map downsampled images 1301 , 1302 to generate the output shadow effect image 705. In other words, the apparatus 900 may use the filtered shadow map image 1001 and one filtered shadow map downsampled image 1301 , 1302 to generate the output shadow effect image 705. Alternatively, the apparatus 900 may use two of the filtered shadow map downsampled images 1301 , 1302 to generate the output shadow effect image 705. For example, the apparatus may use the filtered shadow map downsampled images 1301 and the further filtered shadow map downsampled images 1302, 1302 to generate the output shadow effect image 705. Preferably, the apparatus 900 may use two images which comprise different downsampling, or mipmap, levels. In this way, the generation of the output shadow effect image 705 may be based on a combination, or average, of the images of different downsampling, or mipmap, levels. This may enable the shadow to be more realistic.

Alternatively, the apparatus 900 may use more than two of the (a) the filtered shadow map image 1001 and (b) the one or more filtered shadow map downsampled images 1301 , 1302 to generate the output shadow effect image 705. In this example, the generation of the output shadow effect image 705 may be based on a combination, or average, of the more than two images.

Figure 14 summarises an example of a method 1500 for generating a shadow effect. At step 1501 , the method 1500 comprises obtaining an input scene. At step 1502, the method 1500 comprises generating a shadow map image from the input scene. At step 1503, the method 1500 comprises generating one or more shadow map downsampled images by downsampling the shadow map image. At step 1504, the method 1500 comprises generating a filtered shadow map image from the shadow map image by means of a basis-spline filter algorithm. At step 1505, the method 1500 comprises generating a filtered shadow map downsampled image from each of the one or more shadow map downsampled images by means of a basisspline filter algorithm. At step 1506, the method 1500 comprises generating an output shadow effect image in dependence on two or more images selected from (a) the filtered shadow map image and (b) the one or more filtered shadow map downsampled images.

Figure 15 schematically illustrates an exemplary computing apparatus 1600 configured to perform the methods described herein. Mobile graphical processing units (GPUs) often take a design approach, commonly called tilebased rendering, in order to minimize the amount of power-hungry external memory accesses which are needed during rendering. The Vulkan APIs may provide explicit subpass mechanisms for fetching framebuffer attachments in the on-chip tile memory, which is a special feature of tile-based GPUs to explicitly implement fast accessing on-chip memory.

The computing apparatus 1600 may explicitly take advantage of this feature by exploiting the on-chip tile memory on the GPU in order to save memory bandwidth when building the first two mipmap levels for our shadow map texture. In particular, the computing apparatus 1600 may use two subpasses 1604, 1605 inside a single rendering-pass 1602 for generating the shadow map texture and building a mip level on it, as shown in Figure 15.

In particular, the computing apparatus 1600 may comprise a graphical processing unit GPU 1601. The GPU 1601 may comprise a renderpass 1602 further comprising transient attachments. The GPU 1601 may comprise a first subpass 1604 further comprising 4X depth

1608. The GPU 1601 may comprise a second subpass 1605 further comprising 1X colour

1609.

The computing apparatus 1600 may comprise a main memory 1606. The main memory 1606 may comprise a colour write back 1610.

The computing apparatus 1600 may be configured to obtain the input scene 701. In particular, the renderpass 1602 of the GPU 1601 may be configured to obtain the input scene 701 , as described herein.

The computing apparatus 1600 may be configured to generate the shadow map image 702. In particular, the renderpass 1602 of the GPU 1601 may be configured to generate the shadow map image 702, as described herein.

The computing apparatus 1600 may be configured to generate the filtered 1001 , 1301 , 1302 and/or the downsampled 703, 704, 1301 , 1302 shadow map image(s). In particular, the first subpass 1604 of the GPU 1601 may be configured to generate the filtered 1001 , 1301 , 1302 and/or the downsampled 703, 704, 1301 , 1302 shadow map image(s), as described herein.

In the first subpass 1604, the computing apparatus 1600 may rasterize the whole scene triangles from the light's point of view into a depth buffer with a 4X MSAA enabled. In particular, the first subpass 1604 may not need any colour render target. The computing apparatus 1600 may be configured to generate the output downsampled image 705. In particular, the second subpass of the GPU 1601 may be configured to generate the output downsampled image 705, as described herein.

In the second subpass 1604, the computing apparatus 1600 may perform a customized resolve operation by a pixel shader, which reads the 4 MSAA depth samples of the current pixel and calculate an average depth and depth squared and writes them into a two-component variance shadow map in the format of VK_FORMAT_R16G16_UNORM. Here the computing apparatus 1600 may use a subpass that reads the MSAA texture via an input attachment. As a result, the MSAA texture may only reside in the on-chip tile memory, and may not need to go to the system memory. In this way, this may save memory bandwidth to store and read this texture. For a 512x512 depth texture with a 4X MSAA, the saved read-and-write bandwidth is found to be 1024x1024x4x2= 8 MB when rendering each frame.

For computing apparatus 1600 to work, the first subpass 1601 that renders to MSAA texture has to specify the MSAA texture via pColorAttachments, with VK_ATTACHMENT_STORE_OP_DONT_CARE as the store op. The second subpass 1605 that performs the customized resolve operation needs to specify MSAA texture via plnputAttachments and the resolve target via pColorAttachments. The second subpass 1605 then needs to render a full-screen quad with a shader that uses subpassInputMS resource to read this MSAA data. Additionally, the application needs to specify a dependency between two subpasses 1604, 1605 that indicates the stage/access masks, similar to pipeline barriers, and dependency flags VK_DEPENDENCY_BY_REGION_BIT. With this, the driver should have enough information to arrange the execution such that on tiled GPUs 1601. The MSAA contents may never leave the tile memory and instead is resolved in-tile, with the resolve result being written to main memory.

When accessing the shadow map texture at rendering time, the computing apparatus 1600 may access the resolved texture as base level together with all its upper levels for shadowing factor filtering. The original 4x MSAA texture may not be accessed. This method has been found to work well in testing.

The computing apparatus 1600 may be configured to output the output downsampled image 705. In particular, the main memory 1606 may be configured to obtain the output downsampled image 705 from the GPU 1601 and output the output downsampled image 705. A mobile communications device may be configured to comprise the computing apparatus 1600. The mobile communications device may be a personal computer, laptop, smartphone or similar mobile communications device.

The proposed algorithms may be implemented using graphics APIs, such as DirectX, Vulkan or OpenGL, on both PCs and Android devices.

Fortesting, the algorithm was implemented on a prototype app using Vulkan, so that the source code could be compiled cross platforms on both PC (Windows) and mobile phone (Android) platform.

For uniform filtering kernel size, a performance comparison was made with a prior art PCF algorithm. For per pixel variable filtering kernel size, a performance comparison was made with a classic PCSS algorithm.

The algorithm was tested on a mobile phone. The algorithm was implemented on a Huawei P30 Pro phone at the rendering resolution of 2340x1080p, using a 512x512 shadow depth texture with a 4X MSAA enabled. The results of the frame rate in FPS are shown in Table 1.

Table 1

Figure 17A and 17B illustrates an output image of the apparatus for generating a shadow effect compared to the prior art, when tested on the Huawei P30 Pro. Figure 17A illustrates the output image of the prior art. Figure 17B illustrates the output image of the apparatus for generating a shadow effect.

The algorithm was tested on a laptop. The algorithm was implemented on a Huawei MateBook laptop with an Intel integrated GPU at the rendering resolution of 1920x1080p, using a 1024x1024 shadow depth texture with no MSAA. The results of the frame rate in FPS are shown in Table 2.

Table 2

Figure 18A and 18B illustrates an output image of the apparatus for generating a shadow effect compared to the prior art, when tested on the Huawei MateBook laptop. Figure 18A illustrates the output image of the prior art. Figure 18B illustrates the output image of the apparatus for generating a shadow effect.

Figure 19A and 19B illustrates a further output image of the apparatus for generating a shadow effect compared to the prior art, when tested on the Huawei MateBook laptop. Figure 19A illustrates the output image of the prior art. Figure 19B illustrates the output image of the apparatus for generating a shadow effect.

Figure 20A and 20B illustrates a further output image of the apparatus for generating a shadow effect compared to the prior art, when tested on the Huawei MateBook laptop. Figure 20A illustrates the output image of the prior art. Figure 20B illustrates the output image of the apparatus for generating a shadow effect.

From the performance tables of testing results, it is clear that the proposed image processing apparatus is much faster than prior art PCF and PCSS. The speedup factors may be in the region of 10x times faster.

In the case of PCF, it is not possible to use prefiltered mipmaps to accelerate the process, because of the per sample depth comparison involved. Consequently, in the worst case, the apparatus must sample and compare every individual texel in a neighboring region of NxN size of the shadow map in order to compute the light attenuation for every single pixel. As expected, this process can be slow due to the O(NxN) complexity. The situation deteriorates when PCF is used to achieve edge softening with a large filtering kernel size N, because this approach may be considered equivalent to placing a lower bound on the size of the filter region and, consequently, on the cost of shading a pixel.

In the case of PCSS, the same situation applies, because for a particular pixel, once the filtering kernel size N is found, if this N is large then the same type of costly PCF filtering operation is performed in a neighbouring region of NxN size. PCSS may only have good performance when the N is very small. Therefore, in a whole frame, some pixels may have good performance, while other pixels may not. This also illustrates the reason that why PCSS usually is faster than PCF in our tests.

In the proposed algorithm, the runtime shadow filtering is simple and has a fixed cost. Only a 4-tap bicubic B-spline kernel is sampled to shade each pixel, which involves only a fixed cost of 4 texel sampling operations for uniform filtering. This is in contrast to the NxN sampling operations in PCF algorithm. That is why when N is large, the proposed algorithm may still have a better speedup factor than both PCF and PCSS.

Figure 21 A illustrates a PCF shadow effect transition area. Figure 21 B illustrates a PCT shadow effect transition area.

From the testing images shown above, it can also be seen that the proposed algorithm produces better quality output images than PCF and PCSS. This is because the PCF usually performs a box-filtering in a NxN region 2200, as shown in Figure 21A. This often results in blocky features in the shadow transition areas of the output images.

The proposed algorithm uses a bicubic B-spline filtering kernel, which has a shape very similar to the disc-like Gaussian filtering kernel. The result of this is that the output images comprise very smooth, and are more like a Gaussian blurring, transition areas from hard shadow to soft shadow.

The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Previous Patent: ROUTING OF AN AUTONOMOUS VEHICLE TO ARRIVE AT A TARGET TIME OF ARRIVAL

Next Patent: SHARED RADIO DYNAMIC RADIO RESOURCE HANDLING