Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS, SERVERS AND DEVICES FOR TRANSMITTING AND RENDERING MULTIPLE VIEWS COMPRISING NON-DIFFUSE OBJECTS
Document Type and Number:
WIPO Patent Application WO/2023/218219
Kind Code:
A1
Abstract:
A method implemented by a server (1SVR) for transmitting data used by a device (1CTL) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one additional view (Vi), said method comprising steps of: - generating (SS10) a stream (STR) by encoding said views (Vref, Vi); - selecting (1SS30) non-diffuse pixels (SNDP) among detected non-diffuse pixels (NDP); - generating (1SS40) data (1DND) representative of the selected non-diffuse pixels (SNDP; and - transmitting (SS50) said stream (STR) and said data (1DND) to said device, wherein a size of said data is a function of a pixel rate (PR) of said device and/or of a transmission rate (TR) between said server and said device.

Inventors:
JUNG JOËL (FR)
Application Number:
PCT/IB2022/000293
Publication Date:
November 16, 2023
Filing Date:
May 12, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
TENCENT CLOUD EUROPE FRANCE SAS (FR)
International Classes:
H04N21/218; H04N21/234; H04N21/235; H04N21/238; H04N21/435; H04N21/81; H04N21/854
Foreign References:
US20210006834A12021-01-07
Other References:
VINOD MV (OFINNO): "Draft use cases and requirements for MIV - edition-2", no. m59085, 16 January 2022 (2022-01-16), XP030299927, Retrieved from the Internet [retrieved on 20220116]
FACHADA SARAH ET AL: "Depth Image-Based Rendering of Non-Lambertian Content in MPEG Immersive Video", 2021 INTERNATIONAL CONFERENCE ON 3D IMMERSION (IC3D), IEEE, 8 December 2021 (2021-12-08), pages 1 - 6, XP034047845, DOI: 10.1109/IC3D53758.2021.9687263
BOYCE JILL M ET AL: "MPEG Immersive Video Coding Standard", PROCEEDINGS OF THE IEEE, IEEE. NEW YORK, US, vol. 109, no. 9, 10 March 2021 (2021-03-10), pages 1521 - 1536, XP011873492, ISSN: 0018-9219, [retrieved on 20210818], DOI: 10.1109/JPROC.2021.3062590
Attorney, Agent or Firm:
DELUMEAU, François (FR)
Download PDF:
Claims:
Claims

[Claim 1] A method implemented by a server (1SVR) for transmitting data used by a device (1CTL) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one alternate view (Vi), said method comprising steps of:

- generating (SS10) a stream (STR) by encoding said views (Vref, Vi);

- detecting (1SS20) non-diffuse pixels (NDP) in said at least one alternate view (Vi);

- selecting (1SS30) non-diffuse pixels (SNDP) among said detected non-diffuse pixels (NDP);

- generating (1SS40) data (1DND) representative of the selected non-diffuse pixels (SNDP), said data (1DND) comprising information enabling said device (1CTL) to render, for each said alternate view (Vi), the selected non-diffuse pixels (SNDP) with reflectance corresponding to this alternate view (Vi); and

- transmitting (SS50) said stream (STR) and said data (1DND) to said device (1CTL), wherein a size of said data is a function of a pixel rate (PR) of said device (1CTL) and/or of a transmission rate (TR) between said server (1SVR) and said device (1CTL).

[Claim 2] The method according to claim 1, wherein said non-diffuse pixels (SNDP) are selected (1SS30) based on differences between:

- textures of pixels of said at least one alternate view (Vi); and

- textures of pixels of said at least one basic view (Vref) matching said pixels of said at least one alternate view (Vi).

[Claim 3] The method according to claim 2, wherein said selected non-diffuse pixels (SNDP) are among those maximizing said differences.

[Claim 4] The method according to claim 2 or 3, wherein said non-diffuse pixels (SNDP) are selected (1SS30) based on a comparison between said differences and at least one threshold depending on said pixel rate and/or on said transmission rate.

[Claim 5] The method according to any of claims 1 to 4, wherein said selecting step (1SS30) comprises substeps of:

- associating (1SS301) the pixels of said views (Vref, Vi) with epipolar plane image lines;

- determining (1SS302), for each epipolar plane image line, at least one value representative of the form of this epipolar plane image line; and said non-diffuse pixels (SNDP) are selected (S30) based on said values.

[Claim 6] A method implemented by a device (1CLT) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one alternate view (Vi), said method comprising steps of: - receiving (SD60), from a server (1SVR), a stream (STR) of encoded said views (Vref, Vi) and data (1DND) representative of selected non-diffuse pixels (SNDP) of said at least one alternate view (Vi), wherein a size of said data is a function of a pixel rate (PR) of said device (1CTL) and/or of a transmission rate (TR) between said server (1SVR) and said device (1CTL), said data (1DND) comprising information enabling said device (1CTL) to render, for each said alternate view (Vi), said non-diffuse pixels (SNDP) with reflectance corresponding to this alternate view (Vi);

- rendering (1SD70) said views (Vref, Vi) based on said stream (STR) and said data (DND), said non- diffuse pixels (SNDP) of said alternate views being rendered with their corresponding reflectance.

[Claim 7] A method implemented by a server (2SVR, 3SVR) for transmitting data used by a device (2CTL, 3CTL) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one alternate view (Vi), said method comprising steps of:

- generating (SS10) a stream (STR) by encoding said views (Vref, Vi);

- identifying (2SS20, 3SS20) at least one non-diffuse object (01, 02) in said views (Vref, Vi);

- generating (2SS40, 3SS40) data (2DND, 3DND) representative of said at least one non-diffuse object (01, 02), said data (2DND, 3DND) comprising information enabling said device (2CTL, 3CTL) to render, for each alternate view (Vi), said at least one non-diffuse object (01, 02) with a reflectance corresponding to this view; and

- transmitting (SS50) said stream (STR) and said data (2DND, 3DND) to said device (2CTL, 3CTL).

[Claim 8] The method according to claim 7, wherein said data (2DND) comprise, for each said alternate view (Vi) and for at least one said diffuse object (01, 02), a texture difference (AT_l(Vref, Vi), AT_2(Vref, Vi)) between this object (01, 02) displayed in this alternate view (Vi) and the same object (01, 02) displayed in said at least one basic view (Vref).

[Claim 9] The method according to claim 7 wherein said data (2DND) comprise, for said at least one non-diffuse object (01, 02), a set of parameters (NDParl, NDPar2) of a reflectance model of this object.

[Claim 10] The method according to any of claim 7 to 9 comprising a step of generating (3SS30) a generated view (Vi*) corresponding to a position (IPos) of a given point of view shared between said device (3CLT) and said server (3SVR), said generating step (3SS30) being based on said at least one alternate view (Vi) and said at least one basic view (Vref), said generated view (Vi*) comprising a texture image (Ti*) of said at least one non-diffuse objects, wherein said data (3DND) comprise information enabling said device (3CTL) to render said at least one non-diffuse object (01, 02) with a reflectance corresponding to the generated view (Vi*).

[Claim 11] The method according to claim 10, wherein said data (3DND) comprise, for at least one said non-diffuse object (01, 02): (i) a difference (AT_l(Vref, Vi*), AT_2(Vref, Vi*)) between:

- a texture of this object (01, 02) in said generated view (Vi*) and

- a texture of this object (01, 02) in said at least one basic view (Vref); and

(ii) a difference (AT_l(Vi, Vi*), AT_2(Vi, Vi*)) between :

- a texture of this object (01, 02) in said generated view (Vi*) and

- a texture of this object (01, 02) in said at least one alternate view (Vi) from which the generated view (Vi*) is generated.

[Claim 12] The method according to claim 10 or 11, wherein the step of generating (3SS30) the generated view (Vi*) is performed by computing the texture image (Ti*) of a said generated view (Vi*) as a weighted average of:

- textures (Tref) of a said basic view (Vref); and

- textures (Ti) of a said alternate view (Vi).

[Claim 13] The method according to claim 10 or 11, wherein said at least one basic view (Vref) and said at least one alternate view (Vi) comprise a depth map (Dref, Di), and said generating step (3SS30) of a generated view (Vi*) comprises substeps of:

- generating a three-dimensional point cloud from textures (Tref, Ti) and depth map (Dref, Di) of at least one of said basic and alternate views (Vref, Vi);

- generating the texture image (Ti*) of said generated view (Vi*) based on a projection of said point cloud in a two-dimensional space.

[Claim 14] A method implemented by a device (2CLT, 3CLT) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one alternate view (Vi), method comprising steps of:

- receiving (SD60), from a server (2SVR, 3SVR), a stream (STR) of encoded said views (Vref, Vi) and data (2DND, 3DND) representative of at least one non-diffuse object (01, 02) in said views, said data (2DND, 3DND) comprising information enabling said device (2CLT, 3CLT) to render, for each view, said at least one non-diffuse object (01, 02) with the reflectance corresponding to this view (Vref, Vi, Vi*);

- rendering (2SD70, 3SD70) said views (Vref, Vi) based on said stream (STR) and said data (2DND, 3DND), said at least one non-diffuse object (SNDP) being rendered with their corresponding reflectance.

[Claim 15] The method according of any of claim 1 to 14, wherein said data (1DND, 2DND, 3DND) are transmitted in a Supplemental Enhancement Information (SEI) message.

[Claim 16] A server (1SVR) for transmitting data used by a device (1CTL) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one alternate view (Vi), said server (1SVR) comprising : - a module (MS10) of generating a stream (STR) by encoding said views (Vref, Vi);

- a module (1MS20) of detecting non-diffuse pixels (NDP) in said at least one alternate view (Vi);

- a module (1MS30) of selecting non-diffuse pixels (SNDP) among said detected non-diffuse pixels (NDP);

- a module (1MS40) of generating data (1DND) representative of the selected non-diffuse pixels (SNDP), said data (1DND) comprising information enabling said device (1CTL) to render, for each said alternate view (Vi), the selected non-diffuse pixels (SNDP) with reflectance corresponding to this additional view (Vi); and

- a module (MS50) of transmitting said stream (STR) and said data (1DND) to said device (1CTL), wherein a size of said data is a function of a pixel rate (PR) of said device (1CTL) and/or of a transmission rate (TR) between said server (1SVR) and said device (1CTL).

[Claim 17] A device (1CLT) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one alternate view (Vi), said device (1CLT) comprising :

- a module (1MD60) of receiving, from a server (1SVR), a stream (STR) of encoded said views (Vref, Vi) and data (1DND) representative of selected non-diffuse pixels (SNDP) of said at least one alternate view (Vi), wherein a size of said data is a function of a pixel rate (PR) of said device (1CTL) and/or of a transmission rate (TR) between said server (1SVR) and said device (1CTL), said data (1DND) comprising information enabling said device (1CTL) to render, for each said alternate view (Vi), said non-diffuse pixels (SNDP) with reflectance corresponding to this alternate view (Vi);

- a module (1MD70) of rendering said views (Vref, Vi) based on said stream (STR) and said data (DND), said non-diffuse pixels (SNDP) of said alternate views being rendered with their corresponding reflectance.

[Claim 18] A server (2SVR, 3SVR) for transmitting data used by a device (2CTL, 3CTL) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one alternate view (Vi), said server (2SVR, 3SVR) comprising:

- a module (MS10) of generating a stream (STR) by encoding said views (Vref, Vi);

- a module (2MS20, 3MS20) of identifying at least one non-diffuse object (01, 02) in said views (Vref, Vi);

- a module (2SS40, 3SS40) of generating data (2DND, 3DND) representative of said at least one non- diffuse object (01, 02), said data (2DND, 3DND) comprising information enabling said device (2CTL, 3CTL) to render, for each said alternate view (Vref, Vi), said at least one non-diffuse object (01, 02) with a reflectance corresponding to this view; and

- a module (MS50) of transmitting said stream (STR) and said data (2DND, 3DND) to said device (2CTL, 3CTL). [Claim 19] A device (2CLT, 3CLT) for rendering multiple views (Vref, Vi) of a single scene, said multiple views comprising at least one basic view (Vref) and at least one alternate view (Vi), said device (2CLT, 3CLT) comprising :

- a module (MD60) of receiving, from a server (2SVR, 3SVR), a stream (STR) of encoded said views (Vref, Vi) and data (2DND, 3DND) representative of at least one non-diffuse object (01, 02) in said views, said data (2DND, 3DND) comprising information enabling said device (2CLT, 3CLT) to render, for each view, said at least one non-diffuse object (01, 02) with the reflectance corresponding to this view (Vref, Vi, Vi*);

- a module (2MD70, 3MD70) of rendering said views (Vref, Vi) based on said stream (STR) and said data (2DND, 3DND), said at least one non-diffuse object (SNDP) being rendering with their corresponding reflectance.

[Claim 20] A computer program (PGCLT, PGSRV) comprising instructions configured to implement the steps of a method according to anyone of claims 1 to 15 when executed by a computer.

[Claim 21] A readable medium (1SRV, 1CLT) comprising a computer program (PGCLT, PGSRV) according to Claim 20.

Description:
METHODS, SERVERS AND DEVICES FOR TRANSMITTING AND RENDERING MULTIPLE VIEWS COMPRISING NON-DIFFUSE OBJECTS

Field of the invention

[0001] The invention relates to the field of computer graphics. It relates more particularly to a method for encoding and transmitting data for rendering non-diffuse pixels of multiples views of a same scene. The invention may be used in the context of immersive videos.

[0002] Diffuse surfaces have an apparent brightness that is the same regardless the observer's point of view. They produce a diffuse reflectance of the light: the light is absorbed and re-emitted in all directions. Concrete, wood and wool are examples of such surfaces.

[0003] In contrast, non-diffuse surfaces have the property that the visible texture depends on the point of view. Almost all natural and high-quality synthetic scenes exhibit non-diffuse reflections. Typical examples of non-diffuse surfaces are mirrors, windows, and glossy surfaces.

[0004] The reflectance of a surface in a particular direction is the fraction of incident light which is reflected by this surface in this particular direction. Hereafter, the reflectance of a surface corresponding to a point of view refers to the reflectance of the surface of this object in the direction of this point of view. For a non-diffuse surface, the reflectance varies with the point of view.

[0005] Pixels representing a non-diffuse surface are called non-diffuse pixels.

Background of the invention

[0006] In an immersive video, a user can navigate in a scene through different views. Because of memory and/or computational limitation of the client device, these multiple views are received in a compressed form from a server.

[0007] The transmission rate between the server and the client device is limited, such that efficient compression schemes are needed to reduce the amount of data to transmit.

[0008] Current methods of compression and transmission of multiple views are not satisfying for rendering non-diffuse pixels of these views.

[0009] In particular, MPEG Immersive Video (MIV), a recent codec designed for immersive video, fails to handle view-dependent effects such as non-diffuse reflections.

[0010] The MIV method consists in fully transmitting so-called "basic" views and pruning alternate views such that the client receiver can render them, while reducing the amount of data to be transmitted. [0011] Because of the pruning process, the reflectance of the non-diffuse pixels in the alternate views rendered by the client is not correct.

[0012] An extension of the standard MIV method enables to transmit additional data for rendering texture of non-diffuse pixels of the alternate views. Nevertheless, this method requires a large quantity of transmitted data in order to render the views with a suitable quality.

[0013] There exists a need for a solution that enables to transmit data for rendering non-diffuse pixels in multiple views.

Summary of the disclosure

[0014] A purpose of the present invention is to overcome all or some of the limitations of the prior art solutions, particularly those outlined above.

[0015] To this end, and according to a first aspect, the invention relates to a method implemented by a server for transmitting data used by a device for rendering multiple views of a single scene, said multiple views comprising at least one basic view and at least one alternate view, said method comprising steps of:

- generating a stream by encoding said views;

- detecting non-diffuse pixels in said at least one alternate view;

- selecting non-diffuse pixels among said detected non-diffuse pixels;

- generating data representative of the selected non-diffuse pixels, said data comprising information enabling said device to render, for each said alternate view, the selected non-diffuse pixels with reflectance corresponding to this alternate view; and

- transmitting said stream and said data to said device, wherein a size of said data is a function of a pixel rate of said device and/or of a transmission rate between said server and said device.

[0016] Correlatively, the invention proposes a server for transmitting data used by a device for rendering multiple views of a single scene, said multiple views comprising at least one basic view and at least one alternate view, said server comprising :

- a module of generating a stream by encoding said views;

- a module of detecting non-diffuse pixels in said at least one alternate view;

- a module of selecting non-diffuse pixels among said detected non-diffuse pixels;

- a module of generating data representative of the selected non-diffuse pixels, said data comprising information enabling said device to render, for each said alternate view, the selected non-diffuse pixels with reflectance corresponding to this alternate view; and

- a module of transmitting said stream and said data to said device, wherein a size of said data is a function of a pixel rate of said device and/or of a transmission rate between said server and said device. [0017] In one embodiment, the non-diffuse pixels are detected in the alternate views based on a basic view. In another embodiment, detected non-diffuse pixels could be pixels of objects displayed in the alternate views and labelled as being non-diffuse objects.

[0018] The module for detecting the non-diffuse pixels needs to know which views are the alternate views so that the non-diffuse pixels are detected among the alternate views.

[0019] In one embodiment, the views received by the module of detecting the non-diffuse pixels are labelled as basic or alternate views.

[0020] In another embodiment, the module of generating the stream notifies the module of detecting non-diffuse pixels of the alternate views.

[0021] The invention also proposes a method implemented by a device for rendering multiple views of a single scene, said multiple views comprising at least one basic view and at least one alternate view, said method comprising steps of:

- receiving, from a server, a stream of encoded said views and data representative of selected non-diffuse pixels of said at least one alternate view, wherein a size of said data is a function of a pixel rate of said device and/or of a transmission rate between said server and said device, said data comprising information enabling said device to render, for each said alternate view, said non-diffuse pixels with reflectance corresponding to this alternate view;

- rendering said views based on said stream and said data, said non-diffuse pixels of said alternate views being rendered with their corresponding reflectance.

[0022] Correlatively, the invention proposes a device for rendering multiple views of a single scene, said multiple views comprising at least one basic view and at least one alternate view, said device comprising:

- a module of receiving, from a server, a stream of encoded said views and data representative of selected non-diffuse pixels of said at least one alternate view, wherein a size of said data is a function of a pixel rate of said device and/or of a transmission rate between said server and said device, said data comprising information enabling said device to render, for each said alternate view, said non-diffuse pixels with reflectance corresponding to this alternate view;

- a module of rendering said views based on said stream and said data, said non-diffuse pixels of said alternate views being rendered with their corresponding reflectance.

[0023] In contrast to the prior art, this first aspect of the disclosure takes into account the transmission rate and the pixel rate when transmitting data for rendering non-diffuse pixels.

[0024] In particular, this first aspect of the disclosure proposes to generate data to be transmitted for a client to render non-diffuse pixels, such that the size of these data is a function of pixel and/or transmission rate. [0025] The transmission rate corresponds to the quantity of data which can be transmitted from the server to the client per unit of time. The pixel rate corresponds to the quantity of received data which can be decoded and rendered on the client side per unit of time.

[0026] In one embodiment, the pixel rate and the transmission rate are negotiated during an initialisation phase of a communication session between the server and the client device. In another embodiment the negotiation occurs just before the step of selecting the non-diffuse pixels.

[0027] Advantageously, generating a quantity of transmitted data accordingly to transmission rate and pixel rate avoids these data to be transmitted very slowly or even be arbitrarily truncated during their transmission or during their decoding and rendering on the client side, resulting in a degradation of the user experience.

[0028] The first aspect of the disclosure thus enables, for a same amount of transmitted data (this amount including the stream of encoded views and the data representative of the non-diffuse pixels), to improve the quality of rendered views on the client side. Corollary, it enables to transmit a limited amount of data while achieving a given quality of the rendered views.

[0029] The data representative of non-diffuse pixels are hereafter called "non-diffuse data".

[0030] Transmitting all non-diffuse pixels, with constraint transmission rate, could lead the other pixels or the views encoded in the stream to be more compressed, or could lead to a slow transmission thereof. Both consequences would have a negative impact on all rendered surfaces, and not only the non-diffuse ones.

[0031] The first aspect of the disclosure improves the rendering of non-diffuse surfaces without significantly penalizing the transmission of the basic and alternate views. In other words, this first aspect of the disclosure proposes a way to adapt the non-diffuse data to pixel rate and/or transmission rate limitations, such that the overall quality of the rendered views is not significantly impacted.

[0032] In the following embodiments, the transmission of non-diffuse data may take the form of a supplemental enhancement information message. The client that uses this message may accelerate its processing of non-diffuse pixels; the client that does not use it applies a conventional process.

[0033] An example of such a conventional process by the client in the prior art is represented in figure 1A which shows a server transmitting, to a client device, a stream for rendering multiple views of a scene according to an extension of the MIV method. As shown in figure IB, the stream comprises the basic view, the pruned alternate views, metadata to reconstruct the alternate views, as well as information on the non-diffuse pixels in the alternate views.

[0034] Hereafter, the pruned alternate views comprised in the stream are called "additional" views. [0035] Adjusting the size of non-diffuse data accordingly to the pixel and/or transmission rates may be done by choosing an appropriate encoding scheme. For instance, if non-diffuse pixels are encoded on 16 bits and non-diffuse data are too large regarding a transmission rate limitation, the non-diffuse pixels are converted to 8 bits such that the quantity of transmitted data does not saturate their transmission or their decoding.

[0036] Regardless of whether the encoding scheme is modified or not, entropy coding (e.g. Huffman coding, arithmetic coding...) can also be applied to the non-diffused pixels to reduce the amount of data to be transmitted according to pixel rate and/or transmission rate limitations.

[0037] Controlling the size of transmitted data can also be done during the step of selecting the non- diffuse pixels, as allowed by the following embodiment.

[0038] According to another embodiment of the method, said non-diffuse pixels are selected based on differences between:

- textures of pixels of the at least one alternate view, and

- textures of pixels of the at least one basic view matching said pixels of the alternate view.

[0039] Pixels of two different views are considered as "matching" if they represent same or close positions in the scene. In other words, a pixel from the alternate view and a pixel from the basic view match if they represent the same area in the three-dimensional space where the scene takes place.

[0040] In this embodiment, the selection of non-diffuse pixels in the alternate views is based on a comparison between the pixels of the basic view and the pixels of the alternate views. Such a comparison advantageously allows selecting non-diffuse pixels, and thus controlling the size of transmitted data, according to pixel rate or transmission rate limitations.

[0041] According to a particular embodiment, said selected non-diffuse pixels are among those maximizing said difference.

[0042] In particular, difference of texture between pixels of the basic view and an alternate view characterises a degree of non-diffusivity of pixels. For instance, matching pixels are:

- non-diffuse if they have very different textures; or

- diffuse if they share similar textures.

[0043] For a non-diffuse pixel, the higher the texture difference, the higher the quality of its rendering is improved by taking into account the non-diffusiveness of this pixel.

[0044] Accordingly, selecting non-diffuse pixels with the highest differences of texture enables to significantly reduce quantity of non-diffuse data to be transmitted while maximizing the quality of rendered non-diffuse surfaces.

[0045] In this embodiment, the differences between pixels can be computed during the step of detecting non-diffuse pixels in the alternate views. The position of each pixel in the scene is computed, for instance by projecting each view in the same three-dimensional space. Pixels, respective of the basic and alternate views, and which have close positions in this three- dimensional space, are compared with respect to their textures. The difference value thus obtained for each pixel of the alternate view can be saved for performing the selecting step as described above.

[0046] According to a particular embodiment of the method above, said non-diffuse pixels are selected based on a comparison between said differences and at least one threshold depending on said pixel rate and/or said transmission rate.

[0047] This embodiment enables to control the number of selected non-diffuse pixels by modifying the threshold according to the pixel rate or transmission rate limitations.

[0048] According to a particular embodiment of the method above, said non-diffuse pixels are selected based on probabilities of detected non-diffuse pixels to be non-diffuse pixels; said probabilities being computed based on said differences.

[0049] Such an embodiment advantageously allows selecting non-diffuse pixels such that:

- the size of transmitted data per unit of time does neither exceeds the transmission rate nor the pixel rate; and

- the non-diffuse pixels which have the largest impact on quality of rendered views are transmitted.

[0050] According to a particular embodiment, said selecting step comprises substeps of

- associating the pixels of said views with epipolar plane image lines;

- determining, for each epipolar plane image line, at least one value representative of the form of this epipolar plane image line; and said non-diffuse pixels are selected based on said values.

[0051] A slice of stacked views of a same scene presents lines called epipolar plane image (EPI) lines. Each line corresponds to a same object displayed in consecutive views. If the consecutive views are spaced with the same angle, the geometry of these lines reflects the diffusivity of the corresponding objects. Namely, a diffuse object yields a straight EPI line whereas a non-diffuse object yields a curved EPI line.

[0052] Hence, determining the form of an EPI line, for instance by computing its curvature, enables to determine a degree of non-diffusivity of all pixels of this EPI line. This is particularly advantageous when the basic view and the alternate views correspond to consecutive views spaced at the same angle, since it permits the accurate selection or rejection of all pixels of an EPI lines with respect to the form of their corresponding EPI line.

[0053] According to another aspect, the invention proposes a method implemented by a server for transmitting data used by a device for rendering multiple views of a single scene, said multiple views comprising at least one basic view and at least one alternate view, said method comprising steps Of:

- generating a stream by encoding said views;

- identifying at least one non-diffuse object in said views;

- generating data representative of said at least one non-diffuse object, said data comprising information enabling said device to render, for each said alternate view, said at least one non- diffuse object with a reflectance corresponding to this view; and

- transmitting said stream and said data to said device.

[0054] Correlatively, the invention proposes a server for transmitting data used by a device for rendering multiple views of a single scene, said multiple views comprising at least one basic view and at least one alternate view, said server comprising:

- a module of generating a stream by encoding said views;

- a module of identifying at least one non-diffuse object in said views;

- a module of generating data representative of said at least one non-diffuse object, said data comprising information enabling said device to render, for each said alternate view, said at least one non-diffuse object with a reflectance corresponding to this view; and

- a module of transmitting said stream and said data to said device.

[0055] The invention also proposes a method implemented by a device for rendering multiple views of a single scene, said multiple views comprising at least one basic view and at least one alternate view, a said view comprising a texture image, said method comprising steps of:

- receiving, from a server, a stream of encoded said views and data representative of at least one non-diffuse object in said views, said data comprising information enabling said device to render, for each view, said at least one non-diffuse object with the reflectance corresponding to this view;

- rendering said views based on said stream and said data, said at least one non-diffuse object being rendering with their corresponding reflectance.

[0056] Correlatively, the invention proposes a device for rendering multiple views of a single scene, said multiple views comprising at least one basic view and at least one alternate view, a said view comprising a texture image, said device comprising:

- a module of receiving, from a server, a stream of encoded said views and data representative of at least one non-diffuse object in said views, said data comprising information enabling said device to render, for each view, said at least one non-diffuse object with the reflectance corresponding to this view;

- a module of rendering said views based on said stream and said data, said at least one non- diffuse object being rendering with their corresponding reflectance.

[0057] This aspect of the disclosure invention proposes to transmit non-diffuse data which represents non-diffuse objects in the views. It enables to render non-diffuse surfaces while reducing the quantity of transmitted data. Indeed not all detected non-diffuse pixels have to be individually represented in the transmitted data.

[0058] The identification of non-diffuse objects may be based on detection of non-diffuse pixels as described previously. Non-diffuse objects may also be objects labelled as such. This identification may include an additional step of selection among the detected non-diffuse pixels according to embodiments described above. In this case, a non-diffuse object can be detected by grouping adjacent detected non-diffuse pixels.

[0059] The detection of non-diffuse pixels can also be based on a machine learning method. For instance using a convolutional neural network which takes each view as input, and outputs the position of each non-diffuse object in each view.

[0060] According to a particular embodiment of the method described above, said data comprise, for each said alternate view and for at least one said non-diffuse object, a texture difference between this object displayed in this alternate view and the same object displayed in a said basic view.

[0061] Instead of transmitting the texture of non-diffuse objects, this embodiment proposes that the transmitted data comprise differences of texture between the basic view and the alternate views. Transmitting differences of texture further reduces the size of transmitted data.

[0062] Since the textures of both diffuse and non-diffuse objects of the basic view are transmitted, the difference between textures of an alternate view and the basic view is sufficient for the client to recover the texture of non-diffuse pixels of the alternate view. Advantageously, recovering texture of non-diffuse objects in the alternate views from textures of the basic view and differences of textures may be performed with less computer resources.

[0063] This embodiment hence allows reducing the amount of data transmitted and accurately rendering non-diffuse objects in the alternate views with less computer resources.

[0064] According to another embodiment, said data comprise, for said at least one non-diffuse object, a set of parameters of a reflectance model of this object.

[0065] Reflectance models such as Phong model are used to render the reflectance of an object based on parameters such as light source angles, surface material of the object and surface reflectance.

[0066] Advantageously, the parameters of the reflectance model represent a small amount of data to be transmitted. Furthermore, their size is independent of the size of the object. This embodiment hence enables to greatly reduce the amount of non-diffuse data to be transmitted to the client.

[0067] According to a particular embodiment, the method comprises a step of generating a generated view corresponding to a position of a given point of view shared between said device and said server, said generating step being based on said at least one alternate view and said at least one basic view, said generated view comprising a texture image of said at least one nondiffuse objects, wherein said data comprise information enabling said device to render said at least one non-diffuse object with a reflectance corresponding to the generated view.

[0068] In this embodiment, the generated view may not be coded within the stream.

[0069] This embodiment enables the client to render non-diffuse objects with reflectance corresponding to the generated views. This embodiment reduces the computer resources needed for rendering generated intermediate views on the client side.

[0070] The client may generate from (i) the decoded stream which contains information on the basic view and alternate views and (ii) from the shared position of the given point of view, an intermediate view for this point of view. Yet, the sole information of the stream do not enable the client to render the non-diffuse objects in the intermediate view adequately. To render non- diffuse objects adequately in the intermediate view without significantly increasing the client resources, the client uses the textures of non-diffuse objects generated by the server.

[0071] Furthermore, only the information on the non-diffuse objects have to be transmitted, which limits the amount of data to be transmitted.

[0072] For the views generated by the server and the view generated by the client to be identical, a message comprising the position may be transmitted from the client to the server.

[0073] Thus, in one particular embodiment, the method comprises a step of receiving said position from said device.

[0074] According to a particular embodiment, said data comprise, for at least one said non-diffuse object:

(i) a difference between:

- a texture of this object in said generated view and

- a texture of this object in said at least one basic view; and

(ii) a difference between:

- a texture of this object in said generated view and

- a texture of this object in said at least one alternate view from which the generated view is generated.

[0075] This embodiment limits the amount of transmitted data which enable to render the non- diffuse objects in both of the generated views and the alternate views.

[0076] According to a particular embodiment, the step of generating the generated view is performed by computing the texture image of a said generated view as a weighted average of:

- textures of a said basic view; and

- textures of a said alternate view. [0077] This embodiment provides a simple method in terms of computer resources needed for generating intermediate views. For instance, in order to generate a view with the same angle offsets with the basic as with an alternate view, the average of these basic and alternate views is computed, with equal weights for both views. The weights of the basic and alternate views can also be modified in order to change the angle offset with one view or the other.

[0078] According to a particular embodiment, said at least one basic view and said at least one alternate view comprise a depth map, and said generating step of a generated view comprises substeps of:

- generating a three-dimensional point cloud from textures and depth map of at least one of said basic and alternate views;

- generating the texture image of said generated view based on a projection of said point cloud in a two-dimensional space.

[0079] In this embodiment, the pixels of basic and alternate views are mapped into the same three- dimensional space. The resulting point cloud is then projected in two-dimensional space to form the generated texture image. Generally, this projection is incomplete and the generated texture image has missing pixels. The projected point cloud can be processed in order to determine the texture of missing pixels.

[0080] The more views are used to generate the three-dimensional point cloud, the less texture will be missing in the projected point cloud, and the more accurate will be the generated texture image.

[0081] The invention also provides a computer program comprising instructions for performing the steps of the method according to any of the above-described embodiments, when said program is executed by a computer or a processor.

[0082] It should be noted that the computer programs referred to in this application may use any programming language, and be in the form of source code, object code, or code intermediate between source code and object code, such as in a partially compiled form, or in any other desirable form.

[0083] The invention also provides a storage medium, readable by computer equipment, of a computer program comprising instructions for executing the steps of the method according to one of the embodiments described above, when said program is executed by a computer or a processor.

[0084] The storage medium referred to in this statement may be any entity or device capable of storing the program and being played by any computer equipment, including a computer. For example, the medium may include a storage medium, or a magnetic storage medium, such as a hard drive. [0085] Alternatively, the storage medium may correspond to a computer integrated circuit in which the program is incorporated, and adapted to execute a method as described above or to be used in the execution of such method.

Brief description of the drawings

[Fig. 1A] Figure 1A shows a server and a device according to prior art;

[Fig. IB] figure IB shows a stream of encoded views according to prior art;

[Fig. 2] figure 2 shows a server and a device according to a first embodiment of the invention;

[Fig. 3] figure 3 represents in flowchart the main steps of a method implemented by the server and the main steps of a method implement by the device in the first embodiment of the invention;

[Fig. 4] figure 4 shows a server and a device according to a second embodiment of the invention;

[Fig. 5] figure 5 represents in flowchart the main steps of a method implemented by the server and the main steps of a method implement by the device in the second embodiment of the invention;

[Fig. 6] figure 6 shows a server and a device according to a third embodiment of the invention;

[Fig. 7] figure 7 represents in flowchart the main steps of a method implemented by the server and the main steps of a method implement by the device in the third embodiment of the invention;

[Fig. 8A] figure 8A shows transmitted data according to a particular embodiment;

[Fig. 8B] figure 8B shows transmitted data according to a particular embodiment;

[Fig. 8C] figure 8C shows transmitted data according to a particular embodiment;

[Fig. 8D] figure 8D shows transmitted data according to a particular embodiment;

[Fig.9A] figure 9A represents the textures of objects displayed in two views;

[Fig.9B] figure 9B represents texture differences of objects displayed in two views, according to a particular embodiment;

[Fig.9C] figure 9C represents texture differences of objects displayed in two views, according to a particular embodiment.

[Fig 10] figure 10 illustrates the hardware architecture of a server according to a particular embodiment of the invention;

[Fig 11] figure 11 illustrates the hardware architecture of a device according to a particular embodiment of the invention. of embodiments

[0086] Several embodiments of the invention will now be described. Generally speaking and as previously mentioned, the invention provides a method implemented by a server for generating and transmitting data used by a client device to render non-diffuse pixels of multiples views of a same scene.

First embodiment

[0087] The first embodiment is in the context where the size of the transmitted data depends on the transmission rate from the server to the device and/or on the pixel rate for decoding and rendering by the client device.

[0088] Figure 2A shows a server 1SVR and a device 1CTL according to this first embodiment. The device is a client which receives a stream STR of encoded views and additional data 1DND for rendering these views. Figure 2B represents a stream STR according to the first embodiment.

[0089] The server 1SVR comprises a module MS0 of obtaining multiple views of a same scene S, namely basic views Vref and alternate views Vi. This module MS0 can for instance be an acquisition module connected to cameras configured to capture the multiple views. In a variant; the module MS0 is a module of synthetizing the multiple views.

[0090] Each view Vref or Vi comprises a texture image Tref or Ti. In this embodiment, each texture image is associated with a depth map Dref or Di. Hence each pixel has an attribute of texture, which corresponds to a color, and an attribute of depth, which corresponds to a distance of the pixel with a point of view of the scene S.

[0091] The server 1SRV comprises an encoder module MS10 of generating a stream STR which contains information for partially rendering the multiple views Vref and Vi. The data 1DND comprise information on selected non-diffuse pixels of alternate views Vi, and thus enable to refine the rendering of the views partially rendered from the stream STR.

[0092] As mentioned before, the man skilled in the art understands that, in the rendered multiple views:

- the texture of the selected non-diffuse pixels of alternate views would display the appropriate reflectance (i.e. the reflectance corresponding to the position of the viewer with respect to the object and the direction of the light in the scene S) ; and

- the texture of the non-selected non-diffuse pixels of alternate views would not display the appropriate reflectance.

[0093] In this embodiment, this encoding is done according to the standard MIV method. Accordingly, the module MS10 performs a pruning of the alternate views Vi. The stream STR generated by the module MS10 contains pixels from the basic view Vref and the pruned alternate views (also called the additional views) as well as metadata necessary for recovering the alternate views Vi from the basic view Vref and the additional views, but this recovering from the stream alone does not include the rendering of the adequate reflectance of the non-diffuse pixels.

[0094] The data 1DND are generated by 1MS20, 1MS30 and 1MS40 modules.

[0095] First, the module 1MS20 detects the non-diffuse pixels NDP in the alternate views Vi. This detection can be done by comparing, for each alternate view Vi, each pixel of this view with pixels of the basic view, based on their positions and their textures.

[0096] For instance, the pixels of the different views are mapped into a common three dimensional space. This mapping can be based on depth maps associated to each view. In each alternate view, the textures of pixels are compared with the textures of the pixels of the basic view which have close positions in the three-dimensional space. Then, if a difference of textures associated to a pair of pixels exceed a predefined threshold, these pixels are detected as non-diffuse.

[0097] Second, the module 1MS30 uses the differences of textures for each detected non-diffuse pixel to select non-diffuse pixels SNDP. This selection enables to reduce the quantity of information to transmit in case they exceed transmission rate or pixel rate limitations.

[0098] In some case, the pixel or transmission rate can change depending on the client device 1CLT. In this case, it is necessary for the client to inform the server of these changes.

[0099] To this end, in the embodiment represented in figure 2, a module MD0 of the client device 1CLT sends the pixel rate PR and transmission rate TR to the server SVR, which receives them via a module MS01.

[0100] The module 1MS30 obtains the pixel rate PR and transmission rate TR from the module MS01.

[0101] Thirdly, the module 1MS40 generates data 1DND representative of the non-diffuse pixels SNDP selected by the module 1MS30. These data 1DND will be decoded and used by the client device 1CLT to render the texture of non-diffuse pixels of the alternate views.

[0102] For instance, the 1DND data comprise the textures of the selected non-diffuse pixels of the alternate views Vi.

[0103] As represented in figure 8A, the 1DND data may also comprise the differences of texture AT(Vref, Vi) between the non-diffuse pixels of the alternate views Vi and the pixels of the basic view which correspond to the same positions in the scene S.

[0104] In this embodiment, the size of data 1DND is controlled during the selection of non-diffuse pixels by the module 1MS30. In particular, the module 1MS30 selects a number of non-diffuse pixels and/or modifies the coding scheme of at least some of the non-diffuse pixels as a function of the pixel rate and/or on the transmission rate. [0105] For instance, among the detected non-diffuse pixels NDP, in the case where differences of texture have been calculated during the detection of non-diffuse pixels by the module 1MS20, the pixels which have the highest differences are selected. The number of pixels SNDP so selected increases with the pixel rate and/or the transmission rate.

[0106] Without exceeding the scope of the invention, the methods used for detecting or selecting the non-diffuse pixels SNDP can be different from the ones described above.

[0107] For instance, in one embodiment, epipolar plane image (EPI) lines are constructed from the views Vref and Vi. Each EPI line is formed with pixels of multiple views. The form of each EPI line -namely its curvature- depends on the reflectance of the surface this EPI line is associated to.

[0108] In particular, an EPI line and all the pixels forming it can be detected as non-diffuse if the curvature of the EPI line is non zero.

[0109] Thereafter, the detected non-diffuse pixels which are associated with the most curved EPI lines are selected; the quantity of so selected non-diffuse pixels SNDP being a function of the pixel rate and/or the transmission rate.

[0110] A module MS45 encodes the stream STR and the non-diffuse data DND into messages to be transmitted. For instance, the basic and additional views in the stream are encoded with a standard format (such as MPEG-2, H.264, HEVC, WC, VP8, VP9, AVI...) and the non-diffuse data DND are encoded into a separate message, for instance a supplemental enhancement information message.

[0111] A module MS50 transmits these encoded stream STR and data 1DND to the client device 1CLT.

[0112] These encoded stream STR and data 1DND are received on the client side by a module MD60, and then decoded by a module MD61.

[0113] In this embodiment, the data 1DND representative of non-diffuse pixels and the stream STR are processed independently by the client 1CLT.

[0114] In this embodiment, a module MD65 reconstructs the alternate views Vi according to the MIV method. In particular, the alternate views Vi are reconstructed based on the basic view Vref, the additional views and the metadata comprised in the stream STR. This reconstruction of the alternate views Vi is partial since it does not include the rendering of the adequate texture of the non-diffuse surfaces. In particular, this partial rendering does not enable to display the adequate reflectance of the non-diffuse surfaces.

[0115] To complete the rendering, a module MD70 combines the partially rendered alternate views with the data 1DND to render the selected non-diffuse pixels SNDP with reflectance corresponding to their respective alternate view, and outputs the reconstructed basic view Vref~ and the reconstructed alternate views Vi~. [0116] Figure 3 represents in flowchart the main steps of the methods respectively implemented by the server and the device in this first embodiment.

[0117] On the client 1CLT side, the first embodiment of the method for rendering multiple views is composed by the following step:

• a step SDO, performed by the module MD0, of sending the pixel rate PR and transmission rate TR to the server 1SVR;

• a step SD60, performed by the module MD60, for receiving the stream STR and the nondiffuse data 1DND transmitted by the server 1SVR;

• a step SD61, performed by the module MD61 of decoding the transmitted stream STR and non-diffuse data 1DND;

• a step SD65, performed by the module MD65 of decoding the stream STR;

• a step 1SD70, performed by the module 1MD70 of rendering the views Vref and Vi.

[0118] On the server 1SVR side, the first embodiment of the method for transmitting data used by the device 1CTL for rendering the views is composed by the following step:

• a step SS01, performed by the module MS01, for receiving from the device 1CLT the pixel and transmission rates PR and TR;

• a step SSO, performed by the module MS0, for obtaining the basic view Vref and the alternate views Vi;

• a step SS10, performed by the module MS10 for encoding the stream STR;

• a step 1SS20, performed by the module 1MS20 for detecting non-diffuse pixels;

• a step 1SS30, performed by the module 1MS30 for selecting the non-diffuse pixels SNDP;

• a step 1SS40, performed by the module 1MS40 for generating the non-diffuse data

1DND;

• a step SS45, performed by the module MS45 for encoding the stream STR and the non- diffuse data;

• a step SS50, performed by the module MS50 for sending the encoded stream STR and non-diffuse data 1DND.

Second embodiment

[0119] In this second embodiment, the stream STR of encoded views is shown in figure 2B, identical or similar to those of the first embodiment. [0120] This second embodiment is in the context where non-diffuse objects are detected and where transmitted data comprise information enabling the client to render these non-diffuse objects with the reflectance corresponding to each view in which these objects appear.

[0121] In an example shown by figure 9, two non-diffuse objects 01 and 02 are identified.

[0122] Figure 9A shows the texture image Tref of a basic view Vref and the texture image Ti of an alternate view Vi. Two objects 01 and 02 are identified in these texture images. Only the pixels representing these objects are represented.

[0123] In figure 9A, each texture of a pixel is represented by a number. Notice that this example has only illustrative purpose. A pixel on a texture image can be represented by several numbers. For instance, with RGB representation, a pixel is represented by three values, each one indicating respectively the levels of red, green and blue of the texture image at the position of this pixel.

[0124] Due to different points of view, the objects 01 and 02 do not have the same aspect in the different views Vref and Vi, hence the pixels representing the objects 01 and 02 have different textures depending on the view Vref or Vi in which they appear.

[0125] Figure 9B shows the pixelwise differences AT_l(Vref, Vi) between the texture of object 01 in the alternate view Vi and the texture of the same object in the basic view Vref. Figure 9B also shows the pixelwise differences of texture AT_2(Vref, Vi) for object 02.

[0126] In the case where a texture of a pixel corresponds to several numbers, a difference of texture between two pixels can be the difference between the sums of the number representing the textures. For instance, in a RGB representation, the difference between a texture (20,100,0) and a texture (10, 105,2) is defined as 20 + 100 + 0 - 10 - 105 - 2 = 3. In RBG representation, such a difference corresponds to a difference of luminance between the two pixels.

[0127] Figure 9C shows the average difference AAT_l(Vref, Vi) between the texture of object 01 in the alternate view Vi and the texture of the same object in the basic view Vref. The average AAT_l(Vref, Vi) shown in figure 9C corresponds to the average of differences AT_l(Vref, Vi) shown in figure 9B. Figure 9C also shows the average difference of texture AAT_2(Vref, Vi) for object 02.

[0128] Figure 4 shows a server 2SRV and a device 2CLT according to the second embodiment of the invention.

[0129] In this embodiment, modules MSO, MS10, MS45, MS50, MD60, MD61, MD65 of the server 2SVR are identical to the modules of the server 1SRV with the same reference.

[0130] In this embodiment, the data 2DND representative of non-diffuse surfaces are generated by two modules 2MS20 and 2MS40.

[0131] The module 2MS20 identifies the non-diffuse objects in each view. The corresponding identification step 2SS20 may consist in detecting non-diffuse pixels on each view, as described previously for the first embodiment, and applying mathematical morphology operations such as opening and closing in order to identify groups of non-diffuse pixels with distinct objects.

[0132] The identification of non-diffuse objects on each view can also be performed with machine learning techniques. For instance, the non-diffuse objects can be identified in each view with a convolutional neural network trained for detecting object from texture images and classifying their type of surface (such as diffuse or non-diffuse).

[0133] Each object in the scene S can be associated to a label. In each view, the non-diffuse pixels associated with an object are associated with the label of this object. These labels enable to compare a group of pixels of an alternate view Vi with the group of pixels in the basic view Vref which correspond to the same object.

[0134] The module 2MS40 generates the data 2DND representative of the non-diffuse objects identified by the module 2MS20.

[0135] In one embodiment, the non-diffuse data 2DND generated by the module 2MS40 comprise pixelwise differences of texture such as illustrated by figure 9B.

[0136] In another embodiment, the non-diffuse data 2DND generated by the module 2MS40 comprise average differences of texture such as illustrated by figure 9C.

[0137] The non-diffuse data 2DND may also comprise labels Tags(Vi) of each non-diffuse object in the alternate views Vi, as represented in figure 8B.

[0138] The non-diffuse data 2DND may also comprise positions Pos(Vi) of each non-diffuse object in the alternate views Vi, as represented in figure 8B. In particular, the positions Pos(Vi) can be the relative position of each object in the texture image Ti of each alternate view Vi with respect to its position in the texture image Tref of the basic view Vref.

[0139] On the client device 2CLT represented in figure 4, the module 2MD70 uses the non-diffuse data 2DND to combine the differences of texture AT_l(Vref, Vi) and AT_2(Vref, Vi) to the texture images of the pruned alternate views. In this embodiment these texture differences are combined with the pruned views repectively corresponding to alternate views Vi at the adequate positions on the images Ti thanks to the labels Tags(Vi) and the positions Pos(Vi).

[0140] Instead of comprising differences of texture, in another embodiment, the non-diffuse data 2DND comprise parameters NDParl, NDPar2 of a reflectance model for each non-diffuse object according to each point of view. The parameters of the reflectance model of each non-diffuse object are used on client side by the module 2MD70 to compute the appropriate texture of this object.

[0141] An example of such model of reflectance is the Phong model. The Phong model provides an equation for computing the illumination of each surface point of an object. [0142] Information on light source angles, and surface materials, surface reflectance are needed for rendering the non-diffuse objects. They can be known either because the scene S is synthetic, and was generated by a rendering software (like Blender, or Unity), or because the natural scene has been manually labelled.

[0143] For a given alternate view Vi, the parameters of the reflectance model may comprise the relative angle or position between the observer and each object.

[0144] Figure 5 represents in flowchart the main steps of the methods respectively implemented by the server and the device in this second embodiment.

[0145] On the client 2CLT side, the second embodiment of the method for rendering multiple views is composed by the following step:

• a step SD60, performed by the module MD60, for receiving the stream STR and the nondiffuse data 2DND transmitted by the server 2SVR;

• a step SD61, performed by the module MD61 of decoding the transmitted stream STR and non-diffuse data 2DND;

• a step SD65, performed by the module MD65 of decoding the stream STR;

• a step 2SD70, performed by the module 2MD70 of rendering the views Vref and Vi.

[0146] On the server 2SVR side, the second embodiment of the method for transmitting data used by the device 1CTL for rendering the views is composed by the following step:

• a step SSO, performed by the module MS0, for obtaining the basic view Vref and the alternate views Vi;

• a step SS10, performed by the module MS10 for encoding the stream STR;

• a step 2SS20, performed by the module 1MS20 for identifying non-diffuse objects;

• a step 2SS40, performed by the module 1MS40 for generating the non-diffuse data 2DND;

• a step SS45, performed by the module MS45 for encoding the stream STR and the non- diffuse data 2DND;

• a step SS50, performed by the module MS50 for sending the encoded stream STR and non-diffuse data 2DND.

Third embodiment

[0147] In this third embodiment, the stream STR of encoded views is shown in figure 2B, identical or similar to those of the first and second embodiments.

[0148] The third embodiment is in the context where views Vi* are generated by the server SVR. These generated views Vi* comprise the texture images Ti* of non-diffuse objects represented with new points of view. They are used by the client device CLT to render the non-diffuse objects according to these new points of view.

[0149] Figure 6 shows a 3SRV server and a device 3CLT according to the third embodiment.

[0150] Compared to the second embodiment, this third embodiment comprises a module 3MS30 of generating the generated views Vi*.

[0151] Also in this embodiment, a module MD24 of the device 3CLT send information IPos representative of the new point of views. For example, this information IPos may comprise, for each view to be generated, a position of the observer of the scene corresponding to this view. In another example, this information IPos comprises the relative positions of the observer with respect to each non-diffuse object.

[0152] The information IPos are received by the module MS25 of the server 3SVR, and then used by the module 3MS30 for generating the views Vi*.

[0153] In this embodiment, the views Vi* are generated based on the basic and alternate views Vref and Vi. In particular, the depth map of each of these views Vref and Vi are used to map each pixel into a three-dimensional point cloud representing the scene S. Then, each point of this three-dimensional representation of the scene S are projected on a texture image representing a view of the scene S according to the information IPos on the observer's position.

[0154] In another embodiment, a texture image Ti* of a view Vi* is computed as a weighted average of the texture image Ti of an alternate view Vi and the texture image Vref of the basic view. For instance, if the view Vi* corresponds to the point of view placed in between the point of view according to the view Vi and the point of view according to the view Vref, the texture image Ti* is computed as the average of the texture images Ti and Tref.

[0155] The average of two texture images is a texture image constituted by the averages of pairs of respective pixels of both images.

[0156] In the third embodiment, for each non-diffuse object 01 and 02, each generated view Vi*, and each alternate view Vi, the module 3MS40 computes:

• the differences AT_l(Vref, Vi*) and AT_2(Vref, Vi*) between the texture of the object in the generated view Vi* and the texture of this object in the basic view Vref; and

• the differences AT_l(Vi, Vi*) and AT_2(Vi, Vi*) between the texture of the object in the generated view Vi* and the texture of this object in the alternate view Vi.

[0157] These differences of textures are comprised in the non-diffuse data DND used by the client device CLT to render the non-diffuse object in the alternate views Vi as well as in generated views Vi*. [0158] In this embodiment, a module MD66 of the device 3CLT generates views from the basic view Vref and the pruned alternate view Vi (i.e. the additional views) decoded by the module MD65. The module MD66 also uses the information IPos to generate the views according to the observer's position given in the information IPos.

[0159] The module MD66 may use the same method as the module 3MD30 for generating the views.

[0160] It is important to notice that the views generated by the module MD66 do not comprise information about the appropriate reflectance of the non-diffuse objects.

[0161] The module 3MD70 then combines the non-diffuse data 3DND with these views generated by the module MD66 to render the generated views Vi*, as well as the alternate views Vi, with the adequate textures of non-diffuse objects.

[0162] In another embodiment, the complete views Vi* generated on the server side are transmitted to the client 3CLT. In this case, a larger amount of data 3DND has to be transmitted but the client does not have to compute generated views. This can be advantageous in case of lower pixel rate PR.

[0163] In the third embodiment, the non-diffuse data 3DND also comprise labels Tags(Vi) and Tags(Vi*) of each non-diffuse object appearing in the alternate and generated views Vi and Vi*, as represented in figure 8D

[0164] The non-diffuse data 3DND can also comprise positions Pos(Vi) and Pos(Vi*) of each non- diffuse object appearing in the views Vi and Vi*, as represented in figure 8D.

[0165] Figure 7 represents in flowchart the main steps of the methods respectively implemented by the server and the device in this third embodiment.

[0166] On the client 3CLT side, the first embodiment of the method for rendering multiple views is composed by the following step:

• a step SD24, performed by the module MD24, of sending the information IPos to the server 3SVR;

• a step SD60, performed by the module MD60, for receiving the stream STR and the non- diffuse data 3DND transmitted by the server 3SVR;

• a step SD61, performed by the module MD61 of decoding the transmitted stream STR and non-diffuse data 3DND;

• a step SD65, performed by the module MD65 of decoding the stream STR;

• a step SD66, performed by the module MD66 of generating views ; a step 3SD70, performed by the module 3MD70 of rendering the views Vref and Vi. [0167] On the server 1SVR side, the first embodiment of the method for transmitting data used by the device 1CTL for rendering the views is composed by the following step:

• a step SSO, performed by the module MSO, for obtaining the basic view Vref and the alternate views Vi;

• a step SS10, performed by the module MS10 for encoding the stream STR;

• a step 3SS20, performed by the module 3MS20 for detecting non-diffuse pixels;

• a step SS25, performed by the module MS25 for receiving the information IPos;

• a step 3SS30, performed by the module 3MS30 for generating the generated views Vi*;

• a step 3SS40, performed by the module 3MS40 for generating the non-diffuse data 3DND;

• a step SS45, performed by the module MS45 for encoding the stream STR and the non- diffuse data 3DND;

• a step SS50, performed by the module MS50 for sending the encoded stream STR and non-diffuse data 3DND.

[0168] As illustrated in figure 10, the server 3SRV comprises in particular a processor 1 SRV , a random access memory 3 SR v, a read-only memory 2 SRV , a non-volatile flash memory 4 SRV .

[0169] The read-only memory 2 SRV constitutes a recording medium according to the invention, readable by the processor 1 SRV and on which a computer program PG SR v according to the invention is recorded.

[0170] The computer program PG SR v defines the functional (and here software) modules of the server.

[0171] As illustrated in figure 11, the device CLT comprises in particular a processor 1 CL T, a random access memory 3 C LT, a read-only memory 2 C LT, a non-volatile flash memory 4 C LT-

[0172] The read-only memory 2 CL T constitutes a recording medium according to the invention, readable by the processor 1 C LT and on which a computer program PG C LT according to the invention is recorded.

[0173] The computer program PG C LT defines the functional (and here software) modules of the device CLT.