Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT PROVIDING FOR SIGNALING OF VIEWPORT ORIENTATION TIMING IN PANORAMIC VIDEO DELIVERY
Document Type and Number:
WIPO Patent Application WO/2021/144139
Kind Code:
A1
Abstract:
A method, apparatus, and computer program product provide for signaling of viewport orientation timing, such as in panoramic video delivery. In the context of a method, the method receives a first stream comprising panoramic video content based on a first viewport orientation. The method generates a feedback message comprising one or more updated parameters of a second viewport orientation and causes transmission of the feedback message. The method also receives a second stream comprising panoramic video content based on the second viewport orientation of the client device. The method further comprises determining a motion to high-quality delay value based on timestamps associated with the first and second streams and causing transmission of the motion to high-quality delay value.

Inventors:
AHSAN SABA (FI)
CURCIO IGOR DANILO DIEGO (FI)
Application Number:
PCT/EP2020/088035
Publication Date:
July 22, 2021
Filing Date:
December 30, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NOKIA TECHNOLOGIES OY (FI)
International Classes:
H04N21/218; H04N21/242; H04N21/43; H04N21/6587; H04N21/81
Domestic Patent References:
WO2019120638A12019-06-27
WO2018049221A12018-03-15
Other References:
YONG HE (INTERDIGITAL) ET AL: "MPEG-I: VR Experience Metrics", no. m41120, 11 July 2017 (2017-07-11), XP030069463, Retrieved from the Internet [retrieved on 20170711]
Attorney, Agent or Firm:
NOKIA EPO REPRESENTATIVES (FI)
Download PDF:
Claims:
THAT WHICH IS CLAIMED:

1. A method comprising: receiving, at a client device, a first stream comprising panoramic video content based on a first viewport orientation; detecting an event comprising a change in the first viewport orientation to a second viewport orientation; in response to the event, generating a feedback message comprising one or more updated parameters of the second viewport orientation; causing transmission of the feedback message to a source device; receiving, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation; determining a first timestamp value associated with a frame of the first stream; and determining a second timestamp value associated with a frame of the second stream.

2. The method according to claim 1, further comprising: determining, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value; and causing transmission of the motion to high-quality delay value to the source device.

3. The method according to claim 2, further comprising: storing the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values.

4. The method according to any one of claims 1-3, further comprising: determining a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred; determining a viewport-delivered timestamp value based on a time at which the second stream is first received at the client device; and causing transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.

5. The method according to any one of claims 1-4, wherein the frame of the first stream comprises a last frame rendered at the client device in association with the first stream.

6. The method according to any one of claims 1-5, wherein the frame of the second stream comprises a first frame rendered at the client device in association with the second stream.

7. The method according to any one of claims 1-6, wherein the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

8. A computer program product comprising a non-transitory computer readable storage medium having program code portions stored thereon, the program code portions configured, upon execution, to: receive a first stream comprising panoramic video content based on a first viewport orientation; detect an event comprising a change in the first viewport orientation to a second viewport orientation; in response to the event, generate a feedback message comprising one or more updated parameters of the second viewport orientation; cause transmission of the feedback message; receive, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation; determine a first timestamp value associated with a frame of the first stream; and determine a second timestamp value associated with a frame of the second stream.

9. The computer program product according to claim 8, wherein the program code portions are further configured, upon execution, to: determine, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value; and cause transmission of the motion to high-quality delay value.

10. The computer program product according to claim 9, wherein the program code portions are further configured, upon execution, to: store the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values.

11. The computer program product according to any one of claims 8-10, wherein the program code portions are further configured, upon execution, to: determine a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred; determine a viewport-delivered timestamp value based on a time at which the second stream is first received; and cause transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value.

12. The computer program product according to any one of claims 8-11, wherein the frame of the first stream comprises a last frame rendered in association with the first stream.

13. The computer program product according to any one of claims 8-12, wherein the frame of the second stream comprises a first frame rendered in association with the second stream.

14. The computer program product according to any one of claims 8-13, wherein the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

15. An apparatus comprising: means for receiving, at a client device, a first stream comprising panoramic video content based on a first viewport orientation; means for detecting an event comprising a change in the first viewport orientation to a second viewport orientation; means for, in response to the event, generating a feedback message comprising one or more updated parameters of the second viewport orientation; means for causing transmission of the feedback message to a source device; means for receiving, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation; means for determining a first timestamp value associated with a frame of the first stream; and means for determining a second timestamp value associated with a frame of the second stream.

16. The apparatus according to claim 15, further comprising: means for determining, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value; and means for causing transmission of the motion to high-quality delay value to the source device.

17. The apparatus according to claim 16, further comprising: means for storing the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values.

18. The apparatus according to any one of claims 15-17, further comprising: means for determining a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred; means for determining a viewport-delivered timestamp value based on a time at which the second stream is first received at the client device; and means for causing transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.

19. The apparatus according to any one of claims 15-18, wherein the frame of the first stream comprises a last frame rendered at the client device in association with the first stream.

20. The apparatus according to any one of claims 15-19, wherein the frame of the second stream comprises a first frame rendered at the client device in association with the second stream.

21. The apparatus according to any one of claims 15-20, wherein the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

22. A method comprising: causing transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation; receiving a feedback message comprising one or more updated parameters of a second viewport orientation; generating, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation; causing transmission of the second stream to the client device; and in response to the transmission of the second stream, receiving one or more determined parameters from the client device.

23. The method according to claim 22, further comprising: updating the second stream based on the received one or more determined parameters; and causing transmission of the updated second stream to the client device.

24. The method according to any of claims 22-23, wherein the one or more determined parameters comprises a motion to high-quality delay value.

25. The method according to any of claims claim 22-24, wherein the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value.

26. The method according to claim 25, wherein the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

27. An apparatus comprising: processing circuitry; and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the processing circuitry, cause the apparatus at least to: cause transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation; receive a feedback message comprising one or more updated parameters of a second viewport orientation; generate, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation; and cause transmission of the second stream to the client device; and in response to the transmission of the second stream, receive one or more determined parameters from the client device.

28. The apparatus according to claim 27, wherein the at least one memory and the computer program code are further configured to, with the processing circuitry, cause the apparatus to: update the second stream based on the received one or more determined parameters; and cause transmission of the updated second stream to the client device.

29. The apparatus according to any of claims 27-28, wherein the one or more determined parameters comprises a motion to high-quality delay value.

30. The apparatus according to any of claims 27-29, wherein the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value.

31. The apparatus according to claim 30, wherein the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

32. A computer program product comprising a non-transitory computer readable storage medium having program code portions stored thereon, the program code portions configured, upon execution, to perform the method of any of claims 22 to 26.

Description:
METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT PROVIDING FOR SIGNALING OF VIEWPORT ORIENTATION TIMING IN PANORAMIC VIDEO

DELIVERY

TECHNOLOGICAL FIELD

[0001] An example embodiment relates generally to immersive content consumption, and, more particularly, to techniques for signaling of viewport orientation timing in panoramic video delivery.

BACKGROUND

[0002] An increasing amount of video content is captured and delivered for a variety of different applications. For example, video content may be delivered via streaming, such as for virtual reality applications or other types of applications. The video content that is captured and delivered may be expansive and may provide a panoramic view (e.g., a 180° view, 360° view, omnidirectional view, and/or the like). In contrast with traditional video, omnidirectional video enables spherical viewing direction with support for head-mounted displays, providing an interactive and immersive experience for users. As such, users may have only a limited field of view at any one instant and may change their viewing direction, such as by rotating their head when wearing a head-mounted display (HMD) while continuing to view the panoramic video content. In conjunction with virtual reality content, the entire content of a panoramic video may be streamed to a player. In this regard, users of a virtual reality application generally have a limited field of view such that at any point in time the user views only a portion of the panoramic video content.

[0003] Viewport-dependent streaming is based upon the viewing direction of the user equipped with the HMD such that the video content located in the viewing direction is delivered at a high quality (e.g., comprising a high resolution, a high framerate, and/or the like), while all other video content is delivered at a lower quality. As the users alter their viewing direction, the video content that is delivered at the high quality is correspondingly altered to correspond to the updated viewing direction, while all other video content is delivered at a lower quality. In an instance of head or body motion, or more generally, a change of user gaze direction, new viewport orientation information may be sent to a streaming source using a feedback message. The streaming source may then update the transmitted video stream based on the new viewport orientation information. However, current viewport-dependent streaming methods have limitations, including, for example, inaccurate calculation of important quality parameters. BRIEF SUMMARY

[0004] A method, apparatus, and computer program product are disclosed for providing for signaling of viewport orientation timing in panoramic video delivery. In one embodiment, the method, apparatus and computer program product are configured to receive a first stream based on a first viewport orientation and receive a second stream based on a second viewport orientation. The method, apparatus and computer program product are further configured to determine a first timestamp value associated with a frame of the first stream, determine a second timestamp value associated with a frame of the second stream, and determine, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value which may then be signaled to a source device. By providing for signaling of viewport orientation timing in panoramic (e.g., 360°, omnidirectional, and/or the like) video delivery, user experience during immersive content consumption may be improved. By receiving timing information made available through this disclosure, a source device may make more informed decisions on how to use available bandwidth. Further benefits include improved session monitoring, higher-level metrics of Real Time Control Protocol (RTCP) reports, and lower overhead, as sending motion to high- quality delay values or, in some embodiments, only relevant timing information, provides more accurate information in an efficient manner. Additionally, aggregated and/or periodic transmission of motion to high-quality delay values may further reduce overhead.

[0005] In one aspect, a method is provided comprising receiving, at a client device, a first stream comprising panoramic video content based on a first viewport orientation. The method further comprises detecting an event comprising a change in the first viewport orientation to a second viewport orientation. The method further comprises in response to the event, generating a feedback message comprising one or more updated parameters of the second viewport orientation. The method further comprises causing transmission of the feedback message to a source device. The method further comprises receiving, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation. The method further comprises determining a first timestamp value associated with a corresponding frame of the first stream. The method further comprises determining a second timestamp value associated with a corresponding frame of the second stream.

[0006] In some embodiments, the method further comprises determining, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value and causing transmission of the motion to high-quality delay value to the source device. In some embodiments, the method further comprises storing the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values. In some embodiments, the method further comprises determining a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred. In some embodiments, the method further comprises determining a viewport-delivered timestamp value based on a time at which the second stream is first received at the client device and causing transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.

[0007] In some embodiments of the method, the frame of the first stream comprises a last frame rendered at the client device in association with the first stream. In some embodiments of the method, the frame of the second stream comprises a first frame rendered at the client device in association with the second stream. In some embodiments of the method, the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

[0008] In another aspect, an apparatus is provided comprising processing circuitry and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the processing circuitry, cause the apparatus at least to receive a first stream comprising panoramic video content based on a first viewport orientation, to detect an event comprising a change in the first viewport orientation to a second viewport orientation, to, in response to the event, generate a feedback message comprising one or more updated parameters of the second viewport orientation, to cause transmission of the feedback message to a source device, to receive, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation, to determine a first timestamp value associated with a frame of the first stream, and to determine a second timestamp value associated with a frame of the second stream.

[0009] In some embodiments, the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to further determine, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value and cause transmission of the motion to high-quality delay value to the source device. In some embodiments, the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to further store the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values. In some embodiments, the at least one memory and computer program code are further configured to, with the processing circuitry, to cause the apparatus at least to further determine a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred. In some embodiments, the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to further determine a viewport-delivered timestamp value based on a time at which the second stream is first received and cause transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.

[0010] In some embodiments of the apparatus, the frame of the first stream comprises a last frame rendered in association with the first stream. In some embodiments of the apparatus, the frame of the second stream comprises a first frame rendered in association with the second stream. In some embodiments of the apparatus, the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

[0011] In a further aspect, a computer program product is provided comprising a non- transitory computer readable storage medium having program code portions stored thereon, the program code portions configured, upon execution, to receive, a first stream comprising panoramic video content based on a first viewport orientation. The program code portions are further configured to, upon execution, detect an event comprising a change in the first viewport orientation to a second viewport orientation. The program code portions are further configured to, upon execution, in response to the event, generate a feedback message comprising one or more updated parameters of the second viewport orientation. The program code portions are further configured to, upon execution, cause transmission of the feedback message. The program code portions are further configured to , upon execution, receive, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation. The program code portions are further configured to, upon execution, determine a first timestamp value associated with a corresponding frame of the first stream. The program code portions are further configured to determine, upon execution, a second timestamp value associated with a corresponding frame of the second stream.

[0012] In some embodiments, the program code portions are further configured to , upon execution, to determine, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value and cause transmission of the motion to high-quality delay value . In some embodiments, the program code portions are further configured to, upon execution, store the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values. In some embodiments, the program code portions are further configured to, upon execution, determine a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred. In some embodiments, the program code portions are further configured to, upon execution, determine a viewport- delivered timestamp value based on a time at which the second stream is first received and cause transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value.

[0013] In some embodiments of the computer program product, the frame of the first stream comprises a last frame rendered in association with the first stream. In some embodiments of the computer program product, the frame of the second stream comprises a first frame rendered in association with the second stream. In some embodiments of the computer program product, the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

[0014] In a further aspect, an apparatus is provided comprising means for receiving, at a client device, a first stream comprising panoramic video content based on a first viewport orientation. The apparatus further comprises means for detecting an event comprising a change in the first viewport orientation to a second viewport orientation. The apparatus further comprises means for in response to the event, generating a feedback message comprising one or more updated parameters of the second viewport orientation. The apparatus further comprises means for causing transmission of the feedback message to a source device. The apparatus further comprises means for receiving, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation. The apparatus further comprises means for determining a first timestamp value associated with a corresponding frame of the first stream. The apparatus further comprises means for determining a second timestamp value associated with a corresponding frame of the second stream.

[0015] In some embodiments, the apparatus further comprises means for determining, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value and causing transmission of the motion to high- quality delay value to the source device. In some embodiments, the apparatus further comprises means for storing the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values. In some embodiments, the apparatus further comprises means for determining a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred. In some embodiments, the apparatus further comprises means for determining a viewport-delivered timestamp value based on a time at which the second stream is first received at the client device and causing transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.

[0016] In some embodiments of the apparatus, the frame of the first stream comprises a last frame rendered at the client device in association with the first stream. In some embodiments of the apparatus, the frame of the second stream comprises a first frame rendered at the client device in association with the second stream. In some embodiments of the apparatus, the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

[0017] In another aspect, a method is provided comprising causing transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation. The method further comprises receiving a feedback message comprising one or more updated parameters of a second viewport orientation. The method further comprises generating, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation. The method further comprises causing transmission of the second stream to the client device. The method further comprises, in response to the transmission of the second stream, receiving one or more determined parameters from the client device.

[0018] In some embodiments, the method further comprises updating the second stream based on the received one or more determined parameters and causing transmission of the updated second stream to the client device. In some embodiments of the method, the one or more determined parameters comprises a motion to high-quality delay value. In some embodiments of the method, the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value. In some embodiments of the method, the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

[0019] In another aspect, an apparatus is provided comprising processing circuitry and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the processing circuitry, cause the apparatus at least to cause transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation. The at least one memory and computer program code are further configured, with the processing circuitry, cause the apparatus at least to further receive a feedback message comprising one or more updated parameters of a second viewport orientation. The at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to further generate, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation and cause transmission of the second stream to the client device. The at least one memory and computer program code are further configured, with the processing circuitry, to, cause the apparatus at least to, in response to the transmission of the second stream, receive one or more determined parameters from the client device.

[0020] In some embodiments, the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to update the second stream based on the received one or more determined parameters and cause transmission of the updated second stream to the client device. In some embodiments of the apparatus, the one or more determined parameters comprises a motion to high-quality delay value. In some embodiments of the apparatus, the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value. In some embodiments of the apparatus, the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

[0021] In another aspect, a computer program product is provided comprising a non- transitory computer readable storage medium having program code portions stored thereon, the program code portions configured, upon execution, to cause transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation. The program code portions are further configured to, upon execution, receive a feedback message comprising one or more updated parameters of a second viewport orientation. The program code portions are further configured to, upon execution, generate, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation. The program code portions are further configured to, upon execution, cause transmission of the second stream to the client device. The program code portions are further configured to, upon execution, in response to the transmission of the second stream, receive one or more determined parameters from the client device.

[0022] In some embodiments, the program code portions are further configured to, upon execution, update the second stream based on the received one or more determined parameters and cause transmission of the updated second stream to the client device. In some embodiments of the computer program product, the one or more determined parameters comprises a motion to high-quality delay value. In some embodiments of the computer program product, the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value. In some embodiments of the computer program product, the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

[0023] In another aspect, an apparatus is provided comprising means for causing transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation. The apparatus further comprises means for receiving a feedback message comprising one or more updated parameters of a second viewport orientation. The apparatus further comprises means for generating, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation. The apparatus further comprises means for causing transmission of the second stream to the client device. The apparatus further comprises means for, in response to the transmission of the second stream, receiving one or more determined parameters from the client device.

[0024] In some embodiments, the apparatus further comprises means for updating the second stream based on the received one or more determined parameters and causing transmission of the updated second stream to the client device. In some embodiments of the apparatus, the one or more determined parameters comprises a motion to high-quality delay value. In some embodiments of the apparatus, the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value. In some embodiments of the apparatus, the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.

BRIEF DESCRIPTION OF THE DRAWINGS [0025] Having thus described certain example embodiments of the present disclosure in general terms, reference will hereinafter be made to the accompanying drawings, which are not necessarily drawn to scale, and wherein:

[0026] FIG. l is a block diagram of a system including a source device and a client device configured to communicate via a network in accordance with an example embodiment; [0027] FIG. 2 is a block diagram of an apparatus that may be specifically configured in accordance with an example embodiment of the present disclosure;

[0028] FIG. 3A illustrates a view of an example 360-degree video conference in accordance with some embodiments; [0029] FIG. 3B illustrates a view of an example 360-degree video conference in accordance with some embodiments;

[0030] FIG. 4 is a signal diagram of an example data flow in accordance with an example embodiment; [0031] FIG. 5A is a flow chart illustrating the operations performed, such as by the apparatus of FIG. 2 in accordance with an example embodiment;

[0032] FIG. 5B is a flow chart illustrating the operations performed, such as by the apparatus of FIG. 2 in accordance with an example embodiment;

[0033] FIG. 5C is a flow chart illustrating the operations performed, such as by the apparatus of FIG. 2 in accordance with an example embodiment;

[0034] FIG. 6 is a signal diagram of an example data flow in accordance with an example embodiment; and

[0035] FIG. 7 is a flow chart illustrating the operations performed, such as by the apparatus of FIG. 2 when embodied by a source device in accordance with an example embodiment.

DETAILED DESCRIPTION

[0036] Some embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all, embodiments of the invention are shown. Indeed, various embodiments of the invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like reference numerals refer to like elements throughout. As used herein, the terms “data,” “content,” “information,” and similar terms may be used interchangeably to refer to data capable of being transmitted, received and/or stored in accordance with embodiments of the present invention. Thus, use of any such terms should not be taken to limit the spirit and scope of embodiments of the present invention. [0037] Additionally, as used herein, the term ‘circuitry’ refers to (a) hardware-only circuit implementations (e.g., implementations in analog circuitry and/or digital circuitry); (b) combinations of circuits and computer program product(s) comprising software and/or firmware instructions stored on one or more computer readable memories that work together to cause an apparatus to perform one or more functions described herein; and (c) circuits, such as, for example, a microprocessor s) or a portion of a microprocessor(s), that require software or firmware for operation even if the software or firmware is not physically present. This definition of ‘circuitry’ applies to all uses of this term herein, including in any claims. As a further example, as used herein, the term ‘circuitry’ also includes an implementation comprising one or more processors and/or portion(s) thereof and accompanying software and/or firmware. As another example, the term ‘circuitry’ as used herein also includes, for example, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, other network device (such as a core network apparatus), field programmable gate array, and/or other computing device.

[0038] Referring now to FIG. 1, a block diagram of a system 100 is illustrated for providing for signaling of viewport orientation timing in panoramic video delivery, according to an example embodiment. It will be appreciated that the system 100 of FIG. 1 as well as the illustrations in other figures are each provided as an example of some embodiments and should not be construed to narrow the scope or spirit of the disclosure in any way. In this regard, the scope of the disclosure encompasses many potential embodiments in addition to those illustrated and described herein. As such, while FIG. 1 illustrates one example of a configuration of a system providing for signaling of viewport orientation timing, numerous other configurations may also be employed. As described herein, panoramic video content may refer to any type of immersive video content, such as 360-degree video, 180-degree video content, omnidirectional (e.g., spherical) video content, and/or the like.

[0039] The system 100 may include one or more first devices, e.g., client devices 104 (examples of which include but are not limited to a head-mounted display, camera, panoramic video device, virtual reality system, augmented reality system, video playback device, mobile phones, smartphones, tablets, and/or the like), and/or one or more second devices, e.g., source devices 102 (also known as a server, remote computing device, remote server, streaming server, and/or the like). Although the disclosure describes an example embodiment comprising a client-server architecture, the disclosure is not limited to a client- server architecture, but is also applicable to other architectures, such as peer-to-peer, broadcast/multicast, distributed, multi-party, point-to-point or point-to-multipoint, etc. In this regard, the first device can encompass other types of computing and/or communication devices other than a client device 104 and the second device can encompass other types of computing and/or communication devices other than a source device 102. For example, in a point-to-point architecture, the first device and the second device may be of the same type of device, such as mobile phones or tablets with 360-degree content display and/or streaming capabilities in a stand-alone mode or tethered with another virtual reality or augmented reality display device. [0040] The system 100 may further comprise a network 106. The network 106 may comprise one or more wireline networks, one or more wireless networks, or some combination thereof. The network 106 may, for example, comprise a serving network (e.g., a serving cellular network) for one or more source devices 102. The network 106 may com prise, in certain embodiments, one or more source devices 102 and/or one or more client devices 104. According to an example embodiment, the network 106 may comprise the Internet. In various embodiments, the network 106 may comprise a wired access link connecting one or more source devices 102 to the rest of the network 106 using, for example, Digital Subscriber Line (DSL), optical and/or coaxial cable technology. In some embodiments, the network 106 may comprise a public land mobile network (for example, a cellular network), such as a network that may be implemented by a network operator (for example, a cellular access provider). The network 106 may operate in accordance with universal terrestrial radio access network (UTRAN) standards, evolved UTRAN (E- UTRAN) standards, current and future implementations of Third Generation Partnership Project (3GPP) long term evolution (LTE) (also referred to as LTE-A) standards, third generation (3G), fourth generation (4G), and fifth generation (5G) standards, current and future implementations of International Telecommunications Union (ITU) International Mobile Telecommunications Advanced (IMT-A) system standards, and/or the like. It will be appreciated, however, that where references herein are made to a network standard and/or terminology particular to a network standard, the references are provided merely by way of example and not by way of limitation.

[0041] According to various embodiments, one or more source devices 102 may be configured to connect directly with one or more client devices 104 via, for example, an air interface without routing communications via one or more elements of the network 106. Additionally, or alternatively, one or more of the source devices 102 may be configured to communicate with one or more of the client devices 104 over the network 106. In this regard, the client devices 104 may comprise one or more nodes of the network 106.

[0042] According to various embodiments, the system 100 may be configured according to an architecture for providing for panoramic video streaming. For example, the system 100 may be configured to provide for immersive, panoramic video streaming and techniques to support a wide variety of applications including virtual reality and augmented reality applications.

[0043] One example of an apparatus 200 that may be configured to function as the source device 102 and/or client device 104 is depicted in FIG. 2. As shown in FIG. 2, the apparatus 200 includes, is associated with or is in communication with processing circuitry 22, a memory 24 and a communication interface 26. The processing circuitry 22 may be in communication with the memory 24 via a bus for passing information among components of the apparatus. The memory device may be non-transitory and may include, for example, one or more volatile and/or non-volatile memories. In other words, for example, the memory 24 may be an electronic storage device (e.g., a computer readable storage medium) comprising gates configured to store data (e.g., bits) that may be retrievable by a machine (e.g., a computing device like the processing circuitry). The memory 24 may be configured to store information, data, content, applications, instructions, or the like for enabling the apparatus to carry out various functions in accordance with an example embodiment of the present disclosure. For example, the memory 24 could be configured to buffer input data for processing by the processing circuitry 22. Additionally, or alternatively, the memory device 24 could be configured to store instructions for execution by the processing circuitry 22. [0044] The apparatus 200 may, in some embodiments, be embodied in various computing devices as described above. However, in some embodiments, the apparatus may be embodied as an integrated circuit, also denoted as a chip or chip set. In other words, the apparatus may comprise one or more physical packages (e.g., a packaged integrated circuit and/or chips) including materials, components and/or wires on a structural assembly (e.g., a baseboard). The structural assembly may provide physical strength, conservation of size, and/or limitation of electrical interaction for component circuitry included thereon. The apparatus 200 may therefore, in some cases, be configured to implement an embodiment of the present invention on a single chip or as a single “system on a chip.” As such, in some cases, a chip or chipset may constitute means for performing one or more operations for providing the functionalities described herein.

[0045] The processing circuitry 22 may be embodied in a number of different ways. For example, the processing circuitry may be embodied as one or more of various hardware processing means such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), a processing element with or without an accompanying DSP, or various other circuitry including integrated circuits such as, for example, an ASIC (application specific integrated circuit), an FPGA (field programmable gate array), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like. As such, in some embodiments, the processing circuitry 22 may include one or more processing cores configured to perform independently. A multi-core processing circuitry may enable multiprocessing within a single physical package. Additionally, or alternatively, the processing circuitry 22 may include one or more processors configured in tandem via the bus to enable independent execution of instructions, pipelining and/or multithreading. [0046] In an example embodiment, the processing circuitry 22 may be configured to execute instructions stored in the memory 24 or otherwise accessible to the processing circuitry. Alternatively, or additionally, the processing circuitry 22 may be configured to execute hard coded functionality. As such, whether configured by hardware or software methods, or by a combination thereof, the processing circuitry may represent an entity (e.g., physically embodied in circuitry) capable of performing operations according to an embodiment of the present disclosure while configured accordingly. Thus, for example, when the processing circuitry is embodied as an ASIC, FPGA or the like, the processing circuitry may be specifically configured hardware for conducting the operations described herein. Alternatively, as another example, when the processing circuitry is embodied as an executor of instructions, the instructions may specifically configure the processor to perform the algorithms and/or operations described herein when the instructions are executed. However, in some cases, the processing circuitry 22 may be a general-purpose processor or a processor of a specific device (e.g., an image or video processing system) configured to employ an embodiment of the present invention by further configuration of the processing circuitry by instructions for performing the algorithms and/or operations described herein.

The processing circuitry 22 may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processing circuitry.

[0047] The communication interface 26 may be any means such as a device or circuitry embodied in either hardware or a combination of hardware and software that is configured to receive and/or transmit data, including media content in the form of video or image files, one or more audio tracks or the like. In this regard, the communication interface 26 may include, for example, an antenna (or multiple antennas) and supporting hardware and/or software for enabling communications with a wireless communication network. Additionally, or alternatively, the communication interface 26 may include the circuitry for interacting with the antenna(s) to cause transmission of signals via the antenna(s) or to handle receipt of signals received via the antenna(s). In some environments, the communication interface 26 may alternatively or also support wired communication. As such, for example, the communication interface may include a communication modem and/or other hardware/software for supporting communication via cable, digital subscriber line (DSL), universal serial bus (USB) or other mechanisms.

[0048] It is to be understood that signaling between the source device 102 and the client device 104 (e.g., from the source device 102 to the client device 104 or from the client device 104 to the source device 102) may be carried out over any protocol at any layer of the International Organization for Standardization (ISO) Open Systems Interconnection (OSI) protocol stack (e.g., Session Description Protocol (SDP), Session Initiation Protocol (SIP), Real Time Streaming Protocol (RTSP), Real-Time Transport Protocol (RTP), Real-Time Transport Control Protocol (RTCP), Moving Pictures Experts Group Dynamic Adaptive Streaming over Hypertext Transfer Protocol (HTTP) (MPEG DASH) and the like). It is to be understood that reference to ‘downloading’ and ‘streaming’ data herein, while perhaps implemented through different transport protocol mechanisms, relate to similar functional concepts in this context.

[0049] The 3 rd Generation Partnership Project (3GPP) is a standards organization which develops protocols and is known for the development and maintenance of various standards. 3GPP is presently developing virtual reality (VR) conferencing solutions using the Multimedia Telephony Service for Internet Protocol Multimedia Subsystems (IMS) (MTSI) and Telepresence standards, which enable multi-party video conferences over mobile networks. A typical VR conference comprises one or more parties that are sending immersive content (e.g., via one or more source devices 102) to one or more parties equipped with devices capable of viewing the immersive content (e.g., a client device 104, such as an HMD and/or the like).

[0050] Referring to FIG. 3 A, an example VR conference is depicted wherein a source device located in a conference room A is sending viewport-dependent 360-degree video to two remote participants B and C, each equipped with a client device: participant B viewing the 360-degree video via an HMD and participant C viewing the 360-degree video via a mobile phone. Viewport information is sent from the remote participants (e.g., the client devices) to the 360-degree content source device in order to receive updated viewport- dependent video.

[0051] Viewport-dependent delivery is often utilized for VR content in order to deliver improved quality for video frames within the viewport (e.g., the video frames which the user is presently viewing) in comparison to video content outside of the viewport. This is beneficial for bandwidth saving and optimized delivery. For example, as a user receives (e.g., via streaming and/or downloading) and consumes panoramic content (referenced hereinafter as 360-degree content by way of example but not of limitation) on a client device 104 (e.g., a virtual reality device such as a head-mounted display) of FIG. 1, content within a viewport area is visible to the user. In other words, content that the user is viewing at any given time is viewport area content. For content delivered over a network 106, the entirety of the 360- degree content may be received or downloaded, with viewport area content being received or downloaded at a higher quality than content not currently being viewed by the user. Herein, content not currently being viewed by a user may be referred to as a background area. For example, background area content may be received or downloaded in a lower quality than the viewport area content in order to preserve bandwidth.

[0052] In addition to a viewport area and background area, a margin area (sometimes referred to as guard band(s)) may be extended around the viewport area. The margin area of an example embodiment may comprise a left margin area, right margin area, top margin area, and/or bottom margin area. In some embodiments, the margin area may be received or downloaded with an intermediate quality between the viewport area quality (higher quality) and the background area quality (lower quality). In some embodiments, the margin area may also be received in the same quality as a viewport area (e.g., higher quality). Margin areas may be useful during rendering. In this regard, a margin area may be extended, fully or partially, around a viewport area to compensate for any deviation of the actual viewport at the time of viewing from the predicted viewport at the time of rendering. In some embodiments, margins may be symmetrical on opposite sides of the viewport, regardless of the state of head motion (e.g., a user turning their head while wearing a head-mounted display). In some embodiments, a margin area may be extended asymmetrically in accordance with a direction of a head motion and/or the speed of the head motion. For example, a sender (e.g., source device 102) may utilize margin areas as spatial buffers at a high quality to accommodate head turns in any direction and/or at any speed.

[0053] FIG. 3B depicts another example VR conference, wherein a source device located in a conference room A is sending viewport-independent panoramic video to two remote participants B and C, each equipped with a client device. In this alternate scenario, there may be an intermediary device, such as a Media Resource Function (MRF) and/or a Media Control Unit (MCU) as shown in Figure 3B. The MRF and/or MCU may receive the viewport-independent video from a video source and deliver viewport-dependent video to the remote participant(s). In this regard, viewport information may be sent from the remote participants (e.g., the client devices) to the MRF and/or MCU in order to receive updated viewport-dependent video.

[0054] Conventionally, the transport protocol used for media delivery in MTSI and Telepresence is Real Time Protocol (RTP). Additionally, Real Time Control Protocol (RTCP) may be used as the control protocol for sending control information during the session. Session Initiation Protocol (SIP) and/or Session Description Protocols (SDP) protocols may be used for session establishment, session management, and session parameters that influence the media session exchanged and, in some cases, negotiated using SIP and/or SDP including device capabilities, available media streams, media stream characteristics, and/or the like. RTCP may be used for sending real-time control information such as Quality of Service (QoS) parameters from a receiving device (e.g., a client device 104) to the streaming source (e.g., source device 102). QoS parameters may include parameters related to jitter, packets dropped, and/or the like. Other real-time control parameters, such as viewport information, may also be sent using RTCP feedback messages. In this regard, the source device 102 may use adaptive mechanisms during media delivery to optimize the session quality based on transport characteristics.

[0055] As described above, during viewport-dependent media delivery (e.g., from a source device 102 to a client device 104), the area outside the viewport may be provided at a lower quality than the viewport area. In an instance of head or body motion, or in a more general situation, a change of gaze of the user, new viewport orientation information may be sent to the source device using a feedback message (e.g., an RTCP feedback message). The source device may then update the transmitted video stream based on the new viewport orientation information received. In this respect, the sender (e.g., source device) may further determine appropriate margins around the viewport area and determine the quality at which these margin areas, the viewport area, and/or the background are to be provided in to the client device.

[0056] Time comprising at least one Round Trip Time (RTT) is however involved for the updated viewport area to be visible to the user at the client device 104. Yet, the actual time taken may be longer depending on other parameters such as processing delays at the sender and/or receiver as shown in FIG 4. FIG. 4 depicts an example signal diagram of a sender of panoramic video content (e.g., a source device 102) transmitting a viewport-dependent video stream over RTP to a receiver (e.g., a client device 104). The viewport orientation is then changed at the receiver due to an event, such as head and/or body motion or a change in the gaze of the viewer. The receiver may be configured to then send an RTCP feedback message with the new viewport orientation information to the sender at the earliest possible time. In some embodiments, this time may be dependent on a timing rule associated with the RTCP protocol and/or the particular message. The sender may then update the RTP media stream according to the new viewport orientation information. However, additional sender-side delays may be involved, such as delays incurred due to encoding the additional content and/or the like. The receiver may then begin receiving the updated content and rendering the new viewport area at high quality.

[0057] Motion to high-quality delay may refer to the time taken after the viewport change event (e.g., the receiver’s head motion) for the new viewport to be updated to high quality and rendered at the client device 104. Motion to high-quality delay is an important QoS parameter for not only 360-degree video conferencing but any panoramic video session in general, including VR streaming. An RTP sender (e.g., source device 102) may be responsible for adapting the media delivery according to transport characteristics. In a panoramic video streaming session, this adaptation includes adapting viewport-dependent delivery e.g. delivering viewports with high quality margins when the motion to high-quality delay is large.

[0058] However, the RTP sender (e.g., source device 102) presently cannot accurately calculate the motion to high-quality delay based on the information received from RTCP feedback messages. The RTCP feedback message, which carries real time control and feedback information during an RTP session, has a structured format with fields that follow a predefined specification to which both source devices and client devices must adhere.

Further, while the source device and client device may be able to estimate RTT from their own perspectives, these values do not truly capture the motion to high-quality delay as depicted in FIG. 4 and as described above. Additionally, RTP and RTCP traffic may not be treated the same way in a network (e.g., carried over channels with differing Quality of Service parameters), and since motion to high-quality delay is affected by the delays experienced by both types of messages and on both the sending and receiving sides, any RTT values calculated using only RTCP may not accurately represent and/or include the network delays involved in the RTCP reporting of the viewport change (e.g., head motion) and/or updated RTP delivery of the new viewport area.

[0059] An example embodiment is herein described providing for signaling of viewport orientation timing, such as a motion to high-quality delay value, in panoramic video delivery. In embodiments detailed herein, the motion to high-quality delay value may be representative of a total amount of time taken from when a motion (e.g., change in viewport orientation due to head and/or body motion) occurs at the client device 104 to when a first frame associated with an updated viewport orientation is rendered in high-quality (e.g., within a viewport area relative to a quality level of a background area) at the client device 104. In various embodiments, signaling may be used between a source device 102 and a client device 104. FIGs. 5A, 5B, and 5C depict operations for two example embodiments. In both cases, viewport dependent 360-degree content is fetched from a source device 102 by a client device 104 over the network 106. The fetched content is played at the client device 104 using a VR display (e.g., HMD).

[0060] FIG. 5A illustrates a method 500 that may be performed by the apparatus 200 when embodied by a client device 104. At operation 501, the client device 104 includes means, such as the processing circuitry 22, communication interface 26, or the like, for receiving a first stream comprising panoramic video content based on a first viewport orientation. The stream may comprise video frames which comprise packets of data used for rendering the video frames. In certain embodiments, the client device 104 may receive the panoramic video stream from the source device 102 and/or via the network 106. For example, a user may stream panoramic (e.g., 360-degree, omnidirectional, and/or the like) video content at a client device 104, such as a head-mounted display. The panoramic video content may originate and be streamed from a source device 102, such as a server. For example, the source device 102 may be the original source of the panoramic video content. In some embodiments, the source device 102 may be one or more intermediary devices (e.g., an MRF and/or MCU as described above) that receive the panoramic video content directly or indirectly from an original source and relay the panoramic video content to one or more client devices 104. In some embodiments, the intermediary devices (e.g., the MRF and/or MCU as described above) may receive the video content from multiple original sources, which may then be stitched together to create the panoramic video content. In some embodiments, the intermediary devices may further process the video content prior to transmission of the video content to a client device. In some embodiments, the panoramic video content may be received after a request by the client device 104 is provided to the source device 102.

[0061] The first viewport orientation, Vi, may comprise an orientation at which the client device 104 is presently positioned. In some embodiments, the panoramic video content that is received by the client device 104 may be received using viewport-dependent delivery and comprise viewport and non-viewport content. For example, as described above, viewport content may be defined as a portion of the panoramic video content, such as one or more tiles and/or video frames, which a user is currently viewing at the client device 104. Non-viewport content may be a portion of the panoramic video content which is not being viewed by a user at the client device 104. In this regard, the panoramic video content based on a first viewport orientation received at the client device 104 may be viewport-dependent content comprising a viewport area (e.g., an area currently viewed by a user at the client device 104), a margin area as described above, and a background area (e.g., an area currently not being viewed by a user at the client device 104). In some embodiments, the client device 104 may receive panoramic video content for all of these areas (the viewport area, margin area, and background area) in varying quality levels. For example, in some embodiments, the viewport area may be delivered at a higher quality than the background area and/or margin area. For example, non viewport area content (e.g., a background area and/or margin area) may be handled differently than viewport area content in viewport-dependent delivery. For example, non viewport area content may be downloaded at a lower quality than viewport content. In some embodiments, portions of the non-viewport content may not be downloaded at all, or be delivered at a higher quality (e.g., margin areas).

[0062] At operation 502, the client device 104 includes means, such as the processing circuitry 22, communication interface 26, or the like, for detecting an event comprising a change in the first viewport orientation of the client device to a second viewport orientation. In one embodiment, the client device 104, such as via processing circuitry 22 and/or a component such as accelerometer and/or the like of the client device, may detect the change in viewport orientation by measuring a motion of the client device. For example, in some embodiments, the client device 104 may comprise circuitry for determining motion of the client device (e.g., HMD) and generating data associated with an updated viewport orientation, such as a second viewport orientation, Vi+i, which may be provided by the circuitry to the client device 104 for processing. In this regard, the second viewport orientation may occur at a later time than the first viewport orientation.

[0063] At operation 503, the client device 104 may generate a feedback message. In this regard, the client device 104 includes means, such as the processing circuitry 22, memory 24, or the like, for generating a feedback message comprising one or more updated parameters of the second viewport orientation. For example, the feedback message may be generated in response to the detected event comprising the change in the first viewport orientation, Vi, to the second viewport orientation, Vi+i. In some embodiments, the feedback message may only be generated in an instance in which the degree of change in viewport orientation is above a predefined threshold. In this regard, the client device 104 includes means, such as the processing circuitry 22, memory 24, or the like, for determining whether to generate a feedback message based on a comparison of a degree of change of the first viewport orientation and second viewport orientation with a predefined threshold. In this regard, small motions (e.g., head or body motions) at the client device 104 may not trigger generation of a feedback message. In some embodiments, the feedback message may comprise an RTCP feedback message. It is to be appreciated that while embodiments herein describe use of RTCP, other protocols may be used and/or employed as described above.

[0064] The feedback message may be generated to comprise one or more updated parameters of the second viewport orientation, Vi+i. At operation 504, the client device may cause transmission of the feedback message, e.g., to a source device 102. In this regard, the client device 104 includes means, such as the processing circuitry 22, communication interface 26, or the like, for causing transmission of the feedback message.

[0065] At operation 505, the client device 104 includes means, such as the processing circuitry 22, communication interface 26, or the like, for receiving, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation of the client device. For example, the client device 104 may receive the second stream from the source device 102 after transmission of the feedback message to the source device.

[0066] At operation 506, the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining a first timestamp value associated with a corresponding frame of the first stream. For example, the client device may determine an RTP timestamp value of any of the packets belonging to a video frame which was rendered last at the client device 104 during the first stream based on the first viewport orientation, Vi, prior to the gaze change event such as head or eye pupil motion, may be herein referred to as Vi Highest TS. As described above, the source device 102 may continue to send viewport-dependent content based on Vi until the source device receives the feedback message comprising a new viewport orientation Vi+i. In some embodiment, transmission of the feedback message from the client device 104 to the source device 102 comprising new viewport information may be carried over an RTP or an RTCP message, which may be transmitted at a different time compared to the message comprising Vi Highest TS. The source device 102 may then update the viewport-dependent stream to use the new viewport orientation, Vi+i.

[0067] At operation 507, the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining a second timestamp value associated with a corresponding frame of the second stream. The RTP timestamp value of any of the packets belonging to the first rendered video frame using viewport-dependent delivery with the second viewport orientation, Vi+i , may be referred to herein as Vi+i_Lowest_TS. It is to be appreciated that while FIG. 5A depicts operations performed in a particular numerical order, alternative orders of the operations are possible. For example, in some embodiments, the determination of a first timestamp value associated with a frame of the first stream (e.g., step 506) may occur prior to transmission of the feedback message to the source device 102 (e.g., step 504).

[0068] In some embodiments, after the client device 104 has determined the first timestamp value and the second timestamp value as described above, the client device may continue to carry out operations depicted for example in FIG. 5B.

[0069] At operation 508, the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value. The motion to high-quality delay value may herein be referred to as Motion to HQ and may be determined by the client device 104 using the following formula:

Motion_to_HQ = Vi +i _Lowest_TS - Vi_Highest_TS

[0070] In some embodiments, the motion to high-quality delay value (e.g.,

Motion to HQ) may be expressed and/or stored as an RTP timestamp unit. However, the motion to high-quality delay value may be expressed in other units as well, including but not limited to seconds, milliseconds, and/or the like. In some embodiments, such as when using motion to high-quality delay values across multiple sessions and/or streams, the motion to high-quality delay value may be normalized and/or expressed as a range without any units. For example, a zero (0) value may be assigned as the motion to high-quality delay value in an instance in which the viewport orientation updates at a speed so fast that the first rendered frame after the event (e.g., head motion) is already at the new, updated viewport quality, while higher motion to high-quality delay values may indicate a measure of the delay proportionally increasing while decreasing user experience.

[0071] In some embodiments, as shown in operation 509, the client device 104 may optionally store the motion to high-quality delay value, e.g., in memory 24. In this regard, the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for storing the motion to high-quality delay value. For example, in some embodiments, the client device 104 may store the determined motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values. [0072] For example, in some embodiments, the motion to high-quality delay value may be sent as a single, unaltered value, representing only one viewport orientation change event resulting in a viewport orientation update. However, in certain embodiments, the motion to high-quality delay value may also be sent as a result of a statistical function using the motion to high-quality delay value and/or one or more previously determined motion to high-quality delay values as inputs, e.g., as a representation of the maximum, average, minimum, median value, and/or the like. This statistical aggregation may be for the entire streaming session, for multiple streaming sessions, or, in some embodiments, for a particular time window. As such, the client device 104 may store (e.g., in memory 24) an array, list, or other data structure for storing previously determined motion to high-quality delay values for the purpose of sending to the source device 102, regardless of whether statistical aggregation is performed. In some embodiments, the client device 104 may cause transmission of more than one motion to high- quality delay value to the source device 102, such as a list, array or other data structure that comprises multiple values together. [0073] As shown in operation 509, the client device 104 may cause transmission of the motion to high-quality delay value, e.g., to the source device 102. In this regard, the client device 104 includes means, such as the communication interface 26 and/or the like, for causing transmission of the motion to high-quality delay value.

[0074] In other embodiments, after the client device 104 has determined the first timestamp value and the second timestamp value as described above with reference to operations 506 and 507 of FIG. 5 A, the client device may continue to carry out operations depicted for example in FIG. 5C. For example, rather than calculating a motion to high- quality delay value at the client device, the client device 104 may cause transmission of timing information to the source device 102 for processing.

[0075] As shown in operation 520, the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining a viewport-change timestamp value based on a time at which the change in the first viewport orientation of the client device to the second viewport orientation occurred. For example, during an event in which a change in viewport orientation occurs at the client device 104, the client device may determine a viewport-change (VC) timestamp value (e.g., an RTP timestamp value) and optionally store the VC timestamp value (e.g., in memory 24).

[0076] As shown in operation 521, the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining a viewport-delivered timestamp value based on a time at which the second stream is first received at the client device. For example, after the second stream is received at the client device 104 as described above, the client device may determine a viewport-delivered (VD) timestamp value (e.g., an RTP timestamp value) and optionally store the VD timestamp value (e.g., in memory 24). [0077] As shown in operation 522, and as depicted in FIG. 6, the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for causing transmission of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device. The values may all comprise RTP timestamp units or, in other embodiments, any other time unit and/or format, as long as the client device and source device are aware of the units and/or formats used. In some embodiments, the values (e.g., timestamp values) may comprise either absolute or relative times.

[0078] In some embodiments, during the event comprising a change in the first viewport orientation of the client device to a second viewport orientation (e.g., head motion), a strategy associated with the viewport-dependent delivery may change, such as, for example, the viewport area’s level of quality may increase or decrease. Thus, it is not mandatory that the quality of the frames used to determine the VC timestamp value, VD timestamp value, Vi+i_Lowest_TS and Vi Highest TS be the same, however, it may be the quality assigned to the viewport. In another embodiment, the quality of the frames used to determine the timestamp values VC timestamp, VD timestamp, Vi+i_Lowest_TS and Vi Highest TS must be the same or comparable.

[0079] For example, a source device 102 may temporarily change the strategy of the viewport-dependent delivery during head motion and the timing is associated with when the viewport quality after motion is equivalent or comparable to the viewport quality before motion. FIG. 7 illustrates a method 700 that may be performed by the apparatus 200 when embodied by a source device 102. At operation 701, the source device 102 includes means, such as the processing circuitry 22, communication interface 26, or the like, for causing transmission of a first stream comprising panoramic video content based on a first viewport orientation of the client device. As described above, the source device 102 may cause transmission of panoramic video content to one or more client devices 104.

[0080] At operation 702, the source device 102 includes means, such as the processing circuitry 22, communication interface 26, or the like, for receiving a feedback message comprising one or more updated parameters of a second viewport orientation. For example, and as described above, during transmission of a first video stream to a client device 104, the client device 104 may alter a viewport orientation (e.g., through head and/or body motion or the like) such that the client device 104 generates a feedback message comprising one or more updated parameters of a second viewport orientation and causes transmission of the feedback message to the source device 102. In this regard, the source device 102 may be configured to receive the feedback message.

[0081] At operation 703, the source device 102 includes means, such as the processing circuitry 22, memory 24, and/or the like, for generating, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation of the client device. For example, in some embodiments, the source device 102 may update the first stream and/or generate a new stream in accordance with the one or more updated parameters of the feedback message received from the client device 104.

[0082] At operation 704, the source device 102 includes means, such as the processing circuitry 22, communication interface 26, or the like, for causing transmission of the second stream to the client device. In this regard, as described above, the client device 104 may receive the second stream. At operation 705, the source device 102 includes means, such as the processing circuitry 22, communication interface 26, or the like, for receiving one or more determined parameters from the client device 104. In some embodiments, the one or more determined parameters may be received in response to transmission of the second stream. As described above, the client device 104 may determine one or more parameters, such as time stamp values, a motion to high-quality delay value, and/or the like. In some embodiments, the one or more determined parameters may comprise a motion to high-quality delay value determined at the client device 104 (e.g., as described above with reference to FIG. 5B). In some embodiments, the one or more determined parameters may comprise a first timestamp value (e.g., an RTP timestamp value of any of the packets belonging to a video frame which was rendered last at the client device 104 during the first stream based on the first viewport orientation), a second timestamp value (e.g., an RTP timestamp value of any of the packets belonging to the first rendered video frame using viewport-dependent delivery with the second viewport orientation), a viewport-change timestamp value (e.g., a timestamp value based on a time at which the change in the first viewport orientation of the client device to the second viewport orientation occurred), and a viewport-delivered timestamp value (e.g., a timestamp value based on a time at which the second stream is first received at the client device) as described above with reference to FIG. 5C).

[0083] Based on the received one or more parameters, the source device 102 includes means, such as the processing circuitry 22, memory 24, or the like, for updating the second stream based on the received one or more determined parameters and causing transmission of the updated second stream to the client device 104. In this regard, the received one or more determined parameters (e.g., a motion to high-quality delay value and/or the timestamp values described above) may inform and enable the source device 102 to optimize panoramic video transport and/or delivery.

[0084] For example, in an instance in which a received motion to high-quality delay value from the client device 104 is determined to be below a predefined threshold (e.g., a low numerical value), the source device 102 may adjust and/or alter a panoramic video stream to the client device 104 by not providing margin area extension. In another instance in which a received motion to high-quality delay value from the client device 104 is determined to be above a predefined threshold (e.g., a high numerical value), the source device 102 may adjust and/or alter a panoramic video stream to the client device 104 by extending the margin area in order to improve user experience at the client device 104. Using the timing information described herein (e.g., the motion to high-quality delay value, VC timestamp value, VD timestamp value, Vi+i_Lowest_TS and/or Vi Highest TS), the source device 102 may be enabled to make more informed decisions and determinations, for example, on how to use available bandwidth. When motion to high-quality delay is high, the source device 102 may prioritize transmission of viewport areas with margin areas when bandwidth is available. In an alternative example, in an instance in which motion to high-quality delay is low (e.g., low latency), the source device 102 may reduce and/or eliminate extension of margin areas and instead, for example, upgrade a quality level of a viewport area.

[0085] As described above, a method, apparatus, and computer program product are disclosed for providing for signaling of viewport orientation timing, such as in panoramic video delivery. By providing for signaling of viewport orientation timing in, e.g., panoramic video delivery, user experience during immersive content consumption may be improved while avoiding unnecessary increases in latency and bandwidth consumption.

[0086] For example, the determined values (e.g., the determined motion to high-quality delay value, VC timestamp value, VD timestamp value, Vi+i_Lowest_TS and Vi Highest TS) described above may be used directly by the source device for improved network adaptation. For example, consider a source device 102 that extends margins around the viewport during VR content delivery. The margins may be delivered at the same or comparable quality as the viewport area in order to reduce quality degradation during head motion. However, when a motion to high-quality delay value is low, the margin extensions may be unnecessary, whereas when motion to high-quality delay is high, extending the margins may significantly improve the user experience. Using the timing information made available through this disclosure as described above, the source device 102 can make more informed decisions on how to use available bandwidth. When the motion to high-quality delay value is high, the source device may prioritize sending viewport area content with margins when bandwidth is available. On the other hand, if motion to high-quality delay value is low, the source device 102 may reduce margin extensions and upgrade the quality of the viewport area instead.

[0087] Additionally, current RTCP messages provide low-level metrics, such as, for example, jitter and RTT. Higher level metrics that directly impact user experience, such as the motion to high-quality delay value discussed herein, are more suited for use as indicators for bitrate adaptation as they incorporate buffering and application-level delays in addition to network delays.

[0088] The present disclosure additionally reduces overhead for source devices 102. Conventional methods of using RTCP XR reports (e.g., as described in Internet Engineering Task Force (IETF) Request For Comments 3611) for sending Packet Receipt Time Report Block may be used by a sender to get receipt times of RTP packets. However, using the report block for calculating motion to high-quality delay at the source device 102 requires that the source device is aware of the sequence numbers associated with viewport orientation changes and, additionally, requires the source device 102 to keep track of these sequence numbers at least until the relevant reports are received. Furthermore, since packet receipt times are expressed in RTP timestamp units for a block of sequence numbers, this manner of expression creates significant signaling overhead. Sending motion to high-quality delay values or even only the relevant timing information (e.g., VC timestamp value, VD timestamp value, Vi+i_Lowest_TS and Vi Highest TS) as described in this disclosure provides more accurate information at a much lower overhead. Aggregated and/or periodically received motion to high-quality delay values can further reduce the signaling overhead.

[0089] Additionally, the signaled values of the present disclosure enable improved session monitoring capabilities at the source device and other network elements. The metrics collected using the signaled values may be used for live monitoring as well as to enable engineers to further optimize the system 100.

[0090] FIG. 5A, 5B, 5C, and 7 illustrate flowcharts depicting methods according to an example embodiment of the present invention. It will be understood that each block of the flowcharts and combination of blocks in the flowcharts may be implemented by various means, such as hardware, firmware, processor, circuitry, and/or other communication devices associated with execution of software including one or more computer program instructions. For example, one or more of the procedures described above may be embodied by computer program instructions. In this regard, the computer program instructions which embody the procedures described above may be stored by a memory device 24 of an apparatus employing an embodiment of the present invention and executed by a processor 22. As will be appreciated, any such computer program instructions may be loaded onto a computer or other programmable apparatus (for example, hardware) to produce a machine, such that the resulting computer or other programmable apparatus implements the functions specified in the flowchart blocks. These computer program instructions may also be stored in a computer- readable memory that may direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture the execution of which implements the function specified in the flowchart blocks. The computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operations to be performed on the computer or other programmable apparatus to produce a computer-implemented process such that the instructions which execute on the computer or other programmable apparatus provide operations for implementing the functions specified in the flowchart blocks.

[0091] Accordingly, blocks of the flowcharts support combinations of means for performing the specified functions and combinations of operations for performing the specified functions for performing the specified functions. It will also be understood that one or more blocks of the flowchart, and combinations of blocks in the flowcharts, can be implemented by special purpose hardware-based computer systems which perform the specified functions, or combinations of special purpose hardware and computer instructions. [0092] Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims.

[0093] Moreover, although the foregoing descriptions and the associated drawings describe example embodiments in the context of certain example combinations of elements and/or functions, it should be appreciated that different combinations of elements and/or functions may be provided by alternative embodiments without departing from the scope of the appended claims. In this regard, for example, different combinations of elements and/or functions than those explicitly described above are also contemplated as may be set forth in some of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.