Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD TO TRANSMIT INTERACTIVE GRAPHICAL DATA BETWEEN A DEVICE AND SERVER AND SYSTEM THEREOF
Document Type and Number:
WIPO Patent Application WO/2019/183664
Kind Code:
A1
Abstract:
A method of reducing the perceived latency between a server and a user device over a network, the method comprising the steps of generating a scene data render on the server, generating a scene meta-data render on the server to describe how the scene data render should behave on the user device, streaming the scene data render and the scene meta-data render from the server to the user device over the network to display the data as it was created on the server, receiving a user input on the scene meta-data on the user device and performing an action on the user device as a result of the user input on the scene meta-data.

Inventors:
YAO CHANG-YI (AU)
ZELLER MARCUS (AU)
Application Number:
PCT/AU2019/050230
Publication Date:
October 03, 2019
Filing Date:
March 14, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
YAO CHANG YI (AU)
International Classes:
A63F13/30; G06T19/00
Foreign References:
US20170366838A12017-12-21
US20130268583A12013-10-10
US20170274284A12017-09-28
US20110107220A12011-05-05
Attorney, Agent or Firm:
SPRUSON & FERGUSON (AU)
Download PDF:
Claims:
CLAIMS

1. A method of reducing the perceived latency between a server and a user device over a network, the method comprising the steps: a. generating a scene data render on the server; b. generating a scene meta-data render on the server to describe how the scene data render should behave on the user device; c. streaming the scene data render and the scene meta-data render from the server to the user device over the network to display the data as it was created on the server in step a; d. receiving a user input on the scene meta-data on the user device; and e. performing an action on the user device as a result of the user input on the on the scene meta-data in step d.

2. The method of claim 1 wherein, the scene meta data render from method step b is in a lower resolution than the scene data render from method step a.

3. The method of claim 1 or claim 2 wherein, the method also comprises an optional step f. of transmitting the user input on the scene meta-data render with the scene data render from the user device to the server.

4. A system for reducing the perceived latency between a server and a user device over a network, the system configured to perform the method as claimed in claim 1.

5. A system as claimed in claim 4 wherein the user device is a personal computing device with a touchscreen for receiving the user input.

6. A system as claimed in claim 4 or claim 5 wherein the performing of the action includes making changes to the scene meta-data render on the user device.

7. A system as clai ed in claim 6 wherein the changes are made by a software application operating on the user device to create an amended scene meta-data render which when implemented, changes the way in which the scene data render behaves on the user device.

8. A server having a processor and configured to communicate with a user device over a network, the processor is configured to:

a. generate a scene data render and a scene meta-data render, wherein the scene meta-data render describes how the scene data render should behave on the user device; and b. transmit the scene data render and the scene meta-data render to the user device over the network to display the data on the user device, as it was created on the server.

9. A user device having a processor and configured to communicate with a server over a network, the processor is configured to:

a. receive a scene data render and a scene meta-data render streamed from the server,

wherein the scene meta-data render describes how the scene data render should behave on the user device;

b. reproduce the scene data render and the scene meta-data render;

c. accept a user input on the scene meta-data; and

d. perform an action on the user device as a result of the user input on the on the scene meta-data.

10. A user device as claimed in claim 9 wherein the user device has a display and at least one input mechanism to accept the user input on the scene meta-data.

11. A user device as claimed in claim 9 or claim 10 wherein the user device is a personal computing device with a touchscreen for receiving the user input and at least one processor for performing the action.

12. A computer program that, when executed on a computer having a processor and configured to communicate with a user device over a network, causes the processor to perform a method comprising:

a. generating a scene data render and a scene meta-data render, wherein the scene meta-data render describes how the scene data render should behave on the user device; and b. transmitting the scene data render and the scene meta-data render to the user device over the network to display the data on the user device as it was created on the server.

13. A computer program that, when executed on a user device having a processor and configured to communicate with a server over a network, causes the processor to perform a method comprising:

a. receiving a scene data render and a scene meta-data render streamed from the server, wherein the scene meta-data render describes how the scene data render should behave on the user device;

b. reproducing the scene data render and the scene meta-data render;

c. accepting a user input on the scene meta-data; and

d. performing an action on the user device as a result of the user input on the on the scene meta-data.

14. A computer program as claimed in claim 13 wherein the action causes changes to be made in the scene meta-data render, wherein the scene meta-data render describes how the scene data render should behave on the user device.

Description:
METHOD TO TRANSMIT INTERACTIVE GRAPHICAL DATA BETWEEN A DEVICE

AND SERVER AND SYSTEM THEREOF

TECHNICAL FIELD

[0001] The present invention relates to a method to transmit interactive graphical data between a device and a server and a system thereof.

BACKGROUND ART

[0002] As electronic communications devices have become more advanced, interactive media such as virtual reality gaming has become popular.

[0003] A problem with interactive computer media of the prior art is that a delay (or latency) between a user input on the communication device and a response effect can reduce enjoyment for the user of the interactive media by reacting incorrectly due to delayed processing of input.

If the virtual reality has more motion latency than a real-world situation then the user’ s eye will perceive a less than realistic experience.

[0004] Additionally, complex processing may not be possible on low end devices in a manner that achieves suitable results. Even on high end devices where such processing is possible, other computationally complex calculations may not be able to be performed simultaneously. As a result, complex applications that utilise this motion tracking may not be possible.

OBJECT OF THE INVENTION

[0005] There is a need for an improved method to transmit interactive graphical data between a device and a server, or at least ameliorates the aforementioned problems.

[0006] Accordingly, the object of the present invention is to provide a method to transmit interactive graphical data between a device and a server which may at least partially overcome at least one of the above-mentioned disadvantages or provide the consumer with a useful or commercial choice.

SUMMARY OF INVENTION

[0007] With the foregoing in view, the present invention in one form, resides broadly in a system to transmit interactive graphical data between a device and a server and method thereof, which allows for improved experience of a user of interactive computer media. [0008] The present invention has application to applications which are sensitive to latency and systems which have a higher latency than other applications. Applications which are sensitive to latency include mobile streamed virtual reality (VR) applications, where streaming a depth buffer would allow for three dimensional rendering adjustments on the mobile device in real time, streaming a vector field which could allow the communication device to adjust a server generated scene to account for the latency of the device-server connection depending upon how the scene is moving or eye and/or face tracking of a user of a computing device, which may be transmitted by the computing device as tracking data to the server which can dictate a response effect by the server. Applications which have higher latency include wireless streaming of data between a communication device and a server or a virtual computer network interface (VIF) which may or may not correspond directly to a network interface controller.

[0009] With the foregoing in view, the present invention in one form, resides broadly in a method of reducing the perceived latency between a server and a user device over a network, the method comprising the steps: a. generating a scene data render on the server; b. generating a scene meta-data render on the server to describe how the scene data render should behave on the user device; c. streaming the scene data render and the scene meta-data render from the server to the user device over the network to display the data as it was created on the server in step a; d. receiving a user input on the scene meta-data on the user device; and e. performing an action on the user device as a result of the user input on the on the scene meta-data in step d.

[0010] The“action” which is performed on the user device is normally in relation to the scene data render, that is the user wishes to adjust or interact with the scene data render that is displayed.

[0011] The“action” which is performed on the user device may occur through changes to the scene meta-data render which are made at the user device, typically by a software application operating on the user device, that then subsequently change way in which the scene data render behaves on the user device.

[0012] In the present specification and claims (if any), the word “comprising” and its derivatives comprising “comprises” and “comprise” comprise each of the stated integers but does not exclude the inclusion of one or more further integers.

[0013] For the purposes of the specification, the term“action” means a processing task performed by the user device independently of the server as a reaction to the user input on the scene meta-data.

[0014] Preferably, the scene meta-data render from method step b is in a lower resolution than the scene data render from method step a.

[0015] Preferably, the method also comprises an optional step f. of transmitting the user input on the scene meta-data render with the scene data render from the user device to the server.

[0016] In a second form of the present invention, there is provided a system for reducing the perceived latency between a server and a user device over a network configured to utilise the method as described above.

[0017] In another aspect, the present invention resides in a server having a processor and configured to communicate with a user device over a network, the processor is configured to generate a scene data render and a scene meta-data render, wherein the scene meta-data render describes how the scene data render should behave on the user device and transmit the scene data render and the scene meta-data render to the user device over the network to display the data as it was created on the server.

[0018] In still another aspect, the present invention resides in a user device having a processor and configured to communicate with a server over a network, the processor is configured to receive a scene data render and a scene meta-data render streamed from the server, wherein the scene meta-data render describes how the scene data render should behave on the user device, reproduce the scene data render and the scene meta-data render, accept a user input on the scene meta-data, and perform an action on the user device as a result of the user input on the on the scene meta-data.

[0019] In yet another aspect, the present invention resides in a computer program that, when executed on a computer having a processor and configured to communicate with a user device over a network, causes the processor to perform a method comprising generating a scene data render and a scene meta-data render, wherein the scene meta-data render describes how the scene data render should behave on the user device and transmitting the scene data render and the scene meta-data render to the user device over the network to display the data as it was created on the server.

[0020] In still a further aspect, the present invention resides in a computer program that, when executed on a computer having a processor and configured to communicate with a server over a network, causes the processor to perform a method comprising receiving a scene data render and a scene meta-data render streamed from the server, wherein the scene meta-data render describes how the scene data render should behave on the user device, reproducing the scene data render and the scene meta-data render, accepting a user input on the scene meta-data; and performing an action on the user device as a result of the user input on the on the scene meta-data.

[0021] Any of the features described herein can be combined in any combination with any one or more of the other features described herein within the scope of the invention.

[0022] The reference to any prior art in this specification is not, and should not be taken as an acknowledgement or any form of suggestion that the prior art forms part of the common general knowledge.

BRIEF DESCRIPTION OF DRAWINGS

[0023] Preferred features, embodiments and variations of the invention may be discerned from the following Detailed Description which provides sufficient information for those skilled in the art to perform the invention. The Detailed Description is not to be regarded as limiting the scope of the preceding Summary of Invention in any way. The Detailed Description will make reference to a number of drawings as follows:

[0024] Figure 1 illustrates a system to transmit interactive graphical data between a device and a server, according to an embodiment of the present invention;

[0025] Figure 2 illustrates a schematic process chart of a method of streaming a touch map to allow a user device to interact with a graphical scene, interacting with a graphical element which has moved in the time the graphical data took to transmit to the device. The user is interacting with the system as shown in Figure 1 above;

[0026] Figure 3 illustrates a schematic process chart of a method of streaming a depth buffer to allow for three-dimensional rendering adjustment on a user device in real time which is to be operated on the system as shown in Figure 1 above; and

[0027] Figure 4 illustrates a schematic process chart of a method of streaming a vibration map to allow for touch interaction responses to the vibration map locally on the user device thereby triggering the user device to respond without having to wait for the server to respond to touch which is to be operated on the system as shown in Figure 1 above.

[0028] Preferred features, embodiments and variations of the invention may be discerned from the following Detailed Description which provides sufficient information for those skilled in the art to perform the invention. The Detailed Description is not to be regarded as limiting the scope of the preceding Summary of the Invention in any way.

DESCRIPTION OF EMBODIMENTS

[0029] Figure 1 illustrates a system for reducing the perceived latency between a server and a user device over a network 100. The system 100 provides a low-complexity tracking of smart devices in the 3D space, which in turn enables efficient display of virtual worlds, 3D models or animations on a smartphone or similar portable computing device.

[0030] The system 100 includes a user communications device in the form of a smartphone 105 optionally coupled to a three-dimensional (3D) tracking device 110. The smartphone 105 may include a touch screen display, with which the user may interact. The computing device may transmit details of such interaction to the server. The interaction may comprise interaction with an element of media that has been provided by the server. The smartphone 105 is coupled to the 3D tracking device 110 in a fixed relationship, such that when a user 115 moves the smartphone 105, the smartphone 105 and the 3D tracking device 110 move together. As a result, the 3D tracking device 110 is able to track movement of the smartphone 105, including a location and direction or orientation thereof. The smartphone 105 comprises one or more processors configured to control at least display on the screen display device on which the user may interact, reception and reproduction of media that has been provided by the server, transmission of the user interaction to the server, communication with the 3D tracking device 110. The smartphone 105 also comprises computer-readable storage medium such as RAM, ROM, EPROM on which a computer program to cause the processor to carry out the processes as described above, for example.

[0031] The system 100 also comprises a computing device in the form of a remote server 120 in communication with the smartphone 105. In particular, the server 120 comprises one or more processors configured to generate media for transmission to the smartphone 105 by selecting a scene portion of a scene based upon the location and direction of the handheld computing device. As such, when the user moves the smartphone 105, the view is updated to provide a“moving window” into a three-dimensional world. The server 120 also comprises computer-readable storage medium such as RAM, ROM, EPROM on which a computer program to cause the processor to carry out the processes as described above, for example.

[0032] Scene portions may be selected to simulate navigation in a three-dimensional environment. The three-dimensional environment may comprise a virtual reality environment, or may comprise scene portions to be overlaid over image data captured by the smartphone 105 to provide mixed reality.

[0033] The process is performed continuously so that the media is provided to the user based upon their actual movements. As such, a continuous and immersive experience is provided to the user.

[0034] The media may be provided sequentially, such as in the form of video, and updated such that each frame of the video is generated according to the current location and direction of the phone. This enables the server 120 and the smartphone 105 to utilise existing protocols and methods for transporting the media to the phone.

[0035] As the media may be generated (or modified) at the server 120, lower bandwidth between the smartphone 105 and server 120 may be utilised, as only the view to be displayed to the user need be sent (rather than all views). As a result, perceived latency may be reduced, and the overall user experience may be increased.

[0036] According to certain embodiments, the smartphone 105 is further configured to track a face and/or eyes of a user 135, and report same to the server 120. As such, the server 120 is able to generate media (such as video) for the user based not only on the location and direction of the smartphone 105, but also the relative angle the user is viewing the smartphone (parallax).

[0037] The server 120 is coupled to a data store 125, which includes content. The content may comprise three-dimensional models, from which video data is generated for transmission to the smartphone 105. The three-dimensional model may comprise part of a game, with which the user interacts through the smartphone 105.

[0038] Advantageously, the system 100 may enable virtual reality to be provided in a manner that is not hindered by use of a headset, and enables the user to use his or her phone, which they are generally familiar with and comfortable with.

[0039] The system 100 is low complexity as it utilises tracking from a device that is separate to (but attached to) the smartphone 105. Such configuration not only increases battery life in that processing on the smartphone 105 is reduced, but also enables the smartphone 105 to perform other computationally complex activities, which in turn enables more advanced applications and games.

[0040] The problem to be addressed by the present invention is that sending interactive graphical data between a client communications device and a server has can result in perceived latency (or lag) to the user. The points of latency in this process are:

1. Rendering a graphical scene on the server;

2. Encoding the scene data on the server and decoding the scene data in sending the render from the server to a client; and

3. The communications link between the device and server.

[0041] The present invention will now be described with reference to the above points of latency, Figures 2 to 4 and by way of the following examples.

[0042] Figure 2 shows the use of the present invention to perform a method of streaming a touch map to allow a user device to adjust a graphical scene depending upon how the user interacts with the scene. The server may generate a scene data render in high resolution (such as a 1280x720p resolution in the form of a 150kb jpeg image) and a scene meta-data render in the form of a touch map in a lower resolution (such as a 64x36p resolution in the form of a 4.6kb bitmap where each pixel is 2x2mm).

[0043] Both the scene data and the scene meta-data renders are streamed from the server 120 to the user device smartphone 105 over the network 130 to display the scene as it was created on the server 120. The user input in the form of a touch interaction 130 received on the meta-data scene render is transmitted to the server 120 from the user device 105 which results in a change of the scene on the server 120. The touch icon indicates where the tool would have been registered if sent naively. By using a touch map, a user input touch has successfully interacted with the correct object on the scene without suffering the effects of latency even though the scene has changed on the server 120.

[0044] Figure 3 shows the use of the present invention to stream a depth buffer to allow for three-dimensional rendering adjustments on the user device 105 in real time. The server may generate a scene data render in high resolution and a scene meta-data render in the form of a depth buffer in a lower resolution.

[0045] Both the scene data and the scene meta-data renders are streamed from the server 120 to the smartphone 105 over the network 130 to display the scene as it was created on the server 120. The smartphone 105 calculates the scene by compositing the graphical scene data render and the depth buffer meta-data render and by doing so compensates for the perceived motion delay achieved through local calculation only.

[0046] Figure 4 shows the use of the present invention to stream a vibration map to allow for touch interaction responses to the vibration map locally on the user device. The server 120 may generate a scene data render in high resolution and a scene meta-data render in the form of a vibration map in a lower resolution.

[0047] Both the scene data and the scene meta-data renders are streamed from the server 120 to the smartphone 105 over the network 130 to allow for touch interaction responses to the vibration map locally on the user device. In this way the user device is triggered to respond without having to wait for the server to respond to touch.

Example 1: Displaying a virtual reality (VR) Scene

[0048] The simplest (or naive) solution to displaying virtual reality scenes on a client communications device is to render the scene on a server, send the server render to the device and display on the device. However, data displayed by this solution is always perceivably delayed.

[0049] A more current solution to this problem is that the device sends motion data to the server, a render of a graphical scene is created on the server while predicting future motion data and sent to the device for display. This solution reduces perceived latency by using motion data. However, if the motion data is out of date due to the transfer from the device to the server and back again, an incorrect prediction occurs which creates a worse experience for the user.

[0050] An embodiment of the present invention is to render a graphical scene on the server, send the scene render and a depth buffer to the device and then adjust render based off local motion data for display on the device. Motion adjustments are applied on the device when displayed.

[0051] This solution reduces the perceived latency by using motion and position data. It requires device to be“smart”, i.e. contain a processor and a graphics processing unit (GPU). While the data to be sent between the server and device is slightly larger the render delay is not affected. In this way the perceived delay to the user is substantially reduced and near eliminated.

Example 2: Graphical Interaction

[0052] The simplest (or naive) solution to the problem of graphical interaction between a server and a client communications device is to send graphical data to the device which then displays the graphical data. The user then inputs a response on the device (such as a touch response) and the naive input is sent to the server. The server then responds to the delayed user input by sending new graphical data to the device which then shows the new graphical data. The problem with this solution is that inputs between the server and the device are delayed which can result in perceived latency to the user.

[0053] A more current solution to this problem is that server sends the graphical data to the device and saves a copy of graphical data locally on the server. The device displays the graphical data for the user to input a response on the device. The naive user input is then sent to the server which responds to the input based off the previously saved graphical data. The server then sends new graphical data to the device for display. In this way, inputs on specific virtual objects are confirmed on server which can result in the perceived delay to the user being reduced. However, the relatively large amount of data stored on the server increases as latency between the device and the server increases.

[0054] An embodiment of the present invention is to send graphical data and graphical metadata from the server to the device. The device displays the graphical data and the user inputs on the device. In light of the input, a calculation is performed with the metadata on the user device which responds by performing an action, for example by updating the scene according to the object interacted with rather than the position the user interacted with which may or may not contain the same object after the latency. In this way, inputs on specific virtual objects are confirmed on the device to synchronise the user input with the display to reduce the perceived delay to the user by acting independently of the server. Optionally, the user input on the scene meta-data render with the scene data render are transmitted from the user device to the server which then responds to the delayed user input after performing a processing step by sending a response to the user device which then in turn responds to the server response.

Example 3: Graphical Input Response

[0055] The simplest (or naive) solution to the problem of responding between a server and device to graphical input is to send graphical data to the device which then displays the graphical data. The user then inputs a response on the device (such as a touch response) and the naive input is sent to the server. The server then responds to the delayed user input by sending a response to the device which then responds to the server response. The problem with this solution is that inputs between the server and the device are delayed which can result in perceived latency to the user.

[0056] An embodiment of the present invention is to send graphical data and graphical metadata from the server to the device. The device displays the graphical data and the user inputs on the device. In light of the input, the device responds in light of the metadata and an input is sent to the server which responds by sending new graphical data and new graphical metadata to device to the device for display. In this way, inputs on specific virtual objects are confirmed on the device to synchronise the user input with the display to reduce the perceived delay to the user.

CONCLUDING STATEMENTS

[0057] Therefore, the present invention has the advantage over the prior art of reducing the latency perceived by a user of a user device as a result of communication between a server and the user device.

[0058] Reference throughout this specification to‘one embodiment’ or‘an embodiment’ means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearance of the phrases‘in one embodiment’ or‘in an embodiment’ in various places throughout this specification are not necessarily all referring to the same embodiment.

Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more combinations.