TRACKING OF MULTIPLE EXTENDED REALITY DEVICES

Title:

TRACKING OF MULTIPLE EXTENDED REALITY DEVICES

Document Type and Number:

WIPO Patent Application WO/2023/219615

Kind Code:

Abstract:

This application is directed to device synchronization and alignment in extended reality. Two electronic devices create two maps of a scene according to two distinct coordinate systems. A first electronic device determines a device pose of a second electronic device in a first coordinate system of the first electronic device. The device pose is used to determine a transformation relationship between the two coordinate systems. The first electronic device obtains a second object pose that is measured in a second coordinate system of the second electronic device and used to render an object in a second map of the second electronic device. The second object pose is converted to a first object pose in the first coordinate system based on the transformation relationship. The object is rendered concurrently in the first and second maps of the first and second electronic devices based on the first and second object poses, respectively.

Inventors:

XU YI (US)

Application Number:

PCT/US2022/028803

Publication Date:

November 16, 2023

Filing Date:

May 11, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

INNOPEAK TECH INC (US)

International Classes:

G06T19/00; G06T17/05; G02B27/01; G06V20/20

Foreign References:

US20210110615A1	2021-04-15
US20220092860A1	2022-03-24
US20210064774A1	2021-03-04

Attorney, Agent or Firm:

WANG, Jianbai et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

What is claimed is:

1. A method for rendering a virtual object in multiple electronic devices, implemented at a first electronic device, the method comprising: executing a session of an extended reality application on the first electronic device, creating a first map of a scene by the first electronic device, the first map having a first coordinate system, wherein a second electronic device is configured to execute the extended reality application and create a second map of the scene having a second coordinate system; determining a second device pose of the second electronic device by the first electronic device in the first coordinate system; based on the second device pose of the second electronic device, determining a transformation relationship between the first coordinate system of the first electronic device and the second coordinate system of the second el ectronic device; obtaining a second object pose of an object in the second coordinate system of the second map where the object is rendered; converting the second object pose to a first object pose in the first coordinate system based on the transformation relationship; and while the object is rendered in the second map for the second electronic device, concurrently rendering the object in the first map for the first electronic device, the object having the first object pose in the first coordinate system,

2. The method of claim 1 , further comprising: aligning the first electronic device with the second electronic device; capturing an image of the second electronic device by the first electronic device; and based on the captured image of the second electronic device, determining the second device pose of the second electronic device in the first coordinate system.

3. The method of claim 1 or 2, further comprising: using a plurality of cameras of the first electronic device to identify a second position of the second electronic device, the plurality' of cameras distanced apart by a set of first distances, and triangulating a 3D location of the second electronic device within the first coordinate system based on the first set of distances and images captured by the plurality of cameras.

4. The method of claim 1 or 2, further comprising: using one or more sensors to determine the second device pose of the second electronic device relative to the first electronic device; and generating a 3D location of the second electronic device within the first coordinate system.

5. The method of claim 1 or 2, further comprising: determining an orientation of the second electronic device using one or more cameras coupled to or integrated in the first electronic device.

6. The method of any of the preceding claims, wherein the first electronic device is configured to be worn on a head of a first user and the second electronic device is configured to worn on a head of a second user.

7. The method of any of the preceding claims, wherein the first electronic device or the second electronic device is a virtual reality headset.

8. The method of any of claims 1-6, wherein the first electronic device or the second electronic device is an augmented reality headset.

9. The method of any of the preceding claims, wherein the second electronic device includes second features configured to be detected by the first electronic device.

10. The method of any of the preceding claims, wherein the first coordinate system and the second coordinate system are both coordinate systems in three-dimensional space.

11. An electronic system, comprising: one or more processors; and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform a method of any of claims 1-10.

12. A non-transitory computer-readable medium, having instructions stored thereon, which when executed by one or more processors cause the processors to perform a method of any of claims 1-10.

Description:

Tracking of Multiple Extended Reality Devices

TECHNIC AL FIELD

[0001] This application relates generally to device tracking in an extended reality (XR) system including, but not limited to, systems and methods for tracking multiple devices for a multi-user XR system.

BACKGROUND

[0002] Different forms of XR, such as augmented reality (AR), virtual reality (VR), and mixed reality (MR) have become increasingly common in industrial settings and everyday life. XR systems are being used by multiple users interacting with each other in the same virtual and physical environment. XR system may need to track the location of each user within the virtual environment and the location of each user relative to each other.

Traditional XR systems require the use of external cameras and sensors to track the movement of each user within the virtual environment and relative to each other. Some traditional XR systems may track multiple users using a fiducial marker or known object in the real environment. These traditional XR systems may require additional processing or components.

SUMMARY

[0003] Various embodiments of this application are directed to SLAM techniques that track multiple electronic devices with respect to the same coordinate system without using any fiducial markers and/or known objects in the real environment. This is beneficial for implementing multi-user XR experiences for AR glasses, VR headsets, and MR devices. When an electronic system including two electronic devices start up, device tracking start. Each of a first electronic device and a second electronic device attempts to build a respective map a respective electronic device, and keeps track of the respective electronic device pose (i.e., position and orientation) at the same time. The first and second electronic device runs independently of each other. This can be achieved by SLAM. After an SLAM module successfully tracks the position and orientation of each electronic device, the first and second electronic devices have their own world coordinate systems, which are different from each other (i.e., in which origins or directions of the axes are different). This necessitates alignment between the two coordinate systems of the first and second electronic devices. In some embodiments of this application, a rigid transformation (translation and rotation) is determined for transforming a device pose tracked in one of the two coordinate systems to the other one of the two coordinate systems.

[0004] In one aspect, a method is implemented at a first electronic device for rendering a virtual object in multiple electronic devices. The method includes executing a session of an extended reality application on the first electronic device and creating a first map of a scene by the first electronic device. The first map has a first coordinate system. A second electronic device is configured to execute the extended reality application and create a second map of the scene having a second coordinate system. The method further includes determining a second device pose of the second electronic device by the first electronic device in the first coordinate system. The method further includes, based on the second device pose of the second electronic device, determining a transformation relationship between the first coordinate sy stem of the first electronic device and the second coordinate system of the second electronic device. The method further includes obtaining a second object pose of an object in the second coordinate system of the second map where the object is rendered, converting the second object pose to a first object pose in the first coordinate system based on the transformation relationship, and while the object is rendered in the second map for the second electronic device, concurrently rendering the object in the first map for the first electronic device, the object having the first object pose in the first coordinate system.

[0005] In another aspect, some implementations include an electronic system that includes one or more processors and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform any of the above methods.

[0006] In yet another aspect, some implementations include a non-transitory computer-readable medium, having instructions stored thereon, which when executed by one or more processors cause the processors to perform any of the above methods.

[0007] These illustrative embodiments and implementations are mentioned not to limit or define the disclosure, but to provide examples to aid understanding thereof. Additional embodiments are discussed in the Detailed Description, and further description is provided there. BRIEF DESCRIPTION OF THE DRAWINGS

[0008] For a better understanding of the various described embodiments, reference should be made to the Detailed Description below, in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.

[0009] Figure 1 is an example data processing environment having one or more servers communicatively coupled to one or more client devices, in accordance with some embodiments.

[0010] Figure 2 is a block diagram illustrating an electronic device, in accordance with some embodiments.

[0011] Figure 3 is a flowchart of a process for processing inertial sensor data and image data of an electronic system using a SLAM module, in accordance with some embodiments.

[0012] Figures 4A and 4B are two example XR environments in each of which two electronic devices are separated in a scene and calibrated in space, in accordance with some embodiments.

[0013] Figure 5 is a flow diagram of a method for rendering a virtual object in two electronic devices, in accordance with some embodiments.

[0014] Like reference numerals refer to corresponding parts throughout the several views of the drawings.

DETAILED DESCRIPTION

[0015] Reference will now' be made in detail to specific embodiments, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous non-limiting specific details are set forth in order to assist in understanding the subject matter presented herein. But it will be apparent to one of ordinary skill in the art that various alternatives may be used without departing from the scope of claims and the subject matter may be practiced without these specific details. For example, it wall be apparent to one of ordinary’ skill in the art that the subject matter presented herein can be implemented on many types of electronic sy stems with digital video capabilities.

[0016] Figure 1 is an example extended reality (XR) system 100 having one or more servers communicatively coupled to one or more client devices, imaging devices, internet of things sensors, in accordance with some embodiments. XR system 100 tracks multiple devices with respect to the same coordinate system without using any markers and/or known objects in the real environment. Each of the one or more client devices 104 is, for example, a desktop computer 104A, a tablet computer 104B, a mobile phone 104C, or art intelligent, multi-sensing, network-connected home device (e.g., a surveillance camera). In some embodiments, the one or more client devices 104 is coupled to one or more of XR devices 150. For example, one or more client devices 104 are coupled to two, three, four, five, six, or more than six XR devices 150. XR devices 150 include one or more of VR glasses/headset, AR devices/headset, or MR glasses/headset (also called head-mounted display (HMD)). In some embodiments, XR device 150 is an optical see-through head-mounted display (OST- HMD), in which virtual objects are rendered as if they are on top the real world. However, XR device 150 is a VR headset or combination of a VR and OST-HMD. Each client device 104 is configured to collect data or user inputs, executes user applications, and present outputs on its user interface.

[0017] In some embodiments, XR devices 150 include one or more of an image sensor, a microphone, a speaker, one or more inertial sensors (e.g., gyroscope, accelerometer), and a display. The image sensor and microphone are configured to capture video and audio data from a scene of XR devices 150, while the one or more inertial sensors are configured to capture inertial sensor data. In some situations, the image sensor captures hand gestures of a user wearing XR devices 150. In some situations, the microphone records ambient sound, including user’s voice commands. In some situations, both video or static visual data captured by the image sensor and the inertial sensor data measured by the one or more inertial sensors are applied to determine and predict device poses (e.g., device positions and orientations). The video, static image, audio, or inertial sensor data captured by XR devices 150 are optionally processed by XR devices 150, server(s) 102, or both to recognize the device poses of XR devices 150.

[0018] The device poses are used to by XR devices 150 to interact with an application (e.g., a gaming application, a conferencing application, etc.) executed by XR devices 150. In some embodiments, XR devices 150 displays a user interface, and the recognized or predicted device poses are used to render or interact with user selectable display items on the user interface. In some embodiments, deep learning techniques are applied XR system 100 to process video data, static image data, or inertial sensor data captured by XR devices 150. Device poses are recognized and predicted based on such video, static image, and/or inertial sensor data using a data processing model. Training of the data processing model is optionally implemented by server 102 or XR devices 150. Inference of the device poses is implemented by each of the server 102 and XR devices 150 independently or by both of the server 102 and XR devices 150 jointly. This is probably the case for video pass-through AR, but not true for other types.

[0019] The collected data or user inputs from the client device 104, XR devices 150, or a combination thereof can be processed locally (e.g., for training and/or for prediction) at each device and/or remotely by the servers) 102. The one or more servers 102 provides system data (e.g., boot files, operating system images, and user applications) to the client devices 104 and/or XR devices 150 and in some embodiments, processes the data and user inputs received from the client device(s) 104 and/or XR devices 150 when the user applications are executed on the client devices 104 and/or XR devices 150. In some embodiments, XR system 100 further includes storage 106 for storing data related to the servers 102, client devices 104, and/or XR devices 150 and applications executed on the devices. For example, storage 106 stores video content (including visual and audio content), static visual content, and/or inertial sensor data for training a machine learning model (e.g., deep learning network). Alternatively, storage 106 also stores video content, static visual content, and/or inertial sensor data obtained by a client device 104 and/or XR devices 150 to which a trained machine learning model can be applied to determine one or more poses associated with the video content, static visual content, and/or inertial sensor data.

[0020] The one or more servers 102 can enable real-time data communication with client devices 104 and/or XR devices 150 that are remote from each other or from the one or more servers 102. Further, in some embodiments, the one or more servers 102 can implement data processing tasks that cannot be or are preferably not completed locally by client devices 104 and/or XR devices 150. For example, the client, devices 104 include a game console (e.g., the head-mounted display 150) that executes an interactive online gaming application. The game console receives a user instruction and sends it to game server 102 with user data. Game server 102 generates a stream of video data based on the user instruction and user data and providing the stream of video data for display on the game console and other client devices that are engaged with the game console. In another example, client devices 104 include a networked surveillance camera and mobile device 104C (e.g. mobile phone). The networked surveillance camera collects video data and streams the video data to surveillance camera server 102 in real time. While the video data is optionally pre-processed on the surveillance camera, surveillance camera server 102 processes the video data to identify motion or audio events in the video data and share information of these events with mobile phone 104C, thereby allowing a user of mobile phone 104C to monitor the events occurring near the networked surveillance camera in the real time and remotely. [0021] Servers 102, one or more client devices 104 and/or XR devices 150 and storage 106 are communicatively coupled to each other via one or more communication networks 108, which are the medium used to provide communications links between these devices and computers connected together within XR system 100. Communication networks 108 include connections, such as wire, wireless communication links, or fiber optic cables. Examples of communication networks 108 include local area networks (LAN), wide area networks (WAN) such as the Internet, or a combination thereof. Communication networks 108 are, optionally, implemented using any known network protocol, including various wired or wireless protocols, such as Ethernet, Universal Serial Bus (USB), FIREWIRE, Long Term Evolution (LTE), Global System for Mobile Communications (GSM), Enhanced Data GSM Environment (EDGE), code division multiple access (CDMA), time division multiple access (TDMA), Bluetooth, Wi-Fi, voice over Internet Protocol (VoIP), Wi-MAX, or any other suitable communication protocol. A connection to communication networks 108 is established either directly (e.g., using 3G/4G/5G connectivity to a wireless carrier), or through network interface 110 (e.g., a router, switch, gateway, hub, or an intelligent, dedicated whole-home control node), or through any combination thereof. As such, communication networks 108 can represent the Internet of a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers, consisting of thousands of commercial, governmental, educational and other electronic systems that route data and messages.

[0022] In some embodiments, deep learning techniques are applied in XR system 100 to process content data (e.g., video data, visual data, audio data) obtained by an application executed at client device 104 to identify information contained in the content data, match the content data with other data, categorize the content data, or synthesize related content data. The content data may broadly include inertial sensor data captured by inertial sensor(s) of client device 104. In these deep learning techniques, data processing models are created based on one or more neural networks to process the content data. These data processing models are trained with training data before they are applied to process the content data. [0023] In some embodiments, both model training and data processing are implemented locally at each individual client device 104 (e.g., client device 104C and XR devices 150). Client devices 104 and/or XR devices 150 obtain the training data from servers 102 or storage 106 and applies the training data to train the data processing models. Subsequently to model training, client devices 104 and/or XR devices 150 obtain the content data (e.g., captures video data via an internal and/or external image sensor, such as a camera) and processes the content data using the training data processing models locally.

[0024] Alternatively, in some embodiments, both model training and data processing are implemented remotely at server 102 (e.g., server 102A) associated with client device 104 (e.g. client device 104A and XR devices 150). Server 102A obtains the training data from itself, another server 102 or storage 106 and applies the training data to train the data processing models. Client devices 104 and/or XR devices 150 obtain the content data, send the content data to server 102A (e.g., in an application) for data processing using the trained data processing models, receives data processing results (e.g., recognized or predicted device poses) from server 102A, presents the results on a user interface (e.g., associated with the application), rending virtual objects in a field of view based on the poses, or implements some other functions based on the results.

[0025] Alternatively and additionally, in some embodiments, client devices 104 and/or XR devices 150 themselves implement no or little data processing on the content data prior to sending them to server 102 A. Additionally, in some embodiments, data processing is implemented locally at client devices 104 and/or XR devices 150, while model training is implemented remotely at server 102 (e.g., server 102B) associated with client devices 104 and/or XR devices 150. Server 102B obtains the training data from itself, another server 102 or storage 106 and applies the training data to train the data processing models. The trained data processing models are optionally stored in server 102B or storage 106. Client devices 104 and/or XR devices 150 import the trained data processing models from the server 102B or storage 106, processes the content data using the data processing models, and generates data processing results to be presented on a user interface or used to initiate some functions (e.g., rendering virtual objects based on device poses) locally.

[0026] In various embodiments of this application, two or more XR devices 150 coexist in a scene associated with an extended reality application, and content rendered in these two or more XR devices 150 are synchronized in time and aligned in space. Two XR devices 150 create two maps of a scene according to two distinct coordinate systems. A first XR device 152 determines a device pose of a second XR device 152 in a first coordinate system of the first XR device 152. The device pose is used to determine a transformation relationship between the two coordinate systems of the first and second XR devices 152 and 154. The first XR device 152 obtains a second object pose associated with the second XR device 152. The second object pose is measured in a second coordinate system of the second XR device 154 and used to render an object in a second map of the second XR device 154. The second object pose is converted to a first object pose in the first coordinate system based on the transformation relationship. The object is rendered concurrently in the first and second maps of the first and second XR devices 152 and 154 based on the first and second object poses, respectively.

[0027] Figure 2 is a block diagram of an electronic system 200, in accordance with some embodiments. Electronic system 200 optionally includes an XR system. Electronic system 200 includes at least one of a client device 104, a server 102, or a combination thereof. An example of the client device 104 is an XR device 150. Electronic system 200 is optionally used as XR system 100. Electronic system 200, typically, includes one or more processing units (CPUs) 202, one or more network interfaces 204, memory 206, and one or more communication buses 208 for interconnecting these components (sometimes called a chipset). Electronic system 200 includes one or more input devices 210 that facilitate user input, such as a keyboard, a mouse, a voice-command input unit or microphone, a touch screen display, a touch-sensitive input pad, a gesture capturing camera, or other input buttons or controls. Electronic system 200 also includes one or more output devices 212 that enable presentation of user interfaces and display content, including one or more speakers and/or one or more visual displays. As described above in Figure 1, in some embodiments, devices of electronic system 200 are communicatively coupled to each other via the one or more network interfaces 204 or communication buses 208.

[0028] In some embodiments, electronic system 200 includes XR devices 150 having one or more imaging devices 260 (e.g., tracking cameras, infrared sensors, CMOS sensors, etc.), scanners, or photo sensor units for capturing images or video , detecting users or interesting objects, and/or environmental conditions (e.g., background scenery or objects). XR device 150 also include one or more output devices that enable presentation of user interfaces and display content, including one or more speakers and/or one or more visual displays. Optionally, XR device 150 include a location detection device, such as a GPS (global positioning satellite) or other geo-location receiver, for determining the location of XR device 150. Optionally, XR device 150 include an inertial measurement unit (IMU) 280 integrating multi-axes inertial sensors to provide estimation of a location and an orientation of XR device 150 in space. Examples of the one or more inertial sensors include, but are not limited to, a gyroscope, an accelerometer, a magnetometer, and an inclinometer.

[0029] Memory? 206 includes high-speed random access memory, such as DRAM,

SRAM, DDR RAM, or other random access solid state memory devices; and, optionally includes non-volatile memory, such as one or more magnetic disk storage devices, one or more optical disk storage devices, one or more flash memory devices, or one or more other non-volatile solid state storage devices. Memory 206, optionally, includes one or more storage devices remotely located from one or more processing units 202. Memory' 206, or alternatively the non-volatile memory' within memory' 206, includes a non-transitory computer readable storage medium. In some embodiments, memory 206, or the non- transitory computer readable storage medium of memory 206, stores the following programs, modules, and data structures, or a subset or superset thereof.

* Operating system 214 including procedures for handling various basic system services and for performing hardware dependent tasks;

* Network communication module 216 for connecting the server 102 and other devices (e.g., client devices 104, XR devices 150 (e.g., head-mounted displays 150 (e.g.,), imaging devices, IOT sensors , and/or storage 106) via one or more network interfaces 204 (wired or wireless) via one or more communication networks 108, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;

® User interface module 218 for enabling presentation of information (e.g., a graphical user interface for application(s) 226, widgets, websites and web pages thereof, and/or games, audio and/or video content, text, etc.) at each client device 104 and/or XR device 150 via their respective output devices (e.g., displays, speakers, etc.),

* Input processing module 220 for detecting one or more user inputs or interactions from one of the one or more input devices 210 and interpreting the detected input or interaction;

* Web browser module 222 for navigating, requesting (e.g., via HTTP), and displaying websites and web pages thereof, including a web interface for logging into a user account associated with a client device 104, XR device 150, or another electronic device, controlling the client, XR or electronic device if associated with the user account, and editing and reviewing settings and data that are associated with the user account;

* One or more user applications 226 for execution by electronic system 200 (e.g., games, social network applications, smart home applications, and/or other web or non-web based applications for controlling another electronic device and reviewing data captured by such devices); * Data processing module 228 for processing content data, e.g., identifying information contained in the content data, matching the content data with other data, categorizing the content data, or synthesizing related content data, where in some embodiments, the data processing module 222 is associated with one of the user applications 220 to process the content data in response to a user instruction received from the user application 220;

* Pose determination and prediction module 230 for determining and predicting a pose (position and orientation) of the client device 104, XR device 150, and/or other devices, and further includes a SLAM (Simultaneous Localization and Mapping) module 232 for mapping a scene where a client device 104 and/other devices are located and identifying a location of the client device 104 and/or other devices within the scene;

* Content rendering module 234 for generating virtual content based on content data (e.g., video data, visual data, audio data, sensor data) collected or obtained by one or more of a client device 104, XR device 150, other imaging devices, and IOT sensors, and rendering the virtual content on top of a field of view ⁷ of one or more of the client device 104 or XR device 150 based on the pose of the client device 104 or XR device 150; and

« One or more databases 238 for storing at least data including one or more of: o Device settings 240 including common device settings (e.g., service tier, device model, storage capacity, processing capabilities, communication capabilities, etc.) of the one or more servers 102, client devices 104, XR devices 150, imaging devices, and IOT sensors; o User account information 242 for the one or more user applications 226, e.g., user names, security questions, account history data, user preferences, and predefined account settings; o Network parameters 244 for the one or more communication networks 108, e.g., IP address, subnet mask, default gateway, DNS server and host name; o Data processing model(s) 246 for processing content data (e.g., video data, visual data, audio data, sensor data) using deep learning techniques; o Content data and results 248 that are obtained by and outputted to the client device 104 and XR devices 150 of the XR system 100, where the content data is processed by the data processing models 228 locally at the respective devices or remotely at the server 102 to provide the associated results to be presented on the client devices 104, XR devices 150, and/or other devices; and o SLAM data 250 including mapping information of a scene, device pose data of XR devices 150, and one or more transformation relationships among the XR devices 150.

[0030] Optionally, the one or more databases 228 are stored in one of the server 102, client device 104, imaging devices, and storage 106 of the XR system 100. Optionally, the one or more databases 230 are distributed in more than one of the server 102, client device 104, and storage 106 of the XR system 100. In some embodiments, more than one copy of the above data is stored at distinct devices, e.g., two copies of the data processing models 233 are stored at the server 102 and storage 106, respectively.

[0031] Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, modules or data structures, and thus various subsets of these modules may be combined or otherwise re-arranged in various embodiments. In some embodiments, memory' 206 stores a subset of the modules and data structures identified above. Furthermore, memory 206 stores additional modules and data structures not described above.

[0032] Figure 3 is a flowchart of a process 300 for processing inertial sensor data and image data of an electronic svstem (e.g., a server 102. a client device 104. or a combination of both) using a visual-inertial SLAM module 232, in accordance with some embodiments. The process 300 includes measurement preprocessing 302, initialization 304, local visual- inertial odometry (VIO) with relocation 306, and global pose graph optimization 308. In measurement preprocessing 302, an RGB camera 260 captures image data of a scene at an image rate (e.g., 30 FPS), and features are detected and tracked (310) from the image data.

An IMU 280 measures inertial sensor data at a sampling frequency (e.g., 1000 Hz) concurrently with the RGB camera 260 capturing the image data, and the inertial sensor data are pre-integrated (312) to provide data of a variation of device poses 340. In initialization 304, the image data captured by the RGB camera 260 and the inertial sensor data measured by the IMU 280 are temporally aligned (314). A vision-only structure from motion (SfM) techniques 314 are applied (316) to couple the image data and inertial sensor data, estimate three-dimensional structures, and map the scene of the RGB camera 260. [0033] After initialization 304 and during relocation 306, a sliding window 318 and associated states from a loop closure 320 are used to optimize (322) a VIO. When the VIO corresponds (324) to a keyframe of a smooth video transition and a corresponding loop is detected (326), features are retrieved (328) and used to generate the associated states from the loop closure 320. In global pose graph optimization 308, a multi-degree-of-freedom (multiDOF) pose graph is optimized (330) based on the states from the loop closure 320, and a keyframe database 332 is updated with the keyframe associated with the VIO.

[0034] Additionally, the features that are detected and tracked (310) are used to monitor (334) motion of an object in the image data and estimate image-based poses 336, e.g., according to the image rate. In some embodiments, the inertial sensor data that are preintegrated (234) may be propagated (338) based on the motion of the object and used to estimate inertial-based poses 340, e.g., according to a sampling frequency of the IMU 280. The image-based poses 336 and the inertial-based poses 340 are stored in the database 240 and used by the module 230 to estimate and predict poses that are used by the real time video rendering system 234. Alternatively, in some embodiments, the module 232 receives the inertial sensor data measured by the IMU 280 and obtains image-based poses 336 to estimate and predict more poses 340 that are further used by the time video rendering system 234. [0035] In SLAM, high frequency pose estimation is enabled by sensor fusion, which relies on data synchronization between imaging sensors and the IMU 280. The imaging sensors (e.g., the RGB camera 260, a LiDAR scanner) provide image data desirable for pose estimation, and oftentimes operate at a lower frequency (e.g., 30 frames per second) and with a larger latency (e.g., 30 millisecond) than the IMU 280. Conversely, the IMU 280 can measure inertial sensor data and operate at a very high frequency (e.g., 1000 samples per second) and with a negligible latency (e.g., < 0. 1 millisecond). Asynchronous time warping (ATW) is often applied in an AR system to warp an image before it is sent to a display to correct for head movement and pose variation that occurs after the image is rendered. ATW algorithms reduce a latency of the image, increase or maintain a frame rate, or reduce judders caused by missing images. In both SLAM and ATW, relevant image data and inertial sensor data are stored locally, such that they can be synchronized and used for pose estimation/predication. In some embodiments, the image and inertial sensor data are stored in one of multiple Standard Template Library (STL) containers, e.g., std /vector, std:: queue, std: :list, etc., or other self-defined containers. These containers are generally convenient to use. The image and inertial sensor data are stored in the STL containers with their timestamps, and the timestamps are used for data search, data insertion, and data organization.

[0036] Figures 4A and 4B are two example XR systems 400 (400A and 400B) in each of which two XR devices 152 and 154 (e.g., two HMDs) are separated in a scene and calibrated in space, in accordance with some embodiments. The two XR devices 152 and 154 co-exist in the same scene, and correspond to two coordinate systems 402 and 404, respectively. The first XR device 152 is configured to determine a device pose 406 of the second XR device 154 in a first coordinate system 402 of the first. XR device 152, and derive a transformation relationship 408 between the two coordinate systems 402 and 404 of the two XR devices 152 and 154. When an object needs to be rendered in both of the first, and second XR devices 152 and 154, the first XR device 152 obtains a second object, pose 412 associated with the second XR device 154. The second object pose 412 is measured in a second coordinate system 404 of the second XR device 154 and used to render an object, in a second map of the second XR device 154. The second object pose 412 is converted to a first object pose 410 in the first, coordinate system 402 based on the transformation relationship 408. The object is rendered concurrently in the first and second maps of the first and second XR devices 152 and 154 based on the first and second object poses 410 and 412, respectively.

[0037] Referring to Figure 4A, in some embodiments, both of the first and second XR devices 152 and 154 are communicatively coupled to a server 102 (e.g., a game server, an extended reality server) configured to execute a user application (e.g., an XR application) including the scene. The second object pose 412 of the second XR device 154 is provided to the first XR device 152 via the server 102, allowing the first XR device 152 to use the second object pose 412 to determine the first object pose 410. Referring to Figure 4B, in some embodiments, both of the first and second XR devices 152 and 154 are coupled to each other, and configured to communicate data directly, e.g., using a Bluetooth communication link or a wire, without involving the server 102. In some situations, the second object pose 412 of the second XR device 154 is provided directly to the first XR device 152 to determine the first object pose 410.

[0038] Each extended reality system 400 allows multiple XR devices 150 to interact with each other in a virtual environment (e.g., an XR environment). In some embodiments, an XR system 400 uses a camera external to each XR device 152 or 154 carried by a user for determining a relative location of an object and the user. Such an XR system 400 determines object locations on a single map/scene having a single coordinate system, and stores the object locations in a server remote to the XR devices 152 and 154. Alternatively, in some embodiments, the XR system 400 uses markers placed in a scene. The markers are 2D markers arranged in a predefined pattern having a predefined size, thereby serving real world anchors of location, orientation, and scale. These markers are optionally placed and moved by the user, and help an XR application to recognize the scene or objects within the scene. Locations of other users and objects are determined relative to the anchors as represented by the markers. By these means, the XR devices 152 and 154 are configured to determine the location of objects and other users relative to each other and within a single map.

[0039] Additionally, in some embodiments of this application, each XR device 152 or 154 is integrated with a respective camera 422 or 424, respectively, and does not use any external marker. Specifically, in some embodiments, each XR device 150 includes one or more of an image sensor (e.g., in a camera 260), a microphone, a speaker, one or more inertial sensors (e.g., gyroscope, accelerometer), and a display. The image sensor and microphone are configured to capture video and audio data from a scene of XR device 150, while the one or more inertial sensors 280 are configured to capture inertial sensor data. In some embodiments, the image sensor captures hand gestures or movements of a user wearing XR device 150. In some embodiments, the microphone records ambient sound, including user’s voice commands. In some embodiments, both video or static visual data captured by the image sensor and the inertial sensor data measured by the one or more inertial sensors are applied to determine and predict device poses (e.g., device positions and orientations). The video, static image, audio, or inertial sensor data captured by XR device 150 are optionally processed by XR device 150, server(s) 102, or both to recognize the poses of one or more XR devices 150.

[0040] In some embodiments, one or more XR devices 150 includes first XR device 152 and second XR device 154. First XR device 152 and second XR device 154 are configured to communicate with each other and/or one or more servers 102, client device 104C. Each of first XR device 152 and second XR device 154 is an electronic device configured to allow a user to interact with an XR environment (e.g., VR, AR, MR environments). In some embodiments, XR system 400 includes first XR device 152 and second XR device 154. For example, XR system 400 is configured to render an XR environment and first XR device 152 and second XR device 154 are configured to allow a user to interact with the XR environment.

[0041] In some embodiments, virtual content is rendered on a display of XR devices 150, e.g., on top of a first field of view of the display of first XR device 152 and/or second XR. device 154. For exampie, if first. XR device 152 is a pair of XR glasses, the virtual content is rendered on top of a first field of view of the display of first XR device 152. [0042] In some embodiments, first XR device 152 and second XR device 154 are configured to allow users to view and interact with the same XR environment. First XR device 152 and second XR device 154 allow users to interact with each other in the same XR environment, or interact with objects in the same XR environment. For example, a first user uses first XR device 152, and a second user uses second XR device 154. The first user interacts with the second user within the XR environment. In some embodiments, the first user interacts with a first object and the second user interacts with the same first object in the XR environment. A first user of first XR device 152 and a second user of second XR device 154 interact with each other and/or objects in the same environment without the need for cameras external to first XR device 152 and second XR device 154.

[0043] In some embodiments, first XR device 152 and second XR device 154 are configured to allow for interaction and collaboration of the same set of virtual objects in the same XR environment. For example, first XR device 152 allows for the interaction of a first object within an XR environment, and second XR device 154 allows for interaction of the same first object in the same XR environment.

[0044] XR system 400 allows users associated with XR devices 150 to interact with each other. In particular, XR system 400 allows users to audibly and/or visually interact with one another (e.g., via directional sounds, avatars, shared image data, shared video data, and/or other interactions). In some embodiments, the users interact with one another while being in physical proximity to one other. Physical proximity, for purposes of this disclosure, means within eyesight or earshot. In some embodiments, XR system 400 facilitates interactions among users located in the same room or common area. In some embodiments, a single communication session involves more than two users of XR system 400.

[0045] In some embodiments, first XR device 152 is configured to execute a session of an XR application. For example, first XR device 152 includes a processor and is communicatively coupled to a server and/or storage device. First XR device 152 is configured to receive instructions to load and execute an application allowing first XR device 152 to display an XR environment to the user of first XR device 152. The XR application provides an XR environment that allows one or users to interact with each other and/or objects in virtual space. In some embodiments, second XR device 154 is configured to execute a session of an XR application. [0046] In some embodiments, upon execution of the XR application on first XR device 152, a first map is created of a first XR scene. The first map allows users to interact with each other and/or objects. In some embodiments, first XR device 152 generates the first map upon execution of the XR application. The first map is associated with the first XR device 152. In some embodiments, the first map has a first coordinate system 402. The first coordinate system 402 is associated with first XR device 152. The first map provides locations of objects and persons relative to the location of first XR device 152 within the first map.

[0047] In some embodiments, second XR device 154 is configured to execute a session of an XR application. For example, second XR device 154 includes a processor and is communicatively coupled to a server and/or storage device. Second XR device 154 is configured to receive instructions to load and execute an application allowing second XR device 154 to display an XR environment to the user of second XR device 154. The XR application provides an XR environment that allows one or users to interact with each other and/or objects in virtual space. In some embodiments, second XR device 154, using the XR application, creates a second map having a second coordinate system 404. The second map is different from the first map created by first XR device 152. For example, first XR device 152 generates a first map having a first coordinate system 402 during a session of an XR application. Second XR device 154 generates a second map having a second coordinate system 404 different than the first coordinate system 402 during the same session of the XR application. In some embodiments, during a communication session, a device pose of a first XR device 152 (e.g., XR devices 150 worn by a first user) is determined.

[0048] First XR device 152 includes one or more cameras and is configured to determine a pose of second XR device 154. For example, first XR device 152 has a pose associated with the position and/or location of first XR device 152. Second XR device 154 has a pose associated with the position and/or location of second XR device 154. In some embodiment, first XR device 152 is configured to determine a pose of second XR device 154 within the first coordinate system 402 associated with the first map of first XR device 152. Second XR device 154 is configured to determine a pose of first XR device 152 within the second coordinate system 404 associated with the second map of second XR device 154. Upon determining the pose of second XR device 154 within the first coordinate system 402 of first map, first XR device 152 is configured to determine a transformation relationship 408 between the first, coordinate system 402 of the first map associated with first XR device 152 and the second coordinate system 404 of the second map associated with second XR device 154.

[0049] In practice, upon initiation (e.g., powering up) of XR system 400, each of first XR device 152 and second XR device 154 executes a session of an XR application. The session of the XR application includes tracking of object, persons, and/or devices. For example, first XR device 152 and/or second XR device 154 are configured to track objects, persons, and/or devices within the session. First XR device 152 constructs a first map around the user (e.g., first user) of first XR device 152, and keeps track of a position of first XR device 152 within the first map. Second XR device 154 constructs a second map around the user (e.g., a second user different from the first user) of second XR device 154 and keeps track of the position of second XR device 154 within the second map. Each of first XR device 152 and second XR device 154 is configured to keep track of their’ respective poses (i.e., position and orientation) at the same time. For example, first XR device 152 includes a first camera 422, and second XR device 154 includes a second camera 424. The first camera 422 is configured to keep track of the pose of first XR device 152 within a first map having a first coordinate system 402. The second camera 424 is configured to keep track of the pose of second XR device 154 within a second map having a second coordinate system 404. In some embodiments, first XR device 152 and second XR device 154 operate simultaneously and independently of one another. This can be achieved by using a SLAM method (e.g., utilizing a SLAM module). First XR device 152 and second XR device 154 are used in the same room or common area, such that second XR device 154 is within a line of sight of first XR device 152, and first XR device 152 is in the line of sight, of second XR device 154. First XR device 152 locates second XR device 154 within the first map, and second XR device 154 locates first XR device 152 within the second map.

[0050] In some embodiments, upon determination of the poses of first XR device 152 and second XR device 154, XR system 400 initiates an alignment process. In some embodiments, the alignment, process is used by XR system 400 to determine an algorithm or mathematical relationship for transforming a pose within the first coordinate system 402 of first XR device 152 to a pose within the second coordinate system 404 of second XR device 154 and vice versa. For example, in accordance with alignment of first XR device 152 and second XR device 154, first XR device 152 determines the pose of second XR device 154 within the first map having the first coordinate system 402, and second XR device 154 determines the pose of first XR device 152 within the second map having the second coordinate system 404. XR system 400 generates a transformation relationship 408 to determine the relationship between the first coordinate system 402 and the second coordinate system 404. The alignment process results in XR system 400 determining a transformation relationship 408 between the first coordinate system 402 and the second coordinate system 404. By these means, objects viewed by first XR device 152 are positioned within the second coordinate system 404, so are objects viewed by second XR device 154 positioned within the first coordinate system 402 by XR system 400.

[0051] In some embodiments, while continuously tracking first XR device 152 and second XR device 154, first XR device 152 is physically aligned with second XR device 154. The alignment occurs, automatically or upon a request, when second XR device 154 enters a field of view of first XR device 152. For example, XR system 400 determines that the first camera 422 of first XR device 152 is in line with the second camera 424 of second XR device 154, and determines the pose of first XR device 152 within the second coordinate system 404 and the pose of second XR device 154 within the first coordinate system 402. In some embodiments, each of first XR device 152 and second XR device 154 includes one or more sensors (e.g., opti cal sensors) and alignment of one or more sensors of first XR devi ce 152 with one or more sensors of second XR device 154 results in alignment of first XR device 152 with second XR device 154, In an example, XR system 400 instructs a first user of first XR device 152 to look at the second XR device 154, allowing the first XR device 152 to capture an image of the second XR device 154. This image is used to determine a second device pose of the second XR device 152 from which a transformation relationship 408 is further derived. In another example, XR system 400 instructs a second user of second XR device 154 to look at the fist XR device 152, allowing the second XR device 154 to capture an image of the first XR device 152. This image is used to determine a first device pose of the first XR device 154 from which the transformation relationship 408 is further derived. Upon alignment of the cameras 422 and 424, XR system 400 aligns the first and second XR devices 152 and 154, and determines the transformation relationship 408 of the first coordinate system 402 to the second coordinate system 404.

[0052] In some embodiments, a user may initiate the alignment process by pressing a button on XR devices 150. For example, a first user may press a button disposed on first XR device 152 which results in XR system 400 provided instructions to the first user via first XR device 152 to align with second XR device 154. In some embodiments, one or more sensors on first XR device 152 and/or second XR device 154 assist with aligning first XR device 152 and second XR device 154. For example, first XR device 152 and second XR device 154 include infrared (1R) sensor. First camera 422 of first XR device 152 are aligned with the front of second XR device 154, and the second camera 424 of second XR device 154 is aligned with the front of first XR device 152.

[0053] In some embodiments, a time-of-flight camera or other 3D sensing cameras are used to align first XR device 152 with second XR device 154. A machine learning model may also assist with finding the 3D location of first XR device 152 relative to second XR device 152 and/or the 3D location of second XR device 154 relative to first XR device 152. In some embodiments, the 2D location of first XR device 152 on the image captured by the camera on the second XR device 152 and/or the 2D location of second XR device 154 on the image captured by the camera on the first XR device 152 is determined, and depth map is generated. The depth map is used to convert the 2D location of each of first XR device 152 and second XR device 154 to a 3D location.

[0054] In some embodiments, first XR device 152 and second XR device 154 include a pair of stereo cameras to determine the 3D locations of second XR device 152 and first XR device 154, respectively. For example, first XR device 152 includes two or more stereo cameras configured to take images of second XR device 152. Using two or more stereo cameras allows for use of triangulation to determine the 3D location of second XR device 152. The center of second XR device 154 is localized from two or more input images of the stereo cameras of first XR device 152. In some embodiments, a deep learning-based machine model is used by first XR device 152 to estimate the 3D location of second XR device 154 based on continuous tracking of second XR device 154 by first XR device 152.

[0055] In some embodiments, first XR device 152 is configured to track the face of second user of second XR device 154. For example, instead of tracking second XR device 154, first XR device 152 tracks features of the face of second user using second XR device 154. In some embodiments, first XR device 152 is configured to track facial features (e.g., lips, nose, cheeks, chin) of a user instead of or in addition to tracking second XR device 154. In some embodiments, first XR device 152 is configured to determine the position of second XR device 154 based on facial features. For example, first XR device 152 determines the location of the nose of second user and thus identifies the location of second XR device 154, based on an assumption that second XR device 154 is proximate to a nose of the second user (e.g., second XR device 154 sits on the nose of second user).

[0056] In some embodiments, first XR device 152 is configured to determine the orientation of second XR device 154. For example, first XR device 152 uses time-of-flight cameras or other 3D sensing cameras to estimate the orientation of second XR device 154 from one or more depth images taken by one or more cameras of first XR device 152. In some embodiments, first. XR device 154 includes one or more stereo cameras are used to determine and/or estimate the orientation of second XR device 154. First XR device 152 uses a deep learning-based model to estimate the orientation directly from an input image (e.g., an RGB image or a depth image) of second XR device 154 captured by first XR device 152. In some embodiments, first XR device 152 faces opposite to second XR device 154. A first plane extending through the front surface of first XR device 152 is substantially parallel to a second plane extending through the front surface of second XR device 154. When the first plane of first XR device 152 is substantially parallel to the second plane of second XR device 154, the orientation of second XR device 154 is the direct opposite of the orientation of first XR device 154.

[0057] In some embodiments, second XR device 154 is configured to determine a pose of an object that is rendered within the second map. The pose of the object rendered within the second map is within the second coordinate system 404. Using the transformation relationship 408, first XR device 152 is configured to covert the pose of the object within the second map from the second coordinate system 404 to the first coordinate system 402 associated with the first XR device 152. In some embodiments, while the object is rendered in the second map for second XR device 154, the object is concurrently rendered in the first map for first XR device 152. The object rendered in the first map has a pose in the first coordinate system 402.

[0058] In some embodiments, XR system 400 is configured to calculate a transformation relationship 408 to transform a pose within the first coordinate system 402 to a pose within the second coordinate system 404. A first pose tracked by a first camera 422 of first XR device 152 is represented by a rotation vector “r” and a translation vector “t”. The vectors of the first, pose (e.g., vectors r and t) are composed into matrix T. Matrix T is a transformation matrix. In some embodiments, the transformation matrix T of a first pose of first XR device 152 at. the time of alignment with second XR device 154 is referred to as Tn and the transformation matrix T of a second device pose of second XR device 154 at the time of alignment with first XR device 152 is referred to as T22. In some embodiments, the transformation matrix of the second device pose of second XR device 154 with respect to the first coordinate system 402 of first XR device 152 at the time of alignment of first XR device 152 with second XR device 154 is referred to as T21 and the transformation matrix of the first pose of first XR device 152 with respect to the second coordinate system 404 of second XR device 154 at the time of alignment of second XR device 154 with first XR device 152 is referred to as T12. The transformation matrix T that converts a pose of the first coordinate system 402 to the second coordinate system 404 is referred to as T _a. Transformation matrix T _a is computed based on T21 and T22. For example, since T ₂I = T _a * T ₂2, Ta is computed to be T _a = T ₂IT ₂₂ • Based on the computation of T _a, one or more subsequent tracked poses of second XR device 154 within the second coordinate system 404 are transformed to a pose within the first coordinate system 402. This allows for the alignment of objects viewed by one of more cameras of first XR device 152 with the display of second XR device 154. In other words, this allows XR system 400 to determine the pose of an object within one coordinate system using one or more XR devices 150 (e.g., first XR device 152 and second XR device 154.)

[0059] Additionally, referring to Figure 4A, in some embodiments, the device pose 406 is provided to the server 102 or the second XR device 154 to determine the transformation relationship 408 by the server 102 or the second XR device 154, respectively. In some embodiments, the first object pose 410 is measured in the first coordinate system 402 of the first XR device 152, and provided to the second XR device 154 via the server 102 to determine the second object pose 412 based on the transformation relationship 408 held by the second XR device 154. In some embodiments, the transformation relationship 408 is determined or submitted to the server 102, and used to determine at least one of the first and second object poses 410 and 412 at the server 102. Alternatively, referring to Figure 4B, in some embodiments, the device pose 406 is provided to the second XR device 154 to determine the transformation relationship 408. In some embodiments, the first object pose 410 is measured in the first coordinate system 402 of the first XR device 152, and provided to the second XR device 154 to determine the second object pose 412 based on the transformation relationship 408 held by the second XR device 154. The server 102 is not involved in this data exchanging process that facilitates space alignment of the first and second XR devices 152 and 154.

[0060] Figure 5 is a flow diagram of a method 500 for rendering a virtual object in two electronic devices (e.g., two XR devices 152 and 154 in Figure 1), in accordance with some embodiments. The virtual object is rendered in the two electronic devices in a synchronous manner, and matched and aligned in space to a field of view ⁷ of each electronic device. In some embodiments, the method is applied in AR glasses, robotic systems, vehicles, or mobile phones. For convenience, the method 500 is described as being implemented by an electronic device. In an example, the method 500 is applied to determine and predict poses, map a scene, and render virtual content in extended reality (e.g., VR, AR) for each of the two electronic devices. Method 500 is, optionally, governed by instructions that are stored in a non-transitory computer readable storage medium and that are executed by one or more processors of the electronic system. Each of the operations shown in Figure 5 may correspond to instructions stored in a computer memory or non-transitoiy computer readable storage medium (e.g., memory' 206 of the electronic system 200 in Figure 2). The computer readable storage medium may include a magnetic or optical disk storage device, solid state storage devices such as Flash memory, or other non-volatile memory device or devices. The instructions stored on the computer readable storage medium may include one or more of: source code, assembly language code, object code, or other instruction format that is interpreted by one or more processors. Some operations in method 700 may be combined and/or the order of some operations may be changed.

[0061] A first electronic device (e.g., an XR device 152) executes (502) a session of an extended reality application on the first electronic device and creates (504) a first map of a scene by the first electronic device. The first map has a first coordinate system. A second electronic device is configured to execute (506) the extended reality application and create a second map of the scene having a second coordinate system. The first electronic device determines (508) a second device pose of the second electronic device by the first electronic device in the first coordinate system. Based on the second device pose of the second electronic device, the first electronic device determines (510) a transformation relationship 408 between the first coordinate system of the first electronic device and the second coordinate system of the second electronic device. The first electronic device obtains (512) a second object pose of an object in the second coordinate system of the second map where the object is rendered, and converts (514) the second object pose to a first object pose in the first coordinate system based on the transformation relationship 408. While the object is rendered in the second map for the second electronic device, the first electronic device concurrently renders (516) the object in the first map for the first electronic device. The object has the first object pose in the first coordinate system.

[0062] In some embodiments, the first electronic device is physically aligned with the second electronic device. The first electronic device captures an image of the second electronic device. Based on the captured image of the second electronic device, the first electronic device determines the second device pose of the second electronic device in the first coordinate system.

[0063] In some embodiments, a plurality of cameras of the first electronic device are applied to identify a second position of the second electronic device. The plurality of cameras are distanced apart by a set of first distances. A 3D location of the second electronic device is triangulated within the first coordinate system based on the first set of distances and images captured by the plurality of cameras.

[0064] In some embodiments, the first electronic device uses one or more sensors to determine the second device pose of the second electronic device relative to the first electronic device and generates a 3D location of the second electronic device within the first coordinate system.

[0065] In some embodiments, the first electronic device determines an orientation of the second electronic device using one or more cameras coupled to or integrated in the first electronic device.

[0066] In some embodiments, the first electronic device is configured to be worn on a head of a first user and the second electronic device is configured to worn on a head of a second user.

[0067] In some embodiments, the first electronic device or the second electronic device is a virtual reality headset. In some embodiments, the first electronic device or the second electronic device is an augmented reality headset.

[0068] In some embodiments, the second electronic device includes second features configured to be detected by the first electronic device.

[0069] In some embodiments, the first coordinate system and the second coordinate system are both coordinate systems in a three-dimensional (3D) space. Each of the first and second coordinate system includes a respective 3D coordinate.

[0070] It should be understood that the particular order in which the operations in Figure 5 have been described are merely exemplary and are not intended to indicate that the described order is the only order in which the operations could be performed. One of ordinary skill in the art would recognize various ways to calibrate two XR devices 152 and 154 in space as described herein. Additionally, it should be noted that details of other processes described above with respect to Figures 1 -4B are also applicable in an analogous manner to method 500 described above with respect to Figure 5. For brevity, these details are not repeated here.

[0071] The terminology used in the description of the various described embodiments herein is for the purpose of describing particular embodiments only and is not intended to be limiting. As used in the description of the various described embodiments and the appended claims, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context, clearly indicates otherwise. It. will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “includes,” “including,” “comprises,” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Additionally, it wall be understood that, although the terms “first,” “second,” etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another.

[0072] As used herein, the term “if” is, optionally, construed to mean “when” or “upon” or “in response to determining” or “in response to detecting” or “in accordance with a determination that,” depending on the context. Similarly, the phrase “if it is determined” or “if [a stated condition or event] is detected” is, optionally, construed to mean “upon determining” or “in response to determining” or “upon detecting [the stated condition or event]” or “in response to detecting [the stated condition or event]” or “in accordance with a determination that, [a stated condition or event] is detected,” depending on the context.

[0073] The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the claims to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain principles of operation and practical applications, to thereby enable others skilled in the art.

[0074] Although various drawings illustrate a number of logical stages in a particular order, stages that are not order dependent may be reordered and other stages may be combined or broken out. While some reordering or other groupings are specifically mentioned, others will be obvious to those of ordinary skill in the art, so the ordering and groupings presented herein are not an exhaustive list of alternatives. Moreover, it should be recognized that the stages can be implemented in hardware, firmware, software or any combination thereof.

Previous Patent: SYSTEMS AND METHODS FOR DYNAMIC VULNERABILITY SCORING

Next Patent: LOCAL MOTION EXTENSION IN VIDEO CODING