Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DATA VISUALIZATION IN EXTENDED REALITY
Document Type and Number:
WIPO Patent Application WO/2023/086102
Kind Code:
A1
Abstract:
This application is directed to creating a data item in extended reality. An electronic device displays a user interface of a user application on top of a scene where the electronic device is disposed. A first user selection of a first visual object is received among one or more visual objects of the user interface. The first visual object includes predefined content information. A sticker data item associated with the first visual object is created to include at least the predefined content information. In response to a second user selection of a visual area of the scene, the electronic device displays a user-actionable sticker affordance, which represents the sticker data item associated with the first visual object, on top of the visual area of the scene. The visual area of the scene is optionally identified by a sticker-related device pose and a relative sticker location.

Inventors:
HUANG JINBIN (US)
MEI CHAO (US)
XU YI (US)
XIONG QI (US)
LIANG SHUANG (US)
GAO YU (US)
Application Number:
PCT/US2021/059224
Publication Date:
May 19, 2023
Filing Date:
November 12, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INNOPEAK TECH INC (US)
International Classes:
G06F3/04815; G02B27/00; G06F3/01; G06F3/048; G06T7/20; G06T19/00
Foreign References:
US20190098291A12019-03-28
US20200090338A12020-03-19
US20190295326A12019-09-26
US20180350056A12018-12-06
Attorney, Agent or Firm:
WANG, Jianbai et al. (US)
Download PDF:
Claims:
What is claimed is:

1. A method for creating a data item in extended reality, comprising: displaying, on a display of an electronic device, a user interface of a user application on top of a scene where the electronic device is disposed, the user interface including one or more visual objects; receiving a first user selection of a first visual object among the one or more visual objects of the user interface, the first visual object including d^fined-content information; creating a sticker data item associated with the first visual object, the sticker data item including at least the predefined- content information; and in response to a second user selection of a visual area of the scene, displaying a user- actionable sticker affordance on top of the visual area of the scene, the user-actionable sticker affordance representing the sticker data item associated with the first visual object.

2. The method of claim 1, in response to the second user selection, displaying the user- actionable sticker affordance further comprising: in response to the second user selection of the visual area of the scene, identifying a sticker-related device pose and a relative sticker location, wherein the visual area of the scene is selected at the relative sticker location in a field of view of the electronic device when a camera of the electronic device is positioned according to the sticker-related device pose.

3. The method of claim 2, further comprising: storing the user-actionable sticker affordance, the sticker data item, and information of the user application in association with the sticker-related device pose and the relative sticker location; determining that the visual area appears in the field of view of the electronic device based on the sticker-related device pose and the relative sticker location; and in accordance with a determination that the visual area appears in the field of view of the electronic device and that the user application is executed, re-displaying the user- actionable sticker affordance on top of the visual area of the scene.

4. The method of any of claims 1-3, wherein the user interface of the user application is displayed on a fixed interface location of the display, independently of a variation of a device pose of the electronic device.

32

5. The method of claim 4, wherein in accordance with a determination of the device pose of the electronic device is within a sticker pose range associated with the visual area of the scene, a portion or all of the user-actionable sticker affordance is displayed on top of the visual area of the scene.

6. The method of any of claims 1-5, displaying the user-actionable sticker affordance further comprising: in accordance with a determination that a device pose of the electronic device is within a first pose range associated with the visual area of the scene, displaying a portion of the user-actionable sticker affordance on the display, the portion less than all of the user- actionable sticker affordance; in accordance with a determination that the device pose of the electronic device is within a second pose range associated with the visual area of the scene, displaying all of the user-actionable sticker affordance on the display.

7. The method of claim 6, further comprising: in accordance with a determination that the device pose of the electronic device exceeds the first and second pose ranges associated with the visual area of the scene, aborting displaying all of the user-actionable sticker affordance on the display.

8. The method of any of claims 1-7, wherein the user interface of the user application is displayed on top of a fixed interface location of the scene, further comprising: in accordance with a determination that a device pose of the electronic device is within a third pose range associated with the fixed interface location of the scene, displaying a portion of the user interface on the display, the portion less than all of the user interface; and in accordance with a determination that a device pose is within a fourth pose range associated with the fixed interface location of the scene, displaying all of the user interface on the display, the third pose range distinct from the fourth pose range.

9. The method of claim 8, wherein the fixed interface location of the scene and the visual area of the scene have a distance, such that the user-actionable sticker affordance is not displayed concurrently with the user interface on the display.

10. The method of any of the preceding claims, wherein the user-actionable sticker affordance is displayed concurrently with the user interface on the display.

33

11. The method of any of the preceding claims, wherein displaying the user-actionable sticker affordance includes displaying a portion of the sticker data item.

12. The method of any of the preceding claims, further comprising: receiving user inputs of user-defined content data related to the first visual object, the sticker data item including the user-defined content data related to the first visual object.

13. The method of claim 12, further comprising: in response to a user action on the user-actionable sticker affordance, displaying the content information and the user-defined content data on the display of the electronic device.

14. The method of any of the preceding claims, wherein the electronic device is coupled to a mobile device, and the first user selection and the second user selection are received from the mobile device.

15. The method of claim 14, wherein the mobile device is electrically coupled to the electronic device via a wire or via one or more wireless communication networks, and includes at least one of a trackpad for receiving the first and second user selections and a signal emitter for providing the first and second user selections.

16. The method of any of the preceding claims, wherein the one or more visual objects of the user interface includes a second visual object, and the user-actionable sticker affordance includes a first user-actionable sticker affordance representing a first sticker data item and associated with a first visual area, the method further comprising: displaying a second user-actionable sticker affordance on top of a second visual area of the scene, the second user-actionable sticker affordance representing a second sticker data item associated with the second visual object of the user interface.

17. The method of any of claims 1-15, the scene including a first scene, the visual area including a first visual area, the method further comprising: in accordance with a determination that the electronic device is located in a second scene and that the user application is executed, displaying the user-actionable sticker affordance on top of a second visual area of the second scene, the second visual area substantially similar to the first visual area.

18. An electronic device, comprising: one or more processors; and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform a method of any of claims 1-17.

19. A non-transitory computer-readable medium, having instructions stored thereon, which when executed by one or more processors cause the processors to perform a method of any of claims 1-17.

Description:
Data Visualization in Extended Reality

TECHNICAL FIELD

[0001] This application relates generally to display technology including, but not limited to, methods, systems, and non-transitory computer-readable media for displaying visual objects and supplemental information of a user interface of a user application in an extended reality environment.

BACKGROUND

[0002] Visual data analytic applications are often executed on computers to analyze and visualize data. A single display device having a limited size is not efficient in supporting visualization of big data, because it does not provide efficient interfaces to compare derived data views due to restricted display real estate. It is often difficult to navigate through the data using a small display device, thereby resulting in inefficient data provenance systems. It would be beneficial to have an efficient data visualization mechanism, particularly when non- conventional computer devices are involved.

SUMMARY

[0003] Various implementations of this application are directed to display technology and, more particularly, to systems and methods for visualizing visual objects and supplemental information of a user interface of a user application in a field of view of an electronic device that renders content of extended reality. Extended reality includes augmented reality (AR) in which virtual objects are overlaid on a view of a real physical world, virtual reality (VR) that includes only virtual content including a virtual background view, and mixed reality (MR) that combines both AR and VR and in which real world and digital objects interact. In some embodiments, the user application is a data visualization application that visualizes data in an AR environment, e.g., on a virtual wall facing a user. The visualized data are represented in one or more user-selectable visual objects on the user interface. A user is allowed to select a visual object and create a sticker data item including data information or supplemental information of the selected visual object. In some embodiments, the user is allowed to enter user-defined information related to the selected visual object onto the sticker data item. When a user selects a visual area in extended reality (e.g., an area on a virtual wall), a user-actionable sticker affordance representing the sticker data item is displayed on top of the visual area of the scene. In some implementations, a mobile device is used as a trackpad or a signal emitter for enabling user selections. By these means, an immersive visual data analytics system is created to take advantage of an extended reality device’s spatialization ability and the mobile device’s user interaction capability to provide efficient data analysis experience, e.g., chart comparison and analysis reviewing. [0004] In this application, a data visualization mechanism allows the user to place virtual objects of a data visualization application and supplemental information associated with the virtual objects in a three-dimensional (3D) space. The supplemental information is associated with a sticker affordance that is linked to a specific visual area in the 3D space, which allows a user to rely on his or her own personal memory of the visual area in the 3D space to quickly locate the sticker affordance. For example, the user places a sticker affordance with the supplemental information on a back of a real chair. When the user comes back to retrieve the supplemental information, a memory picture having the corresponding sticker affordance attached to the back of the chair helps the user navigate the 3D space and retrieve the supplemental information (i.e., the associated sticker data item) promptly. [0005] In one aspect, a method is implemented to create a data item in extended reality. The method includes displaying, on a display of an electronic device, a user interface of a user application on top of a scene where the electronic device is disposed. The user interface includes one or more visual objects. The method further receiving a first user selection of a first visual object among the one or more visual objects of the user interface. The first visual object includes predefined content information. The method further includes creating a sticker data item associated with the first visual object, and the sticker data item includes at least the predefined content information. The method further includes in response to a second user selection of a visual area of the scene, displaying a user-actionable sticker affordance on top of the visual area of the scene. The user-actionable sticker affordance represents the sticker data item associated with the first visual object.

[0006] In some embodiments, in response to the second user selection, displaying the user-actionable sticker affordance further includes in response to the second user selection of the visual area of the scene, identifying a sticker-related device pose and a relative sticker location, wherein the visual area of the scene is selected at the relative sticker location in a field of view of the electronic device when the camera is positioned according to the sticker- related device pose. Further, in some embodiments, the method includes storing the user- actionable sticker affordance, the sticker data item, and information of the user application in association with the sticker-related device pose and the relative sticker location and determining that the visual area appears in the field of view of the electronic device based on the sticker-related device pose and the relative sticker location. The method further includes in accordance with a determination that the visual area appears in the field of view of the electronic device and that the user application is loaded and executed, re-displaying the user- actionable sticker affordance on top of the visual area of the scene.

[0007] In another aspect, some implementations include an electronic device including one or more processors and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform any of the above methods.

[0008] In yet another aspect, some implementations include a non-transitory computer-readable medium, having instructions stored thereon, which when executed by one or more processors cause the processors to perform any of the above methods.

[0009] These illustrative embodiments and implementations are mentioned not to limit or define the disclosure, but to provide examples to aid understanding thereof.

Additional embodiments are discussed in the Detailed Description, and further description is provided there.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010] For a better understanding of the various described implementations, reference should be made to the Detailed Description below, in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.

[0011] Figure 1 A is an example data processing environment having one or more servers communicatively coupled to one or more client devices, in accordance with some embodiments, and Figure IB is an example local electronic system that includes a headmounted display (HMD) and a mobile device, in accordance with some embodiments.

[0012] Figure 2 is a flowchart of a process for processing inertial sensor data and image data of an electronic system using a SLAM module, in accordance with some embodiments.

[0013] Figure 3 is a graphic user interface (GUI) displayed in an extended reality environment, in accordance with some embodiments.

[0014] Figures 4A-4C illustrate an example scene where an HMD is disposed and displays different fields of view, in accordance with some embodiments. [0015] Figures 5A-5E are diagrams illustrating five distinct user actions applied on a touchscreen of a mobile device to control content display by an HMD, in accordance with some embodiments.

[0016] Figure 6 is a block diagram illustrating an electronic device, in accordance with some embodiments.

[0017] Figure 7 is a block diagram illustrating a mobile device that is coupled to the electronic device shown in Figure 6, in accordance with some embodiments.

[0018] Figure 8 is a flow diagrams of an example method of creating a data item in extended reality, in accordance with some embodiments.

[0019] Like reference numerals refer to corresponding parts throughout the several views of the drawings.

DETAILED DESCRIPTION

[0020] Reference will now be made in detail to specific embodiments, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous non-limiting specific details are set forth in order to assist in understanding the subject matter presented herein. But it will be apparent to one of ordinary skill in the art that various alternatives may be used without departing from the scope of claims and the subject matter may be practiced without these specific details. For example, it will be apparent to one of ordinary skill in the art that the subject matter presented herein can be implemented on many types of electronic devices with digital video capabilities.

[0021] In various embodiments of this application, an electronic system displays a user interface of a user application and one or more user-actionable sticker affordances on a background view that is associated with a 3D virtual space. This 3D virtual space corresponds to a real physical world in mixed and augmented reality, or is rendered in virtual reality by the electronic system. The one or more user-actionable sticker affordances relate to visual objects of the user interface, and provide a user short-cuts to access excerpted or user- created information associated with the visual objects promptly without going through a full path of the user application leading to the visual objects. Specifically, each sticker affordance is linked to a sticker data item, and the sticker data item includes predefined content information and/or user-defined content data of a respective visual object of the user interface. Each sticker affordance is disposed on a visual area to be associated with the visual area on the background view. When displayed on the respective sticker affordance, a corresponding sticker data item is optionally collapsed and hidden partially or entirely within the respective sticker affordance. Stored in the sticker affordances, the sticker data items associated with visual objects of the user interface are conveniently bookmarked and loaded to facilitate quick access. In some embodiments, a set of sticker affordances are spatially grouped on the background view, allowing a user to compare corresponding sticker data items. When a sticker affordance is selected, the corresponding sticker data is displayed on a foreground for review. Additionally, association of the sticker affordances with the visual areas in the scene allows a user to remember the locations of the sticker affordances in the virtual 3D space, thereby facilitating a reload of the corresponding sticker data items.

[0022] In some embodiments, sticker affordances are controlled (created, zoomed, panned, sorted, selected, annotated) using different user interactions enabled by a mobile device. Optionally, the mobile device is held in a portrait mode to act as a virtual laser pointer (i.e., a signal emitter). The sticker affordances are controlled with the virtual laser pointer formed by the mobile device. Optionally, the mobile device is rotated and held in a landscape mode. A touchscreen of the mobile device becomes a trackpad to allow the user to manipulate content of a sticker affordance.

[0023] Figure 1 A is an example data processing environment 100 having one or more servers 102 communicatively coupled to one or more client devices 104, in accordance with some embodiments. The one or more client devices 104 may be, for example, desktop computers 104 A, tablet computers 104B, mobile phones 104C, augmented reality (AR) glasses 104D, or intelligent, multi-sensing, network-connected home devices (e.g., a camera). Each client device 104 can collect data or user inputs, executes user applications, and present outputs on its user interface. The collected data or user inputs can be processed locally at the client device 104 and/or remotely by the server(s) 102. The one or more servers 102 provides system data (e.g., boot files, operating system images, and user applications) to the client devices 104, and in some embodiments, processes the data and user inputs received from the client device(s) 104 when the user applications are executed on the client devices 104. In some embodiments, the data processing environment 100 further includes a storage 106 for storing data related to the servers 102, client devices 104, and applications executed on the client devices 104.

[0024] The one or more servers 102 can enable real-time data communication with the client devices 104 that are remote from the one or more servers 102. Further, in some embodiments, the one or more servers 102 can implement data processing tasks that cannot be or are preferably not completed locally by the client devices 104. For example, the client devices 104 include a game console that executes an interactive online gaming application. The game console receives a user instruction and sends it to a game server 102 with user data. The game server 102 generates a stream of video data based on the user instruction and user data and provides the stream of video data for display on the game console and other client devices that are engaged in the same game session with the game console. In another example, the client devices 104 include a networked surveillance camera and a mobile phone 104C. The networked surveillance camera collects video data and streams the video data to a surveillance camera server 102 in real time. While the video data is optionally pre-processed on the surveillance camera, the surveillance camera server 102 processes the video data to identify motion or audio events in the video data and share information of these events with the mobile phone 104C, thereby allowing a user of the mobile phone 104C to monitor the events occurring near the networked surveillance camera in real time and remotely.

[0025] The one or more servers 102, one or more client devices 104, and storage 106 are communicatively coupled to each other via one or more communication networks 108, which are the medium used to provide communications links between these devices and computers connected together within the data processing environment 100. The one or more communication networks 108 may include connections, such as wire, wireless communication links, or fiber optic cables. Examples of the one or more communication networks 108 include local area networks (LAN), wide area networks (WAN) such as the Internet, or a combination thereof. The one or more communication networks 108 are, optionally, implemented using any known network protocol, including various wired or wireless protocols, such as Ethernet, Universal Serial Bus (USB), FIREWIRE, Long Term Evolution (LTE), Global System for Mobile Communications (GSM), Enhanced Data GSM Environment (EDGE), code division multiple access (CDMA), time division multiple access (TDMA), Bluetooth, Wi-Fi, voice over Internet Protocol (VoIP), Wi-MAX, or any other suitable communication protocol. A connection to the one or more communication networks 108 may be established either directly (e.g., using 3G/4G connectivity to a wireless carrier), or through a network interface 110 (e.g., a router, switch, gateway, hub, or an intelligent, dedicated whole-home control node), or through any combination thereof. As such, the one or more communication networks 108 can represent the Internet of a worldwide collection of networks and gateways that use the Transmission Control Protocol/Intemet Protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers, consisting of thousands of commercial, governmental, educational and other computer systems that route data and messages. [0026] Deep learning techniques are applied in the data processing environment 100 to process content data (e.g., video, image, audio, or textual data) obtained by an application executed at a client device 104 to identify information contained in the content data, match the content data with other data, categorize the content data, or synthesize related content data. In these deep learning techniques, data processing models are created based on one or more neural networks to process the content data. These data processing models are trained with training data before they are applied to process the content data. In some embodiments, both model training and data processing are implemented locally at each individual client device 104 (e.g., the client device 104C). The client device 104C obtains the training data from the one or more servers 102 or storage 106 and applies the training data to train the data processing models. Subsequently to model training, the client device 104C obtains the content data (e.g., captures video data via an internal camera) and processes the content data using the training data processing models locally. Alternatively, in some embodiments, both model training and data processing are implemented remotely at a server 102 (e.g., the server 102A) associated with a client device 104 (e.g. the client device 104A). The server 102A obtains the training data from itself, another server 102 or the storage 106 and applies the training data to train the data processing models. The client device 104A obtains the content data, sends the content data to the server 102A (e.g., in an application) for data processing using the trained data processing models, receives data processing results from the server 102A, and presents the results on a user interface (e.g., associated with the application). The client device 104 A itself implements no or little data processing on the content data prior to sending them to the server 102 A.

[0027] Additionally, in some embodiments, data processing is implemented locally at a client device 104 (e.g., the AR glasses 104D or a mobile device 104C coupled to the AR glasses 104D), while model training is implemented remotely at a server 102 (e.g., the server 102B) associated with the client device 104. The server 102B obtains the training data from itself, another server 102 or the storage 106 and applies the training data to train the data processing models. The trained data processing models are optionally stored in the server 102B or storage 106. The AR glasses 104D or mobile device 104C imports the trained data processing models from the server 102B or storage 106, processes the content data using the data processing models, and generates data processing results to be presented on a user interface locally.

[0028] In some embodiments, the AR glasses 104D (also called a head-mounted display (HMD)) are communicatively coupled in a data processing environment 100, and includes a camera 112, a microphone, a speaker and a display. The camera 112 and microphone are configured to capture video and audio data from a scene of the AR glasses 104D. In some situations, the camera 112 captures images including hand gestures of a user wearing the AR glasses 104D. In some situations, the microphone records ambient sound, including user’s voice commands. The video or audio data captured by the camera 112 or microphone is processed by the AR glasses 104D, server(s) 102, or both to recognize the hand gestures and related user instructions. Optionally, deep learning techniques are applied by the server(s) 102, AR glasses 104D or both to recognize the hand gestures and user instructions. The user instructions are used to control the AR glasses 104D itself or interact with an application (e.g., a gaming application) executed by the AR glasses 104D. In some embodiments, the display of the AR glasses 104D displays a user interface, and the recognized user instructions are used to interact with user selectable display items on the user interface, thereby enabling performing predefined operations on objections of the application executed by the AR glasses 104D.

[0029] Alternatively, in some embodiments, the AR glasses 104D is tethered to a mobile device 104C via a wire or communicatively coupled to the mobile device 104C via one or more communication networks 108 that optionally include the local network 110. The mobile device 104 is used as a trackpad or a signal emitter of the AR glasses 104D to facilitate user interaction with a user interface that is displayed with AR content. In some embodiments, the user interface is associated with a data visualization application and includes one or more virtual objects. A sticker affordance is generated based on one of the virtual objects in the user interface, and displayed with the AR content. The sticker affordance is displayed external to or overlaid on top of the user interface.

[0030] Figure IB is an example local electronic system 150 that includes a headmounted display (HMD) 104D and a mobile device 104C, in accordance with some embodiments. In some embodiments, the mobile device 104C is coupled to the HMD 104D via one or more communication networks 108 (e.g., a wire, a WAN, a LAN 110, a combination of the WAN and LAN). The mobile device 104 is fixed in a scene or held by hand while the HMD 104D is worn by a user. Alternatively, in some embodiments, the mobile device 104 is connected to the HMD 104D via a wire 152, and the user may hold the mobile device 104 by hand while wearing the HMD 104D.

[0031] In some embodiments, the HMD 104D can be Augmented Reality (AR) glasses, Virtual Reality (VR) headset, or smart glasses without full 3D display. In some embodiments, the mobile device 104C is a mobile phone, a tablet computer, or a laptop computer. The HMD 104D is coupled to the mobile device 104C via a universal standard bus (USB) cable, or wirelessly via Wi-Fi, Bluetooth or Internet. In an example, AR glasses 104D work in tandem with a mobile phone 104C and implements a multi-modal interaction paradigm. The mobile device 104C is used as an input device for user interaction with objects displayed in the AR glasses. A user uses the mobile device 104C as a pointing device and use a touchscreen of the mobile phone for various input tasks (e.g., swipe, button click, text input, etc.). Alternatively, in some embodiments, hand gesture estimation is directly used for user interaction. The HMD 104D uses a hand gesture estimation output to recognize an air gesture. The HMD 104D performs predefined operations (e.g., launching an application, exiting from an application, dismissing a message, generating a sticker data item, posting a sticker affordance) based on user actions on the mobile device 104C or the air gesture. Such a multi-modal interaction paradigm enables convenient, prompt, and accurate user interaction with the AR glasses 104D.

[0032] Figure 2 is a flowchart of a process 200 for processing inertial sensor data and image data of an electronic system (e.g., a server 102, a client device 104, or a combination of both) using a SLAM module (e.g., 632 in Figure 6), in accordance with some embodiments. The process 200 includes measurement preprocessing 202, initialization 204, local visual-inertial odometry (VIO) with relocation 206, and global pose graph optimization 208. In measurement preprocessing 202, a camera 112 captures image data of a scene at an image frame rate (e.g., 30 FPS), and features are detected and tracked (210) from the image data. An inertial measurement unit (IMU) 212 measures inertial sensor data at a sampling frequency (e.g., 1000 Hz) concurrently with the camera 112 capturing the image data, and the inertial sensor data are pre-integrated (213) to provide pose data. In initialization 204, the image data captured by the camera 112 and the inertial sensor data measured by the IMU 212 are temporally aligned (214). Vision-only structure from motion (SfM) techniques 214 are applied (216) to couple the image data and inertial sensor data, estimate three-dimensional structures, and map the scene of the camera 112.

[0033] After initialization 204 and in relocation 206, a sliding window 218 and associated states from a loop closure 220 are used to optimize (222) a VIO. When the VIO corresponds (224) to a keyframe of a smooth video transition and a corresponding loop is detected (226), features are retrieved (228) and used to generate the associated states from the loop closure 220. In global pose graph optimization 208, a multi-degree-of-freedom (multiDOF) pose graph is optimized (230) based on the states from the loop closure 220, and a keyframe database 232 is updated with the keyframe associated with the VIO. [0034] Additionally, the features that are detected and tracked (210) are used to monitor (234) motion of an object in the image data and estimate image-based poses 236, e.g., according to the image frame rate. In some embodiments, the inertial sensor data that are pre-integrated (213) may be propagated (238) based on the motion of the object and used to estimate inertial-based poses 240, e.g., according to the sampling frequency of the IMU 212. The image-based poses 236 and the inertial-based poses 240 are stored in the pose data buffer 236 and used by the SLAM module 632 to estimate and predict poses that are used by the pose-based rendering module 634 or 734. Alternatively, in some embodiments, the SLAM module 632 receives the inertial sensor data measured by the IMU 212 and obtains imagebased poses 236 to estimate and predict more poses that are further used by the pose-based rendering module 634 or 734.

[0035] In SLAM, high frequency pose estimation is enabled by sensor fusion, which relies on data synchronization between imaging sensors and the IMU 212. The imaging sensors (e.g., cameras, lidars) provide image data desirable for pose estimation, and oftentimes operate at a low frequency (e.g., 30 frames per second) and with a large latency (e.g., 30 millisecond). Conversely, the IMU 212 can measure inertial sensor data and operate at a very high frequency (e.g., 1000 samples per second) and with a negligible latency (e.g., < 0.1 millisecond). Asynchronous time warping (ATW) is often applied in an AR system to warp an image before it is sent to a display to correct for head movement that occurs after the image is rendered. ATW algorithms reduce a latency of the image, increase or maintain a frame rate, or reduce judders caused by missing image frames. In both SLAM and ATW, relevant image data and inertial sensor data are stored locally such that they can be synchronized and used for pose estimation/predication. In some embodiments, the image and inertial sensor data are stored in one of multiple STL containers, e.g., std::vector, std::queue, std: :list, etc., or other self-defined containers. These containers are generally very convenient for use. The image and inertial sensor data are stored in the STL containers with their time stamps, and the timestamps are used for data search, data insertion, and data organization.

[0036] Figure 3 is a graphic user interface (GUI) 300 displayed in an extended reality environment, in accordance with some embodiments. In an example, the electronic system includes an HMD 104D having the display and a mobile device 104C that is coupled to the HMD 104D and configured to enable user interaction for the HMD 104D. The extended reality environment in enabled by an electronic system having a display. The GUI 300 corresponds to a field of view of the electronic device, and includes part of the extended reality environment (e.g., a background view 302). The background view 302 optionally corresponds to a view of a real physical world or a virtual world. The background view 302 varies with a device pose of the electronic device that includes a device position and a device orientation. A user application is executed by the electronic system and enables a user interface 304 on the display. The user interface 304 is overlaid on the background view 302, covering part of the extended reality environment. Referring to Figure 3, the user application is a data analysis and visualization application, and data are visualized as five charts 306 on the user interface 304. The five charts 306 includes at least a bar chart 306 A, a map chart 306B, a circular connector chart 306C, and a line chart 306D. A user selects a region of interest (A) within the bar chart 306A on the user interface 304. In some embodiments, the region A encompasses the entire bar chart 306A. A sticker affordance 308A is created for the selected region (A) based on the bar chart 306 A and disposed on another region (B) on the background view 302.

[0037] The bar chart 306 A includes predefined content information describing data illustrated on the bar chart 306A, and the predefined content information is optionally displayed or not displayed on the bar chart 306A. A stick data item is created for the selected region of interest from bar chart 306A, and includes at least part of the predefined content information of the bar chart 306A. Additionally, in some embodiments, the electronic system receives user inputs of user-defined content data related to the bar chart 306 A, and the sticker data item further includes the user-defined content data related to the bar chart 306 A. The user selects a visual area (B) of the background view 302, and a user-actionable sticker affordance 308A is displayed on top of the selected visual area (B) of the background view 302, the user-actionable sticker affordance 308 A representing the sticker data item associated with a region of interest of the bar chart 306 A. In some embodiments, the sticker affordance 308A is displayed external to the user interface 304. Alternatively, in some embodiments not shown, the sticker affordance 308A is displayed on top of the user interface 304.

[0038] In some embodiments, when the sticker affordance 308A is displayed, the content information and data are collapsed into the sticker affordance 308A, and is not or partially displayed. In response to a user selection of the sticker affordance 308A, the sticker affordance 308A opens on the foreground, e.g., in front of the user interface 304, to present the predefined content information and or user-defined content data.

[0039] In some embodiments, the selection of the visual area (B) of the background view 302 is implemented when the HMD 104D is moved to a sticker-related device pose. In response to the selection of the visual area (B) of the background view 302, the electronic system identifies the sticker-related device pose and a relative sticker location. The visual area (B) of the background view 302 is selected at the relative sticker location in a field of view of the electronic system when a camera of the electronic system (i.e., the HMD 104D) is positioned according to the sticker-related device pose. At this sticker-related device pose, the user interface 304 is entirely displayed, partially displayed, or not displayed in the field of view of the camera of the electronic system. Additionally, in some embodiments, the user- actionable sticker affordance 308A is stored with the sticker data item associated with the selected bar chart 306 A, information of the user application, the sticker-related device pose, and the relative sticker location. Subsequently, when the user application is being executed and the visual area (B) appears in the field of view of the HMD 104D, the user-actionable sticker affordance 308A is automatically re-displayed on top of the visual area (B) of the background view 302.

[0040] Alternatively, in some embodiments, the background view 302 is associated with a 3D virtual space including a plurality of feature points. In response to the selection of the visual area (B) of the background view 302, a set of adjacent feature points are identified and associated with the visual area (B) of the background view 302. Further, in some embodiments, the user-actionable sticker affordance 308A is stored with the sticker data item associated with the selected bar chart 306 A, information of the user application, and the set of feature points. Subsequently, when the user application is being executed and the set of feature points appear in the field of view of the HMD 104D, the visual area (B) appears in the field of view of the HMD 104D, and the user-actionable sticker affordance 308A is automatically re-displayed on top of the visual area (B) of the background view 302.

[0041] In some embodiments, the electronic system further displays a second user- actionable sticker affordance 308B on top of a second visual area of the background view 302. The second user-actionable sticker affordance 308B represents a second sticker data item associated with the same or a distinct visual object (e.g., the chart 306A, 306B, 306C, or 306D) of the user interface 304. In some embodiments, a plurality of (e.g., more than two) sticker affordances 308 A, 308B, 308C (also cpnecti v ly or i di vid ualjy referred to as 308) are created and displayed on the background view 302. In some embodiments, the plurality of sticker affordances 308 are displayed adjacent to each other, thereby facilitating comparison of the predefined content information of the corresponding virtual objects (e.g., charts 306). [0042] In some embodiments, the user interface 304 is fixed on the background view 302 and is displayed at least partially on the display of the electronic system when the HMD 104D corresponds to a device pose in an interface pose range 310. Interface pose range refers to the range of devices poses where the user can see the interface. Conversely, the sticker affordance 308A is fixed on the visual area (B) on the background view 302, and is displayed at least partially on the display when the HMD 104D corresponds to a device pose in a sticker pose range 312. Sticker pose range refers to the range of devices poses where ethe user can see the sticker. In some embodiments, the interface pose range 310 and the sticker pose range 312 partially overlap. When the device pose of the HMD 104D is in a corresponding overlap range 320, both the user interface 304 and the sticker affordance 308A are displayed at least partially on the display. In contract, when the device pose of the HMD 104D is out of the corresponding overlap range 320, at most one of the user interface 304 and the sticker affordance 308A is displayed, partially or entirely, on the display.

[0043] In some embodiments, the user wearing the HMD 104D enters a second scene that is different from the scene where a sticker affordance 308A is created and attached to a first visual area. A distinct background view is loaded for the HMD 104D in the second scene if HMD 104D is a VR device. A distinct background view naturally occurs for the HMD 104D in the second scene if the HMD 104D is an AR device. In accordance with a determination that the HMD 104D is located in the second scene and that the user application is executed, the HMD 104D displays the user-actionable sticker affordance 308A on top of a second visual area of the second scene. The second visual area substantially similar to the first visual area. For example, the second visual area is a wall on a left side of the second background view.

[0044] Figure 4A is an example scene 400 where an HMD 104D is disposed, in accordance with some embodiments. The HMD 104D has a field of view 402 that does not cover the entire scene 400. The field of view 402 moves in the scene 400 to include different portions of the scene 400 in response to a variation of a device pose of the HMD 104D. Referring to Figure 4A, the HMD 104D has a first device pose including a first device position and a first device orientation. The field of view 402 corresponding to the first device pose includes a user interface 304 of a user application. The scene 400 is associated with a plurality of user-actionable sticker affordances 308A-308E that are associated with different visual areas of the scene 400. For example, the sticker affordances 308A and 308B are associated with a left wall. The sticker affordances 308C, 308D, and 308E are associated with a floor, a window, a right wall of the scene 400, respectively. At the first device pose, the field of view 402 includes a lower half portion of the sticker affordance 308D and a right top comer of the sticker affordance 308C, while the other sticker affordances 304 A, 308B, and 308E cannot be seen by the HMD 104D without varying the first device pose. [0045] In some embodiments, a user uses a touchscreen of a mobile device 104C as a trackpad to control a virtual cursor and manipulate visual objects displayed on the user interface 304 via one or more touchscreen gestures. In this example, the user interface 304 only includes a visual object 306 (e.g., a bar chart 306A) which is selected and expanded from the plurality of visual objects 306 in Figure 3. The visual object 306A further includes four types of visual elements 404: data item 404A, axis 404B, canvas 404C, and tool option 404D. A data item 404A is a data sample, visualized on the user interface 304. For example, a bar is a data item in a bar chart and a dot is a data item in a scatter plot. A set of data items 404A collectively form visualization content. The axis 404B is optionally present, resides in the margin of the visual object 306, and represents an attribute of each data item 404A, e.g., population of a country, horsepower of a vehicle. In this example, an x-axis represents a nominal attribute - “state”, and a y-axis 404B represents a numerical attribute - “gross GDP”. The canvas 404C is any blank area of the visualization, and a tool option 404D is activated optionally when the user applies a user action on the canvas 404C (e.g., a long tap on the canvas 404C). For example, the tool option 404D allows the user to sort the data items 404A (bars) by GDP or start a sticker creation process. In some embodiments, the tool option 404D triggers a sticker creation window 408E. Alternatively, in some embodiments, the user interface 304 includes a sticker creating affordance 404F, and the sticker creation window 408E is activated in response to a user action on the sticker creating affordance 404F. On the tool option 404D, predefined content information 410 is automatically displayed for a sticker affordance 308 to be created, and the user optionally chooses to enter user-defined content data 412 to be associated with the sticker affordance 308A to be created for the instant visual object 306A. Further, in some embodiments, the user interface 304 displays a cursor that hovers on the canvas 404C and is movable under control of user inputs received on the touchscreen of the mobile device 104C.

[0046] In an example, the user applies an extended press action on a touchscreen of the mobile device 104C coupled to the HMD 104D to launch a plurality of tool options 404D, and selects a “start brushing” affordance to enter a group selection state. In the group selection state, the user drags and draws a rectangular region to select the bar chart 306A. After the user finishes a corresponding brushing procedure, a prompt window 404E pops up, allowing the user to choose to generate a sticker affordance 308A of the selected region. When the user hits “DONE” 406, the sticker affordance 308A of the selected region is created based on the selected bar chart 306A, and the sticker creation window 408E shrinks to the sticker affordance 308A, displaying part of a corresponding sticker data item, and is attached to a visual area of the scene.

[0047] After the user-actionable sticker affordances 308A-308E are created and attached to corresponding visual areas of the background view 302, the user of the HMD 104D selects one of the sticker affordances 308A-308E by a user action (e.g., a click). In response to the user action, the predefined content information 410 corresponding to the selected one of the sticker affordances 308A-308E is displayed on the display of the HMD 104D, e.g., as a foreground object 404E on the field of view 402 of the HMD 104D. The foreground object 404E is optionally the same as the sticker creation window 408E. The selected one of the sticker affordances 308A-308E is expanded to include the predefined content information 410 and displayed on top of the user interface 304. In some embodiments, in response to the user action of selecting the one of the sticker affordance 308-308E, the user-defined content data 412 corresponding to the selected sticker affordance 308 is also displayed with the predefined content information 410 as the foreground object. [0048] In some embodiments, the user interface 304 of the user application is displayed at a fixed interface location of the background view 302 associated with a scene where the HMD 104D is disposed. As the HMD 104D moves, the field of view 402 moves with respect to both the background view 302 and the user interface 304. Referring to Figures 4A and 4B, a field of view 422 of the HMD 104D in Figure 4B moves to the left and upward compared with the field of view 402 of the HMD 104D in Figure 4 A. The HMD 104D has a second device pose in Figure 4B. The user interface 304 is only partially displayed on the HMD 104D, while sticker affordances 308B and 308D are entirely enclosed in the field of view 422. In some embodiments not shown, fixed locations of the user interface 304 and a certain sticker affordance 308 has a distance that is larger than a threshold distance, such that the sticker affordance 308 are never displayed concurrently with the user interface 304 on the display of the HMD 104D.

[0049] Specifically, in accordance with a determination that a device pose of the HMD 104D is within a third pose range 310A associated with the fixed interface location of the scene, the HMD 104D displays a portion of the user interface 304 on the display, the portion less than all of the user interface. The third pose range 310A includes the second device pose of the HMD 104D in Figure 4B. In accordance with a determination that a device pose of the HMD 104D is within a fourth pose range 310B associated with the fixed interface location of the scene, the HMD 104D displays all of the user interface 304 on the display. The fourth pose range 310B includes the first device pose associated with Figure 4 A. The third pose range and the fourth pose range are exclusive to each other, and collectively form the interface pose range 310.

[0050] Referring to Figure 4B, in some embodiments, in accordance with a determination that a device pose of the HMD 104D is within a first pose range 312A associated with the visual area of the scene of the sticker affordance 308A, the HMD 104D displays a portion of the user-actionable sticker affordance 308A on the display of the HMD 104D. The portion is less than all of the sticker affordance 308A. In accordance with a determination that the device pose of the electronic device is within a second pose range 312B associated with the visual area of the scene of the sticker affordance 308A, the HMD 104D displays all of the user-actionable sticker affordance 308A on the display. Referring to Figure 4C, in some embodiments, in accordance with a determination that the device pose of the electronic device exceeds the first and second pose ranges 312A and 312B associated with the visual area of the scene, the HMD 104D aborts displaying all of the user-actionable sticker affordance 308A on the display. The first and second pose ranges 312A and 312B collectively form the sticker pose range 312 in Figure 3.

[0051] In some embodiments, the user interface 304 of the user application is displayed at a fixed interface location of the display of the HMD 104D, independently of a variation of a device pose of the HMD 104D. The user interface 304 is fixed with respect to the field of view 402 of the HMD 104D, while the field of view 402 of the HMD 104D moves with respect to the background view 302. Referring to Figures 4A and 4C, a field of view 442 of the HMD 104D in Figure 4C moves to the left and upward compared with the field of view 402 of the HMD 104D in Figure 4A. The HMD 104D has a third device pose in Figure 4C. In accordance with a determination of the third device pose of the HMD 104D is within a sticker pose range associated with the visual area of the scene corresponding to a sticker affordance 308, a portion or all of the user-actionable sticker affordance 308 is displayed on top of the visual area of the scene. That said, the sticker affordance 308D that is only partially displayed in the field of view 402 is entirely displayed within the field of view 442 of the HMD 104D, overlaid on the user interface 304 of the user application, and the sticker affordance 308E that is not displayed in the field of view 402 is partially displayed in the field of view 442 of the HMD 104D.

[0052] Figures 5A-5E are diagrams 510, 520, 530, 540, and 550 illustrating five distinct user actions applied on a touchscreen of a mobile device 104C to control content display by an HMD 104D, in accordance with some embodiments. The HMD 104D enables an extended reality environment (e.g., 400 in Figure 4A) in which a user interface 304 of a user application and related user-actionable sticker affordances 308 are presented on a background view. The HMD 104D is coupled to the mobile device 104C, and the mobile device 104C is electrically coupled to the HMD 104D via a wire 152 or via one or more wireless communication networks 108. The mobile device 104C includes at least one of a trackpad and a signal emitter for providing user selections that select a visual object 306 displayed on the user interface 304, a region of the user interface 304, and a visual area of a background view 302 to which each sticker affordance 308 is attached. In some embodiments, the touchscreen of the mobile device 104C is used as a trackpad when the mobile device 104C is held in a landscape mode. Alternatively, in some embodiments, the touchscreen of the mobile device 104C is used as a signal emitter (e.g., a laser pointer) when the mobile device 104C is held in a portrait mode.

[0053] The distinct user actions are applied to facilitate data visualization and analysis using the user application. Referring to Figure 5A, a first user action includes a drag action 502 in which a user finger touches the touchscreen of the mobile device 104C and moves on the touchscreen for a distance before being lifted up. The drag action 502 changes a position of a cursor in the field of view of the HMD 104D. The cursor travels across different visual areas before it hits a boundary of a scene (i.e., a background view) where the HMD 104D is disposed.

[0054] Referring to Figure 5B, a second user action includes a tap action 504 in which a user finger touches the touchscreen of the mobile device 104C briefly (e.g., for a short period of time that is less than a threshold duration, such as 0.5 second). In some embodiments, the cursor hovers on a data item 404A of a visual object 306, and a tool tip shows up, displaying information of un-displayed information of the data item 404A in the user interface 304. In some embodiments, the data item 404A is associated more than one (e.g., 2) sticker data items and more than one sticker affordances 308, and the more than one sticker affordances 308 are linked to the data item 404A and to each other, i.e., each sticker affordance 308 corresponding to the data item 404 A includes a reference to one or more other sticker affordances 308 corresponding to the same data item 404. In some situations, if the cursor hovers on the data item 404 A, the sticker affordance(s) 308 associated with the data item 404A are highlighted. In some situations, if the tap action 504 is applied on the tool tip, the tool tip is displayed until unselected. In some situations, if the cursor does not hover above the data item 404 A, the tool tip disappears and the associated sticker affordance(s) 308 are not highlighted when the cursor exits hovering. In some embodiments, a cursor hovers on an axis 404B, and description of the axis appears on the user interface 304. The description disappears when the cursor moves away. If the tap action 504 is applied when a cursor hovers above the axis, the description remains even after the cursor moves away. Alternatively, in some embodiments, the tap action 504 is applied to sort data items 404A by the x-axis or transition to a group selection mode.

[0055] Referring to Figure 5C, a third user action includes an extended press action 506 in which a user finger touches the touchscreen of the mobile device 104C for an extended period of time that is greater than a threshold duration, such as 1 second. In some embodiments, the extended press action 506 is configured to add annotation to a data item 404A. In some embodiments, the extended press action 506 is configured to activate a tool list.

[0056] Referring to Figure 5D, a fourth user action includes a pinch action 508 in which a first finger slide away from a second finger or the first and second fingers slide away from each other while the first and second fingers touch the touchscreen of the mobile device 104C. The pinch action 508 also corresponds to the first finger sliding to the second finger or the first and second fingers sliding to each other while the first and second fingers touch the touchscreen of the mobile device 104C. The pinch action 508 is configured to zoom in and zoon out on what is displayed on an HMD 104C.

[0057] Referring to Figure 5E, a fifth user action includes a two-finger drag action 512 in which two user fingers move with in the same direction while touching the touchscreen of the mobile device 104C. The two-finger drag action 512 controls a panning view of the HMD 104D on the user interface 304.

[0058] In some embodiments not shown in Figures 5A-5E, the mobile device 104C has a remote control model in which the mobile device 104C emits a signal (e.g., a virtual laser pointer) to control the visual objects 306 and the sticker affordances 308. For example, referring to Figure 3, a sticker affordance 308A is enclosed by a bounding box configured to be pulled by an edge item 314 to enable rotation of the sticker affordance 308 A. The bounding box of the sticker affordance 308 A further includes a comer item 316, allowing the user to shrink or enlarge the sticker affordance 308A. In some embodiments, the sticker affordance 308A further includes a close button 318. In response to a tap action 504 on the close button 318, the sticker affordance 308A is minimized, to an icon, on the background view 302.

[0059] The mobile device 104C and the HMD 104D collaborate to create the sticker affordance 308A associated with the visual area of the scene where the HMD 104D is disposed. The mobile device 104C is configured to enable interaction with the HMD 104D, while the display of the HMD 104C is applied to display the user interface 304 and the associated sticker affordances 308. In some embodiments, the mobile device 104C functions either as a controller for manipulating the sticker affordances 308 or as a trackpad for interacting with visual objects 306 or elements 404 within the user interface 304. On the display of the HMD 104D, the user can perform selection, sorting, zooming / panning, and other techniques on the visual objects 306 and/or elements 404 displayed on the user interface 304. In some embodiments, the sticker affordances 308 do not allow zooming and panning, and however, are selected, sorted, and/or annotated. For example, a user draws on one of the sticker affordance 308 (i.e., add additional textual, image, video, or audio data into a corresponding sticker data item).

[0060] Figure 6 is a block diagram illustrating an electronic device 600 (e.g., an HMD 104D in Figures 1A and IB), in accordance with some embodiments. The electronic device 600, typically, includes one or more processing units (CPUs) 602, one or more network interfaces 604, memory 606, and one or more communication buses 608 for interconnecting these components (sometimes called a chipset). The electronic device 600 includes one or more input devices 610 that facilitate user input, such as a keyboard, a mouse, a voicecommand input unit or microphone, a touch screen display, a touch- sensitive input pad, a gesture capturing camera, or other input buttons or controls. Furthermore, in some embodiments, the electronic device 600 uses a microphone and voice recognition or a camera and gesture recognition to supplement or replace the keyboard. In some embodiments, the electronic device 600 includes one or more cameras (e.g., a camera 112 in Figure IB), scanners, or photo sensor units for capturing images, for example, of graphic serial codes printed on the electronic devices. The electronic device 600 also includes one or more output devices 612 that enable presentation of user interfaces and display content, including one or more speakers and/or one or more visual displays. In some embodiments, the HMD 104D includes an IMU 212 that works with the camera 112 to enable simultaneous localization and mapping of a scene where the HMD 104D is disposed, thereby creating a background view 302 for a user interface 304 of a user application displayed on the output devices 612.

[0061] Memory 606 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices; and, optionally, includes non-volatile memory, such as one or more magnetic disk storage devices, one or more optical disk storage devices, one or more flash memory devices, or one or more other non-volatile solid state storage devices. Memory 606, optionally, includes one or more storage devices remotely located from one or more processing units 602. Memory 606, or alternatively the non-volatile memory within memory 606, includes a non-transitory computer readable storage medium. In some embodiments, memory 606, or the non- transitory computer readable storage medium of memory 606, stores the following programs, modules, and data structures, or a subset or superset thereof:

• Operating system 614 including procedures for handling various basic system services and for performing hardware dependent tasks;

• Network communication module 616 for connecting each electronic device 600 to other devices (e.g., server 102, client device 104, or storage 106) via one or more network interfaces 604 (wired or wireless) and one or more communication networks 108, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;

• User interface module 618 for enabling presentation of information (e.g., a graphical user interface for user application(s) 626, widgets, websites and web pages thereof, and/or games, audio and/or video content, text, etc.) at each client device 104 via one or more output devices 612 (e.g., displays, speakers, etc.);

• Input processing module 620 for detecting one or more user inputs or interactions from one of the one or more input devices 610 and interpreting the detected input or interaction, where in some implementations, the input processing module 620 includes a mobile controller module 622 for detecting user inputs or interactions from a mobile device 104;

• Web browser module 624 for navigating, requesting (e.g., via HTTP), and displaying websites and web pages thereof, including a web interface for logging into a user account associated with a client device 104 or another electronic device, controlling the client or electronic device if associated with the user account, and editing and reviewing settings and data that are associated with the user account;

• One or more user applications 626 for execution by the electronic device 600 (e.g., games, social network applications, smart home applications, data analysis and visualization applications, and/or other web or non-web based applications);

• Data processing module 628 for processing data using data processing models 648 that is optionally based on deep learning, thereby identifying information contained in the data, matching the data with other data, categorizing the data, or synthesizing related data, where in some embodiments, the data processing module 628 is associated with one of the user applications 626, and configured to generate sticker data items 658 and related sticker affordances 308 in association with visual objects of the user application 626;

• Pose determination and prediction module 630 for determining and predicting a device pose of the electronic device 600 based on images captured by the camera 112 and sensor data captured by the IMU 212, where in some embodiments, the device pose is determined and predicted jointly by the pose determination and prediction module 630 and data processing module 628, and the module 630 further includes an SLAM module 632 for mapping a scene where the electronic device 600 is located and identifying a device pose of the electronic device 600 within the scene;

• Pose-based rendering module 634 for rendering virtual objects and sticker affordances on top of a field of view of a camera of the electronic device 600 or creating extended reality content using images captured by the camera;

• Pose data buffer 636 for storing pose data optionally with inertial sensor data and images for the purposes of determining recent device poses and predicting subsequent device poses; and

• One or more databases 638 for storing at least data including one or more of: o Device settings 640 including common device settings (e.g., service tier, device model, storage capacity, processing capabilities, communication capabilities, etc.) of the electronic device 600; o User account information 642 for the one or more user applications 626, e.g., user names, security questions, account history data, user preferences, and predefined account settings; o Network parameters 644 for the one or more communication networks 108, e.g., IP address, subnet mask, default gateway, DNS server and host name; o Data processing model(s) 648 for processing content data (e.g., video, image, audio, or textual data) using deep learning techniques; and o Data and results 650 that are obtained by and outputted to the electronic device 600, respectively, where the data includes one or more of historic inertial sensor data 652, historic image data 654, historic pose data 656, and sticker data items 658, and is processed by the HMD 104D, a mobile device 104C, or a remote server 102 to provide the associated results to be presented on the HMD 104D.

[0062] Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, modules or data structures, and thus various subsets of these modules may be combined or otherwise re-arranged in various embodiments. In some embodiments, memory 606, optionally, stores a subset of the modules and data structures identified above. Furthermore, memory 606, optionally, stores additional modules and data structures not described above.

[0063] Figure 7 is a block diagram illustrating a mobile device 104C that is coupled to the electronic device 600, in accordance with some embodiments. The mobile device 104 shares some data processing functions of the electronic device 600, e.g., for enabling extended reality content, when computational and storage resources of the electronic device 600 is limited. Particularly, in some embodiments, the mobile device 104 acts as a trackpad or a signal emitter for enabling user interactions with a user interface displayed on the electronic device 600. The mobile device 104C, typically, includes one or more processing units (CPUs) 702, one or more network interfaces 704, memory 706, and one or more communication buses 708 for interconnecting these components (sometimes called a chipset). The mobile device 104C includes one or more input devices 710 that facilitate user input, such as a keyboard, a mouse, a voice-command input unit or microphone, a touch screen display, a touch-sensitive input pad 710A, a gesture capturing camera, or other input buttons or controls. In some embodiments, the mobile device 104C includes a signal emitter 710B (e.g., an infrared emitter) to interact with the electronic device 600 to control visual objects displayed on a user interface. Furthermore, in some embodiments, the mobile device 104C uses a microphone and voice recognition or a camera and gesture recognition to supplement or replace the keyboard. In some embodiments, the mobile device 104C includes one or more cameras, scanners, or photo sensor units for capturing images, for example, of graphic serial codes printed on the electronic devices. The mobile device 104C also includes one or more output devices 712 that enable presentation of user interfaces and display content, including one or more speakers and/or one or more visual displays.

[0064] Memory 706 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices; and, optionally, includes non-volatile memory, such as one or more magnetic disk storage devices, one or more optical disk storage devices, one or more flash memory devices, or one or more other non-volatile solid state storage devices. Memory 706, optionally, includes one or more storage devices remotely located from one or more processing units 702. Memory 706, or alternatively the non-volatile memory within memory 706, includes a non-transitory computer readable storage medium. In some embodiments, memory 706, or the non- transitory computer readable storage medium of memory 706, stores the following programs, modules, and data structures, or a subset or superset thereof:

• Operating system 714 including procedures for handling various basic system services and for performing hardware dependent tasks;

• Network communication module 716 for connecting the mobile device 104C to other devices (e.g., server 102, storage 106, or electronic device 600) via one or more network interfaces 704 (wired or wireless) and one or more communication networks 108, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;

• User interface module 718 for enabling presentation of information (e.g., a graphical user interface for application(s) 724, widgets, websites and web pages thereof, and/or games, audio and/or video content, text, etc.) at the mobile device 104C via one or more output devices 712 (e.g., displays, speakers, etc.);

• Input processing module 720 for detecting one or more user inputs or interactions from one of the one or more input devices 710 and interpreting the detected input or interaction, where in some implementations, the input processing module 720 includes a mobile controller module 722 for detecting user inputs or interactions for the electronic device 600;

• Web browser module 724 for navigating, requesting (e.g., via HTTP), and displaying websites and web pages thereof, including a web interface for logging into a user account associated with the mobile device 104C, controlling the mobile device 104C if associated with the user account, and editing and reviewing settings and data that are associated with the user account;

• One or more user applications 725 for execution by the mobile device 104C (e.g., games, social network applications, smart home applications, data analysis and visualization applications and/or other web or non-web based applications for controlling another electronic device and reviewing data captured by such devices);

• Model training module 726 for receiving training data and establishing a data processing model 748 for processing content data (e.g., video, image, audio, or textual data) to be collected or obtained by the mobile device 104C;

• Data processing module 728 for processing content data using data processing models 748, thereby identifying information contained in the content data, matching the content data with other data, categorizing the content data, or synthesizing related content data, where in some embodiments, the data processing module 728 is associated with one of the user applications 725, and configured to generate sticker data items 758 and related sticker affordances 308 in association with visual objects of the user application 725;

• Pose determination and prediction module 730 for determining and predicting a device pose of an electronic device 600 coupled to the mobile device 104C based on images captured by a camera 112 and sensor data captured by the IMU 212, where in some embodiments, the device pose is determined and predicted jointly by the pose determination and prediction module 730 and data processing module 728, and the module 730 further includes an SLAM module 732 for mapping a scene where the electronic device 600 is located and identifying the device pose of the electronic device 600 within the scene;

• Pose-based rendering module 734 for rendering virtual objects and sticker affordances on top of a field of view of a camera of the electronic device 600 or creating extended reality content using images captured by the camera;

• Pose data buffer 736 for storing pose data of the electronic device 600 optionally with inertial sensor data and images for the purposes of determining recent device poses and predicting subsequent device poses; and

• One or more databases 738 for storing at least data including one or more of: o Device settings 740 including common device settings (e.g., service tier, device model, storage capacity, processing capabilities, communication capabilities, etc.) of the mobile device 104C; o User account information 742 for the one or more user applications 725, e.g., user names, security questions, account history data, user preferences, and predefined account settings; o Network parameters 744 for the one or more communication networks 108, e.g., IP address, subnet mask, default gateway, DNS server and host name; o Training data 746 applied to train the data processing model 748 in the model training module 726; o Data processing model(s) 748 for processing content data (e.g., video, image, audio, or textual data) using deep learning techniques; and o Data and results 750 that are obtained by and outputted to the electronic device 600, respectively, where the data includes one or more of historic inertial sensor data 652, historic image data 654, historic pose data 656, and sticker data items 658, and is processed by the HMD 104D, the mobile device 104C, or a remote server 102 to provide the associated results to be presented on the HMD 104D.

[0065] Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, modules or data structures, and thus various subsets of these modules may be combined or otherwise re-arranged in various embodiments. In some embodiments, memory 706, optionally, stores a subset of the modules and data structures identified above. Furthermore, memory 706, optionally, stores additional modules and data structures not described above.

[0066] Figure 8 is a flow diagrams of an example method 800 of creating a data item in extended reality, in accordance with some embodiments. For convenience, the method 800 is described as being implemented by an electronic system (e.g., an electronic device 600, a mobile device 104C, or a combination thereof). The method 800 is, optionally, governed by instructions that are stored in a non-transitory computer readable storage medium and that are executed by one or more processors of the computer system. Each of the operations shown in Figure 8 may correspond to instructions stored in a computer memory or non-transitory computer readable storage medium (e.g., memory 606 of the electronic device 600 in Figure 6, memory 706 of the mobile device 104C in Figure 7). The computer readable storage medium may include a magnetic or optical disk storage device, solid state storage devices such as Flash memory, or other non-volatile memory device or devices. The instructions stored on the computer readable storage medium may include one or more of: source code, assembly language code, object code, or other instruction format that is interpreted by one or more processors. Some operations in the method 800 may be combined and/or the order of some operations may be changed.

[0067] The electronic system displays (802), on a display of the electronic device 600 (e.g., an HMD 104D), a user interface 304 of a user application on top of a scene where the electronic device is disposed. The user interface 304 includes one or more visual objects 306, e.g., each having one or more visual elements 404. The electronic system receives (804) a first user selection of a first visual object 306 A among the one or more visual objects 308 of the user interface 304. The first visual object 306 A includes predefined content information 410. The electronic system creates (806) a sticker data item associated with the first visual object 306 A, and the sticker data item includes at least the predefined content information 410. In response to a second user selection of a visual area of the scene, the electronic system displays (808) a user-actionable sticker affordance 308A on top of the visual area of the scene, and the user-actionable sticker affordance 308 A represents the sticker data item associated with the first visual object 306A. In some embodiments, the user-actionable sticker affordance 308A is displayed concurrently with the user interface 304 on the display. In some embodiments, the scene corresponds to a real physical world viewed through the electronic device 600 in an AR or MR environment. Alternatively, in some embodiments, the scene corresponds to a virtual world created by the electronic device 600 in a VR environment. [0068] In some embodiments, in response to the second user selection of the visual area of the scene, the electronic system identifies (810) a sticker-related device pose and a relative sticker location. The visual area of the scene is selected at the relative sticker location in a field of view of the electronic device, when the camera is positioned according to the sticker-related device pose. Further, in some embodiments, the electronic system stores (812) the user-actionable sticker affordance 308 A, the sticker data item, and information of the user application in association with the sticker-related device pose and the relative sticker location, and determines (814) that the visual area appears in the field of view 402 of the electronic device based on the sticker-related device pose and the relative sticker location. In accordance with a determination that the visual area appears in the field of view of the electronic device and that the user application is executed, the electronic device 600 redisplays (816) the user-actionable sticker affordance 308A on top of the visual area of the scene.

[0069] Alternatively, in some embodiments, in response to the second user selection, the electronic system identifies a set of adjacent feature points associated with the visual area in the scene, and associates the user-actionable sticker affordance 308A with the set of adjacent feature points of the scene. Further, in some embodiments, the electronic system stores the user-actionable sticker affordance 308 A, the sticker data item, and information of the user application in association with the set of adjacent feature points. In accordance with the set of adjacent feature points appear in a field of view 402 of the electronic device, the electronic system determines that the visual area appears in the field of view 402 of the electronic device 600. In accordance with a determination that the visual area appears in the field of view of the electronic device 600 and that the user application is executed, the electronic system re-displays the user-actionable sticker affordance 308A on top of the visual area of the scene. [0070] In some embodiments, the user interface 304 of the user application is displayed on a fixed interface location of the display, independently of a variation of a device pose of the electronic device 600. For example, Figures 4A and 4C show that the user interface 304 is displayed on two fields of view 402 and 442 that shift with respect to each other, while covering the same portion of the display of the electronic device 600. Further, in some embodiments, in accordance with a determination of the device pose of the electronic device is within a sticker pose range 312 associated with the visual area of the scene, a portion or all of the user-actionable sticker affordance 308A is displayed on top of the visual area of the scene.

[0071] In some embodiments, in accordance with a determination that a device pose of the electronic device is within a first pose range 312A associated with the visual area of the scene, the electronic device 600 displays a portion of the user-actionable sticker affordance 308A. The portion is less than all of the sticker affordance 308A. In accordance with a determination that the device pose of the electronic device is within a second pose range associated with the visual area of the scene, the electronic device 600 displays all of the user-actionable sticker affordance 308B on the display. Additionally, in some embodiments, in accordance with a determination that the device pose of the electronic device exceeds the first and second pose ranges 312A and 312B associated with the visual area of the scene, the electronic device 600 aborts displaying all of the user-actionable sticker affordance 308A on the display.

[0072] In some embodiments, the user interface 304 of the user application is displayed on top of a fixed interface location of the scene. In accordance with a determination that a device pose of the electronic device 600 is within a third pose range 310A associated with the fixed interface location of the scene, the electronic device 600 displays a portion of the user interface on the display, and the portion is less than all of the user interface 304. In accordance with a determination that a device pose is within a fourth pose range 310B associated with the fixed interface location of the scene, the electronic device 600 displays all of the user interface 304 on the display, and the third pose range is distinct from the fourth pose range. Additionally, in some embodiments, the fixed interface location of the scene and the visual area of the scene have a distance, such that the user-actionable sticker affordance 308A is not displayed concurrently with the user interface 304 on the display. There is no device pose allowing both the user-actionable sticker affordance 308 A and the user interface 304 to be displayed concurrently on the display of the electronic device 600. A device pose of the electronic device 600 has to be adjusted to see the user interface 304 and the user- actionable sticker affordance 308 A sequentially.

[0073] In some embodiments, the user-actionable sticker affordance 308A is displayed with a first portion of the sticker data item, and a second port of the sticker data item has been collapsed and not visible concurrently with the user-actionable sticker affordance 308 A. In response to a user action on the user-actionable sticker affordance 308 A, the sticker affordance 308 is entirely expanded to display the entire sticker data item.

[0074] In some embodiments, the electronic system receives (818) user inputs of user-defined content data related to the first visual object 306A, e.g., via a touchscreen of the mobile device 104C, and the sticker data item includes the user-defined content data related to the first visual object 306A. Further, in some embodiments, in response to a user action on the user-actionable sticker affordance 308A, the electronic device 600 displays the predefined content information and the user-defined content data on the display of the electronic device 600.

[0075] In some embodiments, the electronic device 600 is coupled to a mobile device 104C, and the first user selection and the second user selection are received from the mobile device 104C, e.g., via the touchscreen of the mobile device 104C. Further, in some embodiments, the mobile device 104C is electrically coupled to the electronic device 600 via a wire or via one or more wireless communication networks 108, and includes at least one of a trackpad for receiving the first and second user selections and a signal emitter for providing the first and second user selections.

[0076] In some embodiments, the one or more visual objects 306 of the user interface 304 includes a second visual object 306B. The user-actionable sticker affordance includes a first user-actionable sticker affordance 308 A representing a first sticker data item and associated with a first visual area. The electronic device 600 displays a second user- actionable sticker affordance 308B on top of a second visual area of the scene, and the second user-actionable sticker affordance 308B represents a second sticker data item associated with the second visual object 306B of the user interface 304.

[0077] In some embodiments, the scene includes a first scene, and the visual area including a first visual area. In accordance with a determination that the electronic device 600 is located in a second scene and that the user application is executed, the electronic device displays the user-actionable sticker affordance 308A on top of a second visual area of the second scene. The second visual area is substantially similar to the first visual area. [0078] The method 800 enables a mobile data visualization application in which multiple sticker affordances 308 can be generated and attached to an extended reality environment of the HMD 104D. The extended reality environment is significantly larger than a conventional screen size. These sticker affordances 308 enable efficient multi -view comparison during visual data analysis. Each individual sticker affordance 308 is conveniently associated with a virtual or physical object in the scene of the HMD 104D that is associated with a 3D virtual space. This helps a user recall each stick affordance 308 based on the corresponding virtual or physical object, and the respective sticker data item can be efficiently identified and retrieved. Additionally, the mobile device 104C enables convenient user interactions with the virtual objects and elements displayed on the user interface 304 of a user application displayed on the HMD 104D.

[0079] Alternatively, in some embodiments, an HMD 104D is coupled to a desktop computer having one or more of display devices, mouse, and keyboard. These input devices of the desktop computer are applied to control the user interface 304 and create and control sticker affordances 308. For example, a display device of the desktop computer is surrounded by eight display devices, and a plurality of sticker affordances 308 are displayed on these nine display devices. This setup offers sufficient display space to which the sticker affordances are attached to provide shortcuts to interesting analysis states, and supports efficient interaction enabled by the mouse and keyboard of the desktop computer. The desktop computer provides a virtual 2D interface.

[0080] In various embodiments of this application, an interactive visual data analytics system is created to facilitate data analysis by creating sticker affordances 308 for visual objects 306 displayed on a user interface 304 of a user application. The sticker affordances 308 are displayed in a background view 302 associated with extended reality. The background view 302 is optionally a view of a real physical world in mixed or augmented reality or a view of a virtual world in virtual reality. Extended reality provides 3D display spaces, and associates the sticker affordances 308 with related visual areas. Specifically, a user can attach the sticker affordances 308 to anchor objects in a physical environment in AR and use spatial cues for subsequent sticker retrieval. The anchor objects help users exploiting their spatial memory, while creating and storing the sticker affordances 308 (e.g., by saying “Let’s put this bar chart on the chair”) and while navigating the created sticker affordances 308 (e.g., by recalling “I remember the bar chart was on the chair...”).

[0081] In some embodiments, an extended reality device (e.g., the HMD 104D) includes a camera configured to capture hand gestures, and the hand gestures are applied to interact with user applications executed by the extended reality device. Conversely, in various embodiments of this application, the extended reality device (e.g., the HMD 104D) is coupled to a mobile device 104C, which provides a conventional interaction pattern, enables ease of learning, and introduces a low physical and cognitive load during use. The mobile device 104C is optionally used as a virtual laser pointer when manipulating visual objects of the user interface and/or as a trackpad when manipulating sticker affordances. This allows eye-free interaction with the mobile device 104C and precise selection and efficient input via the mobile device 104C. It also takes advantage of rich and familiar interaction vocabulary of the mobile device 104C to make learning and using the interactive visual data analytics system easy. It is noted that a combination of a laser pointer, a trackpad controller, and a virtual screen can simulate and improve most WIMP (windows, icons, menus and pointers) interactions that a desktop computer provides.

[0082] It should be understood that the particular order in which the operations in Figure 7 have been described are merely exemplary and are not intended to indicate that the described order is the only order in which the operations could be performed. One of ordinary skill in the art would recognize various ways to prune the neural network models as described herein. Additionally, it should be noted that details of other processes described above with respect to Figures 3-5E are also applicable in an analogous manner to the method 700 described above with respect to Figure 8.

[0083] The terminology used in the description of the various described implementations herein is for the purpose of describing particular implementations only and is not intended to be limiting. As used in the description of the various described implementations and the appended claims, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “includes,” “including,” “comprises,” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Additionally, it will be understood that, although the terms “first,” “second,” etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. [0084] As used herein, the term “if’ is, optionally, construed to mean “when” or “upon” or “in response to determining” or “in response to detecting” or “in accordance with a determination that,” depending on the context. Similarly, the phrase “if it is determined” or “if [a stated condition or event] is detected” is, optionally, construed to mean “upon determining” or “in response to determining” or “upon detecting [the stated condition or event]” or “in response to detecting [the stated condition or event]” or “in accordance with a determination that [a stated condition or event] is detected,” depending on the context.

[0085] The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the claims to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain principles of operation and practical applications, to thereby enable others skilled in the art.

[0086] Although various drawings illustrate a number of logical stages in a particular order, stages that are not order dependent may be reordered and other stages may be combined or broken out. While some reordering or other groupings are specifically mentioned, others will be obvious to those of ordinary skill in the art, so the ordering and groupings presented herein are not an exhaustive list of alternatives. Moreover, it should be recognized that the stages can be implemented in hardware, firmware, software or any combination thereof.