Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
TELEMEDICINE OR TELEHEALTH ASSISTING DEVICE AND METHOD
Document Type and Number:
WIPO Patent Application WO/2023/220005
Kind Code:
A1
Abstract:
A method to assist a therapist or medical professional to conduct a session including initializing a video conference session on a therapist or medical professional computing device; receiving a response message from a client or patient computing device confirming the client or patient is engaging in the video conference session; selecting messages to be communicated to the client or patient, the messages including words and expressions; and communicating the selected messages to a robot, wherein an audio output device of the robot speaks the words in the selected messages and a video output device displays the expressions in the selected messages. The method further includes receiving transcriptions of the client's verbal responses to the selected messages spoken by the patient or user; displaying the transcriptions of the client or patient's verbal responses; and repeating the selecting, communicating, receiving and displaying steps until a client patient session has been completed.

Inventors:
KARAPETYAN VAZGEN (US)
MUNICH MARIO (US)
PIRJANIAN PAOLO (US)
SCHERER STEFAN (US)
LANG TIM (US)
Application Number:
PCT/US2023/021456
Publication Date:
November 16, 2023
Filing Date:
May 09, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
EMBODIED INC (US)
International Classes:
G16H80/00; G06N20/00; A61B5/00; G06F16/9032
Foreign References:
US20220047223A12022-02-17
US10943407B12021-03-09
US20180311812A12018-11-01
Attorney, Agent or Firm:
KENDRICK, Mark (US)
Download PDF:
Claims:
CLAIMS

1 . A method to assist a therapist or medical professional to conduct a client or patient session, comprising: one or more processors; one or more memory devices; computer-readable instructions, accessible from the one or more memory devices, and executable by the one or more processors to: initialize a video conference session on a therapist or medical professional computing device; receive a response message from a client or patient computing device confirming the client or patient is engaging in the video conference session; selecting messages to be communicated to the client or patient, the messages including words and expressions; communicating the selected messages to a robot computing device, wherein an audio output device of the robot computing device speaks the words in the selected messages and a video output device displays the expressions in the selected messages; receiving transcriptions of the client or patient’s verbal responses to the selected messages spoken by the patient or user; displaying the transcriptions of the client or patient’s verbal responses to the therapist or medical professional; and repeating the selecting, communicating, receiving and displaying steps until a client patient session has been completed.

2. The method of claim 1, the computer-readable instructions executable by the one or more processors to: receive session audio files and video files from a video conference computing device for the client or patient’s session.

3. The method of claim 2, the computer-readable instructions executable by the one or more processors to: receive audio files from the robot computing device of the client or patient’s session.

4. The method of claim 3, the computer-readable instructions executable by the one or more processors to: receive video files from the robot computing device of the client or patient’s session.

5. The method of claim 4, the computer-readable instructions executable by the one or more processors to: receive video datapoint files from the robot computing device, the video datapoint files generated by the robot computing device based at least in part on the video files of the session captured by a camera of the robot computing device.

6. The method of claim 5, the computer-readable instructions executable by the one or more processors to: store the received transcription files, the received audio files, the received video files and the received video datapoint files in one or more memory devices.

7. A method to assist, a therapist or medical professional in analyzing a client or patient session, comprising: one or more processors; one or more memory devices; computer-readable instructions, accessible from the one or more memory devices, and executable by the one or more processors to: receive video conference files of the client or patient session from a video conferencing cloud computing device: receive video files of the client or patient session from a robot computing device; receive audio files of the client or patient session from the robot computing device; receive audio transcription files associated with the client or patient session from the robot computing device or an audio-to-text cloud computing device; receive video datapoint files associated with the client or patient session from the robot computing device; automatically analyze the received video files to generate session video parameters or metrics; automatically analyze the received audio files to generate session audio parameters or metrics; automatically analyze the received audio transcription files to generate session audio transcription parameters or metrics; automatically analyze the received video datapoint files to generate session video datapoint parameters or metrics; store the audio files, audio transcription files, the video files, video datapoint files, the session video parameters or metrics, session audio parameters or metrics, session audio transcription parameters or metrics, and session video datapoint parameters or metrics in the one or more memory devices.

8. The method of claim 7, the computer-readable instructions executable by the one or more processors to: automatically analyze the session video parameters or metrics; session audio parameters or metrics; session audio transcription parameters or metrics; and/or session video datapoint parameters or metrics to generate aggregated session parameters or metrics.

9. The method of claim 8, wherein the aggregated session parameters or metrics includes a user engagement score, a sentimental score, an overall conversation score, or a vocabulary level score.

10. The method of claim 8, the computer-readable instructions executable by the one or more processors to: generate therapy diagnostic assistance information or medical diagnostic assistance information based at least in part on the aggregated session parameters or metrics.

11. The method of claim 8, the computer-readable instructions executable by the one or more processors to: generate therapy diagnostic assistance information or medical diagnostic assistance information based at least in part on the session audio parameters or metrics, session audio transcription parameters or metrics, session video parameters or metrics, or session video datapoint parameters or metrics.

12. The method of claim 7, the computer-readable instructions executable by the one or more processors to: repeat the receiving and analyzing steps of claim 7 for a plur ality of patients or clients.

13. Tire method of claim 7, the computer-readable instructions executable by the one or more processors to; repeat the receiving and analyzing steps of claim 8 for a plurality of sessions of the client or patient.

13. The method of claim 7, the computer-readable instructions executable by the one or more processors to: identify, by the therapist or medical professional, a selected client or patient and a selected session to review; and retrieve a video conference file associated with the selected client or patient and the selected session from the one or more memory devices.

14. The method of claim 13, the computer-readable instructions executable by the one or more processors to: synchronize the retrieved video conference file for the selected client or patient and the selected session and the received audio files and the received video files for the selected client or patient and the selected session; and display the synchronized video conference file for the selected client or patient session with the video files and audio files for the selected client or patient session.

15. The method of claim 14, the computer-readable instructions executable by the one or more processors to: review the synchronized video conference file and robot video and audio files for the selected client or patient session; and generate client or patient session notes for the selected client or patient session based on the review of the synchronized video conference file and robot video and audio files for the selected client or patient session; and store the generated client or patient session notes for the selected client or patient session.

Description:
UTILITY PATENT APPLICATION

TELEMEDICINE OR TELEHEALTH ASSISTING DEVICE AND METHOD

Vazgen Karapetyan

Inventor! s): Tim Lang

Paolo Pirjanian

Mario Munich

Stefan Scherer

Assignee: Embodied, Inc. a Delaware Corporation

TELEMEDICINE OR TELEHEALTH ASSISTING DEVICE AND METHOD

RELATED APPLICATIONS

[0001] This application is related to and claims priority to U.S. provisional patent application serial No. 63/339,968, filed May 9, 2022, entitled “Telemedicine or Telehealth Assisting Device and Method," and to U.S. provisional patent application serial No. 63/464,916, filed May 8, 2023, entitled “Teletherapy or Telehealth Assisting Device and Method,” the disclosures of which are both hereby incorporated by reference. BACKGROUND

[0002] It is difficult for many therapists or medical professionals to connect with children during a therapy session. In fact, being able to make a connection with the patient is a key element for a successful delivery of therapy. This situation has been further exacerbated by the Covid-19 pandemic, where many therapists have had to move towards telehealth sessions with patients (which are remote sessions via video calls). It is also very difficult to keep a child’s attention during the telemedicine or telehealth therapy session (and given the fact that the therapist is not physically present in the same space where the patient is, many children walk away or just close the video and abort the session, completely disengaging from the therapist).

BRIEF DESCRIPTION OF THE DRAWINGS

[0003] A better understanding of the features, advantages and principles of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, and the accompanying drawings of which:

[0004] FIG. 1 A illustrates a left and center portion of a screen shot of a teletherapy or telehealth assisting software in session mode, in accordance with some embodiments;

[0005] FIG. IB illustrates a right portion of a screen shot of a teletherapy or telehealth assisting software in session mode, in accordance with some embodiments;

[0006] FIG. 2 A show^s a left and center portion of a screen shot of a teletherapy or telehealth assisting software in a review or analyzation mode, in accordance with some embodiments;

[0007] FIG. 2B show s a right portion of a screen shot of a teletherapy or telehealth assisting software in review or analyzation mode, in accordance with some embodiments; [0008] FIG. 3A illustrates a block diagram of a telemedicine system interacting with a user or patient, in accordance with some embodiments; [0009] FIG. 3B illustrates a flowchart of a telemedicine system and method interacting with a user or patient, in accordance with some embodiments;

[0010] FIG. 4 A illustrates a block diagram of a telemedicine system utilized by a medical professional, in accordance with some embodiments;

[0011] FIG. 4B illustrates a flowchart of a telemedicine system and method interacting with a medical professional, in accordance with some embodiments;

[0012] Figure 5 A illustrates a new patient input screen according to some implementations ;

[0013] Figure 5B illustrates a patient screen for therapist or medical professional in a teletherapy or telehealth system according to some embodiments;

[0014] Figure 5C illustrates a cli or patient category screen or menu with a phrases submenu selected according to some embodiments;

[0015] Figure 5D illustrates a client or patient category screen or menu with a submenu selected according to some embodiments;

[0016] Figure 5E illustrates a settings menu of a teletherapy or telehealth system according to some embodiments; and

[0017] Figure 6 illustrates a logical overview of a database in the teletherapy or telehealtli assisting system according to some embodiments.

DETAILED DESCRIPTION

[0018] The following detailed description provides a better understanding of the features and advantages of the inventions described in the present disclosure in accordance with the embodiments disclosed herein. Although the detailed description includes many specific embodiments, these are provided by way of example only and shou ld not be construed as limiting the scope of the inventions disclosed herein.

[0019] Described herein is a teletherapist or telehealth assisting system and method that utilizes a robot computing device to assist a therapist in talking to and evaluating a user or child in a therapy session. Since children may be more open to speaking with a robot computing device rather than to an adult medical professional or therapist, the robot computing device may have controlled conversations with the user or child. This is found to increase engagements w'ith the user or child during telehealth sessions. In these embodiments, the therapist may control actions and/or the speech of the robot computing device. Although the description contained herein may mention a therapist in order to streamline the specification, the methods and descriptions described herein apply to all medical professionals (and even educators). Further, although the specification may refer to telehealth, the methods and systems described herein may also apply to teleconference or other remote interactions between users (including, but not limited to children,) and/or other professionals. In this patent application, the terms teletherapy, telehealth or telemedicine may be utilized interchangeably. The subject matter described herein allows a therapist or a medical professional to remotely control a robot computing device (or a digital companion) when the robot computing device is interacting with a user or a child. In some implementations, the robot computing device’s conversational software or engine may be disabled because the robot computing device is not determining what to say to the user or child (i.e., the therapist or medical professional is making that determination). [0020] FIG. 1A illustrates a left and center screen shot of a teletherapy or telehealth assisting software in session mode, in accordance with some embodiments. FIG. IB illustrates a right screen shot of a teletherapy or telehealth assisting software in session mode according to some embodiments. FIGS. 1A, IB, 2A and 2B are mere representations of user interface screens and many elements may change in future iterations and/or embodiments. The data and files collected and/or presented in FIGS.

1 A, IB, 2A and 2B may be collected in many different ways and each different way may be covered by the subject matter described therein In some embodiments, the teletherapy or telemedicine assisting software and system includes a session mode (where conversations and interactions take place and-'or monitored) and a review' or analysis mode (where recordings and/or transcriptions of the conversations and interactions are analyzed based on a number of factors and/or attributes). FIG. 1 A and FIG. IB illustrate a main screen of a session mode of the telemedicine assisting softw are according to some embodiments. In this patent application, client, patient, child and/or user may at times be used interchangeably. In these embodiments, the session mode involves a user or child, a robot computing device and/or a therapist having a conversation where the robot computing device and a video call session capturing the conversation interaction (both audio and video) between the user or child. In some embodiments, the session mode main screen of the telemedicine software includes a plurality of windows or submenus and input areas. In the implementation illustrated in FIG. 1A and IB, the session mode main screen 100 may include a therapist, prepared messages screen or window 105, an emotion determination and presentation screen or input area 120, a patient session transcription screen or window' 115, a session video screen or submenu 110, a prepared messages screen area or window 135, a session parameter screen or window 130 and/or a user or patient identification screen or window 125.

[0021] In some embodiments, the user or patient identification window or field 125 may identify a user or patient that is participating in the telemedicine session with the therapist. As is illustrated in Figure 1A, a name of the user or patient may be Tim. In some implementations, user or patient identification field or window 125 retrieves user or patient information from the teletherapy or telemedicine cloud computing device (which received the input via the new patient input screen 500). In some embodiments, the session parameter screen or window 130 may provide information about the teletherapy or telehealth session. The session parameter screen or window 130 may further present or display a button or icon turn on or off (or start or end) the teletherapy or telehealt h session and/or software. In some embodiments, as illustrated in Figure 1 A, the session parameters may include a session time or length and/or a time that has expired so far in the session.

[0022] In some embodiments, the session video screen 110 may display video and present audio transmitted from a computing device that is running Zoom software (and/or another video chat or video conferencing software) at the patient or user’s location. In some embodiments, the video chat or video conferencing software may be initiated at the beginning of session with the patient or user by a parent or guardian who is in the same physical location as the patient or user (sometimes the parent or guardian may be in the same room as the patient or user). In these embodiments, a camera or imaging device on the computing device may be focused on the face and/or upper torso of the user or patient. Although Zoom and other 3 rd party video conference or video chat software may be mentioned in this patent application, Embodied, the assignee of the instant application may also provide a proprietary video conference or video chat software, which may also be utilized in the teletherapy or telemedicine assistive software and/or system.

Accordingly, the subject matter claimed herein is not limited to systems running Zoom or other third-party video conferencing or video chat software applications.

[0023] hi some embodiments, the patient session transcription screen or chat screen 115 may display automatically generated transcriptions of communication between the therapist and the user or patient. In some embodiments, the transcriptions may be displayed in real-time or in close to real-time and may be generated by an automatic speech recognition software module. In some embodiments, the emotion determination screen 120 may display an emotion associated with the patient or user. In some embodiments, the emotion determination and presentation screen 120 may be automatically generated based on the speech patterns and/or facial expressions of the patient and/or user. In addition to and/or alternatively to, the emotion determination screen 120 may be generated based on input from the therapist and the therapist’s determination of the emotion based on review of the video and/or audio that the therapist is hearing and seeing. In some embodiments, an emotion determination and presentation screen 120 may include three parts. In some embodiments, a top part of the emotion determination and presentation screen 120 may show a determined emotion for the user. In Figure 1 , the determined emotions for the user display may be happy, sad, neutral and/or normal. In some embodiments, a middle part of the emotion determination and presentation screen may show what is being communicated to the robot computing device to audibly reproduce. In some embodiments, a bottom part of the emotion determination and presentation screen 120 may show a robot computing device emotion (or the emotion that the robot computing device is supposed to be using when communicating to the user). In Figure 1, the robot computing device’s emotion may be displayed on the emotion determination and presentation screen 120. In Figure 1 A and Figure IB, the robot computing device’s emotion may be happy, sad, mad, surprised, neutral and/or apprehensive.

[0024] In some embodiments, the therapist may have previously input, into the teletherapy and telemedicine assisting software 100, messages and/or themes that may be utilized in the session with the patient. In some embodiments, the telemedicine software 100 may display these prepared messages in the prepared messages screen or window 135. In some embodiments, these messages may include a prepared script that the therapist would like to engage in with the user or patient. In some embodiments, these messages may include other themes that the therapist would like to discuss with the user or patient. In some embodiments, the telemedicine assisting software 100 may have some preexisting themes and/or messages that the therapist may select from during an evaluation session. In some embodiments, these may be activities selected from the activities submenu or screen 552 and/or phrases from the phrases submenu or screen 551. In some embodiments, the teletherapy and telehealth software 326 in the Embodied cloud server 325 may include a number of sessions themes and/or other session overall transcripts that a therapist or medical professional may utilize in evaluation sessions with the users or children. This may be referred to a library of session theme transcriptions and/or other session overall transcripts or a library of session themes or existing sessions. In some embodiments, the “library” of themes or sessions, may include an ASD session, an ADHD session, and/or an anxiety session. In some embodiments, the library of themes or sessions may include a cognitive theme session, a behavioral theme session, a visual theme session, an auditory theme session, and/or some other stimuli theme sessions. In some embodiments, a therapist can select from this pre-existing library of sessions and use the session as is, and/or modify the session to personalize or slightly change the existing session (which may be available to select from the activities submenu or screen 552). In some embodiments, a therapist or medical profession may create a new library element or session . In some embodiments, the therapist or medical professional may also establish new libraries for their own work. In some embodiments, a session or library marketplace may be generated and/or established so that therapists or medical professions may exchange their newly created sessions, libraries or library components. These newly created elements may include transcripts of activities that a user or patient can engage in, emotions shown by the robot computing device during these activities and/or images shown on a display of the robot computing device during these activities or sessions.

[0025] In some embodiments, the telehealth system or software 326 may also include a session generation system or module that may automatically create new sessions by analyzing prior sessions of one or more therapists and identifying successful themes and/or conversations. In some embodiments, for example, the session generation system or module may automatically generate an ADHD session by analyzing a number of therapist’s ADHD sessions that have been uploaded to telehealth software 326 on the Embodied cloud server. In these embodiments, the preexisting themes and'or messages may be displayed in the backlog or existing message screen and/or window 135. In some embodiments, the text messages displayed in the backlog or existing message screen 135 may include greeting messages, goodbyes messages, conversation filler messages, question messages, social messages, cognitive messages, and/or emotional messages, as is illustrated in FIG. 1A and FIG. IB. These newly created sessions may include transcripts of activities that a user or patient can engage in, emotions shown by the robot computing device during these activities and/or images shown on a display of the robot computing device during these activities or sessions.

[0026] FIG. 2A shows a left and center screen shot of a teletherapy or telehealth assisting software according to some embodiments. FIG. 2B shows a right screen shot of a teletherapy assisting software in a review or analysis mode, in accordance with some embodiments. In some embodiments, the telehealth or teletherapy assisting software in review or analyzation mode 200 may including a number of windows or screens. In the review or analysis mode, the telehealth or teletherapy assisting software and/or system may allow a therapist to review a prior session’s zoom/video chat audio and video, the robot computing device audio and video and/or the transcription of the conversation during the session. In some embodiments, the leleheallh assisting software and system 200 may include a button or menu 220 that identifies a user or child (e.g., in FIG. 2A, the user is Paolo), and a home icon 221 to return to a telehealth assisting software home screen and/or a layouts icon (where different layouts of the telehealth assisting software may be selected). In some embodiments, the telehealth assisting software 200 may include a chat screen or conversation screen 205. In some embodiments, the chat screen or conversation screen 205 may include transcriptions of the conversation between the therapist and user or child. In these embodiments, the chat or conversation screen 205 may display what the child says in one color or font and vrhat Moxie or the robot computing device is being instructed to say in another color or font in order to distinguish between the child or user’s conversation and the therapist or medical professional’s (and thus the robot computing device’s conversation).. In some embodiments, the chat or conversation screen 205 may include a comment button (at the bottom of the chat or conversation screen) that allows a therapist or medical professional to input comments with regard to the conversation that occurred between the robot computing device and/or the user or child.

[0027] hr some embodiments, the telehealth or teletherapy assisting software 200 may include a zoom video, videoconferencing or videochat window 215. In some embodiments, the Zoom video or videoconferencing ivindow 215 may display a video and audio of the user or child during the therapist video session. In this application, the terms video conferencing, video call and/or video chat may be utilized interchangeably. In some embodiments, the telehealth assisting software 200 may include a moxie or robot computing device video window or menu 210. In some embodiments, the Moxie or robot computing device video window or menu 210 may present a video and/or audio that Moxie or the robot computing device captured during the session with the user or child. In some embodiments, the telehealth assisting software 200 may include a timeline window or menu 225. In some embodiments, the timeline window or menu 225 may identify a time or temporal point in the session. In response, the Zoom video window' 215, the Moxie / robot computing device 210 and/or the chat or conversation window 205 may present or display the corresponding audio and video and speech transcriptions. In other words, the video chat or video conferencing video window 215, the Moxie / robot computing device 210 and/or the chat or conversation window may be synchronized with each other or at least approximately near the same point in the session. In some embodiments, the timeline window or screen 215 may allow a therapist to determine which audio (e.g., the Zoom audio or the robot computing device audio) may be presented or reproduced in the telehealth or teletherapy software or system (or else the audio captured during the Zoom session and the audio captured by the robot computing device may be playing at the same time). In some embodiments, the timeline window or screen 215 may display a volume and/or other characteristics of the user or child’s speech over time in the session, or may display different time portions of the timeline, a speaker icon or button, a volume icon or button, a play icon or button, and/or a stop icon or button.

[0028] In some embodiments, a therapist or medical professional may select a portion of the timeline and this selected portion of the timeline may be enlarged to show only the selected portion of the timeline. Alternatively, the selected portion of the timeline may be presented in a new' window. In these embodiments, the therapist may then be able to see the emotion and/or the emotion of the child in more detail. This allows a therapist or medical professional to provide a more detailed and/or accurate evaluation.

[0029] As an example, the therapist or medical professional may include notes such as “the client was very engaged" and/or “we’ve seen improvement managing the child’s ADHD." In some embodiments, the telehealth software may automatically analyze conversation transcriptions, the video chat audio and video, and/or the Moxie captured audio and video and then evaluate biomarkers, emotion and/or activity detection in these three inputs. In some embodiments, based on this analysis, the teletherapy or telehealth software may automatically highlight potential important timeframes in the session that may be presented in the review' screen or window' so that the therapist or medical professional may review' this specific session timeframe. In some embodiments, the telehealth software may refer to this module or code as an automatic highlight module. In some embodiments, the automatic highlight module may analyze a plurality of sessions and develop an Al-enhance and automatic highlight module that identifies important moments and/or timeframes in analyzed session.

[0030] In some embodiments, the user or child metrics submenu or window' 235 may' include session metrics for the user and/or child. In some embodiments, a scoring module may generate the session metrics and/or parameters of the session. In some embodiments, the scoring module of the telehealth or teletherapy software may analyze the audio and video from the video conference session, the audio and video from the robot computing device or Moxie, the audio transcriptions and/or the facial datapoints to generate the session metrics and/or parameters. In some embodiments, the session metrics and/or parameters may be directed to specific actions (e.g., number of smiles or frowns, response time in conversations, number of words spoken, speed of speech) or to overall session measurements (e g., an engagement score, an overall score, and/or a sentimental score). In other cases, other metrics displayed maybe reading or vocabulary level, speech speed, speech clarity. In some cases, a scoring module of the telehealth software may generate metrics and/or parameters including an overall emotion score, a number of breakdown episodes, an eye contract score, and/or exercise scores for different exercises. In some embodiments, the telehealth software’s scoring module may analyze the audio and video from the video conference or video chat session, the audio and video from the robot computing device or Moxie, the audio transcriptions and''or the facial datapoints to generate the session metrics and/or parameters for a plurality of sessions. In some embodiments, the scoring module may create an Al-enhanced scoring module based at least in part on the plurality of session scores. In some embodiments, the Al-enhanced scoring module may then be utilized to generate session metrics and/or parameters to provide a therapist or medical professionals with suggestions on different scores for the session.

[0031] Figure 3A illustrates a block diagram of a telehealth or teletherapy assisting system in a user interaction mode according to some embodiments. In some embodiments, the teletherapy telehealth assisting system 300 may include a user computing device 305, a robot computing device 315, an automatic speech recognition (ASR ) server and/or computing device 320, a zoom or videoconference server and/or computing device 330, a cloud server and/or computing device 325 (which may be an Embodied cloud computing device) including telehealth or teletherapy software 326, and/or a therapist computing device 335. In some embodiments, a therapist 337 may utilize a therapist computing device 335 and a user computing device 305 to initially communicate with the user or child 310 (or the parent) via a zoom or videoconference session. In some embodiments, the zoom or videoconference session on the zoom server 330 may also capture video and/or audio of the session with the user or child. In some embodiments, the user or patient 310 may interact and/or communicate with the robot computing device 315 in conversations that are captured by one or more cameras 316, and/or one or more microphones 317 of the robot computing device when the robot computing device 315 is under control of the therapist 337. In some embodiments, the robot computing device 315 may capture conversations (e.g., audio files) between the user or patient 310. In some embodiments, the robot computing device 315 may communicate the captured raw audio files to the cloud server and/or computing device 325 for later processing and/or analysis after a teletherapy or telehealth session may be completed. In some embodiments, the robot computing device 315 and may also communicate the raw audio files to the automatic speech recognition server and/or computing device 320 (which may occur in real-time or close to real-time to generate transcripts). In some embodiments, the automatic speech recognition server and/or computing device 320 may generate audio transcripts from the captured audio files and may communicate the generated audio transcript files to the cloud server and/or computing device 325. In some embodiments, the robot computing device may generate and communicate the raw video files from the conversations to the cloud server and/or computing device 325 for later processing by the teletherapy or telehealth assisting software 326. In some embodiments, the robot computing device 315 may communicate the raw video files to the cloud computing device 325 after a session has been completed. In some embodiments, the raw audio files and the raw video files from the robot computing device 315 may be combined with each other.

In some embodiments, the cloud server and/or computing device 325 may include telehealth or teletherapy software 326. In some embodiments, the telehealth software 326 may be computer-readable instructions stored in one or more memory devices 327 of the cloud server and/or computing device 325 and/or may be executable by one or more processors of the cloud server and/or computing devices 325. In some embodiments, the therapist 337 may utilize the therapist computing device 335 to communicate with the cloud server and/or computing device 325 to interface with the teletherapy or telehealth software 326. In some embodiments, the therapist 337 may utilize the therapist computing device 335 to communicate with the user 310 via a zoom session by communicating with a zoom or video conferencing server and/or computing device 330. In these embodiments, the zoom or video conferencing server and/or computing device 330 may communicate with the user computing device 305. In some embodiments, the zoom or video conferencing server computing device 330 may initially store video and/or audio of the Zoom or videoconference session. In these embodiments, the zoom or video conferencing server computing device 330 may communicate the captured session video and/or audio to the Embodied server computing device 325 where the captured session video and/or audio may be stored in one or more memory devices. In some embodiments, the Zoom server or computing device 330 may communicate the captured session video and/or audio files to the cloud computing device or server 325 after completion of the user or patient telehealth or teletherapy session. In these embodiments, the Zoom or video conferencing server computing device 330 may delete the session’s video and/or audio after transferring the captured session to the Embodied server and/or computing device.

[0032] In some embodiments, the user computing device 305 may be a laptop computing device, a tablet computing device, a desktop computing device and/or other portable computing devices. In some embodiments, the therapist computing device 335 may be a laptop computing device, a tablet computing device, a desktop computing device and/or other portable computing devices. In some embodiments, the user computing device 305 and the therapist computing device 335 may be computing devices that have large enough screens to show videos from the robot computing device 315 and/or the zoom computing device or server 330 in order for the therapist or medical professional to view the videos. In some embodiments, the telehealth or teletherapy cloud computing device 325 may communicate specific files, data, parameters and/or metrics to a medical facility, hospital, therapy practice, or other medical related computing device 338. This may include an electronic health record (HER) system.

[0033] Figure 3B illustrates a flowchart of a teletherapy or telehealth assisting system and software in a user interaction or session mode according to some embodiments. In some embodiments, in step 340, in order to initiate the telehealth session, a user or parent may initialize a robot computing device (or Moxie) 315 and/or may also initialize a user computing device 305. In some embodiments, the parent or guardian on the user computing device 305 may also initialize Zoom software and/or videochat software in order to be ready when the therapist may initiate the Zoom or videochat session on the therapist computing device 335.

[0034] In some embodiments, in step 345, a therapist may initialize teletherapy or telehealth software on a therapist computing device 335. As part of the initializing the telehealth software, in step 350, the therapist or medical professional may initialize or communicate with a zoom or videochat recording session with the patient or user on the user’s computing device 305. In some embodiments, the therapist may have previously sent an email and/or invitation to the parent of the user inviting them to a vi deochat or Zoom session at a specific time and day. In some embodiments, in step 355, a parent of the patient may either login to the zoom or videochat session at the scheduled date and/or time via the supplied link and/or may respond to a request from the therapist for a zoom or videochat recording session utilizing the user computing device 305.

[0035] In some embodiments, in step 357, the robot computing device 315 may communicate the raw audio and/or raw video to the cloud server computing device 325. In some embodiments, the robot computing device 315 may communicate the raw audio and/or raw video files to the cloud computing device 325 after a teletherapy or telehealth session is over. In some embodiments, the therapist 337 may utilize the therapist computing device 335 to interface and/or interact with the cloud server computing device 325 and run the teletherapy or telehealth software 336. In one embodiment, the therapist may see the audio and video from the Zoom or videoconference call. In an alternative embodiment, the therapist 337 may be able to see both the 1) zoom or videoconference video and/or audio and/or 2) the robot computing device video and/or audio, but they may not be fully synchronized because both video streams may be live streamed and may not be transported across the same networks. In some embodiments, the therapist might be able to select which video stream to be shown on window 110 of Figure 1 A. In some embodiments, the therapist 337 might be able to switch between video streams to better understand the emotional state or the behavior of the patient.

[0036] In some embodiments, in step 360, the therapist may begin to engage the child and/or user in a conversation. As an example, in step 360, the therapist may utilize existing messages and/or prepared messages (as discussed with respect to Figures 1 A and IB) to communicate with the user or child. In these embodiments, the therapist may select an existing message from the list in the backlog message or existing screen and/or window 135 (e.g., greetings message or a social message) or the prepared messages screen and/or window 105 (e.g., “That’s right” or “Very' Good”). In other embodiments, in step 360, the therapist may type in a message to the chat submenu or window 120 and this input may be communicated to the user via the robot computing device 315. In some embodiments, the therapist 337 may enter the desired phrase into the teletherapy or telehealth software on therapist computing device 335 which communicates the input or selected phrase to the cloud server computing device 325, which in turn communicates the selected phrase with the robot computing device or Moxie 315. In some embodiments, the therapist might select a particular emotion to be shown when presenting the input, or selected phrase through Moxie and a file including a particular emotion selection may be communicated to the robot computing device 315. In some embodiments, the therapist or medical professional may also input for a patient or client to engage in a specific activity (e.g., exercise, animal breathing, meditation) and these may be semi-autonomous activities.

[0037] In some embodiments, in step 365, the robot computing device or Moxie 315 may speak or audibly reproduce the input or selected phrase along with associated facial expressions and/or gestures based on the received input messages from step 360. In some embodiments, the robot computing device 315 may begin to engage in the selected activity or semi-autonomous activity. Tn some embodiments, in step 367, the user or child may respond to the robot computing device’s speech and/or gestures by speaking, making gestures, engaging in the activity, and/or expressing emotions. In some embodiments, in step 370, the robot computing device may capture raw audio files based on the user’s speech utilizing one or more microphones 317. In some embodiments, in step 370, the robot computing device may capture raw video files that captures gestures and/or facial expressions made by the user during the response utilizing the one or more imaging devices 316 of the robot computing device. In some embodiments, in step 375, the robot computing device may communicate the raw audio files in real-time to an automatic speech recognition system 320 which may generate audio transcription files. In some embodiments, the automatic speech recognition system 320 may communicate the audio transcription files to the cloud server computing device 325. In some embodiments, the automatic speech recognition system 320 may communicate the audio transcription files back to the robot computing device 315, which will then communicate the audio transcription files to the cloud server computing device 325. In alternative embodiments, the automatic speech recognition system 320 may be located on a same computing device as the cloud server computing device 325 and/or owned by the same company. For example, Embodied may develop their own automatic speech recognition engine and it may be located on the same cloud server computing device as other Embodied software. In some embodiments, in step 380, the robot computing device 325 may generate facial datapoints from the captured raw video files. In some embodiments, the facial datapoint generation may occur on the cloud server computing device 325. In some embodiments, in step 385, the robot computing device may communicate the raw video files and the raw audio files to the cloud server computing device 325 (which may occur after the teletherapy or telehealth session has ended). In some embodiments, in step 385, if the facial datapoint files and/or the audio transcript files are generated and/or stored on the robot computing device 315, the robot computing device may communicate the facial datapoint files and/or the audio transcript files to the cloud server computing device 325 (which may occur after the teletherapy or lelehealth session has ended), hr some embodiments, the raw video files, the raw audio files, the facial datapoint files and/or the audio transcript files may be sent together after a teletherapy or telehealth session has ended. The data files described directly above may then be analyzed as discussed in detail below. In some embodiments, in step 390, the zoom computing device 330 may communicate and/or transmit the zoom audio and video files to the teletherapy or telehealth cloud computing device 325.

[0038] In some embodiments, the parent of the user or child may also have a parent software application stored on a parent mobile computing device (e.g., a mobile communication device, a smart phone, or a tablet computing device). In these embodiments, the parent may have initially set up Moxie or the robot computing device 315. In some embodiments, the parent software application may store specific user information. In some embodiments, a parent (and thus a user and child) may have a user account on the Embodied cloud server device 325 that may also store facial datapoint information (from non-telehealth sessions), audio transcript information (from non- telehealth sessions), activity data (e.g., such as how long children are using Moxie, if children read a book with the robot), insight data (such as improvements in language skills, how long the child was engaged with the robot, number of words read per minu te ), and/or telemetry/sensor data (for the robot computing device 315). In some embodiments, this user account (and the information identified above) may be stored separately (in a different logical memory location) from the teletherapy or telehealth session data. In some embodiments, the teletherapy or telehealth session data may be stored in a different cloud server computing device 325 than a cloud server computing device where regular robot computing device user account parameters and/or user account data (insight data, sensor/telemetry data, facial datapoint information, audio transcript information, and activity data) are being stored.

[0039] In some embodiments, the information generation during a telehealth session may be stored in a user telehealth account in the embodied cloud server computing device 325. This teletherapy or telehealth session information may be labeled with and/or associated with a unique session ID number, hi other embodiments, the teletherapy or telehealth session information may be labeled with and/or associated with a unique therapist ID number and/or a unique patient ID number.

[0040] In these embodiments, the therapist and/or medical professional may engage in an extended conversation with the user and/or the child, hi these embodiments, the therapist and'or medical professional may select different phrases to speak to the user from the prepared phrases, the different system phrases and/or phrases generated based on the responses the user has already provided in the session (as well as to engage in specific activities). In response, the user may respond to the spoken phrases and the robot computing device may capture the raw audio files and/or the raw video files as is discussed above. In addition, facial datapoint files and/or speech transcription files may be generated from the raw video files and/or the raw audio files. This sequence of steps 360 to 390 may be repeated until the session and/or the conversation between the therapist and the user or child ends.

[0041] As an example of a conversation that may take place between the medical professional or therapist and'or the user is as follows. In some embodiments, the therapist may ask some questions through the robot computing device or Moxie. Therapist: How have you been doing; Child: I am doing good; Therapist: Anything exciting going on at school; Child: We are learning about art and I am really excited. Therapist: That is great, do you like do draw?; Child: Yes and also to paint; Therapist: Any questions you want to ask me; Child: Not right now. Tn some embodiments, the therapist may then engage in an activity, such as a regulation activity. In some embodiments, an illustrative regulation activity may include the robot computing device giving a command with a deeper instruction from the therapist. For example, the therapist may utilize Moxie to give commands to walk the child through a yoga activity, a movement activity, and/or a sensory activity to settle the child down (e.g., which may be referred to as regulating the child). As another example, the therapist may utilize the robot computing device or Moxie to work on meeting a child’s gross motor goals (e.g., utilizing strengthening exercises). In these examples, a parent or guardian may assist the user or child in performing the action and/or activity, if necessary. In some embodiments, the therapist may then utilize Moxie or the robot computing device to walk the child through a cognitive behavioral therapy (CBT) activity. For example, this may be a zone of regulations activity, where the robot computing device may ask questions and lead the activity' with the child (with some therapist if needed). As an illustrative example, a child may have a regulation bingo card. In some embodiments, the robot computing device may walk the child through the different zones and emotions of the regulation bingo card. For example, the robot computing device may ask the user what the emotions were in the blue zone, red zone, green zone and/or yellow zone. In some embodiments, the robot computing device may engage in a game of regulation bingo with the child. As an example, the robot computing device may call out a name of an emotion (e.g., sadness). The child may then find the picture, place the picture on the associated picture on the bingo card and then tell the robot computing device what zone the picture and/or image is in. In some embodiments, the bingo game between the user and the robot computing device continues until the user has bingo. In these embodiments, once the child has bingo, the child may name off the emotions on the cards and squares to the robot computing device. In some embodiments, if the child needs a break during the CBT activity' in order to regulate, the robot computing device can perform their favorite regulation activity' (or the child can engage in the regulation activity' themselves). In some embodiments, the regulation activity may be a movement break, a music break, a dance break and/or a sensory' break. In some embodiments, a therapist may determine an end of session co-regulation activity. In some embodiments, for example, the robot computing device may lead the child through a breathing exercise as a regulation activity. In some embodiments, the robot computing device may end the session with an affirmation for the session (e.g., I am in control of my emotions) and may request that the user repeat and/or agree with the affirmation. In

[0042] FIG. 4A illustrates a block diagram of a teletherapy' or telehealth assisting system utilized by a therapist medical professional, in accordance with some embodiments. In some embodiments, a therapist or a medical professional may be utilizing a teletherapy or telehealth assisting system along with a robot computing device in order analyze a child’s interaction with the robot. In some embodiments, the teletherapy or telehealth assisting system 400 may be operating in an analysis or reviewing mode after a session or multiple sessions have occurred with the user or child. In some embodiments, as illustrated in Figure 4A, telemedicine or telehealth software 410 may' be installed on a cloud server computing device 405 or (Embodied cloud server 405) in order to analyze prior teletherapy or telehealth sessions (e.g., it may be stored in one or more memory' devices 327 of the cloud computing device 405). In some embodiments, as illustrated in Figure 4A, the teletherapy' or telehealth software 410 may receive one or more inputs from the therapy sessions via a robot computing device (not shown in Figure 4A), or from multiple robot computing devices, as well as from a zoom or video conference server computing device 447. In some embodiments, the one or more inputs may be raw audio files and video files of a session from the zoom or video conference server computing device 406, raw audio input files 401 from a Moxie or robot computing device; and/or audio transcript files 402 from an automatic speech recognition server or a robot computing device. Tn some embodiments, the one or more inputs may be video input files 404 from a robot computing device and/or one or more video datapoint files 403 from a robot computing device.

[0043] In some embodiments, the session zoom or video conferencing session files 406 from the zoom or video conferencing server computing device 435 may be stored in the zoom or video conferencing video storage 420 of the cloud server computing device 405. The zoom or video conferencing session files 406 may include audio files and/or video files. In some embodiments, the raw video files 404 captured by the robot computing device may be stored in the robot video storage 422 of the cloud server computing device 405. In some embodiments, the transcript files 402 may be stored in the transcript storage 425 of the cloud server computing device 405. In some embodiments, the video datapoint files 403 may be stored in the datapoint storage 426 of the cloud server computing device 405. In some embodiments, the raw audio files 401 from the robot computing device may be stored in the robot audio storage 421 of the Embodied cloud server computing device 405. Although zoom storage 420, robot video storage 422, robot audio storage 421 , transcript storage 425, datapoint storage 425 are shown as separate blocks in Figur e 4A, all or much of the session-related data may be stored storage modules may be located in a database in a cloud computing device 405 and also may be indexed or associated with a unique session identifier. In some embodiments, a storage area in a cloud storage device may contain or include most if not all session data te.g., audio input files 401, transcript audio input files 402, video input files 404 and video datapoint input files 403, audio expression datapoints, audio expression parameters). Other identifiers such as therapist identifiers and/or patient identifiers may also index the files listed above (e.g., audio input files 401, transcript input files 402, video input files 404 and datapoint input files 403).

[0044] Figure 6 illustrates a logical structure of a database in a teletherapy or telehealth system according to some embodiments. In some embodiments, the database may be stored in one or more memory devices 327 of the teletherapy or telehealth clouding computing device 325. If a session is completed with a therapist or a medical professional utilizing the telehealth or teletherapy assisting system, patient or client and session data may be stored and/or indexed according to the therapist or medical professional and/or the unique session. This includes some or all of the files described above. As an example, as shown in Figure 6, the teletherapy or telehealth system may include a database having a logical database structure 600. Although only two therapists are shown in Figure 6, the database may be scaled and/or expanded to accommodate a large number of therapists or medical professionals, e.g., three to 1,000,000. In Figure 6, the database illustrates two therapists (i.e., therapist 610 and 630) sessions and/or associated data, information, parameters and/or metrics. In some embodiments, a therapist A 610 may have two patients that have session data, information, parameters and/or metrics included in the database (e.g., patient 1 615 and patient 2 620). In these embodiments, patient 1 615 may have participated in two sessions (e.g., session 1 and 2) and patient 2 620 may have participated in two sessions (e.g., session 1 and 2). In these embodiments, patient i’s data for session 1 may be stored in data storage or bucket 616 and patient 1 's data for session 2 may be stored in data storage or bucket 617. Similarly, in these embodiments, patient 2’s data for session 1 may be stored in data storage or bucket 621 and patient 2’s data for session 2 may be stored in data storage or bucket 622. The session data, information, parameters and/or metrics may be indexed via any of the indexing identifiers described above or below. Each of the data storage areas or buckets may have a separate or unique identifier. In some embodiments, therapist B 630 may have two patients (e.g., patient 1 of therapist B 635 and patient 2 oftherapist B 640). Tn these embodiments, patient 1 of therapist B 635 may participate in two sessions with the teletherapy or telehealth system and patient 2 of therapist B may participate in three sessions with the teletherapy or telehealth assisting system. In these embodiments, patient 1 ’s data of therapist B for session 1 may be stored in data storage or bucket 636 and patient 1 ’s data of therapist B for session 2 may be stored in data storage or bucket 637. Similarly, in these embodiments, patient 2’s data of therapist B for session 1 may be stored in data storage or bucket 641, patient 2’s data of therapist B for session 2 may be stored in data storage or bucket 642 and patient 2’s data of therapist B for session 3 may be stored in data storage or bucket 643. The session data, information, parameters and/or metrics may be indexed via any of the indexing identifiers described above or below.

Each of the data storage areas or buckets may have a separate or unique identifier. Again, these are logical representations of how the data is stored in the database and/or the physical structure of the database may be presented or structure differently. [0045] In some embodiments, the raw video files 404 may be input to a video analysis module 41 1 of the telehealth software 410 and the video analysis module 411 may generate video parameters and/or metrics. In some embodiments, the video parameters and/or metrics may be input into an automatic assessment module 414. In some embodiments, the raw audio files 401 may be input into an audio analysis module 413 and the audio analysis module 413 may generate audio parameters and/or metrics. In these embodiments, the generated audio parameters and/or metrics may be input into an automatic assessment module 414. hi some embodiments, the facial datapoint files 403 may be input into a video datapoint module 409 and the video datapoint module 409 may generate datapoint parameters and/or metrics. In some embodiments, the video analysis module 41 1 might create additional datapoints that might be shared with the datapoints module 409 to be integrated with the datapoint files 403 to generate datapoint parameters and/or metrics. In some cases, the video analysis module 411 may also create biomarkers and/or biometrics from the raw video files. In some embodiments, the generated datapoint parameters and/or metrics may be transferred and/or input into the automatic assessment module 414. In some embodiments, the audio transcript files may be input into a text analysis module 412 and transcript parameters and/or metrics may be generated. In some embodiments, the text analysis module 412 may communicate the generated transcript parameters and/or metrics to the automatic assessment module 414. In some embodiments, the audio transcript files may be stored in the transcript storage 425. Tn some embodiments, the facial datapoint files may be stored in the facial datapoint storage 426. hi some embodiments, the automatic assessment module 414 may analyze the generated audio parameters and/or metrics; the generated datapoint parameters and/or metrics; the generated video parameters and/or metrics and/or the generated audio transcript parameters and/or metrics and may generate overall session parameters anchor metrics for the user session.

[0046] In some embodiments, the automatic assessment module 414 may combine all or some of the metrics described above in order to generate user engagement parameters and/or metrics, user eye contract and/or eye gaze parameters and/or metrics, user gesture identifiers, user gesture parameters and/or metrics, user expression identifiers or parameters and/or metrics, user mood identifiers or parameters and/or metrics. In some embodiments, the automatic assessment module 414 may combine all or some of the above-identified metrics and'or parameters in order to generate user intonation parameters and/or metrics; and user pitch parameters and/or metrics. In some embodiments, the automatic assessment module 414 may also generate scores based on wording choice parameters; and/or determination of slight variations in interaction with robot computing device and/or therapist. In some embodiments, the automatic assessment module 414 may analyze and/or combine some or all of the above-identified parameters and/or metrics in order to analyze and/or determine a user’s overall conversation judgment. In some embodiments, the automatic assessment module 414 may be continuously monitoring and/or analyzing the audio files, video files, transcript files and/or facial datapoint files and providing this information to enhance and aid a therapist in judging or analyzing a user session. In some embodiments, the automatic assessment module 414 may communicate and/or interface with a timeline module to provide the parameters and/or metrics described above for the one or more timepoints in the session. In some embodiments, an automatic marking module may mark a transcription of the session and/or also the timeline in the session to identify areas of interest and/or further study for the therapist. This is illustrated in Figure 2 by reference number 207 in the transcription and reference number 208 in the timeline. In some embodiments, the therapist computing device 430 may also communicate with the automatic assessment module 414 with additional markings and/or parameters input by the therapist after the therapist participates in and/or reviews the user or patient session. Please note that everything described above with respect to the automatic assessment module 414 may be occurring automatically and in the background during a session with the child and/or after the child or user has completed the session. As previously discussed, the Al-enhanced automatic highlight module may be utilized to mark the session. There is no human intervention involved in this assessment.

[0047] In some embodiments, the automatic assessment module 414 may analyze the generated audio transcription files to determine expressions and feelings of the user and/or child during a teletherapy or telehealth session with the user or child. In doing this, the automatic assessment module 414 may generate audio expression datapoints from the audio transcriptions files. In some implementations, the audio expression datapoints may be stored in one or more memory devices or a database of the cloud server computing device 405 and the audio expression datapoints may be indexed or associated with the unique session identifier.

[0048] In some implementations, computer-readable instructions may be executable by one or more processors (e.g., the automatic assessment module 414) in the cloud server computing device 404 to combine facial expression datapoints or parameters, audio expression parameters, and/or audio expression datapoints to generate combined user expression parameters and/or datapoints. In these implementations, the combined user expression parameters and/or datapoints may be utilized by the automatic assessment module 414 of the cloud sever computing device 405 to determine aggregated expression and/or feeling parameters of the user and/or child. In some implementations, the aggregated expression and/or feeling parameters may be stored in one or more memory devices or a database of the cloud server computing device 405 and the audio expression datapoints may be indexed or associated with the unique session identifier.

[0049] In some embodiments, the therapist 441 may utilize a therapist computing device 440 to run or execute the telehealth or telemedicine assisting software and interface with the telehealth software 410 on the cloud server computing device 405. In Figure 4A, the therapist may be utilizing the telehealth software in an analyzation and/or assessment mode. In some embodiments, the therapist may initiate the software screens or windows described in Figure 2 above. In some embodiments, the therapist may make comments or notes regarding the session reviewed in the telehealth software 410. In some embodiments, the comments and/or notes for the session may be stored in the therapist module 430 and may be associated with a specific user session and a unique session identifier. In some embodiments, the session comments and/or notes may be stored in a session storage area or bucket and may be indexed and/or identified by the unique session identifier. In some embodiments, the user engagement parameters and/or metrics, user eye contract and/or eye gaze parameters and/or metrics, user gesture identifiers, user gesture parameters and/or metrics, user expression identifiers or parameters and'or metrics, user mood identifiers or parameters and/or metrics may also be stored in a session storage area or bucket and may be indexed and/or associated with the unique session identifier . In some embodiments, the same session storage area or bucket (as shown in Figure 6) that holds the above-identified metrics and parameters and the session comments or notes files, may also include the raw audio and video teleconference files, the robot computing device raw audio files, the robot computing device raw video files, the audio transcript files, the video datapoint files, the audio expression datapoints, the audio expression parameters, and''or the aggregated expression and/or feeling parameters. As discussed with respect to Figure 6, all or some of this data may be stored with a unique session ID and be associated with the patient or client and/or the therapist or medical professional. As noted with respect to Figure 3A, all or some of the above-identified data may also be transferred and/or communicated to a medical facility, hospital, or therapy practice computing device, including but not limited to electronic health record system 338.

[0050] FIG. 4B illustrates a flowchart of a teletherapy or telehealth assisting system and method interacting with a therapist or medical professional in an analysis and/or assessment mode, in accordance with some embodiments. In some embodiments, after the user session has been completed or as the user session is occurring, the teletherapy or telehealth software 410 stored on the cloud server or computing device may be executable by the one or more processors of the cloud server or computing device 405 and may receive raw audio data files, raw video data files, voice transcription files and/or video data point files from the robot computing device or Moxie. In some embodiments, in step 455, a text or transcription analysis module 412 may analyze the voice transcription files and generate transcription parameters or metrics (e.g., conversation parameters or metrics). As an example, the conversation parameters or metrics may include a response time for each conversation interaction between the user and/or the robot computing device. In some embodiments, in step 456, a video assessment module 411 may analyze the raw video files and/or a datapoint module 409 may analyze the video datapoints from the robot computing device and may generate video conversation parameters and/or metrics and/or video datapoints parameters and/or metrics. As an illustrative example, the video conversation metrics or parameters may include a number of smiles that the user had during the session, a user’s eye gaze during the session and/or any gestures a user made during the session conversation. As an example, the video datapoint metrics and/or parameters may include how many mouth changes or eye positions that a user had during a session. In some embodiments, in step 457, an audio assessment or analysis module 413 may analyze the raw audio files and may generate audio conversation parameters and/or metrics. As an example, the audio conversation metrics or parameters may be a response time between verbal exchanges, a speed of the user’ s speech, and''or a pitch of the user’s speech. In some embodiments, the audio conversation parameters and/or metrics, the video conversation parameters and/or metrics, the video datapoint conversation parameters and/or metrics and/or the transcript conversation parameters and/or metrics may be communicated and/or input to an automatic assessment module 414. In some embodiments, in step 460, the automatic assessment module 414 may analyze these parameters and/or metrics and may generate overall conversation parameters, metrics, identifiers, and/or scores. In these embodiments, the overall conversation parameters, metrics, identifiers and/or scores may be aggregated information that takes into considerations audio and/or video responses and may not just be focused on one particular area. As an example, the overall conversation parameters, metrics, identifiers, and/or scores may include a user engagement score, a sentimental score (e.g., analyzing sentiment of conversation), an overall conversation score, a vocabulary level score, a reading speed score, a number of activities completed, and/or other patient or user performance metrics.

[0051] hi some embodiments, the automatic assessment software 414 may also generate common medical diagnoses associated with the calculated parameters, metrics and/or identifiers. This is not a standalone medical diagnosis and may just be utilized in terms of assisting a therapist and/or medical professional in analyzing a user’s behavior or performance. In some embodiments, the automatic assessment module 414 may also assist a therapist or medical professional in helping to determine a medical diagnosis or therapeutic diagnosis of the patient, client or user. In some embodiments, the automatic assessment module 414 may analyze the teletherapy or telehealth session parameters and metrics (e.g., the user engagement parameters and/or metrics, user response time, user eye contract and/or eye gaze parameters and/or metrics, user gesture identifiers, user gesture parameters and/or metrics, user expression identifiers or parameters and/or metrics, user mood identifiers or parameters and/or metrics) and compare the teletherapy or telehealth session parameters and metrics to therapeutical condition parameters or medical condition parameters to determine if the parameters are similar or indicate a high probability that a user or child may have a medical condition and/or mental condition. In these embodiments, the automatic assessment module 414 is only providing a potential medical condition and/or mental condi tion of the user or child to assist the therapist or medical professional in making a diagnosis. The automatic assessment module 414 is not attempting to take the place of a therapist or medical professional, just providing the therapist or medical with statistical information and parameters.

[0052] In some embodiments, in step 465, the generated transcription conversation parameters and metrics, the generated video conversation parameters and metrics, the generated video datapoint conversation parameters and metrics, the generated audio conversation parameters and metrics, the overall conversation parameters, metrics, identifiers and/or scores and/or any suggested diagnosis parameter or identifier may be stored in a database in one or more memory devices of the cloud server computing device. In some embodiments, the teletherapy or telehealth automatic assessment module 414 may generate a session conversation summary file. The session conversation summary file is automatically generated and summarizes what the assessment module 414 determines has occurred during the session. In some embodiments, the teletherapy or telehealth automatic assessment module 414 may generate a session conversation context file. The session conversation context file may be automatically generated and summarizes a conversation context of what the assessment module 414 determines has occurred during the session. In some embodiments, the session conversation summary file and/or the session conversation context file may be stored in one or more memory devices of the teletherapy or telehealth cloud computing device 325 (and or maybe in the therapist module 430. The actions in steps 455 - 465 may occur automatically and without any therapist or interaction and/or may take place in the background during and/or after the conversation session between the user and the therapist and robot computing device. In other words, this portion of the teletherapy or telehealth assisting software may be executing or running at any time. In some embodiments, the generated parameter and/or metrics may be utilized in real-time by the therapist and/or medical professional to assist in deciding a next step in the session.

[0053] In some embodiments, the teletherapy or telehealth software may time synchronize the captured zoom audio / video with the captured robot computing device audio / video. In some embodiments, this synchronization is important because the captured zoom audio / video and the robot computing device audio / video will be presented to the therapist or medical professional to review the session in the telehealth software.

[0054] hr some embodiments, a therapist or medical professional may determine that they want to analyze and/or assess a user’s session. In some embodiments, in step 470, the therapist may login to the therapist computing device and may initiate the telehealth software on the therapist computing device which interfaces with the telehealth software on the cloud server computing device. In some embodiments, in step 475, the therapist may select which patient or user and session that they would like to review by selecting a button or icon in the telehealth or teletherapy software. In some embodiments, after the patient has been selected, a display screen and/or menu for an assessment or analysis mode may be displayed on a monitor or screen of the therapist computing device. Figure 2 illustrates a representative menu screen for the assessment or analysis mode of the telehealth or telemedicine software.

[0055] In some embodiments, in step 480, after a user or patient and/or session is selected, the teletherapy telehealth software 410 may retrieve the zoom recording associated with the user and/or the session that the therapist selected. In some embodiments, the zoom or videochat recording may be retrieved from a zoom video recording storage module 420 in one or more memory devices of the cloud server computing device 405 and may be displayed in the zoom window of the telehealth or telemedicine software (reference number 205 in Figure 2).

[0056] In some embodiments, in step 482, the audio / video recorded by the robot computing device or Moxie may be retrieved from the robot video storage 422 of the cloud server computing device 405 and may be displayed in the robot computing device window of the telehealth or telemedicine software (reference number 215 in Figure 2). [0057] In some embodiments, in step 482, the stored transcript data for the user’s session may be retrieved from the transcript storage 425 of the cloud server computing device and time synchronized portions of the stored transcript data may be presented in the transcript window of the telehealth software (reference number 205 in Figure 2). In these embodiments, the captured zoom audio / video, the captured robot computing device audio / video and the transcript data may be time synchronized. In other words, each of these windows may be presenting audio.'video files and/or transcript data for a same period time of the selected user’s session.

[0058] In some embodiments, in step 485, the telehealth or teletherapy assisting software may generate a timeline display associated with the selected user’s session. For example, if there is a 20 minute session, a 20 minute timeline will be presented in the timeline menu or window (e.g., reference number 225 in Figure 2). In some embodiments, the timeline menu or window allows a therapist or medical professional to move to different parts of the session (e.g., select a specific time in the session). In these embodiments, the different window's, e.g., the zoom or videoconferencing video window' 215, the robot computing device video window 210 and the transcript window 205) will also move to the selected specific time in the session. In some embodiments, the timeline window also allows a therapist to adjust the volume, to play the videos, to stop and'or pause the playing of the videos, and/or to focus on one of the robot computing device or zoom videos.

[0059] In these embodiments, in step 487, the therapist or medical professional may utilize the telehealth or teletherapy assisting software to review all or parts of the user or patient’s session. In some embodiments, in step 490, the therapist or medical professional may enter notes with respect to user or patient’s session. In some embodiments, the notes may be synchronized with the time in the session and may appear in a chat window as therapists notes. In some embodiments, the therapists or medical professional notes may appear in the comments window 230. In some embodiments, in step 492, the therapist may write an assessment of the patient session based on the review of the robot computing device’s video / audio, the zoom video / audio and/or the transcript of the session of the robot computing device and user. In some embodiments, the therapist’s assessment for the session may be stored in the therapist module 430 of the cloud server computing device 405. In these embodiments, the therapist’s assessment file for the session may be stored in a session storage area (with the other session-related data) in a database in a cloud server computing device 405 and may be indexed or associated with a unique session identifier. In some embodiments, in step 495, the telehealth or teletherapy assisting software may compare the therapists or medical professional’s assessment for the session to the diagnosis or assessment automatically generated by the automatic assessment software 414 to see if the automatic assessment is similar and/or aligned with the therapist’s diagnosis. In some embodiments, the telehealth or teletherapy software 410 may compare the therapist’s session assessment to the overall conversation parameters, metrics, identifiers and/or scores by the automatic assessment module 414 to see if there is alignment between the different assessments. In some embodiments, the session data for a client or patient may be communicated or transmitted to another therapist or medical professional computing device for review by a supervisor and/or other therapist or medical professional. This would allow easier transfer of pa tient or client session data (and util ization of the teletherapy or telehealth assisting software and system) during movement of clients or patients or within practice groups or similar medical facilities.

[0060] Figure 5 A illustrates an account setup screen for the telehealth or telemedicine assisting software system according to some embodiments. The user input (UI) screens illustrated herein (in Figures 5A through 5E) are only representative screens and do not limit the subject matter described in any way. Alternative methods of inputing similar data and/or information and/or establishing accounts are also covered by the details presented with respect to Figures 5A - 5E. Initially, a therapist or medical professional may have to dowmload the teletherapy or telehealth assisting software (e.g., computer- readable instructions) to a therapist’s computing device. Alternatively, a therapist or medical professional may login to a cloud server computing device 405 (see Figure 4) which is executing the teletherapy or telehealth assisting software. In some implementations, a therapist or medical professional may enter in their name, address. medical license type and number, types of services provided and/or other licensing documentation requirements. In these implementations, the therapist or medical professional may then be established within the teletherapy or telehealth assisting software and system as an established therapist or medical services provider. Figure 5A illustrates a new patient input screen according to some implementations. In some implementations, the new patient input screen 500 includes a name entry field 502 and 504, a date of birth entry field 506, a phone number entry field for the medical professional or the parent/guardian 508, a parent/guardian email entry field 510, and an address entry field 512 for the therapist or medical professional or the parent/ guardian 514. The teletherapy or telehealth assisting software or system also includes a home menu 520, a content menu 530 and/or a settings menu 540. In some implementations, the therapist or medical professional may enter in a client’s name into the name fields 502 and 504, a client or patient’s date of birth into the date of birth name fields 506, a client or patient or parent or guardian’s phone number or a therapist or medical professional’s phone number into the phone field 508, a client or patient’s parent or guardian’s email into the email field 510, and a client or patient parent or guardian’s address or a therapist or medical professionals address into the address entry field 512. In these implementations, the entered client or patient’s information and/or parameters may be stored as a new client account in a database of the telelherapy or telehealth cloud computing device (e.g., in one or more memory devices). In some implementations, although not shown in Figure 5A, a therapist may also identify therapist clinic systems and/or computing device which the teletherapy cloud computing device may interface and/or communicate with. In some implementations, although not shown in Figure 5A, a medical professional ma also identity medical facilities, hospitals, and/or other medical officer computing systems and/or computing device (or electronic health record systems) which the telehealth cloud computing device may interface and/or communicate with. Figure 5B illustrates a patient screen for therapist or medical professional in a teletherapy or telehealth assisting software or system according to some embodiments. In some implementations, after the client or patient data and/or parent or guardian data is entered into the new client or patient setup screen 500, the teletherapy or telehealth assisting software and/or system may execute computer-readable instructions to communicate with a parent or guardian’s parent software application for the robot computing device and send in invitation for their child (who is the patient or client) to engage in teletherapy or telemedicine sessions with the robot computing device. In some implementations, once the parent or guardian of the client accepts the invitation for their child to engage in teletherapy or telemedicine, the client or patient is entered and verified in the teletherapy or telehealth assisting software and/or system.

[0061] In Figure 5B, a client or patient screen 545 may include a list of clients or patients or icons representing clients or patients. In some implementations, as illustrated by Figure 5B, one client or patient icon 546 is provided for the therapist or medical professional, who is identified in a therapist or medical professional field 547. In some implementations, a therapist or medical professional may wish to see different parameters, configurations, activities or sessions that are available before engaging with a client or patient. In these implementations, a therapist or medical professional may select the client or patient icon 546 and Figure 5C or a similar menu may appear. Figure 5C illustrates a client or patient category screen or menu with a phrases submenu selected according to some embodiments. In some implementations, the client or patient category or screen menu 550 may include a phrases submenu or screen 551 , an activities submenu or screen 552, a sessions submenu or screen 553 anchor a session notes submenu or screen 554. In some implementations, the phrases submenu or screen 551 may include different types of phrases that a therapist or medical professional may utilize in a conversation through the robot computing device with the client or patient. As an illustrative example, during a session, a therapist or medical professional may select transitions, goodbyes, greetings, pauses, common phrases and/or deflections to identify and utilize different examples of these phrases categories during a conversation through the robot computing device with the client or patient. Figure 5D illustrates a client or patient category screen or menu with a submenu selected according to some embodiments. In some implementations, a therapist or medical professional may select an activities screen or submenu 552 that the therapist or medical professional can engage in with a client or patient through the robot computing device. As an illustrative example, if the activities screen or submenu 552 is selected a list of potential activities or icons of potential activities may be presented on a display. As illustrated in Figure 5D, the activities screen 556 may include activities such as a meditation journey, animal breathing, a body scan, affirmations, imagine a place, mindful status, a jukebox, a name that feeling, a drawing session, a dance party, a drawing, understanding emotions, a making friends or a stories activity, hi some implementations, the therapist or medical professional may select an activity and the robot computing device may begin to engage in the selected activity. [0062] Figure 5E illustrates a settings menu of a teletherapy or telehealth assisting system according to some embodiments. If a therapist or medical professional selects the settings menu 541, the screen illustrated m Figure 5E may be displayed. In some implementations, the settings menu 541 may include a client identification selection field 542, a zoom initiation and configuration field 543, a therapist ID field 544 (which is assigned by the teletherapy or telehealth assisting software or system), a reset settings field 545, and a sign out field 546. In some implementations, the client identification selection field 542 allows a user to determine how to identify the user (e g., as a client or a patient). In some implementations, the zoom initiation and configuration field 543 may allow a therapist or medical professional to determine how to initiate the teletherapy or teiehealth session and also what types of notifications a parent or guardian may receive (e.g., through the parent app and/r email). In some implementations, the therapist ID field 544 may display the therapists or medical professionals ID code as assigned by the teletherapy or telehealth system. In some implementations, the reset settings field 545 may allow a therapist or medical professional to reset any of the above-identified settings or other settings developed in the future. In some implementations, the sign out field 546 allows a therapist of medical professional to log out of the teletherapy or telemedicine assisting system or software.

[0063] As detailed above, the computing devices and systems described and/or illustrated herein broadly represent any type or form of computing device or system capable of executing computer-readable instructions, such as those contained within the modules described herein. In their most basic configuration, these computing device(s) may each comprise at least one memory device and at least one physical processor.

[0064] The term “memory” or “memory' device,” as used herein, generally represents any type or form of volatile or non-volatile storage device or medium capable of storing data and/or computer-readable instructions. In one example, a memory device may store, load, and/or maintain one or more of the modules described herein. Examples of memory devices comprise, without limitation, Random Access Memory (RAM), Read Only Memory (ROM), flash memory, Hard Disk Drives (HDDs), Solid-State Drives (SSDs), optical disk drives, caches, variations or combinations of one or more of the same, or any other suitable storage memory.

[0065] In addition, the term “processor” or “physical processor,” as used herein, generally refers to any type or form of hardware-implemented processing unit capable of interpreting and/or executing computer-readable instructions. In one example, a physical processor may access and/or modify one or more modules stored in the above-described memory device. Examples of physical processors comprise, without limitation, microprocessors, microcontrollers, Central Processing Units (CPUs), Field- Programmable Gate Arrays (FPGAs) that implement softcore processors, Application- Specific Integrated Circuits (ASICs), portions of one or more of the same, variations or combinations of one or more of the same, or any other suitable physical processor.

[0066] Although illustrated as separate elements, the method steps described and/or illustrated herein may represent portions of a single application. In addition, in some embodiments one or more of these steps may represent or correspond to one or more software applications or programs that, when executed by a computing device, may cause the computing device to perform one or more tasks, such as the method step.

[0067] In addition, one or more of the devices described herein may transform data, physical devices, and/or representations of physical devices from one fonn to another. For example, one or more of the devices recited herein may receive image data of a sample to be transformed, transform the image data, output a result of the transformation to determine a 3D process, use the resul t of the transformation to perform the 3D process, and store the result of the transformation to produce an output image of the sample. Additionally or alternatively, one or more of the modules recited herein may transform a processor, volatile memory, non-volatile memory, and/or any oilier portion of a physical computing device from one form of computing device to another form of computing device by executing on the computing device, storing data on the computing device, and/or otherwise interacting with the computing device.

[0068] The term “computer-readable medium,” as used herein, generally refers to any form of device, carrier, or medium capable of storing or carrying computer-readable instructions. Examples of computer-readable media comprise, without limitation, transmission-type media, such as carrier waves, and non-transitory-type media, such as magnetic-storage media (e.g., hard disk drives, tape drives, and floppy disks), optical- storage media (e.g., Compact Disks (CDs), Digital Video Disks (DVDs), and BLU-RAY disks), electronic-storage media (e.g., solid-state drives and flash media), and other distribution systems.

[0069] A person of ordinary skill in the art will recognize tha t any process or method disclosed herein can be modified in many ways. The process parameters and sequence of the steps described and/or illustrated herein are given by way of example only and can be varied as desired. For example, while the steps illustrated and/or described herein may be shown or discussed in a particular order, these steps do not necessarily need to be performed in the order illustrated or discussed.

[0070] The various exemplary methods described and/or illustrated herein may also omit one or more of the steps described or illustrated herein or comprise additional steps in addition to those disclosed. Further, a step of any method as disclosed herein can be combined with any one or more steps of any other method as disclosed herein.

[0071] Unless otherwise noted, the terms “connected to” and “coupled to” (and their derivatives), as used in the specification and claims, are to be constmed as permitting both direct and indirect (i.e., via other elements or components) connection. In addition, the terms “a” or “an,” as used in the specification and claims, are to be construed as meaning “at least one of.” Finally, for ease of use, the terms “including” and “having” (and their derivatives), as used in the specification and claims, are interchangeable with and shall have the same meaning as the word “comprising.

[0072] The processor as disclosed herein can be configured with instructions to perform any one or more steps of any method as disclosed herein.

[0073] As used herein, the term “or” is used inclusively to refer items in the alternative and in combination.

[0074] As used herein, characters such as numerals refer to like elements.

[0075] Embodiments of the present disclosure have been shown and described as set forth herein and are provided by way of example only. One of ordinary skill in the art will recognize numerous adaptations, changes, variations and substitutions without departing from the scope of the present disclosure. Several alternatives and combinations of the embodiments disclosed herein may be utilized without departing from the scope of the present disclosure and the inventions disclosed herein. Therefore, the scope of the presently disclosed inventions shall be defined solely by the scope of the appended claims and the equivalents thereof.