Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
TRANSFER LEARNING APPARATUS, TRANSFER LEARNING SYSTEM, METHOD OF TRANSFER LEARNING, AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2020/250451
Kind Code:
A1
Abstract:
The present invention provides a transfer learning apparatus, system, method, and storage medium able to reduce the incurred computational overhead of having to frequently train new specialized inference models and a time series models continuously from time series data.

Inventors:
BEYE FLORIAN (JP)
Application Number:
PCT/JP2019/024618
Publication Date:
December 17, 2020
Filing Date:
June 14, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NEC CORP (JP)
International Classes:
G06N3/08; G06K9/00; G06N3/04; G06N7/00
Foreign References:
US20190180469A12019-06-13
US20180005069A12018-01-04
Other References:
REN ZIHAN ET AL: "A Real-Time Suspicious Stay Detection System Based on Face Detection and Tracking in Monitor Videos", 2017 10TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), IEEE, vol. 1, 9 December 2017 (2017-12-09), pages 264 - 267, XP033311979, DOI: 10.1109/ISCID.2017.150
TSUNG-YI LIN: "Focal Loss for Dense Object Detection", IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV, 2017
Attorney, Agent or Firm:
TANAI, Sumio et al. (JP)
Download PDF:
Claims:
[CLAIMS]

[Claim 1]

A transfer learning apparatus, comprising:

an inference model parameter memory storing model parameter data associated with an inference model;

a time series model memory storing model parameter data associated with a time series model and a state probability distribution;

an inference unit configured to receive time slice data and configured to calculate an inference result vector from the time slice data and the parameter data stored in the inference model parameter memory;

a time series model update unit configured to receive the inference result vector from the inference unit and configured to update the parameter data and the state probability distribution stored in the time series model memory;

a gradient calculation unit configured to receive the inference result vector from the inference unit and parameter data from the time series model memory and calculate a gradient vector based on the inference result vector and the parameter data;

a magnitude metric calculation unit configured to receive the gradient vector and calculate a magnitude metric value; and

an inference model parameter update unit configured to update the inference model parameter data stored in the inference model parameter memory based on the gradient vector and the time slice data, if the magnitude metric value is higher than a magnitude metric threshold.

[Claim 2]

The transfer learning apparatus of claim 1 , wherein the time series model update unit is further configured to calculate a loss value from the time series model parameter data and the inference result vector, and

the magnitude metric calculation unit calculates the magnitude metric value based on both of the lose value and the gradient vector.

[Claim 3]

The transfer learning apparatus of claim 1 or claim 2, wherein

the time series model update unit updates the state probability distribution stored in the time series model memory from the inference result vector at a time before the inference model parameter update unit determines whether or not the magnitude metric value is higher than the magnitude metric threshold.

[Claim 4]

The transfer learning apparatus of claim 1 or claim 2, wherein

if the inference model parameter update unit determines that the magnitude metric value is higher than the magnitude metric threshold and updates the inference model parameter data stored in the inference model parameter memory based on the gradient vector and the time slice data, the inference unit recalculates the inference result vector and the time series model update unit updates the state probability distribution, and

if the inference model parameter update unit determines that the magnitude metric value is less than or equal to the magnitude metric threshold, the time series model update unit updates the state probability distribution. [Claim 5] A transfer learning system, comprising

a communication network;

an inference model parameter memory storing model parameter data associated with an inference model;

a time series model memory storing model parameter data associated with a time series model and a state probability distribution;

an inference unit configured to receive time slice data and configured to calculate an inference result vector from the time slice data and the parameter data stored in the inference model parameter memory;

a time series model update unit configured to receive the inference result vector from the inference unit and configured to update the parameter data and the state probability distribution stored in the time series model memory;

a gradient calculation unit configured to receive the inference result vector from the inference unit and parameter data from the time series model memory and calculate a gradient vector based on the inference result vector and the parameter data;

a magnitude metric calculation unit configured to receive the gradient vector and calculate a magnitude metric value; and

an inference model parameter update unit configured to update the inference model parameter data stored in the inference model parameter memory based on the gradient vector and the time slice data, if the magnitude metric value is higher than a magnitude metric threshold, wherein

an edge device configured to provide time slice data through the communication network, the edge device decoding information from a sensor as the time slice data. [Claim 6]

A method of transfer learning comprising, in order:

calculating an inference result vector from time slice data and inference model parameter data;

updating time series model parameter data from the inference result vector; updating a state probability distribution from the inference result vector;

calculating a gradient vector from the time series model parameter data and the inference result vector;

calculating a magnitude metric from the gradient vector; and

updating the inference model parameter data from the gradient vector and the time slice data when the magnitude metric value is higher than a magnitude metric threshold.

[Claim 7]

A computer readable storage medium storing instructions for causing a computer to execute:

calculating an inference result vector from time slice data and inference model parameter data;

updating time series model parameter data from the inference result vector; updating a state probability distribution from the inference result vector;

calculating a gradient vector from the time series model parameter data and the inference result vector;

calculating a magnitude metric from the gradient vector; and

updating the inference model parameter data from the gradient vector and the time slice data when the magnitude metric value is higher than a magnitude metric threshold.

Description:
[DESCRIPTION]

[Title of the Invention]

TRANSFER LEARNING APPARATUS, TRANSFER LEARNING SYSTEM,

METHOD OF TRANSFER LEARNING, AND STORAGE MEDIUM

[Technical Field]

[0001]

The present invention relates to a transfer learning apparatus, a transfer learning system, a computer readable storage medium, and method for efficiently (machine) learning a joint model including an inference model and a time series model continuously from time series data.

[Background Art]

[0002]

Applications frequently combine an inference model with a time series model to analyze time series data. For example, when time series data consists of frames from a video stream, an inference model can be an object detection model used for detecting objects within an individual frame, and a time series model can be used for tracking object identities between frames. However, high-accuracy object detection models such as the one described in NPL1 are complex and incur substantial computational costs and latencies. Less complex detection models can achieve similarly high accuracy under limited conditions (e.g. fixed background, fixed time-of-day, etc.) at lower computational cost and latency when they are trained specifically for these limited conditions such as described for example in PTL1. However, when analyzing time series data such as frames from a video stream, and conditions such as background are expected to be transient, then the usage of such specialized models creates the additional problems of having to frequently train new specialized models according to the changed conditions, and/or having to maintain and switch dynamically between a multitude of specialized . models by detecting the current conditions and determining the specialized model best suited to the current conditions.

[Citation List]

[Patent Literature]

[0003]

[PTL 1]

US Patent Application Publication No. US20180005069A 1

[Non Patent Literature]

[0004]

[NPL 1]

“Focal Loss for Dense Object Detection”, Tsung-Yi Lin et. al., 2017 IEEE International

Conference on Computer Vision (ICCV).

[Summary of Invention]

[Technical Problem]

[0005]

The present disclosure aims to solve the problem of the incurred computational overhead of having to frequently train new specialized inference models according to changed conditions, and/or having to maintain and switch dynamically between a multitude of specialized inference models. One of the objectives of this invention is to provide a method for efficiently learning an inference model continuously from time series data, to the effect that the inference model dynamically adapts to changes of external conditions, such as background objects, lighting, etc.

[Solution to Problem]

[0006] A time series model is used to provide means for estimating the magnitude of which parameters of the inference model would change according to input time slice data, i.e. the magnitude of the potential learning effect. Furthermore, the

computationally intensive parameter update, i.e. learning operation, is performed selectively according to the estimated magnitude of change and a threshold magnitude value, i.e. only when the anticipated learning effect is considered high enough.

[0007]

A first example aspect of the present disclosure provides a transfer learning apparatus, including: an inference model parameter memory storing model parameter data associated with an inference model; a time series model memory storing model parameter data associated with a time series model and a state probability distribution; an inference unit configured to receive time slice data and configured to calculate an inference result vector from the time slice data and the parameter data stored in the inference model parameter memory; a time series model update unit configured to receive the inference result vector from the inference unit and configured to update the parameter data and the state probability distribution stored in the time series model memory; a gradient calculation unit configured to receive the inference result vector from the inference unit and parameter data from the time series model memory and calculate a gradient vector based on the inference result vector and the parameter data; a magnitude metric calculation unit configured to receive the gradient vector and calculate a magnitude metric value; and an inference model parameter update unit configured to update the inference model parameter data stored in the inference model parameter memory based on the gradient vector and the time slice data, if the magnitude metric value is higher than a magnitude metric threshold.

[0008] A second example aspect of the present disclosure provides a transfer learning system, including a communication network; an inference model parameter memory storing model parameter data associated with an inference model; a time series model memory storing model parameter data associated with a time series model and a state probability distribution; an inference unit configured to receive time slice data and configured to calculate an inference result vector from the time slice data and the parameter data stored in the inference model parameter memory; a time series model update unit configured to receive the inference result vector from the inference unit and configured to update the parameter data and the state probability distribution stored in the time series model memory; a gradient calculation unit configured to receive the inference result vector from the inference unit and parameter data from the time series model memory and calculate a gradient vector based on the inference result vector and the parameter data; a magnitude metric calculation unit configured to receive the gradient vector and calculate a magnitude metric value; and an inference model parameter update unit configured to update the inference model parameter data stored in the inference model parameter memory based on the gradient vector and the time slice data, if the magnitude metric value is higher than a magnitude metric threshold, wherein an edge device configured to provide time slice data through the communication network, the edge device decoding information from a sensor as the time slice data.

[0009]

A third example aspect of the present disclosure provides a method of transfer learning including: calculating an inference result vector from time slice data and inference model parameter data; updating time series model parameter data from the inference result vector; updating a state probability distribution from the inference result vector; calculating a gradient vector from the time series model parameter data and the inference result vector; calculating a magnitude metric from the gradient vector; and updating the inference model parameter data from the gradient vector and the time slice data when the magnitude metric value is higher than a magnitude metric threshold.

[0010]

A fourth example aspect of the present disclosure provides a computer readable storage medium storing instructions to cause a computer to execute: calculating an inference result vector from time slice data and inference model parameter data; updating time series model parameter data from the inference result vector; updating a state probability distribution from the inference result vector; calculating a gradient vector from the time series model parameter data and the inference result vector; calculating a magnitude metric from the gradient vector; and updating the inference model parameter data from the gradient vector and the time slice data when the magnitude metric value is higher than a magnitude metric threshold.

-[Advantageous Effects of the Invention]

[0011]

When compared to using a single static but complex general inference model of high accuracy, a single less complex inference model dynamically adapting to limited conditions by use of the present invention can achieve similar accuracy at computational costs which are substantially lower due to the learning operation only being performed selectively when the anticipated learning effect is considered high enough, i.e., greater than a predetermined threshold.

[Brief Description of Drawings]

[0012]

[Fig. 1]

Figure 1 is a block diagram showing the structure of the first and second embodiments of the present disclosure.

[Fig. 2]

Figure 2 is a block diagram showing the structure of the third and fourth embodiments of the present disclosure.

[Fig. 3]

Figure 3 is a flow diagram showing the operations of the first embodiment of the present disclosure.

[Fig. 4]

Figure 4 is a flow diagram showing the operation of the second embodiment of the present disclosure.

[Fig. 5]

Figure 5 is a block diagram showing the structure of the fifth embodiment of the present disclosure.

[Fig. 6]

Figure 6 is an overview of the structure in which transfer learning is provided at multiple locations over a communication network.

[Fig- 7]

Figure 7 is a block diagram showing the structure of an edge device.

[Fig. 8]

Figure 8 is a block diagram showing the transfer learning apparatus according the third embodiment of the present disclosure.

[EXAMPLE EMBODIMENTS]

[0013]

Example embodiments of the present invention are described in detail below with reference to the accompanying drawings. In the drawings, the same elements are denoted by the same reference numerals, and thus redundant descriptions are omitted as needed.

[0014]

Reference throughout this specification to“one embodiment”,“an

embodiment”,“one example” or“an example” means that a particular feature, structure or characteristic described in connection with the embodiment or example is included in at least one embodiment of the present embodiments. Thus, appearances of the phrases

“in one embodiment”,“in an embodiment”,“one example” or“an example” in various places throughout this specification are not necessarily all referring to the same embodiment or example. Furthermore, the particular features, structures or characteristics may be combined in any suitable combinations and/or sub-combinations in one or more embodiments or examples.

(First Example Embodiment)

Before explaining the structure and operation of the first example embodiment, some terms will be defined and some assumptions will be provided.

[0015]

In the following descriptions time is broken down into slices indexed by t (time slices).

[0016]

Time slice data d t is data corresponding to a time slice t. The time slice data d t may be image frames from a surveillance video camera, installed, for example, in a retailer shop at a fixed angle recording customers. The time slice data d t may go through background changes, such as changes in the lighting, or positions of fixed objects like shelf products and boxes.

[0017] The present embodiment uses an inference model , where

d corresponds to input data, to model parameters, and , to the corresponding

inference result vector.

[0018]

The inference model may have a structure of any linear or non-linear classification or regression model, including a convolution neural network (CNN) such as MobileNets and its variants. For the surveillance camera embodiment, the inference model may be, for example, MobileNet 224 (https://arxiv.org/abs/1704.04861 ) having predetermined parameters.

[0019]

The initial model parameters of the inference model may be pre-trained using a conventional method such as supervised or unsupervised training using a training dataset designed for the inference task such as object detection, image captioning, natural language processing, or scene recognition, for example. The model structure and the initial model parameters may also be adopted from available public repositories of

trained networks. When the inference model structure is a lightweight network such as

MobileNets, in order for the network output inferences to be sufficiently accurate, the network should be re-trained, either online or offline, using time series data collected in the context of the specific deployment of the model at the time of the initial installation. This is to adapt the parameter values from those found e.g. in the public repositories, to values suitable to the deployment (background) during initial installation. For example, such context could correspond to a specific surveillance camera for an object detection task in a surveillance application. Furthermore, in order for the network output inferences to be sufficiently accurate even after background changes, the network should be re-trained either online or offline during normal operation. This is to adapt to background changes after initial installation, i.e. during normal operation.

[0020]

A probabilistic time series model ) modeling inference

observations Y 1:t and states Z 1:t is given model parameters q. The time series model may be any state-based probabilistic model. The time series model may have a structure such as a hidden Markov model, a linear dynamic system state-space model, or a random finite set state-space model. Alternatively, recurrent neural networks can be used if their prediction output is interpreted as a probability distribution.

[0021]

For this example embodiment, description will be given as applied to an object tracking surveillance camera although the present invention is not limited thereto. The time series model may be pfe-trained using a publicly available dataset so that the trained model can predict the locations of detection targets such as humans within the image in the next time frame, when the model is provided with the locations of the detection targets in the current time frame.

[0022]

The time series model can be defined by a function g,

representing the joint probability of observing inference y and a state transition to state z’ at time t, given that the time series state at time t-1 is z, under the time series model parameters q.

[0023]

For example, when the time series model is a hidden Markov model, a linear dynamic system state-space model or a random finite set state-space model, g(y, z’\z, q ) can be written as the product of state transitions probabilities P(Z t |Z t _ 1 q) and observation probabilities P(Y t \Z t , q),

For the surveillance-camera embodiment using a random finite set state-space model as time series model, a state z represents the locations and velocities of the tracked objects, , and an inference observation y represents detected object

locations The function g represents the modeled chance of objects moving .

to different locations and different velocities and of detecting locations Especially, g may model motion noise, appearance

or disappearance of objects, as well as detection noise and probabilities of false

positive/false negative detections.

[0024]

Given a state probability distribution p(z) and observation data y, the filtered state probability distribution p’(z’) is calculated by Bayesian inference as

In the present embodiment, p’(z’) represents the posterior probability distribution of the object locations and velocities in the image, given the prior probability distributions p(z) and the observation y at time frame t.

[0025]

A loss function L is defined below.

In the present embodiment, this loss function L represents the difference between (a) the object locations inferred by the image inference model from the video image at the current time frame and (b) the object locations inferred by the time series model based on the estimated probability distribution of the object locations and velocities in the previous time frame. In other words, the loss represents the unlikelihood of detecting locations y given the estimated distribution p of locations and velocities in the previous time frame.

[0026]

The structure of the first embodiment of this invention, a transfer learning apparatus, is displayed in the block diagram in Fig. 1, and constitutes the basic structure of this invention. In the following, the responsibilities of each unit contained in this embodiment are described.

[0027]

The data d=d t that corresponds to time slice t is received from the outside as input or from a memory, and is read or received in succession, Each

time slice data d t triggers a new operation of the transfer learning apparatus. For the present embodiment, for example in the surveillance application, time slice data can be image data. In order for the online/offline training to reasonably converge, it is preferable that the camera be stationary, so that the video background change occurs only gradually.

[0028]

An inference unit 101 calculates an inference result vector

according to inference model using time slice data d, and the model parameter

data ø stored in inference model parameter memory 102. In the surveillance camera embodiment, the inference result may represent the locations of the detected objects and object classes,

where y j and q respectively denote the detected location and class (such as a person, a vehicle, etc.) of the i-th detected object. [0029]

The inference model parameter memory 102 stores parameter data

associated with an inference model and the parameter data persists between

updates and may also be updated between individual operations of the transfer learning apparatus according to the rules governing whether or not to update the model (such rules to be discussed later). For typical models for object detection in image data, the number of parameters is of the orders

[0030]

The transfer apparatus transfer learning apparatus has an inference result vector 103 representing the inference result as, for example, numbers and locations of detected objects.

[0031]

A time series model update unit 104 (sometimes referred to as“time series model parameter/state update unit”) retrieves the state probability distribution p(z) and parameters q stored in the time series model memory 105 (sometimes referred to as“time series model parameter/state memory”), updates parameters q stored in the time series model memory 105 as

and updates the state probability distribution by Bayesian inference as

where are some fixed parameters controlling the learning speed and y is the inference result vector 103. Given y, the detected object locations inferred from the new video image, the parameters q and the estimated distribution of locations and velocities p(z) are updated using these equation ' s.

[0032]

A time series model memory 105 stores parameter data q for the time series model and a state distribution p(z) associated with the time series model. The parameter persists between arrivals of time series data slices 100.

[0033]

A gradient calculation unit 106 retrieves the state probability distribution p(z) and parameters q stored in time series model memory 105, calculates a gradient vector where y is the inference result vector 103. This gradient vector

corresponds to the gradient (i.e., the partial derivative) of the loss L with respect to each component of the inference vector y. In the surveillance camera embodiment, when the observed change of the inferred object locations in the current video frame are totally unexpected based on the prediction of the time series model, then this gradient vector tends to yield larger elements.

[0034]

The gradient vector 107 consists of the partial derivative of the loss L with respect to all components of the inference vector y.

[0035]

The device of the present embodiment determines whether or not to update the model parameters depending on the significance or magnitude of the update that is about to be made using the current time series data slice 100. This determination is made for example based on the gradient vector 107. In the present embodiment, this

determination may be performed, as explained below, by calculating a magnitude metric from the gradient vector 107 and comparing the gradient magnitude with a threshold, and perform an update of the model when the magnitude is larger than the threshold.

[0036]

A magnitude metric calculation unit 108 calculates the magnitude metric value

109, m = h(w), where w is the gradient vector 107 and h(w) is a magnitude metric function to calculate the magnitude of the gradient vector 107. The magnitude metric function h(w) may be chosen from any vector magnitude metric function, for example, but not necessarily, an L1, L2, or Max function. If the metric function h(w) is L2, then

[0037]

A magnitude metric value 109 represents the magnitude of the gradient, which represents the unexpectedness, i.e., significance, of the update being made based on the current time frame data. In the surveillance camera scenario, if the pre-trained model produces a high-magnitude gradient of the loss, it is likely that this is caused by some misdetection or detection noises due to background change, such as a recent lighting change. In this case, the gradient from the current time frame should be efficiently used for the model updating, in order to reduce similar misdetections or noises in the future.

On the other hand, if the magnitude is low, the video frame probably did not experience any background change. In this case, running a model parameter update (which consumes significant computational resources) would not significantly improve the accuracy of the model, which should be avoided.

[0038]

A magnitude metric threshold value 110 may be determined emphatically.

[0039]

For an inference model parameter update unit 111, if magnitude metric value 109 is above the magnitude metric threshold value 110, the inference model parameter update unit 111 updates the parameters f stored in the inference model parameter memory 102 as where T k are fixed parameters

controlling the learning speed, d is the time slice data 100 and w is the gradient vector 107.

[0040]

In the following, the operation of the apparatus depicted in Fig. 1 is explained according to the flow diagram in Fig. 3 as a series of steps.

[0041]

In step S200, the time series data slice 100 d— d t for some time slice t is received.

[0042]

In step S201, the inference unit 101 calculates the inference result vector, y = from time slice data 100 and the model parameter data stored in the

inference model parameter memory 102.

[0043]

In step S202a, the time series model update unit 104 retrieves the state probability distribution p(z) and parameters q stored in the time series model memory

105, and then updates parameters q stored in time series model memory 105 as where y is the inference result vector 103.

[0044]

In step S202b, the time series model update unit 104 retrieves the state probability distribution p(z) and parameters q stored in time series model memory 105, and then updates the state probability distribution by Bayesian inference as

where y is the inference result vector 103.

[0045]

In step S203, the gradient calculation unit 106 retrieves the state probability distribution p(z) and parameters q stored in the time series model memory 105, calculates a gradient vector W j 107 as

where for y is the inference result vector 103.

[0046]

In step S204, the magnitude metric calculation unit 108 calculates the magnitude metric value m 109, m = h(w), where w is the gradient vector 107.

[0047]

In step S205, if magnitude metric value 109 is above magnitude metric threshold value 110, execution proceeds-to step S206, else it proceeds to step S207.

[0048]

In step S206, the inference model parameter update unit 111 updates the parameters f stored in inference model parameter memory 102

where d is the time slice data 100 and w is the gradient vector 107.

In step S207, processing for time slice t is finished, and execution stops until another time series data slice 100 d = d t+1 for time slice t+1 is received.

(Second Example Embodiment) [0049]

The apparatus from Fig. 2, which corresponds to the apparatus from first example embodiment amended as follows:

[0050]

The time series model update unit 104 additionally calculates the loss value 111 as l = L(y\p, q), where y is the inference result vector (1003), and p(z) and 0 are the state probability distribution and the parameters retrieved from the time series model memory 105, respectively.

[0051]

The magnitude metric calculation unit 108 calculates the magnitude metric value

109, m = h’(w, 1), where h’(w, 1) is a function of gradient vector 107 and loss value 111.

[0052]

The loss value 111 is the value l = L(y\p, q).

[0053]

The flow of operation follows the sequence from Fig. 3, which is altered with respect to the following:

[0054]

In step 202a, a time series model update unit 104 additionally calculates the loss value 111 as l = L(y\p, q), where y is the inference result vector 103, and p(z) and 0 are the state probability distribution and the parameters retrieved from the time series model memory 105, respectively.

[0055]

In step S204, the magnitude metric calculation unit 108 calculates the magnitude metric value 109, m = h’(w, 1), where h’(w, 1) is a function of the gradient vector 107 and loss value 111. (Third Example Embodiment)

[0056]

In this third example embodiment, a description will be provided in accordance with either of the first and second example embodiments with the following additions and modifications with reference to Figs. 6-8. Redundant descriptions of components previously described in the first and second example embodiments will be omitted.

[0057]

Fig. 6 shows an example system diagram in which the transfer learning apparatus of the present disclosure may be applied for time series data analysis at, for example, a plurality of locations (e.g., supermarkets, convenience stores, stadiums, warehouse, etc.) with a plurality of sensors 305, such as cameras, audio recording devices, etc. In this example, the transfer learning apparatus is part of a cloud computing environment 310 and is able to perform processing of time slice data 100 for each of the locations which are equipped with an edge device 300 and one or more sensors 305 as shown, for example, in Fig. 7.

[0058]

In addition to the features of either the first or second example embodiments, a tracking data generation unit 112 is provided, as shown in Fig. 8, in order to output object tracking data, for example, back to the respective edge devices of the respective locations.

As shown in Fig. 7, the exemplary embodiments may include a central processing unit (CPU), and as the memory, a random access memory (RAM) may be used. As the storage device, a hard disk drive (HDD), a solid state drive (SSD), etc. may be used.

[0059] With reference to Fig. 7, an exemplary structure of the edge device 300 will now be described. The edge device may include, for example, a communication I/F 301

(interface), a controller 302, storage 303, and a sensor I/F 304. The controller includes

CPU and memory. The storage 303 may be a storage medium such as an HDD and an SSD. The communication I/F 301 has general functions for communicating with cloud computing environment 310 via the communication network. The sensor I/F has general functions for instructing operations to the sensor 305 and retrieve detected

(sensed) information from the sensor 305. In other words, the edge device 300 has at least a computing function, a communication gateway function, and a storage function. However, it may be assumed that these functions of the edge device are relatively less performance intensive as compared with those of a high-end personal computer and also those of the cloud computing environment due to, for example, commercial reasons (i.e. cost) with regard to the edge device 300.

[0060]

It should be noted that the edge device 300 may be merely part of a POS (point of sale) system.

(Other Modifications)

[0061]

While the preferred example embodiments of the present invention have been described above, it is to be understood that the present invention is not limited to the example embodiments above and that further modifications, replacements, and adjustments may be added without departing from the basic technical concept of the present invention.

[0062] In the first and second example embodiments, descriptions are given in accordance with the flow chart shown in Fig. 3. However, the present invention is not limited to this sequence of operations and may instead operate, for example, in accordance with the flow chart shown in Fig. 4.

[0063]

In the present disclosure, the embodiments are intended to be used with training being performed online. However, batch training is also possible depending on design specifications.

[0064]

One example of an object to be tracked could be human beings, and the objective may be to track the number of individuals in a store at any given time.

[Industrial Applicability]

The disclosed invention can be applied to the computer vision task of tracking objects from video data.

[Reference Signs List]

[0065]

100 Image Data

101 Inference Unit

102 Inference Model Parameter Memory

103 Inference Result Vector

104 Time Series Model Update Unit

105 Time series model memory

106 Gradient Calculation Unit

107 Gradient Vector 108 Magnitude Metric Calculation Unit

109 Magnitude Metric Value

110 Magnitude Metric Threshold Value

111 Inference Model Parameter Update Unit 112 Tracking Data Generation Unit

150 Object Detection Unit

151 Obj ect Tracking Unit

300 Edge Device

301 Communication I/F

302 Controller

303 Storage

304 Sensor I/F

305 Sensor

310 Cloud Computing Environment