A SIMULATOR INCLUDING A METHOD AND APPARATUS FOR DETERMINING THE CO-ORDINATES OF AN OBJECT IN TWO DIMENSIONS

Title:

A SIMULATOR INCLUDING A METHOD AND APPARATUS FOR DETERMINING THE CO-ORDINATES OF AN OBJECT IN TWO DIMENSIONS

Document Type and Number:

WIPO Patent Application WO/2012/063068

Kind Code:

Abstract:

The invention relates to a simulator. The simulator includes a method and apparatus for determining the coordinates of an object in two dimensions. A camera and infrared camera are provided, for capturing images of the controller. The infrared image is processed using an active contour model to produce a training image from an image from the camera. An adaptive correlation filter is constructed from the training image, which is correlated with images from the camera to measure the position of the controller.

Inventors:

TELENSKY JAN (GB)
KAMAT PRAJAY (GB)

Application Number:

PCT/GB2011/052187

Publication Date:

May 18, 2012

Filing Date:

November 10, 2011

Export Citation:

Click for automatic bibliography generation Help

Assignee:

JT CONSULTANCY LTD
TELENSKY JAN (GB)
KAMAT PRAJAY (GB)

International Classes:

G06T7/00

Foreign References:

US6757422B1	2004-06-29
US6529614B1	2003-03-04

Other References:

SHAH S ET AL: "Iris Segmentation Using Geodesic Active Contours", IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, IEEE, PISCATAWAY, NJ, US, vol. 4, no. 4, 1 December 2009 (2009-12-01), pages 824 - 836, XP011328925, ISSN: 1556-6013, DOI: 10.1109/TIFS.2009.2033225
OLIVER C. JOHNSON ET AL: "Optimization of OT-MACH filter generation for target recognition", PROCEEDINGS OF SPIE, vol. 7340, 1 January 2009 (2009-01-01), pages 734008 - 734008-9, XP055022088, ISSN: 0277-786X, DOI: 10.1117/12.820950

Attorney, Agent or Firm:

MACKENZIE, Andrew (St Albans, Hertfordshire AL1 3AW, GB)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1 . Apparatus, for determining co-ordinates of an object in two dimensions, comprising a first camera, for producing an infrared image of the object;

a second camera, for producing a calibration image of the object and a stream of images of the object;

a first computational module, for producing a training image, configured to extract a vector, corresponding to an edge of the object, from the infrared image using an active contour model, and to apply the vector to the calibration image to produce the training image;

an adaptive correlation filter, constructed from the training image; and a second computational module, configured to correlate at least one image of the stream of images from the second camera to the adaptive correlation filter for determining the x-y coordinates of the object by comparing the amplitude of the maximum peak in the adaptive correlation filter correlation plane to a detection threshold.

2. Apparatus as claimed in Claim 1 , wherein a plurality of rotated training images are produced, the rotated training images constructed by rotating the training image, and the adaptive correlation filter being constructed from the rotated training images.

3. Apparatus as claimed in Claim 1 , wherein the training image is produced periodically to update the adaptive correlation filter. Apparatus as claimed in Claim 2, wherein the rotated training images are produced periodically to update the adaptive correlation filter.

Apparatus as claimed in either Claim 3 or Claim 4, wherein the adaptive correlation filter is updated between every 0.5 and 1 .5 seconds.

Apparatus as claimed in any one of the preceding claims, wherein the adaptive correlation filter is an OT-MACH filter.

A method, for determining the co-ordinates of an object in two dimensions, the method comprising the steps of:

acquiring an infrared image including the object from an infrared camera; acquiring a stream of images, including the object, and a calibration image, including the object, from a camera;

producing a vector, corresponding to an edge of the object, from the infrared image, using an active contour model;

producing a training image, by extracting the object from the calibration image using the vector;

constructing an adaptive correlation filter using the training image; and determining the x-y co-ordinates of the object in a correlation plot, by correlating at least one image from the stream of images to the adaptive correlation filter.

The method as claimed in Claim 7, further comprising the step of producing a plurality of rotated training images, the rotated training images constructed by rotating the training image, and constructing the adaptive correlation filter from the rotated training images.

9. The method as claimed in either Claim 7, wherein the training image is produced periodically to update the adaptive correlation filter.

10. The method as claimed in Claim 8, wherein the rotated training images are produced periodically to update the adaptive correlation filter. 1 1 . The method as claimed in any one of Claims 7 to 10, wherein the adaptive correlation filter is updated between every 0.5 and 1 .5 seconds.

12. The method as claimed in any one of Claims 7 to 1 1 , wherein the adaptive correlation filter is an OT-MACH filter.

13. A computer program, embodied on a computer readable medium, configured to execute the method according to any one of Claims 7 to 12.

14. Apparatus substantially as herein described with reference to and as shown in any one of the accompanying drawings.

15. A method substantially as herein described with reference to and as shown in any one of the accompanying drawings.

Description:

A SIMULATOR INCLUDING A METHOD AND APPARATUS FOR DETERMINING THE

CO-ORDINATES OF AN OBJECT IN TWO DIMENSIONS

This invention relates to a simulator, including a method and apparatus for determining the co-ordinates of an object in two dimensions. More specifically, but not exclusively, this invention relates to a simulator for the training of a tradesman.

Traditionally, tradesmen, such as plumbers, have learned their trade on the job as an apprentice. An apprentice learns the various requisite skills by attempting to replicate their master's work. Apprenticeships provide a focussed, personal training experience. However, this form of training is not scalable, as the master's time is dissipated over many pupils. Furthermore, at least in the early stages of training, the apprentice will make mistakes, which will cost the master and deter him/her from hiring apprentices in the future. Vocational training courses were developed to give pupils the initial experience needed to start work as a tradesman, in an attempt to reduce the initial, costly, period of the apprenticeship. However, these courses are subject to relatively high fees due to the number of mistakes the pupils make in the early stages. According to a first aspect of the invention, there is provided apparatus, for determining coordinates of an object in two dimensions, comprising a first camera, for producing an infrared image of the object; a second camera, for producing a calibration image of the object and a stream of images of the object; a first computational module, for producing a training image, configured to extract a vector, corresponding to an edge of the object, from the infrared image using an active contour model, and to apply the vector to the calibration image to produce the training image; an adaptive correlation filter, constructed from the training image; and a second computational module, configured to correlate at least one image of the stream of images from the second camera to the adaptive correlation filter for determining the x-y coordinates of the object by comparing the amplitude of the maximum peak in the adaptive correlation filter correlation plane to a detection threshold.

By applying an active contour model to the image from the infrared camera, which is in turn used to create the training image for the adaptive correlation filter, the apparatus may accurately track the object with only one infrared camera. In the prior art, multiple infrared cameras are used to track the object. The present invention therefore reduces the cost of tracking an object, by alleviating the need for multiple, expensive, infrared cameras. The active contour model works synergistically with the infrared camera, as the contrast between the object and a human being allows an accurate vector corresponding to an edge of the object to be produced. The apparatus therefore works well in cluttered environments, such as a living-room.

The present invention can therefore use a standard camera, e.g. a colour camera such as a VGA camera, which provides images from the visible spectrum which may be used by the apparatus.

Preferably, a plurality of rotated training images are produced, the rotated training images constructed by rotating the training image, and the adaptive correlation filter being constructed from the rotated training images. By rotating the training images, the apparatus may maintain a track on the object, that is, may accurately and consistently measure the position of the object, for a longer period without updating the adaptive correlation filter with a new training image. The rotated training images increase the tolerance of the apparatus to changes in orientation, scale changes and position.

Preferably, the training image, or rotated training images, are produced periodically to update the adaptive correlation filter. The adaptive correlation filter may be updated between every 0.5 and 1 .5 seconds, or more preferably, every second.

By updating the adaptive correlation filter with a new training image, and using a set of rotated training images, the adaptive correlation filter is 'retrained'. This increases the accuracy of the adaptive correlation filter for a subsequent image from the second camera. The adaptive correlation filter may be a MACH filter, an OT-MACH filter, or the like.

According to a second aspect of the invention, there is provided a method, for determining the co-ordinates of an object in two dimensions, the method comprising the steps of acquiring an infrared image including the object from an infrared camera; acquiring a stream of images, including the object, and a calibration image, including the object, from a camera; producing a vector, corresponding to an edge of the object, from the infrared image, using an active contour model; producing a training image, by extracting the object from the calibration image using the vector; constructing an adaptive correlation filter using the training image; and determining the x-y co-ordinates of the object in a correlation plot, by correlating at least one image from the stream of images to the adaptive correlation filter. The method may further comprise the step of producing a plurality of rotated training images, the rotated training image set is constructed by rotating the training image, and constructing the adaptive correlation filter from the rotated training images.

The training image, or plurality of rotated training images may periodically update the adaptive correlation filter. The adaptive correlation filter may be updated every 0.5 to 1 .5 seconds, or more preferably, every second. A computer program, embodied on a computer-readable medium, may be configured to execute the method according to the second aspect of the invention.

Embodiments of the invention will now be described, by way of example, and with reference to the drawings in which:

Figure 1 illustrates a simulator, including a computer, head mounted display, controller and camera unit;

Figure 2 illustrates a flow diagram illustrating a method, of an embodiment of the present invention, of measuring the x-axis and y-axis co-ordinates of the controller;

Figure 3 illustrates an OT-MACH filter of the embodiment of Figure 2; and

Figure 4 illustrates, for reference only, a method of measuring the z-axis of the controller.

Figure 1 illustrates an overview of a simulator 1 . The simulator 1 includes a controller 100, a computer 200, a camera unit 300 and a head mounted display 400. For the purposes of this description, the computer 200 is configured to run a computer program which simulates a training scenario, such as using a blowtorch, or bending a pipe.

The computer 200 receives data from the controller 100 and the camera unit 300. The controller 100 includes various sensors to measure spatial properties, such as acceleration and orientation, and to measure user input. The controller 100 outputs the data from the sensors to the computer 200. The camera unit 300 includes a first camera 310 and a second, infrared, camera 320, for image acquisition. The camera unit 300 outputs the image data to the computer 200.

The computer 200 is configured to process the data from the controller 100 and camera unit 300 as input variables in the computer program. The controller 100 provides spatial data, such as acceleration and orientation, and user inputs, and the camera unit 300 provides images which may be processed by a method according to the present invention to track the controller 100. The computer program, which may simulate a training scenario, can therefore give the user an immersive and accurate simulation of a real-life skill, such as using a blow-torch or bending a pipe. The method of tracking the controller 100 is described in more detail below. · Tracking

In normal use, the simulator 1 is set up in a room, with the camera unit 300 facing the controller 100. Generally, the camera unit 300 will be positioned against a wall, and face the controller 100 in the centre of the room. The controller 100 is held by a user. For the purpose of this description, three dimensions are denoted along the Cartesian x, y and z axes, wherein the z-axis is in the direction from the camera unit 300 to the controller 100 (that is, the axis parallel to the floor). The x-axis and y-axis are both orthogonal to the z- axis and to each other. The computer 200 is configured to calculate the x-axis and y-axis co-ordinates of the controller via a first method, and calculate the z-axis co-ordinate via a second method.

The first method, that is, the method of calculating the x-axis and y-axis co-ordinates of the controller 100, will now be described with reference to Figures 2 to 3. The method is performed on the computer 200, using the image data from the camera unit 300. The camera unit 300 acquires a calibration image via the first camera 310, and acquires an infrared image via the second, infrared, camera 320. As mentioned above, the camera unit 300 faces the controller 100, which is held by the user. Therefore the calibration image and the infrared image include the controller 100 and the user.

An overview of the first method is illustrated in Figure 2. As a preliminary step, background subtraction of the infrared image, via temporal differencing, differentiates the controller 100 and the user from the constant background. This produces a processed infrared image, including only the controller and the user, suitable for the subsequent steps.

An active contour model is applied to the processed infrared image to produce an accurate vector contouring the edge of the controller 1 10. The controller 1 10 is readily distinguishable from the user in the processed infrared image due to the use of an IR reflectant coating on the controller 1 10. The active contour model works on the principle of energy minimization to ascertain the vector of the controller's 100 edge in the processed infrared image. The energy of each vector point is calculated based on its neighbouring pixels. A Difference of Gaussian (DoG) filtered image is computed for emphasizing the edges of the controller 100. This energy minimization process is an iterative, continuous, process until the vector of the edge of the controller 100 is accurately computed. The energy function computed and iterated for each vector point is described in the equation below, where i, the number of iterations, runs from 1 to n, n being the number of points on the vector, and E _v ^* _ector is the calculated energy of the vector point.

The computer 200 includes a configuration file, for modifying the number of iterations, i, required to accurately compute the vector of the edge of the controller 100.

Once the vector of the edge of the controller 1 10 has been computed, the vector is then applied to the calibration image from the first camera, to extract the controller 1 10. The extracted controller 1 10 is then applied to the centre of a blank background, which forms a training image, suitable for an OT-MACH filter.

In this embodiment, the training image is further processed to produce a plurality of rotated training images for the OT-MACH filter. For example, the training image is rotated by two- degree increments between -6 degrees and +6 degrees, thus obtaining 7 rotated training images. The rotated training images are multiplexed and input to the OT-MACH filter. The operation of the OT-MACH filter will now be described in more detail, with reference to Figure 3. The OT-MACH (Optimal Trade-off Maximum Average Correlation Height) filter is performed on the computer 200 using a FFTW ("Fastest Fast Fourer Transform in the West") library. The FFTW library is a C subroutine library for computing discrete Fourier transforms in one or more dimensions. The FFTW library is interfaced with Intel's (RTM) OpenCV library for Computer vision, making the OT-MACH filter efficient with respect to processing time and frequency. As shown on the left hand column in Figure 3, the OT-MACH filter receives the set of rotated training images t _i=1toN, where N is the number of rotated training images. Each rotated training image is Fourier transformed FT(T). The output of the FFTW is not a shifted FFT. Shifting of the zero component of the FFT to the centre of the spectrum is performed using the following function, designed in C. cvFFTWShifi

The function has the effect of swapping the upper-left quadrant with the lower-right quadrant, and swapping the upper-right quadrant with the lower-left quadrant.

The OT-MACH filter is expressed in the equation below, where m _x is an average of the rotated training image vector x _1toN in the frequency domain, C is a diagonal power spectral density matrix of any chosen noise model, D _x is a diagonal average power spectral density of the rotated training image, and S _x denotes the similarity matrix of the rotated training image set. These parameters are derivable from the training image. Alpha, beta and gamma are non-negative optimal trade-off parameters, which allow the OT-MACH filter to be tailored for external conditions, such as light levels, α, β, and γ can be modified in the configuration file.

h = - aC + D _{x + 1}S _x

The computer 200 receives a stream of images from the first camera. As shown on the right hand side of the column in Figure 3, a set of sub-images where N is the number of sub-images, are derived from one image from the stream. Each sub-image is Fourier transformed FT(S _k). The Fourier transformed sub images are correlated with the OT-MACH filter, in the frequency domain, via the function below.

conj{FT{h))FT{S _k )

Each sub-image is then classified as in-class or out-of-class by comparing the amplitude of the maximum peak in the correlation plane to a detection threshold. The detection threshold is given in the equation below.

Threshold =—∑ CentrePeak{FT{h)* FT{t _i )) A correlation plot is produced for each in-class sub image. The position of the controller 100 in the x and y direction corresponds to the highest value in the correlation plot. The OT-MACH filter is applied to every m ^,h image from the first camera to generate a correlation plot and to determine the position of the controller 1 10. The parameter m may be modified in the configuration file. The OT-MACH filter may be updated, that is, by a new set of rotated training images obtained and applied to the OT-MACH filter, either in real-time or at a frequency determined by a parameter in the configuration file.

The second method, that is, for calculating the z-axis co-ordinate of the controller 100, will now be described, for reference only, with reference to Figure 4. The z-axis co-ordinate is the distance from the first and second camera's centroid to the controller 100.

A half angle of the first camera θι and second camera θ ₂ is calculated using the following expression, where D is the first or second camera's field stop and f is the first or second camera's focal length.

θ _{1 2} = tan

²Λ

With reference to Figure 4, the z-axis co-ordinate can be determined from the following expression, where a i , ₂ can be measured using the half angle of view and the x-axis and y- axis position of the controller 100 calculated using the first method.

Alternatively, if the first and second camera are calibrated, the intrinsic and extrinsic camera parameters can be found using OpenCV functions

The skilled reader will understand that the rotational multiplexing, that is, the rotation of the training image to produce a plurality of rotated training images, is a non essential feature of the present invention. Rather, the OT-MACH filter may be constructed from the training image. The skilled reader will understand that constructing the OT-MACH filter from the plurality of rotated training images is preferable, as it provides a degree of tolerance to the OT-MACH filter between filter updates, such that the accuracy of position recognition is increased and the computer is less likely to lose tracking of the controller 100. The skilled reader will also understand that the updating the OT-MACH filter, that is, producing a new set of rotated training images or training image, is a non-essential feature. Rather, the OT-MACH filter can be constructed from a first set of training images and not updated. Of course, the skilled reader will understand that updating the OT-MACH filter is highly preferable, as it provides for more accurate position recognition of the controller 100.

Furthermore, it is a non-essential feature for the OT-MACH filter to be updated once every 25 images from the stream of images from the first camera (that is, for a common camera, once every second where the camera captures 25 frames per second). The skilled reader will understand that the frequency of updating the OT-MACH filter may be changed, by modification of the configuration file.

The skilled reader will understand that it is not essential for the filter to be an OT-MACH filter. Rather, any form of adaptive correlation filter may be used, for example, a MACH filter, ASEF filter, UMACE filter or MOSSE filter may be used instead.

The skilled reader will also understand that the simulator 1 is not limited to the plumbing scenarios detailed above. Rather, the simulator 1 may be used for various forms of virtual reality situations, such as other training, recreational or industrial situations. In particular, the object tracking method outlined above will have uses in other situations, for example the manufacturing industry.

In the above embodiment, the computer 200 includes a computer program. The skilled reader will understand that the computer program may be embodied on a computer readable medium, such as a compact disc or USB flash drive, or may be downloadable through the internet. The computer program may also be stored on a server in a remote location, and the user's personal computer may send and receive data to the server through a network connection.

The skilled reader will also understand that the head mounted display is a non-essential feature. Rather, the computer 200 may output graphics to a computer monitor, projector, TV, HDTV, 3DTV, or the like. The head mounted display is a preferable feature, as it provides an immersive experience for the user, and may also provide data relating to the user's head orientation, which can then in turn be used by the simulator 1 . The skilled person will understand that any combination of features is possible without departing from the scope of the present invention, as claimed.

Previous Patent: DEVICE

Next Patent: A SIMULATOR INCLUDING A CONTROLLER