STEREOSCOPIC SYSTEM CALIBRATION AND METHOD

Title:

STEREOSCOPIC SYSTEM CALIBRATION AND METHOD

Document Type and Number:

WIPO Patent Application WO/2019/050417

Kind Code:

Abstract:

A multi-camera calibration method is provided which includes identifying a plurality of calibration objects in each of a plurality of images, and calculating an error matrix comprising reprojection error values. The reprojection error values comprise a reprojection error for the calibration objects in each of the plurality of images and the error matrix comprises reprojection error values for the image acquisition devices.

Inventors:

NIELSEN POUL MICHAEL FONSS (NZ)
RASSOULIHA AMIR HAJI (NZ)
NASH MARTYN PETER (NZ)
TABERNER ANDREW JAMES (NZ)

Application Number:

PCT/NZ2018/050121

Publication Date:

March 14, 2019

Filing Date:

September 06, 2018

Export Citation:

Click for automatic bibliography generation Help

Assignee:

AUCKLAND UNISERVICES LTD (NZ)

International Classes:

G06T7/80; G01C11/06; G06F17/11; H04N13/204; H04N13/246

Foreign References:

US9734419B1	2017-08-15
EP2523163A1	2012-11-14
US20130176392A1	2013-07-11
EP2200311A1	2010-06-23
US8368762B1	2013-02-05

Attorney, Agent or Firm:

BALDWINS INTELLECTUAL PROPERTY (NZ)

Download PDF:

View/Download PDF PDF Help

Claims:

Claims

1 . A calibration method for a system comprising a plurality of image acquisition devices, the method comprising the steps of:

identifying a plurality of calibration objects in each of a plurality of images obtained by the image acquisition devices; and

calculating an error matrix comprising reprojection error values;

wherein the reprojection error values comprise a reprojection error for the calibration objects in each of the plurality of images and the error matrix comprises reprojection error values for the image acquisition devices.

2. A calibration method as claimed in claim 1 further comprising the step of minimising the error matrix.

3. A calibration method as deemed in claims 1 and 2 wherein the reprojection error for each calibration object is minimised.

4. A calibration method as claimed in any one of the preceding claims wherein the error matrix comprises length error values representing length errors in comparison to an expected length between calibration objects.

5. A calibration method as claimed in claim 3 wherein the length error values measure a 3D distance between calibration objects.

6. A calibration method as claimed in any one of the preceding claims wherein the error matrix comprises 3D shape error values representing a difference between a 3D reconstructed shape and an ideal shape of a calibration target comprising the calibration objects.

7. A calibration method as claimed in claim 5 wherein the shape error values measure a Euclidean distance between the reconstructed, shape and a model of the calibration template or object.

8. A calibration method as claimed in any of the preceding claims wherein one or more reprojection error values are scaled when calculating the error matrix.

9. A calibration method as claimed in any one of claims 5 to 7 including approximating the average pixel size of one or more of the images in a length measurement, and using the approximation to convert the units of the 3D length and 3D shape error functions to pixel units.

10. A calibration method as claimed in any one of the preceding claims wherein one or more input parameters of each image acquisition device are divided into groups with the same units and similar magnitude, the parameters being selected from one or more of:

focal length,

principal point,

lens distortion coefficients,

rotation vector, and

translation vector.

1 1 . A calibration method as claimed in any one of the preceding claims wherein one or more input parameters for each image acquisition device are scaled based on the sensitivity of an objective function to changes of that group of parameters, estimated using the Jacobian matrix (J) of the objective function.

12. A calibration method as claimed in any one of the preceding claims wherein the calibration objects are calibration points or control points in a calibration target.

13. A calibration system for an image acquisition system consisting of one or more image acquisition devices, the calibration system comprising:

an input means adapted to receive images from at least two image acquisition devices; and

a processing means adapted to calibrate the system based on the received images, the processing means configured to obtain an error matrix of the system.

14. A calibration system as claimed in claim 13 wherein the processing means is configured to perform the method of any of claims 1 to 12.

15. A calibration method for a system comprising at least one image acquisition device the method comprising the steps of:

obtaining a series of images of a calibration target; 32

removing distortion from the series of images, using a distortion model of the at least one image acquisition device to obtain a measured model of the target, using subpixel image registration to find discrepancies between the images of the calibration target and a model of the calibration target at a plurality of control points, and using the measured discrepancies to update the distortion model.

16. A calibration method as claimed in claim 12 comprising using a mapping function to update the distortion model and using the mapping function to remove the lens distortion using a forward image distortion model by mapping distorted locations to undistorted locations.

17. A calibration method as claimed in claim 16 wherein the mapping function comprises a function(s) that can model image distortion effects and/or has orthogonal basis functions.

18. A calibration method as claimed in claim 16 or claim 17 wherein the mapping function for mapping distorted images to distortion corrected images comprises Zernike polynomials. 19. A calibration method as claimed in any one of claims 16 to 18 wherein image distortion effects are removed from the images using the mapping function and optimising the intrinsic and extrinsic parameters of the image acquisition device(s) in an optimisation process using the error matrix and the objective functions as claimed in any one of claims 1 to 12.

20. A calibration method for a system comprising one or more image acquisition device(s), the method comprising the steps of:

obtaining an error matrix;

minimising the error matrix, at least in part, to optimise the intrinsic and extrinsic parameters of the image acquisition devices.

Description:

STEREOSCOPIC SYSTEM CALIBRATION AND METHOD Field of the Invention

The present invention relates to a calibration system and/or method. In particular it relates to calibration of imaging systems and provides an improved method of calibrating a system using a calibration template.

Background

Three-dimensional (3D) computer vision systems have many applications in robotics, shape reconstruction, quality control, and 3D measurements in experimental mechanics. The majority of 3D computer vision systems use an image acquisition device such as one or more cameras, or a stereoscopic camera, to capture images from various viewing angles to generate 3D models of objects. References throughout this document to "camera" or "cameras" include any device or devices that can acquire an image i.e. any image acquisition device. The simultaneous calibration of multiple cameras is an important task in three- dimensional (3D) computer vision systems. The accuracy of stereoscopic systems in performing 3D measurements is often dependent upon the accuracy of camera calibration. However, the camera calibration process is challenging, particularly when using more than two cameras. Most of the existing multi-camera calibration methods have limitations that prevent them from achieving accurate, robust, and fast calibration even for two cameras. For a single camera, calibration involves identifying the camera's intrinsic parameters and the lens distortion parameters. The intrinsic camera parameters relate an object to its image in the camera image plane, and the lens distortion parameters characterise the distortion effects of the lens. Multiple camera systems include a further step to identify extrinsic camera parameters that specify the 3D positions of each camera in the world-coordinate system. The extrinsic parameters of cameras are required for estimating the 3D position of the object points from two-dimensional (2D) camera images.

In previous template-based methods such as Tsai [R. Tsai, A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses, IEEE J. Robot. Autom. 3 (1987)], and Zhang [Z. Zhang, A flexible new technique for camera calibration, IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000) 1330-1334] a set of 3D control points is extracted from images of a calibration template or target of a known shape and size. Checkerboard patterns with known square sizes are the most commonly used templates. In both of these methods, the extracted control points of the calibration template are used to estimate the intrinsic parameters of the cameras. The most common way of finding the extrinsic parameters of multi-camera systems is using a two-step method. In the first step, the camera's intrinsic parameters and the initial estimates of extrinsic parameters are identified, followed by an optimisation process to minimise a defined objective function to refine the extrinsic parameters.

A widely used objective function is the summation of reprojection errors of all the calibration images of all the cameras (the reprojection error of each calibration image is the sum of squared distances between the measured and the projected control points using the current estimates of the intrinsic and extrinsic parameters of the cameras). Therefore, the camera parameters are refined by minimising the reprojection error in a nonlinear least-squares optimisation. This typically has problems because it is slow, or ineffective for some problems. The use of stereo-pairs was attempted but was found inadequate to address these limitations. For instance Zhang describes a two-step process in which the initial values for the unknown parameters of the camera and the lens are found in the first step, and a nonlinear optimisation process refines the parameters in the second step. Zhang uses the reprojection error in the objective function of the optimisation process, since it relates the known coordinates of 3D control points to the unknown parameters of the camera and lens distortion coefficients. Zhang uses the knowledge of the configuration of the calibration target (the square size, and the number of rows and columns of the checkerboard template). Radial and tangential lens distortion coefficients can also be incorporated into the reprojection error to map the distorted pixel locations to the undistorted locations. The reprojected 3D points (M) can be a function of camera intrinsic parameters, the camera 3D pose {R and 7), and lens distortion coefficients. The values of the parameters can be optimised by minimising the reprojection error in the optimisation process:

f (K, R, T, k ₁, k ₂, k ₃, p ₁, p ₂ \\

Zhang proposed using the Levenberg-Marquadt (LM) algorithm for this optimisation process.

The method of Zhang has been extended to multiple cameras by adding the reprojection error in the images of each camera to the optimisation process, and selecting a world coordinate system and a common origin for 3D points, R, and T vectors of all the cameras. The R and T vectors of each camera locate the 3D position of a camera in the world coordinate system (i.e. relate the camera coordinate system to the world coordinate system). The process of finding the R and T vectors for each camera is called extrinsic camera calibration. The optimisation process is extended to minimise the summation of the reprojection errors of all cameras to find the refined parameters of each camera. The parameters should be optimised for all the cameras:

c N L

argmin \\ m _c

[K],[R],[ T],[ k XkzUkzUPiliPz] — ^Wc ( _i>n) (^c Rc> To fr(c,l)' k(c,2)> fr(c,3 P(c,l)' P(c,2)) where, C is the number of cameras and the parameters inside brackets are an array of the values for all the cameras. Each camera adds at least 15 parameters (3 for the rotation vector, 3 for the translation vector, 4 for the camera intrinsic parameters, and 5 for the lens distortion coefficients) and (N X L) 3D control points to the optimisation process.

Generally methods that use calibration targets (template-based) are the primary method of calibrating multi-camera systems, and checkerboard templates are the most commonly used calibration targets. However, localising the control points of checkerboard templates (i.e. their corner locations) may not be sufficiently accurate to provide a good estimate of the camera parameters. Even though template-based methods are more reliable for controlled environments than self-calibration methods, they have some limitations. For instance the corners of the checkerboard that are used as control points in traditional methods cannot be localised accurately in many applications. Also, perspective distortions in the calibration images and imperfections of the calibration target decrease the accuracy of localising the control points.

Datta et al. (A. Datta, J.-S. Kim, and T. Kanade, "Accurate camera calibration using iterative refinement of control points," 2009 IEEE 12th Int. Conf. Comput. Vis. Work. ICCV Work., pp. 1201 -1208, Sep. 2009) used parameters obtained from the traditional camera calibration method to undistort and unproject calibration images to canonical fronto-parallel images (i.e. images that are parallel to the image plane of the camera). The fronto-parallel images were then used to localise control points, which were projected back to recompute the camera parameters, iteratively. However this increases processing time and unprojecting the calibration images introduces an ambiguity in the image scale.

Douxchamps et al. (D. Douxchamps and K. Chihara, "High-accuracy and robust localisation of large control markers for geometric camera calibration," IEEE Trans. Pattern Anal. Mach. Intell., vol. 31 , no. 2, pp. 376-383, 2009) used ray tracing to build a synthetic image of the calibration template at the estimated location of the calibration target in calibration images. The ray-traced model of the calibration template and the image of the calibration target were matched using an optimisation process to maximise the match between their bright and dark areas, which correspond to high and low intensities, respectively. However this required assumptions that nonlinear distortions are negligible and the surface's luminance is isotropic. This method cannot provide accurate estimates of the control points of the calibration target with which to optimise the parameters of the cameras. The current techniques do not allow for an effective and fast calibration of multi-camera systems. This problem has been addressed to some extent, for instance in the OpenCV library, by limiting the number of calibration images, or in other methods by limiting the number of cameras. However, these solutions are far from ideal. Objects of the Invention

It is an object of the invention to provide a new calibration method and/or system which will improve the accuracy, robustness and/or speed of the calibration of an image capture system. In particular it is an object to improve the method and the process of measuring errors and/or optimising the parameters of a calibration system.

Alternatively it is an object of the invention to provide a calibration method and/or system which will at least go some way to overcoming disadvantages of existing systems, or which will at least provide a useful alternative to existing systems. Further objects of the invention will become apparent from the following description. Summary of Invention

Accordingly in one aspect the invention may broadly be said to consist in a calibration method for a system comprising one or more image acquisition devices, the method comprising the steps of:

identifying a plurality of calibration objects in each of a plurality of images obtained by the image acquisition device(s); and

calculating an error matrix comprising reprojection error values and/or the discrepancy between acquired images and those reconstructed from a mathematical model of the calibration template; wherein the reprojection error values comprise a reprojection error for the calibration objects in each of the plurality of images and the error matrix comprises the reprojection error values. The introduction of an error matrix for the reprojection values allows an improvement in the robustness and efficiency of the system. This is enabled because the error matrix representation provides possible separate entries for the reprojection error values for each of a number of calibration objects in each of the plurality of images if required, and has these values available for each image acquisition device (e.g. camera). This means that instead of attempting to optimise to a single scalar value the calibration method, the system separately addresses each component or parameter (images/calibration objects/cameras) and can easily adjust to different cameras in the system. It can also allow a user to distinguish which parameters have influenced the error. In an embodiment the image acquisition device comprises a camera. In an embodiment the image acquisition device operates at visual or optical frequencies.

In an embodiment the error matrix comprises length error values. The length error values represent length errors in comparison to an expected length between calibration objects. In an embodiment the length error values measure a 3D distance between adjacent calibration objects.

In an embodiment the error matrix comprises shape error values. The shape error values represent a difference between the 3D reconstructed shape and an ideal template shape. In an embodiment the shape error values measure a Euclidean distance between calibration objects and model or template calibration objects.

In an embodiment any one or more of the types of error values are scaled when included in the error matrix. Scaling allows a fair comparison between the different error measurements. In an embodiment the method comprises the step of scaling the shape error values and/or the length error values between metres (length) and pixels. This allows the method to effectively combine them into the optimisation process.

In an embodiment the calibration objects are calibration points or control points in a calibration target. Control points are specific points in a calibration target used to optimise the camera parameters by minimising the reprojection errors of them. In an embodiment the plurality of images are of the calibration target. In an embodiment the calibration points are formed by concentric circles or crossing lines, such as on a checkerboard. In an embodiment the calibration objects are target image features or a selection thereof. In an embodiment the calibration objects comprise or form feature sizes with a range of spatial frequencies. In an embodiment the calibration method comprises the step of:

obtaining a plurality of images of a calibration target.

In an embodiment a calibration target has a plurality of known parameters. In an embodiment the known parameters are used to obtain length and/or shape error values.

In an embodiment the calibration method comprises the step of minimising the error matrix. The matrix can be a matrix of objective function values rather than a single objective function value. In an embodiment the calibration method the step minimising the error matrix optimises the parameters of the cameras.

The optimisation process attempts to minimise the overall error in the error matrix by adjusting any one or more of the parameters in the system. Typically this is performed by minimising finding the arguments of the minima, the point(s) at which the matrix is value is minimised.

In an embodiment the step of optimising the error matrix comprises a trust-region method. In an embodiment the method comprises the trust-region-reflective algorithm.

In an embodiment the calibration method comprises the step of:

obtaining or selecting initial values.

In an embodiment the initial values comprise previously used values; values calculated by another method; known, measured, or selected values for equipment; and/or randomised values. In an embodiment the initial values comprise intrinsic initial values and extrinsic initial values. Intrinsic initial values comprise camera or lens specific values. Extrinsic values comprise system, or multi-camera arrangement (3D pose) values.

In an embodiment the calibration method comprises the step of:

forming parameter groups; and scaling model parameters with respect to the largest group parameter. In an embodiment the forming of parameter groups comprises the step of grouping the parameters to groups of a same physical concept,

In an embodiment of the calibration method the parameters include any one or more of: camera focal length; principal point; lens distortion coefficients; rotation vector and translation vector.

In an embodiment the calibration method comprises the step of:

scaling model parameters with respect to the sensitivity of the objective function or error matrix to that parameter.

In an embodiment the sensitivity of the error matrix or the objective function was measured by a partial derivative matrix, such as the Jacobian. In an embodiment the calibration method comprises the steps of:

detecting outliers in the error matrix; and

removing outliers from the error matrix.

In an embodiment the step of detecting outliers comprises comparison of an error value to an average error value. In an embodiment the step of detecting outliers comprises comparison of the difference between the error value and an average error value to a threshold level. In an embodiment the threshold was set at a factor of 10.

In an embodiment the calibration method comprises any one or more of the steps of:

positioning the two or more cameras;

positioning a calibration target;

estimating initial parameters; and

configuring the image acquisition device(s) or calibration target dependent on the calibration error.

In an embodiment the calibration method comprises extending the error matrix to include a further image acquisition devices.

According to a second aspect, the invention may broadly be said to consist in a calibration method for a system comprising one or more image acquisition devices, the method comprising the step of:

obtaining an error matrix; and minimising the error matrix, at least in part, to optimize the parameters and/or calibrate the system.

The embodiments of at least the first aspect may be included with the second aspect.

In an embodiment the error matrix is obtained based on current estimates of the parameters.

According to a third aspect, the invention may broadly be said to consist in a calibration system for a multi-camera system, the calibration system comprising:

an input means adapted to receive images from at least two image acquisition devices cameras in the multi-camera system; and

a control means adapted to calibrate the system based on the received images, the control means configured to obtain an error matrix of the system; and

an output means to output images from the multi-camera system.

In an embodiment the calibration system is configured to use the method of the first aspect and/or second aspects.

According to a fourth aspect, the invention may broadly be said to consist in a method for configuring an additional camera to a calibrated multi-camera system; the method comprising the steps of:

obtaining reprojection error values for the additional camera; and

configuring the error matrix to include the reprojection error values.

In embodiments of the method the reprojection error values may be obtained by any one or more of:

Estimation of appropriate values;

Replicating values of a current camera; and/or

Manufacturer specification values.

According to a further aspect the invention may broadly be said to consist in a calibration method for a system comprising at least one image acquisition device the method comprising the steps of:

obtaining a series of images of a target;

removing the lens distortion from the series of images, using a lens distortion model of the at least one image acquisition device, to obtain a measured model of the target; and updating the lens distortion model dependent on a comparison between the measured model of the target and a control model of the target.

The introduction of a model based approach, in which an accurate model of the target is used by in the method as a comparison for the outcome of the lens distortion on images, allows the system to have an exact reference for the expected geometry of the template.

Furthermore, by allowing a visual type comparison of the actual image (measured model) and the expected image (control model) a clear indication of accuracy is provided. This process takes advantages of having a substantially accurate control model. This method can be used with any calibration method, including that described in the above aspects. As part of a calibration method this method allows calculation of more accurate control objects, allowing for instance more accurate estimation of parameters.

In an embodiment of the method the comparison is a comparison of a fixed property of the target.

In an embodiment of the invention the comparison comprises a comparison of discrepancies between the measured model and the control model. In an embodiment the control model is synthetically generated.

In an embodiment of the method the comparison is a comparison of spatial features. In an embodiment the comparison uses an image registration algorithm. Preferably a sub-pixel image registration algorithm is use. Most preferably the P-SG-GC algorithm is used. In an embodiment the method comprises the step of selecting a subimage size based on any one of more of the image characteristics, including: resolution, texture, and noise.

In an embodiment the comparison comprises a projective transformation between the measured model and the control model. In an embodiment the orientation of the control model in the image is estimated by the orientation of the measured model.

In an embodiment the method comprises the step of:

locating a plurality of control objects in the measured model of the target;

wherein the comparison between the measured model of the target and the control model of the target comprises a comparison of the control objects in each model.

In an embodiment the control objects comprise control points such as line cross points, edges, concentric circles or checkerboards. In an embodiment the control objects comprise calibration target features. In an embodiment the target features could have a range of spatial frequencies to provide a good comparison at multiple scales.

In an embodiment the control objects of the measured model and the control objects of the control model are mapped to one another.

In an embodiment the series of images include images of the target throughout the field of view. In an embodiment the lens distortion model comprises a radial and tangential distortion model. In an embodiment the lens distortion model comprises a Taylor series expansion.

In an embodiment the lens distortion model comprises Zernike polynomials. In an embodiment the lens distortion model has inputs comprising any one or more of: coordinates of control points and/or image discrepancies. In an embodiment the inputs are normalised within the unit circle. In an embodiment the lens distortion model maps the shifts between distorted and undistorted images. In an embodiment the shifts are mapped in the x and y directions.

Some polynomials, including Zernike polynomials are suitable for fitting to symmetric shapes, similar to lens distortions or aberrations. Therefore, using these polynomials to map the distorted locations to undistorted locations (instead of to the target location movement) can provide advantages.

According to a further aspect the invention may broadly be said to consist in a method for characterising lens distortion, the method comprising the steps of:

obtaining a plurality of images of a target;

removing the apparent lens distortion from the plurality of images, using a lens distortion model, to obtain a measured model of the target;

updating the lens distortion model dependent on a comparison between the measured model of the target to an control model of the target.

In an embodiment the method may use any one or more of the above embodiments.

The disclosed subject matter also provides a multi-camera system and a method for calibration which may broadly be said to consist in the parts, elements and features referred to or indicated in this specification, individually or collectively, in any or all combinations of two or more of those parts, elements or features. Where specific integers are mentioned in this specification which have known equivalents in the art to which the invention relates, such known equivalents are deemed to be incorporated in the specification.

Further aspects of the invention, which should be considered in all its novel aspects, will become apparent from the following description.

Drawing Description

A number of embodiments of the invention will now be described by way of example with reference to the drawings in which:

Fig. 1 shows flowchart of embodiments of the calibration method showing (a) calibration of a imaging system and (b) refinement of a lens distortion model.

Fig. 2 is a diagram of a multi-camera system which may be calibrated.

Fig. 3 is a diagram of a checkerboard type calibration target.

Fig. 4 is a diagram of the checkboard type calibration target of Fig. 3 where control points have been identified.

Fig. 5 shows the change in different error measurements with iterations of the method of figure 1 a and a known method.

Fig. 6 shows the difference in a fitted line between a distorted and undistorted image.

Fig. 7 shows a concentric circle type calibration target.

Fig. 8 shows a calibration target in (a) the measured model and (b) the control model.

Detailed Description of the Drawings

Throughout the description like reference numerals will be used to refer to like features in different embodiments.

The invention is applicable to a wide variety of different fields in which camera calibration is necessary or desirable. These include, without limitation: Agriculture (fruit and produce inline sorting for example); Surgery; Security; Hi-Tech (for example automotive, sailing, etc.); and Large mechanical structures (deformation of cranes, bridges, wind turbines, etc).

The proposed calibration method addresses the limitations of the traditional methods of calibrating multi-camera (at least two image acquisition devices) systems by altering the optimisation procedure and introducing a new objective function. This can provide a number of advantages including the ability to: optimise the intrinsic and extrinsic parameters of all the cameras in a single optimisation; estimate the parameters from imprecise initial values; calibrate many cameras simultaneously, and use a large set of calibration images. The calibration methods can be used as the final step of any calibration method to refine the camera parameters. As mentioned earlier in this document, although referred to herein as cameras, image acquisition devices include, without limitation, cameras, microscopes, x-ray devices or other imaging means. The imaging device may operate at any suitable wavelength(s).

In a particular embodiment the calibration method introduces a trust-region-reflective optimisation algorithm with an error matrix. Further to this at least two 3D error functions can be introduced into the error matrix, which then forms the objective function of the optimisation process. The objective function is the function that calculates the error matrix. The objective function is what an optimisation algorithm tries to minimise. This optimisation process is able to simultaneously calibrate all the cameras of a stereo system using many calibration images, which can improve the accuracy of camera parameter estimations compared to the methods that only can calibrate stereo-pairs or can use few images.

Figure 1 a shows a flow chart of the overall method. Initial calibration values 10 are found or chosen for the intrinsic and extrinsic parameters. A plurality of images of a calibration target are taken 1 1 using a multi-camera system 42, for example. Alternatively, a single image acquisition device may acquire images from a plurality of different locations and/or dispositions. The method then identifies a number of calibration objects in each of the plurality of images, so as to be able to appropriately compare the images. The images obtained by each of the multiple cameras can now be compared to form an error matrix 18 in an objective function. The error matrix 18 is used to relate the reprojection error 15 (the distance between the coordinates of the reprojected 3D points (M) and the measured corresponding points in the camera image (m). The reprojected 3D points (3D points that are reprojected to the camera image plane using the camera matrix) are based on reprojected control points of the model. In preferred embodiments the error matrix is appended with further spatial 16 or 3D shape error 17 values. The various intrinsic and extrinsic values can now be optimised by optimisation of this error matrix 19. This step preferably uses a trust- region algorithm to find the argmin of the error matrix across the parameters. However, alternative embodiments may use different forms of algorithm or may find minimise to a nonzero or maximum value without departure from the method. The process can then be repeated using the newly calibrated intrinsic and extrinsic parameters. Figure 1 b shows a second embodiment of the system which calculates the lens distortion component of the reprojection values. In this embodiment a model based technique is used to calculate and/or update the lens distortion model or camera parameters of a multi-camera system 42. This model takes an initial lens distortion model 30 and uses this to remove lens distortion effects from a plurality of images obtained 32 from a camera or a multi-camera system 42. The calibration target in the undistorted image 33 is now compared 34 to the model of the calibration target 31 . The calibration target model 31 is ideally a perfect replication of the calibration target and may be based on a 3D drawing template, ray traced model or other means. The comparison 34 between the achieved image (measured, 33) and the ideal (control, 31 ) can be used to adjust the lens distortion model 35 and to optimise the camera parameters 36. This comparison 34 can be completed in a number of ways, such as finding the discrepancies between the calibration target in the image and the calibration target model. However it is preferable that it operates on a fixed characteristic of the target. Fixed characteristics include spatial characteristics such as length or 3D shape. In a preferred embodiment the comparison uses subpixel image registration to find localised shifts between the calibration image and the model of the calibration target. Particular embodiments of sub- pixel image registration are described in NZ720269 (WO2017200395), included herein by reference.

Broadly speaking, the multiple camera system 42 comprises at least two cameras 40 arranged above a field of view. Each of the plurality of cameras 40 is then able to image the object (e.g. hand 43), typically in a number of different positions. An example set-up is shown in Figure 2 where four cameras 40 are shown at the corners of a field of view, however the method is not limited to this arrangement, or to four cameras. In a particular embodiment, used for testing and experiment, a system comprises four monochrome USB 3 cameras (Point Grey FL3-U3-13Y3M-C), equipped with 6 mm focal length lenses (DF6HA-1 B from Fujinon). The image size of these cameras was 1280 pixel x 1024 pixel. The FOV of this stereoscopic system was approximately 200 mm ^χ 200 mm with an average distance of 200 mm to the cameras. The cameras can be focused in the FOV using a focusing pattern. In an example 100 calibration images could be taken to cover the whole FOV at various distances and angles to the four cameras of our setup.

Figure 3 shows a calibration template or target 50 used to calibrate the system of Figure 2. The target 50 is not limited to a checkerboard pattern as shown. Some examples of the calibration targets 50 are templates that comprise circular control points, two orthogonal 1 D objects, or four collinear 1 D markers. The template or target 50 comprises a plurality of calibration objects, such as squares vertices 52. In further embodiments the calibration targets 50 may include objects with an array of patterns that have spatial frequencies. The checkerboard template 50 was printed and attached to a 3 mm thick acrylic sheet using an adhesive spray, resulting in a flat 2D template. The checkerboard square 51 size and number of squares 51 were selected based on the size of FOV, and the average distance of the FOV to the cameras. A checkerboard template of size 9 x 12 (i.e. 8 x 1 1 inner corners) with the square size of 6 mm was chosen as the calibration template).

Figure 7 shows an alternative calibration object 50 (also referred to as a template or target) consisting of groups of concentric circles 53 positioned on a 3 χ 4 grid. The characteristics of the template, including the calibration template size, the radius of circles, and the distance between the concentric circles are generally selected based on the size of the FOV, and the average distance of the FOV to the cameras. In preferred embodiments an accurate image of the template 50 or an accurate reconstruction of the template is available. For instance a SVG (scalable vector graphics) image, or other image type produced by a 3D drawing program, may be used. This calibration target offers a number of calibration objects 53 (such as circle centres).

Figure 4 shows an identification of a plurality of calibration points 54 on the calibration target 50. In this case calibration points 54 are identified by the intersection points of the checkerboard. However, it should be understood that only a subset of such points needs to be used, or that calibration objects which have not been previously identified may be chosen based on the images received. In particular it may be advantageous to choose calibration points or objects 51 , 53 with varying spatial frequency between them.

The pinhole camera model is a commonly used simple mathematical representation of a camera without a lens and with a very small aperture opening. The pinhole model is useful to solve the camera equations with geometric optics. In the pinhole model, the relation between the physical 3D position of a point {[x, y, z]) and its corresponding pixel position ([u, v]) in the camera image plane is found using:

where r _nm are the elements of the rotation matrix and t _n are the elements of the translation vector.

In practice, lenses are required to focus the light, distorting the camera images. Lens distortions are mathematically defined as displacements between the observed pixel positions of the image features and their calculated positions[u, v]. Radial and tangential distortions are the two most common mathematical models for lens distortions. Radial distortion is corrected in camera images using : where, {x^ ya) ^are distorted locations, (x _u, y _u) are undistorted locations, k _t are the distortion coefficients, and r is the distance of the distorted locations from the principal point. The radial distortion model is in the form a Taylor series expansion around the principal point (or the image centre), and is symmetric about the centre. The tilting between the x and y axes is often modelled as a tangential distortion, with two more distortion coefficients, Piand p ₂ - Correcting radial and tangential distortions can be performed by Brown's distortion model: x _d = x _u(l + k r ² + k ₂r ⁴ + k ₃r ⁶ )

+ [2p ₁x _uy _u + p ₂ (r ² + 2x ²)] ya = _u(l + r ² + k ₂r ⁴ + k ₃r ⁶ )

+ [ i(r ² + 2y ²) + 2p ₂x _uy _u ]

Tangential distortions are asymmetric about the centre, and the corrected position of points are dependent on both of the current distorted and / positions {x _d and y _d).

Figure 1 a shows flow charts at different stages of the calibration method. The multiple camera calibration method is generally a two-step method. The initial values of the intrinsic and extrinsic parameters are found in the first step 1 0, and then used as the initial values in an optimisation process to find the optimised parameters in the second step 12.

The first step of the method is to initialise calibration values 10. The initial camera and lens parameters can be found based on the method of Zhang. However it may also be suitable to estimate values, use the values calculated at a prior calibration technique, use manufacturer values, or create values by some other method. In embodiments other methods for finding the initial values may be adopted. In one example the initial values for the extrinsic parameters of the cameras 40 (i.e. 3D positions of the cameras with respect to the target or field of view) were found using 3D pose estimation for a set of images taken at various positions of the checkerboard template. Although the described methods for finding the initial values are able to find relatively good initial values, in preferred embodiments the proposed method is able to estimate the parameters using significantly inaccurate initial values. For instance, the initial values for the camera parameters and their 3D positions could be chosen at step 10 purely based on the lens and camera specifications and the physical configuration of the cameras In a particular embodiment a preprocessing step was used on checkerboard images 50 to improve the corner detection for the intrinsic initial values. The preprocessing step converted the colour images to greyscale images, and applied a 2D median filter with a neighbourhood size of 3 pixel ^χ 3 pixel. The median filter improves the performance of the algorithm for finding the checkerboard corners or vertices 54 by removing some of the camera sensor noise and sudden changes in the illumination. However, median filters shift the borders of the image (i.e. the corner locations of the checkerboard), so were only used in the first step for finding the initial locations of the corners.

In a particular embodiment for the initial values 10 of the extrinsic values, the tangential distortion coefficients and in the camera intrinsic matrix were assumed to be zero at this step to help the OpenCV optimisation process by reducing the number of parameters (the tangential distortion coefficients were later estimated in our proposed global optimisation process for estimating the parameters). The images of the same 3D position of the calibration template 50 differ across the cameras 40, since in each camera 40 the images are based on that camera's coordinate system. The perspective transformation (homography) between the corner locations 54 in the image and their corresponding metric locations in the ideal calibration template was used to estimate the 3D position of the checkerboard template 50 in each image. The random sample consensus algorithm was used to find the perspective transform, assuming that the ideal calibration template was placed at z=0 plane of the camera coordinate system with its first corner located at [x y z] = [0 0 0]. With this assumption, the 3D position of the checkerboard template 50 in the camera coordinate system was found for each calibration image and in each camera. Next, the 3D positions of the cameras were estimated by knowing the 3D positions of the checkerboard template and choosing a world coordinate system (i.e. a common origin).

To make the process of adding extra cameras 40 to the stereoscopic system 42 simple, the coordinate system of one of the cameras 40 of the system was selected as the world coordinate system, and the coordinate systems of the other cameras was transformed to this common coordinate system. However, other coordinate systems are possible. Assuming that the 3D positions of the checkerboard template in the coordinate systems of both cameras 40 are known, and the world coordinate system is selected to be the coordinate system of camera 1 , the 3D position of the camera 2 in the world coordinate system could be found using:

R _r = R ₂x(Ri) ^T T _r = T ₂ - (RrXT

where, (Ri i) and {R ₂, T ₂) are the vectors indicating the 3D position of the checkerboard template in the coordinate systems of camera 1 and camera 2, respectively and (i? ₁) ^Tdenote the transposed matrix of R ₁. The {R _r, T _r) values are estimated for each calibration image and will have variation across a set of images. However, a single set of values should be chosen for the 3D position of each camera 40. The selection of the average value for (R _r, T _r) of the camera 40 will introduce error, since some of the estimated values are outliers. Therefore, to minimise the error, in our method the average of the three middle values of the median sorted array was selected for (R _r, T _r) of each camera 40. Other means of selecting an average or other appropriate value for R _r, T _r are also suitable.

A system 42 with four cameras 40 and a dataset of eighty calibration images of a checkerboard of size 9 squares ^χ 1 2 squares (8 x 1 1 inner corners as control points), sixty parameters (i.e. 4 x 15) should be optimised using 28160 (i.e. 8 ^χ 1 1 ^χ 4 ^χ 80) control points. The present system and method addresses this problem by proposing an optimisation process including using a matrix error 1 8.

The error functions 15, 16, 1 7 of this method are able to be summed. However, instead of using a summation of errors, the error is calculated in the form of a matrix. This is, in part, because a summation fails to properly represent the error being detected. For instance, a summation assumes that a single (scalar) value can be used to represent the error. This overlooks the fact that the reprojection errors are not distributed uniformly over the field of view (FOV) and across the cameras: due to the lens distortion effects, the reprojection errors are higher close to the peripheries of the image, and can vary over the FOV and across the cameras of a multi-camera system.

In a particular embodiment a form of the matrix 18 (or objective function) is chosen which includes the reprojection error values, RE:

The reprojection error values are measured at each checkerboard corner 54 location, L is the number of checkerboard corners, N is the number of calibration images, and C is the number of cameras. This matrix error forms an objective, that by minimising the error measured at each location (by modification of parameters (K, R, T, k and p) the multi-camera system can be calculated. This provides a much clearer understanding of which calibration object and image or camera, RE _(a,b) is causing a problem.

A number of methods are capable of measuring the reprojection method, and a particular, a model based method is described herein. In the error matrix preferably no assumption is made about the distribution of the errors. In a preferred embodiment comprehensive error information is provided for the optimisation algorithm 19 at each calibration object location of each of a plurality of image across all the cameras 40. The error matrix was estimated in an optimisation process after undistorting the images. Thus, in addition to the camera parameters, lens distortion parameters were taken into account. This is because the reprojection error, and therefore the error matrix incorporates lens distortion error.

Embodiments of the method also use a different algorithm for calculating the optimised values 19 for the matrix error equation. However, in principle any one of the optimisation algorithms that can minimise an error matrix 18 may be used. In a particular embodiment trust-region algorithms have been used. This is because Trust-region methods have very reliable convergence properties, particularly for solving problems with a sparse structure. Minimisation of the reprojection error for camera calibration is a similar example of a problem with a sparse structure. However alternative optimisation methods such as gradient descent and steepest descent may also be used.

In preferred embodiments of the system and/or method the error matrix above is improved by the addition of further components which reflect, or reveal other errors present in the system. In a particular embodiment two further error functions or values are introduced. These are based on the 3D information of the reconstructed calibration template. Introducing these to the objective function, in combination with the reprojection error, can help to increase the accuracy and robustness of the optimisation process of calibrating multiple cameras.

A first error function used information about the size (e.g. physical dimensions) of the calibration target, preferably 3D length information 16. That is to say it forms a representation of a change in length between images. In a particular embodiment this can be calculated by subtracting the 3D distance of adjacent triangulated checkerboard corners (or alternative calibration objects) and the known square size of the checkerboard template (or calibration object).

A second error function used a measurement of 3D shape error 17. This measures variations between the actual shape of the calibration target and its image. These variations include rotations or skews or other variations across the calibration target. This may be stated as the difference between the 3D reconstructed template and the ideal template shape (i.e. 3D shape errors). In a particular embodiment based on a checkerboard target these errors are calculated by measuring the Euclidian distances between the triangulated corners and the expected ideal 3D locations of the checkerboard corners (which are estimated knowing the number of rows and columns of the checkerboard template 50 and its square size).

The expected ideal corners might be uniformly scaled due to an uncertainty in finding the 3D location of the calibration template. In other words, 3D shape errors only illustrate the geometric variation of the triangulated corners from an ideal checkerboard template, without taking into account the correct square size. The correct square size can be by measuring 3D length errors.

In an embodiment where both length errors and 3D shape errors are introduced into the error matrix 18, or objective the objective can be stated as:

RE (1.1) RE, LE, SE,

(C.l)

argmin

M-[«]-[ T],[ fc ₁],[fc ₂],[fc ₃].[Pi].[P2] RE, (l.LxN) RE(C,LXN) (LXN) SEQ where, RE is the reprojection error, LE is the length error, and SE is the 3D shape error, (note that LE and SE are 3D measurements, and are only defined once for a multiple cameras system). In other embodiments it may be useful add further error measurements to the matrix, such as angular error measurements of the target or errors at different spatial frequencies. Furthermore the above matrix form is not the only matrix form 18 of the system. Although the above form helps to separate the individual parts of the errors it is possible to combine, or separate, parts of different errors to form a matrix having the same information in a different arrangement. One advantage of the proposed embodiment is that the arrangement has a relatively sparse structure, including the reprojection error values and the the 3D length and 3D shape errors, which can improve the optimisation process.

The 3D length 16 and 3D shape 17 errors are measured in meters, but the reprojection errors are measured in pixel units. Different units are preferably not directly combined in a single optimisation process, since they generally have different scales and physical meaning. To overcome this issue the different error measurement values should be converted to a common unit scale. Any unit scale should be suitable. In the present embodiment units of pixel were chosen. The average pixel size of our FOV was approximated in meters, and was used to convert the units of the 3D length and 3D shape error functions to pixel units. An alternative length measure could also be used. This enables all the error functions of the objective function to have the same unit of pixel, which enabled a consistent objective function. The errors associated with the lens distortion can be taken into account in the objective function by mapping the distorted checkerboard corners to undistorted locations, prior to estimating the error functions at each iteration 13 of the optimisation process.

The process of corner 54 finding is error-prone, particularly when images are noisy, blurred, or have areas with specular reflections. As a result, the error measurements become inaccurate in such images. The quality of the optimisation process can be improved by removing the outliers caused by the failure of the corner finding algorithm. In embodiments of the method outliers are detected and removed from the defined error matrix (or objective function) based on a comparison with an average or expected value, such as the average error values in the whole dataset for each camera. In a particular embodiment the average values were calculated for reprojection errors of each camera using

A threshold was used to test whether the measurement was an outlier. In a particular embodiment reprojection error values that were greater than 10 times of the average reprojection error estimated using were detected as outliers, and were removed from the error matrix in. This choice of threshold for detecting the outliers attempts to ensure that only outliers will be removed, not points that may have larger error due to lens distortions, such as the images with checkerboard corners at the peripheries. However larger or smaller thresholds may be used as required. The input parameters of multiple camera calibrations have various ranges and units. For instance, the ranges of rotation vectors are between 0 to 2π radians, while the translation vectors can vary considerably. Furthermore, objective functions have different levels of sensitivity to the input parameters, which means that variation of input parameters could have different effects on the output error in optimisation processes. In embodiments of the invention input parameters are scaled prior to using them as the inputs of the optimisation process.

The sensitivity of the objective function to each input parameter is often indicated based on its partial derivatives. In a particular embodiment the scaling of parameters can be performed in two steps. In the first step, the input parameters of each camera are divided into groups with the same units and similar magnitude, which were:

the camera focal length,

principal point,

lens distortion coefficients,

rotation vector, and

translation vector.

The parameters of each group are normalised with respect to the largest value of that group.

The sensitivity of the objective function to the changes of the input parameters is an important factor that affects the convergence rate and robustness of the optimisation process. In a particular embodiment this can be addressed by a second step where the input parameters (in the divided groups) are scaled based on the sensitivity of the objective function to the changes of that group of parameters, estimated using the Jacobian matrix (J) of the objective function. In other embodiments an alternative way of assessing the sensitivity of the optimisation process to input parameters could be used. The Jacobian matrix is an approximation of the partial derivative of the objective function (O) at initial values:

where, O is the objective function, x _t are the camera calibration parameters (input parameters), and NE is the number of the elements of the error matrix (NE is equal to L x N x (C + 2)). The average values of columns of the Jacobian are an approximation of the partial derivative of O for that input parameter (x _t), and were thus used as the measure of the sensitivity of the objective function for that input parameter. In this step of the scaling process, the groups of input parameters were scaled according to the average value of the Jacobian matrix, so that more sensitive parameters become larger and vice versa. This is intended to results in a well-scaled problem for which, any changes in the input parameters will have a similar effect on the error functions (or the objective function). Note that preferably all the input parameters were scaled using this two-step method prior to using them as inputs of the objective function, whereas inside the objective function the input parameters were scaled back to their original values to estimate the error parameters. In other embodiments an alternative partial difference method could be used. Figure 5 shows RMS reprojection errors for two 58, 59 of four cameras 40, RMS 3D error 60, and RMS length error 61 for a calibration dataset in the method of Figure 1 a 56 and a traditional method 57. This demonstrates that the new method converged quickly and has resulted in small errors. The results show both a smaller overall error and a faster convergence rate.

Figure 6 shows two sample lines fitted to one row of corners (calibration objects) in a distorted image (on the left) and in its undistorted version (on the right). It is clear from this image that the original image was badly distorted by the multi-camera system 42. The distortion has been improved by the optimisation of the calibration values. The multi-camera system can now be used to picture unknown objects as required.

In further embodiments of the method the system is adapted for a multi-camera system, such as that shown in Figure 2, with a variable number of cameras 40 being used. This is, in part, because the error matrix is arranged to have a separate column for each camera in the system. This means that, where a further camera 40 is introduced only one column of the error matrix must be updated. This is intended to result in an accurate initial value. Similarly, where different cameras are being used the system does not attempt to find a single value for what may be different cameras, but enables details about each camera to be processed separately.

Figure 1 b shows a flowchart of an embodiment of the system. In a first step, the initial values of the parameters of our multi-camera system are estimated or initialised 30. This may use a rough calibration system, as in the earlier method, or may use any one or more of the methods described above. For example the use of a checkerboard template 50 for estimating the initial values of the parameters 30 at the first step may be due to its good reliability in the presence of lens and perspective distortions. The estimated initial values of the lens distortion coefficients allow us to correct most of the lens distortion effects 33 in the images 32. Removing the lens distortion effects paves the way to use concentric circle templates, which are typically more sensitive to lens distortions than checkerboard templates, but can provide higher accuracies for localising the control points in low-distortion images. In a particular embodiment a series of calibration images are taken with the system and used to estimate the initial values of the parameters of our multi-camera system. Other methods of obtaining initial calibration values will be known to the skilled person.

In an embodiment the initial values of the camera parameters are found using a calibration method. The initial parameters were then refined using a designed calibration target (for instance consisting of concentric circles) and a reconstructed model 31 of this calibration target. The reconstructed, or control model (a synthetically generated model 31 ) is, for instance reconstructed in the estimated 3D position of the calibration target in the calibration images using a projection estimate. This enables simultaneous refinement of the control point locations and estimation of the lens distortion effects. The discrepancies 34 between the calibration target and its reconstructed model were measured using an algorithm for subpixel image registration. Zernike polynomials can be used as mapping functions to define a forward lens distortion model. This process acts to improve the localisation of the control points and characterisation of the lens distortion. It can be implemented with the earlier method, or any other method, where lens distortion is calculated 35. The improved accuracy or localisation of the control points means that a following calibration step 36 is more accurate, because the input parameters (e.g. reprojection errors) have been more accurately calculated. In an embodiment of the system a model-based technique is employed to refine the camera parameters and estimate the lens distortion model. A further set of calibration images are taken 32 by each of the cameras 40, of the multi-camera system 42. This may use a different calibration target 50, such as the concentric circle calibration target of Figure 7. Images are taken in the FOV at various distances and positions with respect to the cameras. The lens distortion effects were removed from the calibration images using the initial values of the estimated parameters.

After the calibration images have been obtained 32 embodiments of the method proceed as follows:

a series of control points, or calibration objects are found in the undistorted calibration images and were sorted in a pre-specified order;

the sorted control points were refined, and the effects of lens distortion were estimated;

a lens distortion model was defined and was used to remove the lens distortion effects from the images 33; and

the cameras were recalibrated based on the new estimated parameters 36.

The step of locating or obtaining the calibration object locations (control points) involves segmenting the template from the background. This may use a Canny edge detection algorithm to convert images to binary images. The components which are outside of a size range which identifies the calibration objects may then be removed (for instance 100 pixels<calibration object<1000 pixels). The size range will be dependent on the camera image size and the calibration object. The calibration objects were then found based on the geometric characteristics of the components of the binary image, however other methods are possible. In a particular embodiment using concentric circles an ellipsef itting was used to find the centre of concentric circles. . The ratio of the length of the major axis to the minor axis of that ellipse {Ra, ranging from perfect circle (1 ) upwards) was calculated (circles become elliptic under perspective distortion in the camera images). The components of the binary image where Ra was smaller than, for example, 2 can be selected as the components that had the required geometric characteristics to be the circles 53 of the calibration template 50. Even though the upper threshold for the Ra value is dependent on the amount of perspective distortion in calibration images, the Ra value that we selected can be valid in a wide range of calibration images

Because circles 53 may become elliptic under perspective distortion a least-squares method was used to find the best-fit ellipse for the data points of each component of the binary image. The centres of the fitted ellipses were used as the centres of the circles of the calibration template 50. Even though the concentric circles 53 have a common centre, because of the errors in identifying the centre of each circle, several closely placed centres were found for the members of each group of concentric circles. Therefore, the /(-means clustering method was used to divide the found centre positions into a number of clusters equal to the number of control points, and the median values of the x and y positions of the centres in each cluster were selected as the centre of that group of concentric circles (or the control point of the calibration template). The median value of the centres was used to help to reduce the error in finding the centres of the circles in calibration images. After finding the centres (control points, calibration points) were sorted, for instance using defined markers of the calibration template, to estimate the position and orientation of the calibration template. The four markers were identified in calibration images based on a characteristic of the markers. For instance in the object of Figure 7, the number of circles in each group of concentric circles (marker 1 had five, marker 2 had four, and marker 3 and marker 4 had three concentric circles). The number of circles 53 was found after clustering the centres using the k-means clustering in the previous step. In other cases distance could be used, for instance Marker 3 and marker 4, which had the same number of circles, were distinguished based on the Euclidean distance of their centres to the centre of marker 1 in calibration images (marker 3 had a shorter distance). If necessary the remaining calibration objects can also be provided with a specified order, or to allow sorting. In an embodiment of the method, the control points of calibration images were mapped to the control of the calibration model (the synthetically generated model) using some known information about the geometry of the calibration template 50. In embodiments of the method the system uses a comparison between image(s) obtained from the multi-camera arrangement and the model of the calibration target 31 , (which may be referred to as a control model). This process attempts to reduce the error of the algorithm used in localising the control points and remaining effects of lens distortion. The control model may be generated or prepared 31 using an image of the calibration target, such as its SVG image. The distance between the control points can be converted from mm to pixels by known means. Although the control model can be obtained in a number of ways (by measurement, ray tracing or otherwise) an advantage of using a 3D modelling program is that the control model is designed with known dimensions, the exact locations of the control points were known in the model of the calibration template.

This enables a comparison 34 to be made between calibration objects on the measured model (i.e. from the images) and the control model (ideal model from calibration target). A projective transformation between the control points of the calibration template model and the control points of the control model may be found using a least-squares method. In embodiments the initial values of the control points were estimated in undistorted images, which helped to estimate a more accurate projective transformation between the calibration model and the calibration image. This projective transformation is based on the current lens distortion parameters. Therefore it can be used to reconstruct the control model in the estimated location of the measured model. That is to say the projective transformation is used to make the control model appear to match the measured model. Any differences now seen between the projected control model and the measured model are discrepancies, or errors which can be measured and compensated 35. This is a forward lens distortion model, which maps from the distorted locations to undistorted locations. In different embodiments an inverse model may be used.

In particular the method can be used where the comparison 34 between the measured model and the control model is a comparison of a fixed property of the calibration target. This includes, but is not limited to a spatial dimension (e.g. length errors) or 3D shape feature of the calibration target. These features are most useful because they can be extracted from calibration images in the multi-camera system 42 and can be compared to a model. Furthermore, these features are fixed, and do not alter during the calibration process. Figure 8 shows a sample calibration image 80 (Fig. 8a) and the generated model of the calibration template 81 (Fig. 8b). Zoomed views of the concentric circles on the bottom left side of the calibration template in the calibration image 82 and the model 83 are also provided in Fig. 8 for comparison. As the zoomed views 82, 83 illustrate, the expected view of the calibration template 82 (i.e. the model of the calibration template) was reconstructed very similarly to the actual calibration image 83. However, the reconstructed model of the calibration template and the calibration image show some minor local discrepancies due to the errors in localising the control points and lens distortion effects. The discrepancies between the calibration image and the reconstructed model of the calibration template (D _x and D _y) were measured in subimages of size 128 pixel ^χ 128 pixel using the P-SG-GC algorithm. The local discrepancy data (subpixel shifts) were measured in the subimages of all the calibration images of the multiple cameras in the x and y directions. The discrepancy data were used to create the lens distortion model. The local discrepancies (subpixel shifts or errors) between the reconstructed model of the calibration template (control model) and the calibration images (as taken by the multi-camera system) may now be estimated in the x and y directions. This may use any pixel registration method but preferably uses a sub-pixel image registration such as that described in NZ720269. In a preferred embodiment the method uses the phase-based Savitzky-Golay gradient-correlation (P-SG-GC) subpixel image registration algorithm as described in NZ720269, however other means will be known. In a particular embodiment the local discrepancies, D _x and D _y, were estimated in subimages of size 128 pixel ^χ 128 pixel, which were chosen around the (x, y) coordinates of the control points of the calibration template (64 pixel at each direction) for both the reconstructed model and the calibration images. The subimage size of 128 pixel ^χ 128 pixel provided a good trade-off between the locality of measurements and having adequate image features in the subimages. Image features are useful for performing subpixel image registration, and the concentric circles of the calibration template provided suitable features for this purpose. However other features are suitable and, in particular, features with a range of spatial frequencies may be particularly useful.

NZ720269 describes an image registration technique for a plurality of images. The image representation technique comprises the steps of: Obtaining an image characteristic at a plurality of points in each image; Estimating the gradient of the image characteristic, the gradient estimate comprising a feature extracting function. An accurate registration is achieved because of the calculation of a gradient combining multiple neighbouring points of the gradient measurement. The process may also involve the application of a feature extracting function, such as a smoothing function. This may be combined (or include) with a further operator for extracting or emphasising some of the image characteristics, such as a differentiator kernel (gradient). Although the gradient appears to be a minimal factor in the calculation and a more complex, or higher order function increases the computational load, the addition of a smooth gradient, or a smoothing filter combined with a differentiator kernel, has a large beneficial impact on the image registration. Further aspects of the image registration method include the steps of: Obtaining a frequency domain representation of a smoothing function; Applying the frequency domain representation of the smoothing function to a frequency domain representation of a cross correlation. In a further aspect the image registration comprises a two-step process which comprises the steps of obtaining an estimate of the integer pixel-shift between the images; and obtaining an estimate of the sub- pixel shift between the images.

In embodiments of the method the control points of the reconstructed calibration model at each image were selected as the refined control points of the calibration target in that image, and Dx and D _y values were used to characterise the lens distortion effects. This is because the errors associated with localising the control points are likely smaller and more random than lens distortion effects.

As described above Brown's distortion model can be used to correct radial and tangential distortions. Brown's distortion model uses a Taylor series expansion around the principal point. In embodiments of the invention Zernike polynomials are used instead. Other suitable polynomials include those which can accurately model symmetric shapes and include some basis functions. An example of these types of functions is "Bessel functions. In an embodiment of the method the measured D _x and D _y values from provided data about lens distortion behaviour at the location of the control points of the calibration target in all of the calibration images of the camera. To characterise this behaviour at each camera, two independent Zernike polynomials were fitted to each of the measured D _x and D _y values in all of the cameras. The x and y inputs of the Zernike polynomials were the x and y coordinates of the control points in calibration images, and the z input was the measured D _x and D _y values at that location. Preferably the, the x and y positions of control points were normalised within the unit circle prior to being used for fitting in the Zernike polynomials, this takes advantage of the orthogonality of Zernike polynomials within the unit circle.

Zernike polynomials have some advantages: they are orthogonal over the continuous unit circle, and they can readily capture and model different aspects of the signal shapes. After the lens distortion effects were characterised, two separate sets of Zernike polynomials were used to estimate the mapping function that maps the distorted x and y locations of the points to their undistorted locations. The forward lens distortion model was used instead of solving an inverse problem, and to increase accuracy.

By fitting Zernike polynomials to the D _x and D _y values, the Zernike polynomials become a mapping function that provides the x and y coordinates of a point, and based on that, can estimate the amount of shift that is caused by the lens distortion effects at that location. Thus, the undistorted location of points can be measured by subtracting the estimated shift from the distorted locations. An advantage of using Zernike polynomials to map the amount of shift between distorted and undistorted images rather than to map the distorted locations to undistorted locations is their suitability for fitting to symmetric shapes that are similar to lens distortions or lens aberrations.

The Gram-Schmidt orthogonalisation technique allows the expansion of discrete data in terms of the Zernike polynomials while retaining orthogonality this takes into account that Zernike polynomials are only orthogonal over the continuous unit circle, whereas the D _x and D _y data characterising the lens distortion is discrete. Preferably third order Zernike polynomials are used, as this offers a balance between complexity and over fitting in higher order systems. Other polynomials may also be useful. The method has now obtained improved calibration estimates 36 of the multi-camera system 42. For instance the refined control points of a concentric calibration target 50are less prone to error compared to the corners of the checkerboard. In preferred embodiment the calibration process is repeated or iterated. This refines the parameters of the multi-camera system 42 using the refined control points of the designed concentric calibration target obtained from calibration images. Even though the lens distortion effects were estimated and modelled, the radial and tangential distortion coefficients were included in the process of recalibrating the cameras with initial values of zero. As they were estimated to be zero, this showed that these distortive effects had largely been corrected. The four cameras of the stereoscopic system were recalibrated using the multi-camera calibration technique using matrix error, however other methods will be known.

In a multi-camera system 42a first step is to match the corresponding points in each of the cameras to 3D reconstruction of the surface of the flat object. Traditional methods such as block-matching, which typically use cross-correlation to match the corresponding points in cameras are challenging when matching the corresponding points in arbitrarily positioned cameras 40 that cause substantial differences between image views. In a preferred embodiment a first camera 40 is chosen as the reference camera and a projective transformation is applied to the images of the remaining cameras to make the views similar to the reference camera. The projective transformation is only used as an initial estimate; thus, it does not need to be accurate. Corresponding points between cameras 40 may be found by extracting and matching the image features or by using block-matching methods and subpixel image registration to find the corresponding points from the transformed images.

In a particular embodiment Camera One was used as the reference camera. A (24 χ 39) virtual grid of points with a step size of 10 pixels (i.e. 936 points) was selected on the surface of the surface of the flat object in the reference camera One. Then, the P-SG-GC subpixel image registration algorithm with subimages of size 128 pixel χ 128 pixel was used to match the corresponding points between the image of the reference camera and the transformed images of the non-reference cameras. The matched corresponding points were transformed back to the original coordinate system of the non-reference cameras using the inverse of the projective transformation.

From the foregoing it will be seen that systems and/or methods of calibration are provided is provided which enables robust and accurate calibration of cameras and, in particular multi- camera systems.

Unless the context clearly requires otherwise, throughout the description, the words "comprise", "comprising", and the like, are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense, that is to say, in the sense of "including, but not limited to".

Although this invention has been described by way of example and with reference to possible embodiments thereof, it is to be understood that modifications or improvements may be made thereto without departing from the scope of the invention. The invention may also be said broadly to consist in the parts, elements and features referred to or indicated in the specification of the application, individually or collectively, in any or all combinations of two or more of said parts, elements or features. Furthermore, where reference has been made to specific components or integers of the invention having known equivalents, then such equivalents are herein incorporated as if individually set forth. Any discussion of the prior art throughout the specification should in no way be considered as an admission that such prior art is widely known or forms part of common general knowledge in the field.

Previous Patent: APPARATUS FOR PROCESSING MEAT

Next Patent: DICTIONARY-BASED DATA COMPRESSION