Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
INDOOR NAVIGATION SYSTEM AND METHOD
Document Type and Number:
WIPO Patent Application WO/2014/153429
Kind Code:
A1
Abstract:
Disclosed is a method for indoor navigation. Two or more cameras capture images of a first set of spots made by one or more laser beams, and images of a second set of spots made by one or more laser beams. The laser beams making the set of spots are emitted by a laser projector in a set of four or more different directions during a time interval on surfaces of an indoor space. Three-dimensional locations of spots are estimated from images captured by at least two cameras during the first and second time intervals. A position of the laser projector in the indoor space during the first and second time intervals is estimated by space resection given the first and second sets of four or more different directions and the three-dimensional locations of the spots.

Inventors:
JANKY JAMES M (US)
SHARP KEVIN A I (US)
MCCUSKER MICHAEL V (US)
ULMAN MORRISON (US)
Application Number:
PCT/US2014/031269
Publication Date:
September 25, 2014
Filing Date:
March 19, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
TRIMBLE NAVIGATION LTD (US)
International Classes:
G01C21/20; G01S5/16; G01S17/89
Domestic Patent References:
WO2012041687A12012-04-05
Foreign References:
US20080204699A12008-08-28
EP0221643A21987-05-13
DE69019767T21996-02-15
US20040001197A12004-01-01
Other References:
SEBASTIAN TILCH ET AL: "CLIPS proceedings", INDOOR POSITIONING AND INDOOR NAVIGATION (IPIN), 2011 INTERNATIONAL CONFERENCE ON, IEEE, 21 September 2011 (2011-09-21), pages 1 - 6, XP031990152, ISBN: 978-1-4577-1805-2, DOI: 10.1109/IPIN.2011.6071937
SEBASTIAN TILCH ET AL: "Current investigations at the ETH Zurich in optical indoor positioning", POSITIONING NAVIGATION AND COMMUNICATION (WPNC), 2010 7TH WORKSHOP ON, IEEE, PISCATAWAY, NJ, USA, 11 March 2010 (2010-03-11), pages 174 - 178, XP031814808, ISBN: 978-1-4244-7158-4
"Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem", INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 13, no. 3, 1994, pages 331 - 356
HARALICK ET AL.: "Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem", INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 13, no. 3, 1994, pages 331 - 356, XP008049398, DOI: doi:10.1007/BF02028352
PETER H. SCHOENEMANN: "A Generalized Solution of the Orthogonal Procrustes Problem", PSYCHOMETRIKA, vol. 1, no. 31, 1966, pages 1 - 10
E. CHURCH: "Revised Geometry of the Aerial Photograph", 1945, SYRACUSE UNIVERSITY PRESS
E. CHURCH: "Theory of Photogrammetry", 1948, SYRACUSE UNIVERSITY PRESS
M. DHOME; M. RICHETIN; J. T LAPRESTE; G. RIVES: "The Inverse Perspective Problem from a Single View for Poly-hedra Location", IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 1988, pages 61 - 68
S. FINSTERWALDER; W SCHEUFELE: "Sebastian Finsterwalder zum 75.Geburtstage", 1937, VERLAG HERBERT WICHMANN, article "Das Rückwartsein-schneiden im Raum", pages: 86 - 100
M. A. FISCHLER; R. C. BOLLES: "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography", GRAPHICS AND IMAGE PROCESSING, vol. 24, no. 6, 1981, pages 381 - 395, XP001149167, DOI: doi:10.1145/358669.358692
S. GANAPATHY: "Decomposition of Transformation Matrices for Robot Vision", PROCEEDING OF INTERNATIONAL CONF ON ROBOTICS AND AUTOMATION, 1984, pages 130 - 119
E. W. GRAFAREND; P LOHSE; B. SCHAFFRIN: "Dreidimensionaler Rückwartsschnitt Teil I: Die projektiven Gleichun-gen", ZEITSCHRIFT FUR VERMESSUNGSWESEN, GEODATISCHES INSTITUT, 1989, pages 1 - 37
J. A. GRUNERT: "Das Pothenotische Problem in erweiterter Gestalt nebst Uber seine Anwendungen in der Geodasie", GRUNERTS ARCHIV FÜR MATHEMATIK UND PHYSIK, vol. 1, no. 1841, pages 238 - 248
R. M. HARALICK; H. JOO; C. N. LEE; X. ZHUANG; VG. VAIDYA; M. B. KIM: "Pose Estimation from Corresponding Point Data", IEEE TRANS. ON SYSTEMS, MAN, AND CYBERNETICS, vol. 19, 6 November 1989 (1989-11-06), XP055003136, DOI: doi:10.1109/21.44063
B. K. P. HORN: "Closed-Form Solution of absolute Orientation Using Orthonormal Matrices", J. OPT. SOC. AM. A, vol. 5, no. 7, 1988, pages 1127 - 1135
R. HORAUD: "New Methods for Matching 3-D Objects with Single Perspective Views", IEEE TRANSACTIONS ON THE PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. PAMI-9, no. 3, May 1987 (1987-05-01)
S. LINNAINMAA; D. HARWOOD; L. S. DAVIS: "Pose Estimation of a Three-Dimensional Object Using Triangle Pairs", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 10, no. 5, 1988, pages 634 - 647, XP011478180, DOI: doi:10.1109/34.6772
P. LOHSE: "Dreidimensionaler Rückwärtsschnitt Ein Algo-rithmus zur Streckenberechnung ohne Hauptachsentrans-formation", GEODÄTISCHES INSTITUT, 1989
D. G. LOWE: "Three-Dimensional Object Recognition from Single Two-Dimensional Images", ARTIFICIAL INTELLIGENCE, vol. 31, 1987, pages 355 - 395, XP008018761, DOI: doi:10.1016/0004-3702(87)90070-1
E. L. MERRITT: "Explicity Three-Point Resection in Space", PHOTOGRAMMETRIC ENGINEERING, vol. XV, no. 4, 1949, pages 649 - 655
E. L. MERRITT: "Analytical Photogrammetry", PITMAN PUBLISHING CORPORATION, article "General Explicit Equations for a Single Photograph", pages: 43 - 79
F. J. MÜLLER: "Direkte (exakte) Losung des ein-fachen Rückwartseinschneidens im Raume", ALLGEMEINE VERMESSUNGS-NACHRICHTEN, 1925
P. H. SCHONEMANN; R.M. CAROLL: "Fitting One Matrix to Another Under Choice of a Central Dilation and a Rigid Body Motion", PSYCHOMETRIKA, vol. 35, no. 2, June 1970 (1970-06-01), pages 245 - 255
P. H. SCHONEMANN: "A Generalized Solution of the Orthogonal Procrustes Problem", PSYCHOMETRIKA, vol. 31, no. 1, March 1966 (1966-03-01), pages 1 - 10, XP002317334
G. H. SCHUT: "On Exact Linear Equations for the Computation of the Rotational Elements of Absolute Orientation", PHOTOGRAMMETRIA, vol. 16, no. 1, 1960, pages 34 - 37
"American Society of Photogrammetry", 1980, FALLS CHURCH, article "Manual of Photogrammetry"
A.D.N. SMITH: "The Explicit Solution of Single Picture Resection Problem with a Least Squares Adjustment to Redundant Control", PHOTOGRAMMETRIC RECORD, vol. V, no. 26, October 1965 (1965-10-01), pages 113 - 122
E.H. THOMPSON: "Space Resection: Failure Cases", PHOTOGRAMMETRIC RECORD, vol. V, no. 27, April 1966 (1966-04-01), pages 201 - 204
R.Y. TSAI: "A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses", IEEE JOURNAL OF ROBOTICS AND AUTOMATION, vol. RA-3, no. 4, 1987, pages 323 - 344, XP002633633, DOI: doi:10.1109/JRA.1987.1087109
J. VLACH; K. SINGHAL, COMPUTER METHODS FOR CIRCUIT ANALYSIS AND DESIGN, 1983
J.H. WILKINSON: "Rounding Errors in Algebraic Process", H.M. STATIONERY OFFICE, 1963
P.R. WOLF: "Elements of Photogrammeny", 1974, MCGRAW HILL
W.J. WOLFE; D. MATHIS; C.W. SKLAIR; M. MAGEE: "The Perspective View of Three Points", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 13, no. 1, 1991, pages 66 - 73, XP009084623, DOI: doi:10.1109/34.67632
Attorney, Agent or Firm:
GALLENSON, Mavis S. et al. (Suite 2100Los Angeles, California, US)
Download PDF:
Claims:
What is claimed is:

1. An indoor navigation system comprising:

a laser projector that emits one or more laser beams in four or more different directions during a time interval At;

two or more cameras that capture images of spots made by the laser beams on surfaces of an indoor space during At, the cameras calibrated such that three-dimensional locations of the spots can be estimated from images captured by at least two cameras; and,

a processor in communication with the cameras, the processor estimating position of the laser projector in the indoor space by space resection given the four or more different directions and the three-dimensional locations of the spots.

2. The system of Claim 1, the processor further in communication with the projector to determine orientation of the projector with respect to an object during At.

3. The system of Claim 2, the projector attached to the object via a gimbal mount.

4. The system of Claim 1, the projector comprising a rotating mirror; and the processor further in communication with the projector to determine an angle of the rotating mirror during At.

5. The system of Claim 1, the projector comprising a micro-electromechanical (MEMS) spatial light modulator; and the processor further in communication with the projector to determine unit vectors representing directions of laser beams emitted by the projector during At.

6. The system of Claim 1, each pair of directions of the four or more defining an angle different from that of any other pair.

7. The system of Claim 1, the four or more different directions coinciding at a point.

8. The system of Claim 1, the laser beams having an infrared wavelength and the cameras equipped with infrared filters to select the wavelength in preference to background light.

9. The system of Claim 1, each laser beam modulated with a modulation signal that

distinguishes it from all of the other laser beams, and each camera demodulating the signal to identify a correspondence between spots and laser beam directions.

10. The system of Claim 9, the modulation signal being sinusoidal.

11. The system of Claim 9, the modulation signal being a pseudorandom code.

12. The system of Claim 9, the modulation signal further carrying information pertaining to the orientation of each laser beam with respect to an object.

13. A method for indoor navigation comprising:

capturing, with two or more cameras, images of spots made by one or more laser beams on surfaces of an indoor space, the laser beams emitted by a laser projector in four or more different directions during a time interval At;

estimating three-dimensional locations of the spots from images captured by at least two cameras during At; and,

estimating position of the laser projector in the indoor space by space resection given the four or more different directions and the three-dimensional locations of the spots.

14. The method of Claim 13 further comprising: estimating an orientation of an object

during At, given a relative orientation of the object and the projector.

15. The method of Claim 14, the projector attached to the object via gimbal mount.

16. The method of Claim 13 wherein the laser projector comprises a rotating mirror; and further comprising: estimating an orientation of an object during At, given an angle of the rotating mirror with respect to the object during At.

17. The method of Claim 13 wherein the laser projector comprises a micro- electromechanical (MEMS) spatial light modulator; and further comprising: estimating an orientation of an object during At, given a set of unit vector coordinates describing directions of laser beams emitted by the projector with respect to the object during At.

18. The method of Claim 13, the cameras calibrated by:

capturing simultaneously with two or more cameras images of four or more spots on a planar surface;

determining homographies between pairs of cameras by identifying corresponding spots in the images captured;

determining from the homographies: relative poses between pairs of cameras, and orientation of the planar surface with respect to the cameras;

fitting the orientation of the planar surface to a model of the indoor space; and, determining the location and pose of each camera in the indoor space.

19. The method of Claim 13, the space resection including:

creating a virtual point not coplanar with three actual spot locations used in resection; using correspondences between locations of the three actual spots and the virtual point expressed with respect to the indoor space and the laser projector to establish a transformation matrix; and,

using the transformation matrix to test a fourth actual spot to resolve geometric

ambiguities in resection.

20. The method of Claim 13, each laser beam modulated with a modulation signal that distinguishes it from all of the other laser beams, and each camera demodulating the signal to identify a correspondence between spots and laser beam directions.

21. The method of Claim 20, the modulation signal being sinusoidal.

22. The method of Claim 20, the modulation signal being a pseudorandom code.

23. The method of Claim 13, each pair of directions of the four or more defining an angle different from that of any other pair.

24. The method of Claim 13, the four or more different directions coinciding at a point.

25. The method of Claim 13, the laser beams having an infrared wavelength and the

cameras equipped with infrared filters to select the wavelength in preference to background light.

26. A method for indoor navigation comprising:

capturing, with two or more cameras, images of a first set of spots made by one or more laser beams on surfaces of an indoor space, the laser beams emitted by a laser projector in a first set of four or more different directions during a time interval Atx and images of a second set of spots made by one or more laser beams on surfaces of the indoor space, the laser beams emitted by the laser projector in a second set of four or more different directions during a time interval Δΐ2;

estimating three-dimensional locations of the spots from images captured by at least two cameras during Ati and Δΐ2; and,

estimating a position of the laser projector in the indoor space during Δΐι and At2 by space resection given the first and second sets of four or more different directions and the three-dimensional locations of the spots.

27. The method of Claim 26 further comprising: estimating an orientation of an object during Δΐχ and Δΐ2, given a relative orientation of the object and the projector.

28. The method of Claim 27, the projector attached to the object via gimbal mount.

29. The method of Claim 26 wherein the laser projector comprises a rotating mirror; and further comprising: estimating an orientation of an object during Δΐι and Δί2, given angles of the rotating mirror with respect to the object during Δΐχ and Δΐ2.

30. The method of Claim 26 wherein the laser projector comprises a micro- electromechanical (MEMS) spatial light modulator; and further comprising: estimating an orientation of an object during Δΐχ and Δΐ2, given a set of unit vector coordinates describing directions of laser beams emitted by the projector with respect to the object during Ati and Δΐ2.

31. The method of Claim 26, the cameras calibrated by:

capturing simultaneously with two or more cameras images of four or more spots on a planar surface;

determining homographies between pairs of cameras by identifying corresponding spots in the images captured;

determining from the homographies: relative poses between pairs of cameras, and orientation of the planar surface with respect to the cameras;

fitting the orientation of the planar surface to a model of the indoor space; and, determining the location and pose of each camera in the indoor space.

32. The method of Claim 26, the space resection including:

creating a virtual point not coplanar with three actual spot locations used in resection; using correspondences between locations of the three actual spots and the virtual point expressed with respect to the indoor space and the laser projector to establish a transformation matrix; and,

using the transformation matrix to test a fourth actual spot to resolve geometric

ambiguities in resection.

33. The method of Claim 26, each laser beam modulated with a modulation signal that distinguishes it from all of the other laser beams, and each camera demodulating the signal to identify a correspondence between spots and laser beam directions.

34. The method of Claim 33, the modulation signal being sinusoidal.

35. The method of Claim 33, the modulation signal being a pseudorandom code.

36. The method of Claim 26, each pair of directions of each set of four or more defining an angle different from that of any other pair.

37. The method of Claim 26, each set of four or more different directions coinciding at a point.

38. The method of Claim 26, the laser beams having an infrared wavelength and the

cameras equipped with infrared filters to select the wavelength in preference to background light.

Description:
INDOOR NAVIGATION SYSTEM AND METHOD

Cross-Reference to Related Applications

[01] This application is related to and claims priority to U.S. Non-Provisional Patent Application Serial No. 13/922,530 filed June 20, 2013, which is hereby incorporated by reference in its entirety.

[02] This application is also related to and claims priority to U.S. Non-Provisional Patent Application Serial No. 13/847,585 filed March 20, 2013, which is hereby incorporated by reference in its entirety.

[03] U.S. Non-Provisional Patent Application Serial No. 13/922,530 is a continuation-in- part of U.S. Non-Provisional Patent Application Serial No. 13/847,585.

Technical Field

[04] The disclosure is generally related to indoor navigation via multi-beam laser projection.

Background

[05] Conventional indoor navigation techniques include ultrasonic or laser ranging, tracking marked objects with cameras, and interpreting video scenes as captured by a camera. This last method, navigating as a person would by interpreting his visual surroundings, is an outstanding problem in computer vision research. [06] A variety of challenges are associated with these and other indoor navigation techniques. Occlusion, for example, occurs when a camera or detector's view is blocked. Lack of sensitivity can be an issue when object-tracking cameras are located too close to one another, leading to small angle measurements. Some vision-based navigation systems depend on surface texture which may not always be available in an image. Finally, incremental positioning methods may accumulate errors which degrade positioning accuracy.

[07] Building construction is one scenario in which indoor navigation is a valuable capability. Robots that lay out construction plans or install fixtures need accurate position and orientation information to do their jobs. Assembly of large aircraft parts offers another example. Precisely mating airplane fuselage or wing sections is helped by keeping track of the position and orientation of each component. In scenarios like these, as a practical matter, it is helpful for indoor navigation solutions to be expressed in the same coordinates as locations of building structures such as walls, floors, ceilings, doorways and the like.

[08] Many vision-based indoor navigation systems cannot run in real time because the computational requirements are too great. Finally, a navigation system for a small robot is impractical if it consumes too much power or weighs or costs too much. What is needed is an indoor navigation system that permits accurate tracking of the location and orientation of objects in an indoor space while considering the challenges mentioned above and without requiring excessive computational capacity, electrical power or weight.

Brief Description of the Drawings

[09] Fig. 1 shows an indoor navigation system.

[10] Fig. 2 illustrates a multi-beam laser projector. [11] Fig. 3 is a map of projected laser beams.

[12] Fig. 4 illustrates a camera placement example.

[13] Fig. 5 is a flow chart for a calibration procedure.

[14] Figs. 6A and 6B illustrate space resection geometry.

[15] Fig. 7 is a flow chart for an indoor navigation method.

[16] Fig. 8 illustrates a multi-beam laser projector that can rotate around roll, pitch and yaw axes.

[17] Fig. 9 illustrates a multi-beam laser projector mounted on a gimbal mount.

[18] Fig. 10 illustrates a multi-beam laser projector that may emit beams in almost any direction.

[19] Fig. 11 shows an indoor navigation system in which a multi-beam laser projector emits beams in sets.

[20] Fig. 12 illustrates orientation data communication between a multi-beam laser projector and a microprocessor.

[21] Fig. 13 illustrates a laser projector based on a laser beam deflected by a scan mirror.

[22] Fig. 14 illustrates a multi-beam laser projector based on laser beam modulated by a spatial light modulator.

[23] Fig. 15 is a flow chart for an indoor navigation method.

Detailed Description

Introduction [24] Part I below describes an indoor navigation system based using space resection to find the position and orientation of a laser projector that creates landmarks on the walls of an indoor space.

[25] Part II below describes laser projection systems for use with the navigation system described in Part I.

Part I: Indoor navigation system

[26] The indoor navigation systems and methods described below involve solving a problem known in computer vision as "perspective pose estimation" and in

photogrammetry as "space resection", namely: Determine the position of each of the vertices of a known triangle in three dimensional space given a perspective projection of the triangle. Haralick, et al. show how this problem was first solved by the German

mathematician Grunert in 1841 and solved again by others later ("Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem," International Journal of Computer Vision, 13, 3, 331 - 356 (1994), incorporated herein by reference, and which is hereto attached as Appendix A).

[27] Space resection has been used in the past to find the position and orientation of a camera based on the appearance of known landmarks in a camera image. Here, however, space resection is used to find the position and orientation of a laser projector that creates landmarks on the walls of an indoor space. In contrast to traditional space resection, in the present case angles to the landmarks are set by the laser projector rather than measured. When the projector is attached to an object, the position and orientation of the object may be estimated and tracked. [28] Navigation based on this new technique is well suited to indoor spaces such as office buildings, aircraft hangars, underground railway stations, etc. Briefly, a laser projector is attached to a robot, machine tool or other item whose position and orientation are to be estimated in an indoor space. The projector emits laser beams in four or more different directions and these beams are seen as spots on the walls of the indoor space. ("Walls" is defined to include walls, ceiling, floor and other surfaces in an indoor space upon which laser spots may be formed.) Multiple fixed cameras view the spots and provide data used to estimate their positions in three dimensions. Finally, the space resection problem is solved to estimate the position and orientation of the laser projector given the location of the spots on the walls and the relative directions of the laser beams transmitted from the object.

[29] Indoor navigation based on multi-beam laser projection minimizes occlusion and sensitivity concerns through the use of a set of several laser beams spread out over a large solid angle. Multiple beams provide redundancy in cases such as a beam striking a wall or other surface at such an oblique angle that the center of the resulting spot is hard to determine, or half the beam landing on one surface and half landing on another. Having several beams pointed in various directions spread out over a half-sphere or greater solid angle, for example, largely eliminates sensitivity to unlucky geometries - small angles may be avoided. Each new measurement of laser projector position and orientation is directly referenced to building coordinates so measurement errors do not accumulate over time.

Finally the computational requirements are modest and computations may be performed in a fixed unit separate from a tracked object. [30] The major components of a multi-beam laser projection indoor navigation system are: a laser projector, a set of observation cameras and a processor that solves space resection and other geometrical tasks. Fig. 1 shows such a system.

[31] In Fig. 1 a robot 105 is situated in a room 110 that includes floor 111, walls 112, 113 and 114, and ceiling 115 ("walls" or "surfaces"). A laser projector 120 mounted on the robot emits four laser beams 121 - 124 which form spots 1 - 4. Spots 1 and 4 are on wall 113 while spots 2 and 3 are on ceiling 115. Although not illustrated here, spots may also fall on other surfaces and/or objects in the room. Cameras 130 and 131 are fixed in position such that they each can view spots 1 - 4 as suggested by the dash-dot and dashed lines. Opaque obstacle 135 blocks a direct line of sight from camera 130 to projector 120, but does not affect the operation of the navigation system. Microprocessor 140 (which includes associated memory and input/output devices) is in communication with cameras 130 and 131 via a wired or wireless data connection. Robot 105 is, of course, just an example of an object that may be tracked by the navigation system. Any object to which a laser projector can be attached would suffice.

[32] When properly calibrated, cameras 130 and 131 may be used to estimate the three dimensional position of any point that both can see. For example if both cameras can see spots 1, 2, 3 and 4, then the three dimensional coordinates of each spot can be estimated in a coordinate system used to locate the walls and other features of the room. Meanwhile, laser projector emits laser beams 121 - 124 at known azimuths and elevations as measured with respect to the robot. The angle between each pair of laser beams is therefore also known. As discussed in detail below, this provides enough information to estimate the position and orientation of the laser projector, and the object to which it is attached, in room coordinates. [33] Fig. 2 illustrates a multi-beam laser projector 205 in greater detail. In the example provided by Fig. 2, projector 205 emits five laser beams A, B, C, D and E. Four is a practical minimum number of beams and more is helpful. Despite the two-dimensional appearance of the figure, the beams do not all lie in a one plane. In an embodiment no more than two beams lie in any particular plane and the angle between any pair of laser beams is different from that between any other pair. In an embodiment the back projections of each beam intersect at common point, P. Said another way, directions of the laser beams coincide at point P. Point P need not lie inside the projector although it is illustrated that way in Fig. 2. The location and orientation at P may then be estimated by the navigation system.

[34] Fig. 2 shows five lasers 210 - 214 provided to emit the five laser beams. However, fewer lasers may be used with beam splitters or diffractive elements to create multiple beams from one laser source. To avoid interference from room lighting the lasers may operate in the near infrared and the cameras may be equipped with near infrared bandpass optical filters. In an embodiment each laser beam is modulated or encoded so that it may be distinguished from all the others and identified by cameras equipped with appropriate demodulators or decoders. Commonly available diode lasers may be directly modulated with analog or digital signals up to tens of megahertz, suggesting, for example, sine wave modulation of each beam at a unique frequency. Alternatively, beams may be modulated with orthogonal digital codes in analogy to code-division multiple access radio systems. If one laser is split into several beams, then each beam may be provided with its own modulator.

[35] Fig. 3 is a map of projected laser beams that helps visualize the situation of Fig. 2. Beams such as 305, 310, 315, 320 are represented in Fig. 3 as points where the beams would intersect a half-sphere placed over a projector such as 205. In the figure, "0", "30", and "60" represent degrees of elevation up from the horizon. Thus, for example, beams 310 and 315 are located between 30 and 60 degrees elevation. A set of laser beams spread out over the half sphere provides robustness against geometric ambiguities. Depending on the application, more beams may be placed at high elevations to avoid conflicts with potential obstructions such as object 135 in Fig. 1. A projector that emits beams below the horizon may also be used when the projector is far from the nearest room surface, for example on top of a tall robot. And, of course, the projector may be oriented sideways or even upside down for objects that move on walls or ceilings.

[36] Cameras, such as cameras 130 and 131 in Fig. 1 are used to estimate the position of laser spots on walls in three dimensions. This can be accomplished as long as a spot is viewable simultaneously by at least two cameras. When more cameras are available, the location of a spot may be estimated more robustly. Fig. 4 illustrates a camera placement example. In the example of Fig. 4, L-shaped room is monitored by three cameras, Ci, C 2 and C 3 . The part of the room labeled "1, 2, 3" is viewable by all three cameras. The part labeled "1, 2" is viewable by cameras Ci and C 2 only, while the part labeled "2, 3" is viewable by cameras C 2 and C 3 only. All parts of the room are viewable by at least two cameras.

[37] If only one camera is available, but it is aimed at a scene with known geometry (e.g. a flat wall at a known location), then that is enough to locate laser spots. This situation may be hard to guarantee in practice, however. Using two or more cameras eliminates issues that arise when spots fall on surfaces at unknown locations. As described below, one known surface may be used during system calibration.

[38] If the laser beams used in an indoor navigation system are near infrared, then corresponding filters may be used with the cameras to remove background room light. Similarly, if the laser beams are modulated or encoded, then the cameras may be equipped with corresponding demodulators or decoders. Finally, as used here, a "camera" includes processors or other components to demodulate or decode laser spots and report their two dimensional position in an image. Cameras may thus report a set of time-stamped two- dimensional spot coordinates to a central computer (e.g. 140 in Fig. 1) for processing. This data stream has low enough bandwidth requirements that a wireless link between cameras and the central computer may be used.

[39] Calibration is done to estimate the pose of each camera in room coordinates before navigation commences. Fig. 5 is a flow chart for one calibration procedure, others are possible. In general any method that results in the poses of each camera being determined in the coordinate system of the indoor space involved is sufficient. The procedure outlined here uses the same equipment that is later used for navigation.

[40] The first steps 505 and 510 in the calibration procedure of Fig. 5 are to project four or more spots of light onto a planar surface such as a flat wall or ceiling and capture images of all of the spots with two or more cameras. The next step 515 is to determine or identify a homography between each pair of cameras by identifying corresponding spots in the images. The homography is then used to determine the relative poses of the cameras and the orientation of the planar surface with respect to the cameras; steps 520 and 525. Next, the planar surface is fit to a model of the indoor space of which it is a part; step 530. For example, a computer aided design model of a building may be available showing the location of a wall. Finally, the location and pose of each camera is determined in the coordinate system of the building or indoor space in step 535. Fitting a plane surface to a building design model removes an overall scale ambiguity that may not be resolvable from the homography alone. Repeating this procedure on two or more planar surfaces may also be helpful to resolve other geometric ambiguities that sometimes exist. [41] An example of indoor navigation using multi-beam laser projection is now presented using Figs. 6A and 6B to illustrate space resection geometry. An object such as robot 105 in Fig. 1 is equipped with a laser projector as described above. The coordinate system of the object, also called the reference coordinate system, R, may be defined by two of the laser beams emitted by the projector. For example the origin of the reference coordinates may be at the intersection of the laser beams, point P in Figs. 1, 2, 6A and 6B. The z-axis may then be defined to coincide with one of the beams and the y-axis may be the cross product of rays coincident with an ordered pair of the first beam and another beam. Coordinates used to describe a room or other indoor space are known as world coordinates, W.

[42] After some introductory comments on notation, the example proceeds as follows. A set of reference unit vectors corresponding to the directions of laser beams projected from a laser projector are defined. Next, distances are defined from the projector to laser spots that appear on walls, ceilings or other surfaces. These distances are scalar numbers that multiply the reference unit vectors. The unit vectors and the distance scalars therefore define the position of observed laser spots in the reference (i.e. laser projector) coordinate system.

[43] The next step in the example is to assume a transformation matrix that defines the relationship between reference and world coordinate systems. This matrix is used to find the position of observed laser spots in the world (i.e. room) coordinate system. The task of the navigation system is to find the transformation matrix given the reference unit vectors (from the design of the laser projector) and the laser spot locations in world coordinates (as observed by a set of calibrated cameras).

[44] The mathematics of space resection has been worked out several times by various researchers independently over the last 170 years. Here we follow Haralick et at., "Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem/' International Journal of Computer Vision, 13, 3, 331 - 356 (1994); see, especially, p. 332 - 334. Other valid solutions to the space resection problem work just as well. It turns out that space resection based on three observed points often leads to more than one solution. Two solutions are common, but as many as four are possible. Thus, the next part of the example shows a way to determine which solution is correct. Finally, as an optional step, the four by four transformation matrix between reference and world coordinate systems expressed in homogenous coordinates is decomposed into Euler angles and a translation vector.

[45] Fig. 6A is similar to Haralick Fig. 1. Point P is called the center of perspectivity by Haralick; here it is the location of the laser projector. Points Pi, P 2 and P3 represent the locations of observed laser spots. Haralick uses Si, s 2 and s 3 to represent the distances to these points. In the example below the distances are first defined as m x , m 2 and m 3 ; these distances are then calculated following Haralick. a, b, c, a, β, γ in Fig. 6A correspond directly to the same symbols in Haralick. Fig. 6B illustrates unit vectors pi, p 2 and p 3 that indicate the directions of laser beams emitted from a laser projector and are used to define the reference coordinate system. Angles 0 ί; · between unit vectors are discussed below.

[46] Two functions are used to transform homogeneous coordinates to non- homogeneous coordinates and vice versa. Ή (·) transforms non-homogeneous coordinates to homogenous coordinates while Jf ~] (·) transforms homogeneous coordinates to non- homogeneous coordinates. Both functions operate on column vectors such that if v ] then:

[47] The pose in world coordinates of the object to be tracked can be defined as the coordinate transform between reference and world coordinate systems. The transform can be carried out by left-multiplying a 4 by 1 vector, describing a homogeneous three- dimensional coordinate, by a 4 by 4 matrix X R W t0 8 ive another homogeneous, three- dimensional coordinate.

[48] Let pi, p 2 , 3, P4 denote non-homogeneous coordinates on a unit sphere for reference rays in the reference coordinate system. (See, e.g. pi, p 2 and p 3 in Fig. 6B). These rays are coincident with laser beams projected by the laser projector. Let

* , P , P , P^denote 3D homogeneous coordinates of detected spots (i.e. points where laser beams are incident on walls or other surfaces) along the rays in the reference coordinate system. Then:

P 4 R = -(m 4 p 4 ) where m lt m 2 , m 3 , m 4 are positive scalars that describe how far along each ray light is intercepted by a surface to create a detected spot. The homogeneous coordinates of the 3D detected spots in the world coordinate system are denoted by P , P , P™ , P^where:

- Y P R [49] The following reference unit vectors are defined for purposes of example:

P j =[-0.71037 -0.2867 0.64279

p 2 =[0.71037 0.2867 0.64279] 7"

p 3 =[-0.88881 0.45828 o

p 4 = [0.56901 -0.37675 0.73095] 7"

[50] The angle 9 i} between p ; and p ; is given by θ ϋ = cos 1 (p p- j; therefore,

0 12 =1OO°, 0 13 =6O°, # l4 =80°

6> 23 =120°, <¾ 4 =40°, # 34 = 132.7° (5)

The set of reference vectors p, has been chosen in this example such that the angle between each pair of vectors is different. This property helps avoid ambiguities in pose estimation but is not required. For purposes of illustration, m-, are chosen as follows: m% - 1, m 2 = 4, m 3 = 7 and m 4 = 10. Then, using equation (2), we have:

P j R =[ -0.71037 •0.2867 0.64279 if

P = [ 5.6901 •3.7675 7.3095 if

[51] Let us assume the following transformation matrix:

0.917 -0.38924 0.087156 7

0.39439 0.91746 -0.052137 11

X =

R→w -0.059668 0.082183 0.99483 0.1

0 0 0 1

[52] Then, using equation (3), P, w = [ 6.5162 10.423 0.75829 if

P 2 W = [ 9.3834 13.039 2.5826 if

(8)

P 3 W = [ 0.046048 11.489 0.73487 if

P 4 W = [ 14.321 9.4065 6.7226 if

[53] We now have the required inputs, p lt p 2 , p 3 , Ρχ , Ρ 2 > P^ > r a space resection algorithm such as the one described in Haralick. The algorithm determines X R W up to a possible four-fold ambiguity. To resolve the ambiguity each real solution may be checked to see whether or not it projects P 4 W to p 4 . The space resection method detailed in Haralick first determines distances from the origin to each of the reference points. These distances are called s r ,s 2 ,s 3 , by Haralick and if correctly calculated should be equal to m x ,m 2 ,m 3 respectively. Given these distances we can then calculate P R , P 2 R , P 3 R . [54] Given the coordinates of three 3D points expressed in both the reference coordinate system and the world coordinate system one can find the transformation matrix between the two coordinate systems. This may be done by Procrustes analysis; see, for example Peter H. Schoenemann, "A Generalized Solution of the Orthogonal Procrustes Problem", Psychometrika, 1, 31, 1 - 10 (1966). A simpler method is presented below, however. [55] If a, b,c, a, β, / take the meanings described in Haralick then they can be calculated as:

cos/?= pj\p 3 = 0.5000 (10) cos = p .p 2 = -0.1736

[56] Inserting these values into Haralick's equation (9) gives: = 0.1128,4 = -1.5711,4 = 6.5645,4 = -8.6784, = 7.2201 (11)

[57] The quartic function in v with these coefficients has the following roots: v = 7.0000 or v = 5.4660 or v = 0.7331 - 1.065i or v = 0.7331 + 1.066i

[58] The complex roots may be ignored and the real roots substituted into Haralick's equation (8) to give corresponding values for u: u = 4.0000, = 7.0000 or u =2.9724,v = 5.4660 (13)

[59] Substituting u and v into Haralick's equations (4) and (5) leads to:

= 1.0000, s 2 = 4.0000, s } = 7.0000

or (1 j, = 1.3008, s 2 = 3.8666, s 3 = 7.1 104

[60] One can see that the first solution corresponds to the values picked for m ] ,m 2 ,m 3 above. Of course, at this point we know this only because we know how the problem was constructed. We will now recover the transformation, X R→W , for each solution and then determine which solution is correct. [61] It is noted in Haralick and elsewhere that the transformation has 12 parameters but the point correspondences only give 9 equations. The conventional solution to this problem is to enforce orthogonality constraints in the rotation part of the transform to reduce the number of parameters. However there is an easier and somewhat surprising method; we can manufacture a virtual fourth point whose location is linearly independent from those of the first three points. This virtual point is consistent with a rigid transformation, so its coordinates and those of the three actual points, as expressed in the reference and world coordinate systems, give the transformation directly.

[62] The fourth point is found by considering the vectors from one actual point to each of the other two. If we take a point that is separated from the first point by the cross product of the two vectors then we can be sure that it is not coplanar with the three actual points and is therefore linearly independent. Since in a Euclidean transform the vectors are simply rotated, so is their cross product. Hence we have a fourth point correspondence which is linearly independent but enforces the orthogonality constraint.

[63] We call this point P 5 R in the reference coordinate system and P 5 in the world coordinate system. Formally it may be defined as:

F* = J{(j{- l (i>? ) + (jr ] (p ) - ^- 1 (P 1 R ))x(^ 1 (P 3 R )- - 1 (P 1 R )))

(15) p 5 w = π (ττ 1 (p, w ) + (#~ 1 (p 2 w ) - x - 1 (p )) χ 1 ( P ) - (P )))

[64] We first consider the solution where s, = 1.0000, s 2 = 4.0000, s 3 = 7.0000. Calculated values are indicated using a 'hat'. For example: P R =tf(s 1 p 1 ) = [ -0.7104 -0.2867 0.6428

P R =?f(s 7 p 1 ) = [ 2.8415 1.1468 2.5712 (16) P R =^(s 3 p 3 ) = [-6.2217 3.2079 0

[65] Using equation (15) we find:

P R = [-8.3707 -8.6314 20.9556 if

(17) P 5 W =[4.5101 -1.3129 20.7373 if

[66] Stacking 3D point correspondences gives: p v p 2 w |p 3 w = X pR pR

R→W p» r 2 r 3 P R

X pW pW

R→W r 2 r 3

0.9170 -0.3892 0.0872 7.0000

0.3944 0.9175 -0.0521 11.0000

X R→W

-0.0597 0.0822 0.9948 0.1000

0 0 0 1

[67] Comparison with equation (7) shows that this is the correct solution. This may be verified independently by transforming the fourth world point into reference coordinates, projecting it onto the unit sphere, and comparing to the corresponding reference unit vector: [5.6901 -3.7675 7.3095

(19)

P 4 = ^ ~ 7r' (p 4 R ) - [0.5690 -0.3767 0.7310] 7

[68] Comparing this to equation (4) shows that the fourth world point does indeed agree with the fourth reference coordinate and we can therefore conclude that the calculated transform is correct.

[69] Now consider the second solution where ^ = 1.3008, s 2 = 3.8666, s 3 = 7.1104. Plugging these values into equation (2) gives:

P, R = [-0.9241 -0.3729 0.8362 if

P 2 R = [2.7467 1.1085 2.4854 if

, T (20)

P 3 R = [-6.3198 3.2585 0.0000 l]

P 5 R = [-8.1519 -6.2022 22.1599 if

[70] Stacking these points as we did in equation (18) leads to the transform matrix:

6.5162 9.3834 0.0460 4.5101 -0.9241 2.7467 -6.3198 -8.1519

10.4233 13.0387 11.4894 -1.3129 -0.3729 1.1085 3.2585 -6.2022

X

0.7583 2.5826 0.7349 20.7373 0.8362 2.4854 0.0000 22.1599

1 1 1 1 1 1 1 1

(21)

0.9043 -0.4153 0.0989 7.1142

0.4265 0.8898 -0.1626 11.2852

X,

-0.0205 0.1892 0.9817 -0.0109

0 0 0 1

[71] Testing this with the fourth world point leads to: pR _ y-\ p W

r ~ J R→W I 4 [5.5783 -3.3910 7.6286

(22)

TT' Jp* )/ Jf-' (P 4 R ) = [0.5556 -0.3377 0.7598

[72] Here the elements of p 4 differ from those of p 4 (see equation (4)) indicating that this is not a correct solution.

[73] For many purposes it is unnecessary to decompose the transformation matrix, XR W > however we present the decomposition here for completeness. The transform describes directly how the basis vectors of one coordinate system relate to the basis vectors of another. The coordinate system is defined by the point at infinity on the x-axis, the point at infinity on the y-axis, the point at infinity on the z-axis, and the origin. We denote the basis vectors of the reference coordinate system in world coordinates as B^ , and the basis vectors of the reference coordinate system in reference coordinates as . If we stack the basis vectors we get the four by four identity matrix:

1 0 0 o-

0 1 0 0 (23)

0 0 1 0

-0 0 0 1-

[74] Since,

the transformation can be read as the basis vectors of the reference coordinate system the world coordinate system. Thus the question "What is the position of the reference system (i.e. the laser projector)?" is equivalent to asking "Where is the origin of the reference coordinate frame in the world coordinate system?" This is given by the fourth column of X R→W ; the column that corresponds to [0 0 0 if in . Likewise the other columns tell us how the reference frame has rotated (i.e. the orientation of the laser projector). However, those unfamiliar with projective geometry often prefer to consider the rotation in terms of Euler angles. For a z-y-x Euler sequence we can consider the

transformation to be composed as:

X R→w = T(x, y ) R x (d x )R y y )R z (3 z )

...where...

[75] in this convention θ ζ (yaw) is a counter-clockwise rotation about the z-axis, Θ (pitch) is a counter-clockwise rotation about the new y-axis, (roll) is a counter-clockwise rotation about the new x-axis. To avoid singularities in the inversion of the transform Θ is restricted to the open interval -90° < Θ < 90° . When 9 y = ±90° gimbal lock occurs and Euler angles are inadequate for describing the rotation. With this caveat the transform can be decomposed as: θ - atan 2 (— r 23 , r 3

0 y = sin _1 (r 13 )

= atan 2 (-r 12 ,r.

.where.. (26)

y

¾ z

0 0 0 1

[76] Applying this to the transformation of equation (18) we get:

0.9170 -0.3892 0.0872 7.0000

0.3944 0.9175 -0.0521 11.0000

X 0

-0.0597 0.0822 0.9948 0.1000

0 0 0 1

θ χ = atan 2 (-0.0521, 0.9948) = 3°

(27) 9 y = sin "1 (0.0872) = 5°

θ ζ = atan 2 (-0.3892, 0.9170) = 23°

x = l

y = U

z = 0.1

[77] Thus the position of the origin of the reference coordinate system (i.e. the position of the laser projector) expressed in the world coordinate system is (7, 11, 0.1) and the orientation of the laser projector in the world coordinate system is described by Euler angles 3°, 5° and 23°.

[78] To recap: Knowledge of the location of laser spots on the walls of a room, combined with knowledge of the relative directions of laser beams emitted by a laser projector, leads to the location and orientation of the laser projector expressed in room coordinates. The location of the spots is determined with a calibrated set of cameras and the relative directions of the projected laser beams are set during manufacture and/or set-up of the laser projector.

[79] A few subtleties of the system and methods described above are worth mentioning or revisiting at this point. For example, in an embodiment the directions of each laser beam coincide at a point, P. If this is not the case the mathematics of the space resection problem becomes more complicated.

[80] Correspondences between laser beams and their spots may be accomplished by trial and error until a solution to the space resection problem is found. This process is made more robust when the angles between pairs of laser beams are different.

[81] Alternatively, each laser beam may be modulated or encoded to facilitate identification. Each beam may be modulated with its own frequency sine wave or its own pseudo random code, as examples. Demodulating cameras may be used to identify beams or demodulation may be done later using a separate microprocessor. Unique beam identification becomes even more helpful when multiple laser projectors (e.g. on multiple robots or other objects) are tracked at once.

[82] The use of four, five or even more beams per laser projector helps make the system more robust in the face of potential geometric ambiguities. Furthermore, once an ambiguity has been resolved, such as finding that the first rather than the second solution is correct in the example above, it will tend to stay resolved in the same way as a tracked object makes incremental movements from one location and pose to the next.

[83] In light of the detailed example given above and the subtleties just mentioned, Fig. 7 is a flow chart for an indoor navigation method. According to Fig. 7, the first step 705 in the method is to project laser beams in four or more different directions from a laser projector onto surfaces of an indoor space. Next, in step 710, images of the spots made by the beams are captured with two or more cameras. At least two cameras are necessary to determine the three dimensional position of the laser spots in the coordinate system of the indoor space. Ideally images are captured by each camera at the same time. Delay between images may reduce the accuracy of the system depending on the speed of a tracked object. Cameras may provide a time-stamp along with the two dimensional coordinates of observed laser spots so that a processor can use sets of spots observed as close to simultaneously as possible.

[84] The next step 715 is to identify the observed points based on unique modulation signals applied to each laser beam. This step is not required if no laser modulation is used. Given the observed location of laser spots as determined from data supplied by two or more cameras and knowledge of the geometry of the laser projector, the space resection problem is now solved in step 720. The solution may proceed in analogy to the example provided above or it may use another method. The solution may include resolving geometric ambiguities which may arise.

[85] The solution includes comparing the coordinates of known points (e.g. laser spots) as expressed in reference and world coordinates to find a matrix describing a coordinate transform between the two coordinate systems. This may be done though Procrustes analysis or using the method of manufacturing a virtual, linearly independent point as described above.

[86] A system including a multi-beam laser projector attached to an object to be tracked, a set of calibrated cameras that observe laser spots on the walls of a room, and a processor that solves the space resection problem is thus able to provide an indoor navigation solution. The system avoids many difficulties associated with traditional camera-based navigation including issues such as occlusion and geometric insensitivity while requiring neither extraordinary processing power nor high-bandwidth data transfer. Part II: Advanced laser projection systems

[87] The purpose of advanced laser projection systems described below is to increase the number of observable laser spots on the surfaces of an indoor space. More spots can make an indoor navigation system more robust by helping to avoid poor spot geometries, mitigating occlusions, and providing additional measurements to resolve ambiguities.

[88] Previously the location and orientation of a tracked object, e.g. a robot, were determined from observations of laser spots projected from a projector rigidly attached to the tracked object. In other words, the relative orientation of the tracked object and the laser projector was fixed. In an advanced laser projection system, however, a projector may be mounted on a tip/tilt platform or gimbal mount attached to the tracked object instead of directly on the tracked object. (Note: "Tracked object" means an object whose position and orientation are desired; it is not meant to specify whether or not an object, such as a robot, travels on tracks.)

[89] The space resection method used to find the position and orientation of the tracked object remains the same. However, an additional coordinate transformation may be needed to relate laser projector attitude to tracked object attitude.

[90] Advanced projection systems increase the number of observable spots but may slow down position update rate if not all spots are available simultaneously. Spots may appear in groups or one-by-one depending on the projection system. [91] Fig. 8 illustrates a multi-beam laser projector 805 that can rotate around roll, pitch and yaw axes: X, Y and Z. Projector 805 emits four laser beams F, G, H and J. Despite the two-dimensional appearance of the figure, the beams need not all lie in a one plane. In an embodiment no more than two beams lie in any particular plane and the angle between any pair of laser beams is different from that between any other pair. Fig. 8 shows four lasers 810 - 813 provided to emit the four laser beams.

[92] In an embodiment the back projections of each beam intersect at common point, P. Said another way, directions of the laser beams coincide at point P. Roll, pitch and yaw axes (X, Y and Z) intersect at P. Thus roll, pitch and yaw rotations change the orientation of the projector with respect to a tracked object, but not the position of point P.

[93] One way to provide the roll, pitch and yaw degrees of freedom illustrated in Fig. 8 is to mount multi-beam laser projector 805 on a gimbal mount 910 as shown in Fig. 9. The mount permits the projector to rotate through roll, pitch and yaw Euler angles γ, β and , respectively. The gimbal mount may be fixed to a tracked object such as a robot.

[94] As an example, if projector 805 and mount 910 are configurable in two different orientations, oti, βι, γι and α 2 , β 2 , j2, then laser beams F, G, H and J, create two sets of spots on the surfaces of an indoor space, one corresponding to the first configuration and another to the second. The space resection problem may then be solved twice to find the position and orientation at point P.

[95] Full α, β, γ rotational freedom is not always necessary. Fig. 10 illustrates a multi- beam laser projector 1005 that may emit beams in almost any direction. The projector is mounted on a rotation mount 1010 that permits only yaw rotation as indicated by yaw angle a. Despite the two-dimensional appearance of the figure, laser beams K, L, M, N, Q, R, S, T need not all lie in a one plane. In an embodiment no more than two beams lie in any particular plane and the angle between any pair of laser beams is different from that between any other pair. In an embodiment the back projections of the beams intersect at common point, P. Said another way, directions of the laser beams coincide at point P.

[96] The configuration of laser projector 1005 is specified by yaw angle a. Sets of spots projected by laser beams K ... T may be grouped according to yaw angle. Thus one set of spots may correspond to a = ai and another to a = a 2 , for example. Rotation mount 1010 may be fixed to a tracked object, such as a robot.

[97] If the orientation of a tracked object is desired, then the relative orientation between the laser projector and the tracked object is needed, e.g. angles α, β, γ in Fig. 9 or angle a in Fig. 10. On the other hand, if only the position of a tracked object is required, then relative orientation of the projector and the object is not needed as long as the position of point P is fixed with respect to the object. In either case, the space resection problem can only be solved when angles between laser beams that generate spots are known.

[98] Fig. 11 shows an indoor navigation system in which a multi-beam laser projector emits beams in sets. Fig. 11 is similar to Fig. 1 and like reference numbers refer to the same items in each figure. However, in Fig. 11 laser projector 120 is rotatable through Euler angles α, β, γ and is therefore capable of generating two sets of laser spots on the walls of room 110: spots 1 - 4 and spots 1' - 4'. Microprocessor 140 may solve the space resection problem twice, once for each set of spots. Of course, projector 120 may be capable of projecting many more sets of spots associated with different sets of Euler angles.

[99] The angles between laser beams are the same for each set of spots, but the relative orientation of the set of laser beams and the tracked object is different. This relative orientation must be communicated to microprocessor 140 if the attitude of the object (e.g. robot 105) is desired. On the other hand, the position of point P (and the robot) may be determined without knowing the relative orientation between laser projector and the object. Data describing the orientation of the laser projector with respect to the tracked object may be communicated as part of a laser beam modulation pattern or by another data link.

[100] The orientation of the tracked object may be obtained using a transformation between reference (i.e. laser projector) coordinates and tracked-object coordinates.

The rotation may be expressed with Euler angles:

cos a cos β cos a sin β sin y— sin cos y cos sin β cos y + sin a sin y sin a cos/? sin a sin β sin y + cos a cos y sin a sin β cos y— cos sin y

- sin /? cos/? sin cos β cosy Here [0 X 0 Y 0 Z ] T is a vector in the tracked object coordinate system and [R x Ry R Z ] T is a vector in the projector (reference) coordinate system, α, β and y are counterclockwise rotation angles of the projector around object axes Z, Y and X respectively where y rotation is performed first, then β rotation, and a rotation last. (This coordinate transformation assumes that object and reference coordinate systems are related by rotation without translation.) [101] As a simple example, if projector 1005 of Fig. 10 rotates around the Z axis with rotation rate ω radians per second (i.e. a = cot and β = γ = 0), then

[102] In this case the position of spots generated by lasers K ... T is a function of time. If the orientation of a tracked object is desired, then orientation data must be exchanged between the projector and the processor performing space resection. Fig. 12 illustrates orientation data communication between a multi-beam laser projector and a microprocessor. [103] In Fig. 12 multi-beam laser projector 1205 is attached, but rotatable with respect to, a tracked object, robot 1210. The projector communicates data 1215 to microprocessor 1220: at time ti for example, the relative orientation of the projector with respect to the tracked object is Oi. Microprocessor 1220 solves the space resection problem (as described in Part I above) for sets of laser spots observed at various times. For example the space resection problem may be solved for laser spots observed at time t = t x and the position and orientation of laser projector 1205 estimated at that time. If the orientation of robot 1210 is also desired at ti, then a coordinate transformation based on orientation data Oi may be used to relate projector and robot orientations. Orientation data may include Euler angles describing a coordinate transformation between object and projector coordinate systems.

[104] An advanced laser projector may report its orientation to a microprocessor.

Alternatively a microprocessor may command a laser projector to a assume specific orientation. Thus orientation data may be transferred between projector and processor in either direction. Orientation data may be communicated via standard wired or wireless links, or it may be included in a laser modulation scheme.

[105] As an example, consider a laser projector that operates in several orientations. Lasers in the projector may be modulated, for example to aid beam identification.

Modulation may also specify the orientation of the projector. Spot detection cameras may then demodulate observed laser spots. The cameras then report spot position, laser beam identification and laser projector orientation information to a microprocessor as inputs to a space resection navigation method.

[106] Roll, pitch and yaw rotations of a multi-beam laser projector may be implemented in various ways including the use of gimbal and single-axis spinning mounts described above. Similar effects may also be obtained with scan mirror systems or spatial light modulators. [107] Fig. 13 illustrates a laser projector based on a laser beam deflected by a scan mirror. In Fig. 13, laser projector 1305 includes laser 1310 which emits a beam incident on scan mirror 1315. Reflected beam 1320 is shown at time t = ti. At later times, e.g. t 2 , t 3 , t 4 , etc., the beam points in different directions as the scan mirror rotates. Scan mirror 1315 may be mounted on a galvo scanning unit or may be part of a spinning polygon mirror, as examples. In Fig. 13, scan mirror rotates around an axis (not shown) perpendicular to the page. This motion leads to reflected laser beams lying on one plane coincident with the page. In general, however, a scan mirror may also rotate around other axes and thereby produce laser beam directions not all in one plane.

[108] Laser 1310 may be modulated such that it emits light only at discrete times. For example, if scan mirror 1315 rotates a constant speed, but laser 1310 emits light only at discrete times, e.g. ti, t 2 , t 3 , t 4 , etc., then laser spots created on wall of an indoor space by projector 1305 will appear in rapid succession, but not simultaneously. Successive spots may still be used for navigation by space resection depending on relative time scales of relevant events. If a set of spots is projected within the time that a camera (e.g. camera

130) captures one image frame, then the camera cannot distinguish the temporal sequence of the spots; as far as the camera is concerned, the spots appear simultaneously.

[109] For best navigation accuracy, new complete sets of spots should appear faster than the time it takes for a tracked object to move appreciably. Further, spots should appear long enough for cameras to gather enough light to make accu rate estimates of spot location in images. The signal to noise ratio of spot detection may be improved by modulating a laser beam.

[110] Fig. 14 illustrates a m ulti-beam laser projector based on laser beam modulated by a spatial light modulator. In Fig. 14, laser projector 1405 includes laser 1410 and spatial light modulator (SLM) 1415. SLM 1415 transforms a beam of light from laser 1410 into a projected image. The image may contain a single spot of light or it may include many spots. As an example, in Fig. 14, SLM 1415 modulates a laser beam from laser 1410 into beams 1420 and 1422 which appear simultaneously at time t = ti. Spatial light modulator 1415 may be based on a micro-electromechanical (MEMS) mirror array such as Texas

Instruments' Digital Mirror Device or a MEMS ribbon array such as a grating light modulator. [Ill] A spatial light modulator may be used to create a set of numerous spots that covers a large solid angle. Alternatively an SLM may be used to create successive sets of just one, two or a few spots. The latter approach may allow each spot to be brighter depending on the SLM technology used.

[112] A projector based on a MEMS SLM need not move (on a macroscopic scale) with respect to a tracked object to which it is attached. Thus, instead of sending Euler angle data to a space resection processor, a projector may send a set of reference unit vector coordinates analogous to unit vectors pi, p 2 and p 3 shown in Fig. 6B. In fact, any of the advanced laser projectors described above may take this approach. It is a matter of engineering choice whether to supply a rotation matrix to relate projector and object coordinates or to recalculate the unit vector coordinates in the object coordinate system at any given time.

[113] Gimbal mounts, spinning mirrors and MEMS based SLMs all introduce the possibility that sets of spots are projected in different directions from an object at different times. A navigation solution based on space resection is valid for spots observed during a short time interval At. At later times different sets of spots may be used and the space resection problem may need to be solved again even if a tracked object has not moved. Considering the time dependence of laser beam directions, Fig. 15 is a flow chart for an indoor navigation method.

[114] According to Fig. 15, the first step 1505 in the method is to project laser beams in four or more different directions from a laser projector onto surfaces of an indoor space during a time interval At. Next, in step 1510, images of the spots made by the beams are captured with two or more cameras. At least two cameras are necessary to determine the three dimensional position of the laser spots in the coordinate system of the indoor space. Ideally images are captured by each camera at the same time. Delay between images may reduce the accuracy of the system depending on the speed of a tracked object. Cameras may provide a time-stamp along with the two dimensional coordinates of observed laser spots so that a processor can use sets of spots observed as close to simultaneously as possible.

[115] The next step 1515 is to identify the observed points based on unique modulation signals applied to each laser beam. This step is not required if no laser modulation is used. Given the observed location of laser spots as determined from data supplied by two or more cameras and knowledge of the geometry of the laser projector, the space resection problem is now solved in step 1520. The solution may proceed in analogy to the example provided above or it may use another method. The solution may include resolving geometric ambiguities which may arise. Finally in step 1525, if the relative orientation of the projector and a tracked object is known that information may be used to find the orientation of a tracked object. That orientation may only be constant during At. Thus new projector - object relative orientation information may be needed at later times.

Conclusion [116] The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the principles defined herein may be applied to other embodiments without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

[117] All elements, parts and steps described herein are preferably included. It is to be understood that any of these elements, parts and steps may be replaced by other elements, parts and steps or deleted altogether as will be obvious to those skilled in the art.

[118] This writing discloses at least the following. An indoor navigation system is based on a multi-beam laser projector, a set of calibrated cameras, and a processor that uses knowledge of the projector design and data on laser spot locations observed by the cameras to consider the space resection problem to find the location and orientation of the projector.

CONCEPTS

This writing discloses at least the following concepts.

Concept 1. An indoor navigation system comprising:

a laser projector that emits one or more laser beams in four or more different directions during a time interval At;

two or more cameras that capture images of spots made by the laser beams on surfaces of an indoor space during At, the cameras calibrated such that three-dimensional locations of the spots can be estimated from images captured by at least two cameras; and,

a processor in communication with the cameras, the processor estimating position of the laser projector in the indoor space by space resection given the four or more different directions and the three-dimensional locations of the spots.

Concept 2. The system of Concept 1, the processor further in communication with the projector to determine orientation of the projector with respect to an object during At.

Concept 3. The system of Concept 2, the projector attached to the object via a gimbal mount.

Concept 4. The system of Concept 1, the projector comprising a rotating mirror; and the processor further in communication with the projector to determine an angle of the rotating mirror during At.

Concept 5. The system of Concept 1, the projector comprising a micro-electromechanical (MEMS) spatial light modulator; and the processor further in communication with the projector to determine unit vectors representing directions of laser beams emitted by the projector during At. Concept 6. The system of any one of the preceding Concepts, each pair of directions of the four or more defining an angle different from that of any other pair.

Concept 7. The system of any one of the preceding Concepts, the four or more different directions coinciding at a point.

Concept 8. The system of any one of the preceding Concepts, the laser beams having an infrared wavelength and the cameras equipped with infrared filters to select the wavelength in preference to background light.

Concept 9. The system of any one of the preceding Concepts, each laser beam

modulated with a modulation signal that distinguishes it from all of the other laser beams, and each camera demodulating the signal to identify a correspondence between spots and laser beam directions.

Concept 10. The system of Concept 9, the modulation signal being sinusoidal.

Concept 11. The system of Concept 9, the modulation signal being a pseudorandom code. Concept 12. The system of Concept 9, the modulation signal further carrying information pertaining to the orientation of each laser beam with respect to an object.

Concept 13. A method for indoor navigation comprising:

capturing, with two or more cameras, images of spots made by one or more laser beams on surfaces of an indoor space, the laser beams emitted by a laser projector in four or more different directions during a time interval At;

estimating three-dimensional locations of the spots from images captured by at least two cameras during At; and,

estimating position of the laser projector in the indoor space by space resection given the four or more different directions and the three-dimensional locations of the spots. Concept 14. The method of Concept 13 further comprising: estimating an orientation of an object during At, given a relative orientation of the object and the projector.

Concept 15. The method of Concept 14, the projector attached to the object via gimbal mount.

Concept 16. The method of Concept 13 wherein the laser projector comprises a rotating mirror; and further comprising: estimating an orientation of an object during At, given an angle of the rotating mirror with respect to the object during At.

Concept 17. The method of Concept 13 wherein the laser projector comprises a micro- electromechanical (MEMS) spatial light modulator; and further comprising: estimating an orientation of an object during At, given a set of unit vector coordinates describing directions of laser beams emitted by the projector with respect to the object during At.

Concept 18. The method of any one of Concepts 13 - 17, the cameras calibrated by:

capturing simultaneously with two or more cameras images of four or more spots on a planar surface;

determining homographies between pairs of cameras by identifying corresponding spots in the images captured;

determining from the homographies: relative poses between pairs of cameras, and orientation of the planar surface with respect to the cameras; fitting the orientation of the planar surface to a model of the indoor space; and, determining the location and pose of each camera in the indoor space. Concept 19. The method of any one of Concepts 13 - 18, the space resection including: creating a virtual point not coplanar with three actual spot locations used in resection; using correspondences between locations of the three actual spots and the virtual point expressed with respect to the indoor space and the laser projector to establish a transformation matrix; and,

using the transformation matrix to test a fourth actual spot to resolve geometric

ambiguities in resection.

Concept 20. The method of any one of Concepts 13 - 19, each laser beam modulated with a modulation signal that distinguishes it from all of the other laser beams, and each camera demodulating the signal to identify a correspondence between spots and laser beam directions.

Concept 21. The method of Concept 20, the modulation signal being sinusoidal.

Concept 22. The method of Concept 20, the modulation signal being a pseudorandom code.

Concept 23. The method of any one of Concepts 13 - 22, each pair of directions of the four or more defining an angle different from that of any other pair.

Concept 24. The method of any one of Concepts 13 - 23, the four or more different

directions coinciding at a point.

Concept 25. The method of any one of Concepts 13 - 24, the laser beams having an

infrared wavelength and the cameras equipped with infrared filters to select the wavelength in preference to background light.

Concept 26. A method for indoor navigation comprising:

capturing, with two or more cameras, images of a first set of spots made by one or more laser beams on surfaces of an indoor space, the laser beams emitted by a laser projector in a first set of four or more different directions during a time interval At x and images of a second set of spots made by one or more laser beams on surfaces of the indoor space, the laser beams emitted by the laser projector in a second set of four or more different directions during a time interval Δΐ 2 ;

estimating three-dimensional locations of the spots from images captured by at least two cameras during Ati and Δί 2 ; and,

estimating a position of the laser projector in the indoor space during Δΐχ and At 2 by space resection given the first and second sets of four or more different directions and the three-dimensional locations of the spots.

Concept 27. The method of Concept 26 further comprising: estimating an orientation of an object during Ati and At 2 , given a relative orientation of the object and the projector. Concept 28. The method of Concept 27, the projector attached to the object via gimbal mount.

Concept 29. The method of Concept 26 wherein the laser projector comprises a rotating mirror; and further comprising: estimating an orientation of an object during At x and At 2 , given angles of the rotating mirror with respect to the object during Atj. and At 2 .

Concept 30. The method of Concept 26 wherein the laser projector comprises a micro- electromechanical (MEMS) spatial light modulator; and further comprising: estimating an orientation of an object during Δίι and Δί 2 , given a set of unit vector coordinates describing directions of laser beams emitted by the projector with respect to the object during t x and Δΐ 2 .

Concept 31. The method of any one of Concepts 26 - 30, the cameras calibrated by:

capturing simultaneously with two or more cameras images of four or more spots on a planar surface;

determining homographies between pairs of cameras by identifying

corresponding spots in the images captured; determining from the homographies: relative poses between pairs of cameras, and orientation of the planar surface with respect to the cameras;

fitting the orientation of the planar surface to a model of the indoor space; and, determining the location and pose of each camera in the indoor space. Concept 32. The method of any one of Concepts 26 - 31, the space resection including: creating a virtual point not coplanar with three actual spot locations used in resection; using correspondences between locations of the three actual spots and the virtual point expressed with respect to the indoor space and the laser projector to establish a transformation matrix; and,

using the transformation matrix to test a fourth actual spot to resolve geometric

ambiguities in resection.

Concept 33. The method of any one of Concepts 26 - 32, each laser beam modulated with a modulation signal that distinguishes it from all of the other laser beams, and each camera demodulating the signal to identify a correspondence between spots and laser beam directions.

Concept 34. The method of Concept 33, the modulation signal being sinusoidal.

Concept 35. The method of Concept 33, the modulation signal being a pseudorandom code.

Concept 36. The method of any one of Concepts 26 - 35, each pair of directions of each set of four or more defining an angle different from that of any other pair.

Concept 37. The method of any one of Concepts 26 - 36, each set of four or more

different directions coinciding at a point. Concept 38. The method of any one of Concepts 26 - 37, the laser beams having an infrared wavelength and the cameras equipped with infrared filters to select the wavelength in preference to background light.

APPENDIX A

"Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem,"

International Journal of Computer Vision, 13, 3, 331 - 356 (1994).

International Journal of Computer Vision, 13, 3, 331-356 (1994)

© 1994 Kluwer Academic Publishers. Manufactured in The Netherlands.

Systems and Replication

Review and Analysis of Solutions of the Three Point Perspective

Pose Estimation Problem

ROBERT M. HARALICK

Intelligent Systems Laboratory, Department of Electrical Engineering

FT-10, University of Washington, Seattle, WA 98195, USA

CHUNG-NAN LEE

Institute of Information Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan 80424, ROC KARSTEN OTTENBERG

Philips-Forschungslabor, Vogt-Kolln Strafe 30, D-2000 Hamburg 54, Germany MICHAEL NOLLE

Technische Universitat Hamburg-Harburg, Technische Informatik I, Harburger Schlofistrafie 20, D-2100 Hamburg 90, Germany

Received October 30, 1990; revised March 27, 1992 and October 29, 1993

Abstract. In this paper, the major direct solutions to the three point perspective pose estimation problems are reviewed from a unified perspective beginning with the first solution which was published in 1841 by a German mathematician, continuing through the solutions published in the German and then American photogrammetry literature, and most recently in the current computer vision literature. The numerical stability of these three point perspective solutions are also discussed. We show that even in case where the solution is not near the geometric unstable region, considerable care must be exercised in the calculation. Depending on the order of the substitutions utilized, the relative error can change over a thousand to one. This difference is due entirely to the way the calculations are performed and not due to any geometric structural instability of any problem instance. We present an analysis method which produces a numerically stable calculation.

1 Introduction as well as in computer vision, because it has a variety of applications, such as camera calibration

Given the perspective projection of three points (Tsai 1987), object recognition, robot picking, constituting the vertices of a known triangle and robot navigation (Linnainmaa et al. 1988; in 3D space, it is possible to determine the Horaud 1987; Lowe 1987; Dhome 1988) in composition of each of the vertices. There may puter vision and the determination of the locabe as many as four possible solutions for point tion in space from a set of landmarks appearpositions in front of the center of perspectivity ing in the image in photogrammetry (Fischler and four corresponding solutions whose point and Bolles 1981). Three points is the minimal positions are behind the center of perspectivity. information to solve such a problem. It was In photogrammetry, this problem is called the solved by a direct solution first by a German three point space resection problem. mathematician in 1841 (Grunert 1841) and then This problem is important in photogrammetry refined by German photogrammatrists in 1904

APPENDIX A 41 Haralick, Lee, Ottenberg, and Nolle and 1925 (Miiller 1925). Then it was indepensolution in Appendix I to make the three point dently solved by an American photogrammatrist perspective pose estimation solution complete. in 1949 (Merritt 1949). Second, we run experiments to study the nu¬

The importance of the direct solution became merical stability of each of these solutions and less important to the photogrammetry commuto evaluate some analysis methods described in nity with the advent of iterative solutions which Appendix II to improve the numerical stability could be done by computer. The iterative soof the calculation. It is well-known that roundlution technique which was first published by ing errors accumulate with increasing amounts Church (1945, 1948), needs a good starting value of calculation and significantly magnify in some which constitutes an approximate solution. In kinds of operations. Furthermore, we find that most photogrammetry situations scale and disthe order of using these equations to derive the tances are known to within 10% and angle is final solution affects the accuracy of numerical known to within 15°. This is good enough for results. The results show that the accuracy can the iterative technique which is just a repeated be improved by a factor of about 10 3 . Since the adjustment to the linearized equations. The accumulation of rounding errors will be propatechnique can be found in many photogrammegated into the calculation of the absolute orientry books such as Wolf (1974) or the Manual of tation problem. As a result the error would be Photogrammetry (Slama 1980). very serious at the final stage. In the advent of

The exact opposite is true for computer vision better sensors and higher image resolution, the problems. Most of the time approximate solunumerical stability will play a more dominant tions are not known so that the iterative method role in the errors of computer vision problem. cannot be used. This makes the direct solution Finally, we summarize the result of hundreds method more important in computer vision. Inof thousands experiments which study the nudeed, in 1981 the computer vision community merical behaviors of the six different solution independently derived its first direct solution techniques, the effect of the order of equation (Fischler and Bolles 1981). And the commumanipulation, and the effectiveness of analysis nity has produced a few more direct solutions methods. These results show that the analysince then. sis techniques in Appendix II are effective in

In this paper, first, we give a consistent treatdetermining equation order manipulation. The ment of all the major direct solutions to the source codes and documentation used for the three point pose estimation problem. There is experiments in the paper is available on a a bit of mathematical tedium in describing the CDROM. The interested readers can send a request to the Intelligent Systems Laboratory at various solutions, and perhaps it is worthwhile

to put them all in one place so that another the University of Washington.

vision researcher can be saved from having to

redo the tedium himself or herself. Then, we

2 The Problem Definition

compare the differences of the algebraic derivations and discuss the singularity of all the solu¬

Grunert (1841) appears to have been the first tions. In addition to determining the positions one to solve the problem. The solution he gives of the three vertices in the 3D camera cooris outlined by Miiller (1925). The problem can dinate system, it is desirable to determine the be set up in the following way which is illustrated transformation function from a 3D world coorin Figure 1.

dinate system to the 3D camera coordinate system, which is called the absolute orientation in Let the unknown positions of the three points photogrammetry. Though many solutions to the of the known triangle be

absolute orientation can be found in photogrammetry literature (Schut 1960; Schonemann 1966;

1970; Wolf 1974; Slama 1980; Horn 1988; and p u p 2 , and p 3 ; p { = , i = 1, 2, 3. Haralick et al. 1989) we present a simple linear

APPENDIX A 42 Three Point Perspective Pose Estimation Problem a, β nd

Let the known side lengths of the triangle be

(1841), Finsterwalder (1937), Merritt (1949),

By the perspective equations, Fischler and Bolles (1981), Linnainmaa et al.

(1988), and Grafarend et al. (1989), respectively. In this section we first give the derivation of the Grunert solution to show how the problem can be solved. Then, we analyze what are the major differences among the algebraic

The unit vectors pointing from the center of derivation of these solutions before we give the perspectivity to the observed points pi, P2,P3 are detailed derivations for the rest of solutions. given by Finally, we will give comparisons of algebraic derivation among six solutions.

1, 2, 3 Grunert's Solution

Grunert proceeded in the following way. By the respectively. The center of perspectivity tolaw of cosines,

gether with the three points of the 3D triangle

form a tetrahedron. Let the angles at the center + s — s2-53 cos a (1)

APPENDIX A 43 Ha alick, Lee, Ottenberg, and Nolle

Let Then cos 2 /?

Hence,

' a J + c a \

4 I — COS <¾ COS (9 cos

ft''

rt 2 + v 2 — 2uv cos a

b 2 . ,' 6 2 - a 2 \ ,

+ 2 — — ) cos 2

be obtained in terms of v. Outline of the Differences of Algebraic Derivation

' ~l+ £ -2( £ ) cos βυ+1+

2 (8) As we can find in the Grunert solution, the procedure to solve the problem is first to reduce

This expression for u can then be substituted three unknown variables βχ, s 2 , and S3 of three back into the equation (6) to obtain a fourth quadratic equations (1), (2) and (3) into two order polynomial in v. variables u and v, and then further reduce two variables u and v into one variable v from which

A 4 v 4 + A 3 v 3 + A 2 v 2 + Aiv + A 0 = 0 (9) we find the solutions of v and substitute them where back into equation (5) and equation (4) to obtain

si, s 2 , and s 3 . Though all six solutions mainly follow the outline mentioned above, there are a

several differences from the algebraic derivation

APPENDIX A 44 Three Point Perspective Pose Estimation Problem point of view. We classify the differences from where the coefficients depend on λ:

the following aspects. A = 1 + λ

B = — cos

Change of variables

b - a 2

Linnainmaa et al. use s 2 = « + COS S1 and C =

s 3 = v + cos ?si instead of s 2 = usi and v λ

D = -Acos

s 3 = vsi which are used by others.

Different pairs of equations f a 2 c 2 '

E = cos/3

There are three unknowns in the three equations (1), (2), and (3). After the change of -a 2 , (I b 2

F =

variables is used, any two pairs of equations b 2

can be used to eliminate the third variable. Finsterwalder considers this as a quadratic For example, Grunert uses the pair of equaequation in v. Solving for v,

tions (1) and (2) and the pair of equations (2)

and (3) and Merritt uses the pair of equa-2(Bu+E)± /i(Bu-j-E) 1 -4C{An 2 +2Du+F)

v = 2C

tions (1) and (2) and the pair of equations (1)

_ -(Bu+E)± /{B 2 -AC)u 2 +2(BE-CD)u+E 2 ~CF and (3). ~ C

Approaches of further variables reduction (11) When reducing two variables into one variThe numerically stable way of doing this comable, Grunert and Merritt use substitution. putation is to determine the small root in terms Fischler and Bolles and Linnainmaa et al. of the larger root.

use directly elimination to reduce the variables. Finsterwalder and Grafarend et al. -sgn(5u + E)

Vlarge = ~ [\Bu + E\ introduce a new variable λ before reducing

the variables. +

_ c

The flow chart shown in Figure 2 gives a summary of the differences of algebraic derivation Now Finsterwalder asks, can a value for λ of six solutions in a unified frame. In the flow be found which makes (B 2 - AC)u 2 + 2(BE - chart we start from the three equations (1), (2), CD)u + E 2 — CF be a perfect square. For and (3), make different change of variables, use in this case v can be expressed as a first ordifferent pairs of equations, do further variable der polynomial in terms of u. The geometric reduction by different approaches, if necessary, meaning of this case is that the solution to (10) solve the new variable, then we have six different corresponds to two intersecting lines. This first solution techniques. order polynomial can then be substituted back into equation (6) or (7) either one of which

Finsterwalder's Solution

yields a quadratic equation which can be solved for u, and then using the just determined value

Finsterwalder (1903) as summarized by Finsterfor u in the first order expression for v, a value walder and Scheufele (1937) proceeded in a

for υ can be determined. Four solutions are promanner which required only finding a root of a

duced since there are two first order expressions cubic polynomial and the roots of two quadratic

for v and when each of them is substituted back polynomials rather than finding all the roots of

into equation (6) or (7) the resulting quadratic a fourth order polynomial. Finsterwalder mulin u has two solutions.

tiplies equation (7) by A and adds the result to

The value of λ which produces a perfect Equation (6) to produce

square satisfies

An 2 + 2Buv + Cv 2 + 2Du + 2Ev + F = 0 (B 2 - AC)u 2 + 2{BE - CD)u + E 2 — CF

APPENDIX A 45 Haralick, Lee, Ottenberg, and Nolle

Fig. 2. Shows the differences of a algebraic derivations among six solution techniques.

Hence, each side and dividing all terms by a common

C there results

B 2 - AC = p 2

BE - CD = pq C(AF - D ) + B(2DE - BF) - AE l = 0,

(13) E 2 - CF = q 2 .

or expressed as a determinant

Since p q 2 = (pq) 2 ,

B D

(B 2 - AG)(E 2 - CF) = (BE - CD) 2

C E \

After expanding this out, canceling a B 2 E 2 on E F

APPENDIX A 46 Three Point Perspective Pose Estimation Problem

This is a cubic equation for λ: The numerically stable way to calculate u is to compute the smaller root in terms of the larger

GX 3 + HX 2 + IX + J = 0 (14) root. Let

where c 2

I = 6 2 (6 2 - c 2 ) sin 2 a + a 2 (o 2 + 2c 2 ) sin 2 3 then

+ 2a 2 6 2 (— 1 + cos a cos /?cos ) -sgn(.B)

J = a (a 2 sin /? - 6 2 sin 2 a). u large |B| +

C

Solve this equation for any root λ 0 . This determines p and q: orpe

q

Then from equation (11)

Similarly, after multiplying equation (1) by c 2 , w = [~(Bu + E) ± (pu + q)]/C and equation (3) by a 2 and subtracting there = [-(B p)u - (E ^ q )]/C results

= um + n,

o 2 3 2 + (a 2 — c 2 )sij c 2 s|— 2a 2 sjS2 cos 7 where + 2c 2 s 2 s 3 cos = 0.

m = {-B ± p]/C

and Then using the substitution of equation (4) n = [-( j B T ?)]/C. we obtain the following two equations.

Substituting this back into equation (7) and simplifying there results -fcV + (a 2 - b 2 )v 2 - 2o 2 cos βυ

+ 26 2 cos craw + a = 0 (18)

(6 2 - mc 2 )u 2 + 2(c 2 (cos β - n)m - b 2 cos )

(a 2 - C 2 )M 2 - cV - 2a 2 cos 71*

-c n 2 + 2c 2 cos β + b 2 - c 2 = 0. (16)

+ 2c 2 cos auv + a 2 = 0. (19)

Hence,

APPENDIX A 47 Haralkk, Lee, Ottenberg, and Nolle

Substituting this expression for v 2 into equaNow add

tion (19) and simplifying to obtain

(a 2 - b 2 - c 2 )u 2 + 2c 2 cos auv

There results

b 2 -a 2 -<?+(l?+<?-a 2 }v?+2(a?-t?) coa-yu

2c 3 ( cos a— cos β)

Substituting this expression for v into equation (19) produces the fourth order polynomial

equation

Choose λ so that the right hand side is a perfect

B 4 u 4 + B 3 u 3 + B 2 u 2 + B x u + B 0 = 0 (22)

square.

wher

+ 4c (cos 2 a + cos 2 β + cos 2 7 This means that

and

A 2 2

-co + = n

ΛΓ = 2(6 2 + c 2 - α 2 )(α 2 + c 2 - b 2 ) cos 4

+ 4c 2 (a 2 + b 2 - c 2 ) cos a cos β.

real ro

.

APPENDIX A 4 8 Three Point Perspective Pose Estimation Problem

Fischler and Bolles' Solution multiply (26) by

[ a 2 _ b 2 -?)u 2 + 2(b 2 - a 2 ) cos u + (a 2 - b 2 + c 2 )]

Fischler and Bolles (1981) were apparently not

aware of the earlier American or earlier German and subtract to eliminate v. This produces the solutions to the problem. From Equation (5), fourth order polynomial equation

they obtain

D 4 w 4 + s u 3 + D 2 u 2 + Dili + D 0 = 0 (27) cos3— cosmt I v where

D 4 = 4b 2 c 2 cos 2 a-(a -b 2 -c 2 ) 2

(23)

b 2 D 3 = -4c 2 (a 2 + b 2 - (?) cos a cos β

-86 2 c 2 cos 2 acoS

+ 2(— cos u)v + I 1

+ 4(a 2 -6 2 -c 2 )(a 2 -6 2 )cos 7

+ -T-C0S U r- = 0. (24) D 2 = 4c 2 (a 2 - c 2 ) cos 2 β

+ 8c 2 (a 2 + b 2 ) cos cos β cos 7

Equation (23) is identical to equation (6) but + 4c 2 (6 2 - (?) cos 2 a

equation (24) is different from equation (7) since -2(a 2 -6 2 -c 2 )(a 2 -6 2 + c 2 ) it arises by manipulating a different pair of equations than was used to obtain equation (6). -4(a 2 -6 2 ) 2 cos 2 7

Multiply (23) by D x = - 8a c 2 cos 2 β cos 7

2a 2 - 4c 2 (b 2 - c 2 ) cos a cos β

\u + cos « — 4a 2 c 2 cos cos β

+ 4(o 2 -6)(a 2 -6 2 + c 2 )cos7 multiply (24) by

D 0 =4a 2 c 2 cos 2 /3-(a 2 -& 2 + c 2 ) 2

Corresponding to each of the four roots of b 2 equation (27) for u there is an associated value and subtract to produce for v through equation (26) or equation (25).

[(a 2 - b 2 - c + 2(6 2 - a )cosju

Grafarend, Lohse, and Schaffrim's Solution + (a 2 - b 2 + c 2 )]v + 2b 2 cos au 3

+ (2(c 2 - a 2 ) cos β - 46 2 cos cos 7) u 2 Grafarend, Lohse, and Schaffrim (1989) aware + [4a 2 cos β cos 7 + 2(6 2 - c 2 ) cos a]u of all the previous work, except for the Fischler- -2a 2 cos/3 = 0. (25) Bolles solution, proceed in the following way.

They begin with equations (1), (2), and (3) and

Multiply (24) by seek to reduce them to a homogeneous form.

After multiplying equation (3) by

and subtract from (23).

and adding the result to equation (1) there

2c 2 (cos au - cos β)υ + (a 2 - b 2 - c 2 )u 2 results

+ 2(6 2 - o 2 ) cos ' + a 2 -b 2 + <? 2 2

(26) 2 1 8 2 + S 3

Finally, multiply (25) by + 2— S1S2COS - 2cosas2S 3 - 0.

2c 2 (cos au— (28)

APPENDIX A 49 Haralick, Lee, Ottenberg, and Nolle

After multiplying equation (3) by where

b 2 Ac 2 cos/3 c?(-l + A)

cos — a 2 + Xb 2 ) c 2 cos a

Po =

and adding the result to equation (2), there c^ cos a c?(-l + A) results x 2 - c 2 - A6 2 (^ cos a

Next they use the same idea as Finsterwalder.

Rotating the coordinate system by angle Θ so They multiply equation (29) by λ and from it

that the cross term can be eliminated, Θ must subtract equation (28) to produce

satisfy

2c 2 cos

tan 26» = (33)

(si S2 s 3 )A I s 2 (30) c^ - A^ + c 2 ) '

Define the new coordinate (ρ', q 1 ) by

of λ which makes the determinant of A zero. where

Setting the determinant of A to zero produces

. _ a 2 -2c 2 +A(c 2 2 % (a ¾ -A(!^+c 2 ¾) :! +(2e ¾ cos a) 2 a cubic for λ. For this value of λ the solution to equation (30) becomes a pair of planes R _ a 2 -2e +A(c 2 -6 2 )T (a 2 -A(i> 2 +c 2 )) 2 +(2c 2 cos ct) 2 intersecting at the origin. a - 2

They let p = and q = s 3 /si and rewrite We choose the negative root square term for A the homogeneous equation (30) in sj , s 2 , and s 3 and the positive root square term for B when the as a non-homogeneous equation in p and q. value of 2Θ falls in the first and third quadrant

(a 2 - c 2 - Xb )p 2 + 2c 2 cos apq + c 2 (-l + X)q 2 and choose the positive root square term for A and the negative root square term for B when + 2(-a 2 + Xb 2 ) cos p - 2X<? cos q the value of 2Θ falls in the second and fourth + a 2 - X(b 2 - c 2 ) = 0 (31) quadrant.

Assuming B/A < 0, (35) results in

Now since \A\ = 0, and assuming

p' = ±Kq' (36) c 2 cos c 2 (-l + λ)

— c 2 - A6 2 c 2 cos

a value for ( >o, qo) exists such that

written in the homogeneous form Using (34) there results

(a 2 - c 2 - Xb 2 )(p - p 0 ) 2 p[cos Θ ± K sin Θ] + q[sin Θ ± K{- cos 0)]

+ 2c 2 COS a(p - p 0 )(q - q 0 ) + [-po(cos 0 ± if sin 6») + c 2 (-l + X)(q - qo) 2 = 0 + 5o(- sin Θ ± K cos 9)} = 0. (37)

APPENDIX A 50 Three Point Perspective Pose Estimation Problem

Equation (36) is a function of λ. For any λ Letting

equation (36) degenerates into a pair of straight

lines intercepting in p, q plane. All possible qi = 1— cos 2 7

combinations of any two As out of λι, λ2, and q 2 = 1 - cos 2 β

λ 3 will give a real solution for p' and c . Then q 3 = 2(cos z 7 - cos a cos β cos 7 (47) we solve p and q. Finally, from equation (1),

4- cos 2 β - 1)

(2), and (3)

¾ = c 2 + b 2 - a 2

q 5 = 2(cos a cos β— cos 7)

Si = (39)

n = qj + cos 2 a0 2 c

Linnainmaa, Harwood, and Davis' Solution

r = 4COS Q¾ + 2q $ q 6

Linnainmaa, Harwood, and Davis (1988) give Γ5 = 4 cos α<¾.

another direct solution. They begin with equa¬

Then to eliminate the uv term, they square tions (1), (2), and (3) and make a change of

variables.

53 = V + cos 3si (43) where

equations (2) and (3) become t s =

APPENDIX A 51 Harali k, Lee, Ottenberg, and Nolle and (49) can be solved for two values of u and the eigensystem (si s s 3 )(P - AQ)(si s 2 s 3 ) = 0, v. Each of these can then be substituted into which is another form of equation (30). At equation (42) and (43) to obtain the positive this point these two approaches are algebraically solutions for s 2 and S3. equivalent.

Comparisons of the Algebraic Derivations Singularity of Solutions

The main difference between the Grunert soluIt is well-known that there exist some geometric tion and the Merritt solution is that they use structures for the three point space resection, different pairs of equations. As a result, the coon which the resection is unstable (Thompson efficients in their fourth order polynomials are 1966) or indeterminate (Smith 1965). For the different. However, if we replace 6 with c, c unstable geometric structure, a small change with b, β with 7, and 7 with β in equation (9), in the position of the center of perspectivity then we can obtain equation (22). Therefore, will result in a large change in the position of from the algebraic point of view, their solutions three vertices. For the indeterminate geometric are identical. But Merritt converts the fourth structure, the position of three vertices cannot order polynomial into two quadratics instead be solved. Besides the singularity caused by of solving it directly. The difference between geometric structures, there also exist some sinFischler and Bolles' and Grunert's solution is gularities caused by the algebraic derivation of that the former just multiplies some terms to solutions. In the following paragraphs we will two pairs of equations and then subtracts each give detailed explanations and examples.

other without expressing one variable in terms The danger cylinder is a typical case for the of the other. unstable geometric structure and refers to the

Grunert and Merritt use the substitution to geometric structure where the center of perspecreduce the two variables into one variable. The tivity is located on a circular cylinder passing advantage of the substitution approach is that through the three vertices of a triangle and havit is pretty trivial. But there exists a singular ing its axis normal to the plane of the triangle. region when the denominator is zero in equaAn illustration of the danger cylinder is shown tions (8) and (21). This is discussed more fully in Figure 3.a. The reason for the instability can in the next subsection. Fischler and Bolles and be explained as follows. Instead of determining Lin nainmaa et al. use direct elimination to the position of three vertices, we fix them and reduce the variables. Though the approaches let the coordinates of the center of perspectivity are not trivial, it does not generate any singular be unknown, (x, y, z), as in the resection probpoint during the derivation. lem. Since the problem is mainly to solve the three unknown variables si, a 2 , and s 3 , there is

Linnainmaa et al. use s 2 = u + cos S1 and

actually no difference between fixing the vertices s 3 = u + cos^si as the change of variables. Naturally, this leads to another different derivation or the center of perspectivity. Now the value to the problem. Although we consider equaof si, s 2 , and S3 are functions of x, y, z. Rewrite equations (1), (2), and (3) into fi(x, y, z) = 0, tion (52) as a fourth order equation in sj, the

f 2 (x, y, z) = 0, and f 3 (x, y, z) = 0, and take total complexity of the coefficients is much higher

than that of Grunert's fourth order equation. derivatives. We then have

Finsterwalder and Grafarend et al. introduce 1 ( dx \ ( dh \ the same variable, but they use different apAB \ dy \ = { df 2 \ (53) proaches to solve λ. Finsterwalder solves equaW \ dz j \ df

tion (10) for v and seeks a λ to make the term

where

inside the square root be a perfect square. Grafarend et al. actually rewrite the quadratic equa( x - xi 2 - j/i z - zi \ tions into matrix form si s 2 s 3 )P(si s 2 ¾)' = 0 x - x y - y 2 z - ¾ , and (sj s 2 s 3 )Q(si s 2 s ) 1 = 0, then try to solve x - ¾ y - y z - z 3 J

APPENDIX A 52 Three Point Perspective Pose Estimation Problem

and pi = i = 1, 2, 3 as defined before are

Determination of the Absolute Orientation the position of three vertices of the triangle in

the camera coordinate frame. Once the position of three vertices of the trian¬

To make the system stable the dx, dy, dz must gle is determined, the transformation function have no solutions other than zeros; that is, mawhich governs where the 3D camera coordinate trices A and B must be non-singular. The system is with respect to the 3D world coordideterminant of matrix A is proportional to the nate system can be calculated.

volume of a tetrahedron formed by three verThe problem can be stated as follows. Given tices and the center of perspectivity. As long three points in the 3D camera coordinate system as these four points are not coplanar the maand their corresponding three points in the 3D trix A is nonsingular. When the matrix B is world coordinate system, we want to determine singular; that is, where the determinant of B a rotation matrix R and translation vector T is zero, we can expand the determinant of B which satisfies

and express s x , s 2 , s 3 , cos a, cos/?, and cos in

terms of x, y, z. Then we can obtain an equation p i = Rp' i + T i = 1, 2, 3 (54) the =

APPENDIX A 53 Haralick, Lee, Ottenberg, and Nolle

Table I. The summary of characteristic of six solutions.

Authors Features Algebraic singularity

Grunert 1841 Direct solution, solve Yes

a fourth order polynomial

Finsterwalder 1903 Form a cubic polynomial and Yes

find the roots of two quadratics

Merritt 1949 Direct solution, solve Yes

a fourth order polynomial

Fischler and Bolles 1981 Another approach to form No

a fourth order polynomial

Linnainmaa et al. 1988 Generate an eighth No

order polynomial

Grafarend et al. 1989 Form a cubic polynomial and Yes

find intersection of two quadratics system, R is a 3 by 3 orthonormal matrix, i.e., gle are randomly generated by a uniform random number generator. The range of the x, y,

RR* = J, and T = (^j- The problem can and z coordinates are within [-25, 25], [-25, be solved by a linear (Schut 1960), an iterative 25], and [/ + o, b] respectively. Since the image (Wolf 1974; Slama 1980), or noniterative closed- plane is located in front of camera at the disform solution (Horn 1988). We give a simple tance of focal length, /, the z coordinate must linear solution in Appendix I. be larger than the focal length. So a > 0 and b > f + a. The a and b are used as parameters to test the solution under diiferent sets of depth.

4 The Experiments

Projecting the 3D spatial coordinates into the

To characterize the numerical sensitivity of the image frame we obtain the perspective image six diiferent 3 point resection solutions we percoordinates u and v.

form experiments. The experiments study the

effects of rounding errors and numerical insta4.1.1 Permutation of Test Data. To test the nubility of each of these six different solutions in merical stability of each resection technique we both single and double precision mode. In adpermute the order of the three vertices of a tridition, we examine the relation of the equation angle and the order of the perspective projection manipulation order. This is accomplished by of the 3D triangle vertices. Assume the original changing the order in which the three correorder of vertices is 123 for vertex one, vertex two sponding point pairs are given to the resection and vertex three, respectively, then the other five procedure. permutations are 312, 231, 132, 321, and 213.

Since singularities and unstable structures exThe permutation of triangle vertices means perist in the three point perspective pose estimation muting in a consistent way the 3D triangle side problem, we wanted to know how often it can lengths, the 3D vertices and the corresponding happen in the testing data. To isolate these sin2D perspective projection vertices.

gularities and unstable structures, we ran 100000

experiments on the Grunert solution, because it 4.2 The Design of Experiments

has both algebraic and geometric singularities.

Then we screened out the singular cases by In this section we will summarize the paramepicking those trials whose error is larger than a ters in the experiments discussed in Appendix II. certain value. The experimental procedure of experiments will

4.1 Test Data Generation be presented too. The parameters and methods involved in accuracy and picking the best

The coordinates of the vertices of the 3D trian- permutation are denoted by

APPENDIX A 54 Three Point Perspective Pose Estimation Problem

Νχ - the number of trials = 10000 312, 231, 132, 321, and 213.

N2 - the number of trials = 100000 Step 4. For each of the resection techniques, deP - different number of precisions = 2 termine the location of the 3D vertices di - the first set of depths along z axis if the calculation can succeed.

d 2 - the second set of depths along z axis

S w =∑* =0 \Si\ - the worst sensitivity Step 4.1. For any calculation which has sucvalue for all coefficients. ceeded record the absolute distance error (ADE) associated with each permutation. The mean absolute distance error (MADE) is defined

error for a coef ic ents

where Si = Φ- 1 = , P = β ¾ + a^x 3

+ a 2 x 2 + α χ + ao, and 5* = § The e wrrej

and e WOT . ei are the total relative and absolute and (a¾i is the calculated rounding errors propagated from the first to the point coordinates and the last mathmetical operations of each coefficient is the correct generated point coorof the polynomial P. dinates. The error standard deviation is expressed as follows:

4.2.1 The Design Procedures. The experimental

procedures and the characteristics to be studied

sd = \

are itemized as follows: (n - 1)

Step 0. Do the following steps N times. Step 5. This procedure is only applied to Gru- Step 1. Generate the coordinates of vertices of nert's solution

the 3D triangle.

-25 < Xi < 25 where i = 1, 2, 3 Step 5.1. Calculate the sensitivity of zero w.r.t.

-25 < y; < 25 each coefficient and total sensitivity

For ∑i coordinate there are several for all coefficients based on the dissets to be tested. cussion in A.2.3.

1. d x = {(a, 6) I (a, b) € {(0, 5), Step 5.2. Calculate the worst absolute and rel(4, 20)}, / = 1} ative rounding errors for each coef¬

2. d 2 = {(a, b) I (a, b) G {(0, 5), ficient based on the discussion in (4, 20), (24, 75)}, / = 1} A.2.4. The number of significant

Step 2. For single and double precision do the digits is the same as the mantissa resection calculation. representation of machine for multi¬

Step 3. Permutation of the vertices. Let the plication and division. For addition original vertex order be 123 (vertex one, and subtraction the possibly lost sigvertex two and vertex three, respecnificant digits in each operation must tively), then we permute the order as be checked.

APPENDIX A 55 Haralick, Lee, Ottenberg, and Nolle

Step 5.3. Calculate the polynomial zero drift. C language and the experiments are run on

Step 5.4. Record the values of the sensitivboth a Sun 3/280 workstation and a Vax 8500 ity S w , the normal sensitivity 5 ωη , computer. Unless stated otherwise, the results the worst absolute rounding error in the following paragraphs are obtained from e ware , the worst relative rounding erthe Sun 3/280. Table II shows the results of ror e wrre , the worst polynomial zero random permutation of six different solutions. drift due to absolute rounding error, From Table II we find that Finsterwalder's soluand the worst polynomial zero drift tion (solution two) gives the best accuracy and due to relative rounding error e SWTre Merritt's solution gives the worst result. for each permutation. Grunert's solution (solution one), Fischler's so¬

Step 5.5. Based on the smallest value of e ware , lution (solution four) and Grafarend's (solution

or e.. picks six) are about the same order and give the secthe corresponding error generated ond best accuracy. The reasons for the better by the corresponding permutation results can be explained in terms of the order of and accumulate the number of its polynomial and the complexity of computation. rank in the six permutation. Rank Linnainmaa's solution (solution five) generates each permutation in terms of the eran eighth order polynomial. Though it doesn't ror associated with the permutation. have to solve the eighth order polynomial, the The rank one is associated with the complexity of the coefficients of Linnainmaa's smallest error and the rank six is solution is still higher than that of others. associated with the largest error. Finsterwalder's solution only needs to solve a third order polynomial. The higher order poly¬

Step 6. Check for singular cases.

nomial and higher complexity calculations tend

Redo the whole procedure again by

to be less numerically stable. However, Merchanging Ni to N 2 and d to t¾

ritt's solution also converts the fourth order polyand use Grunert's solution only. If

nomial problem into a third order polynomial the largest absolute distance error is

problem, but it gives a worse result. This is greater than lO "7 redo steps 5 and

because the conversion process itself is not the record the corresponding values for

most numerically stable. An experiment which the large error cases.

directly solves Merritt's fourth order polynomial was conducted. A Laguerre's method was used

5 Results and Discussion to find the zeros of a polynomial. The results

In this section we discuss the results of the are similar to that of Grunert's solution.

experiments. The software is coded in the The histogram of the absolute distance errors

Table II. Results of random permutation of six solutions in double precision and single precision.

Mean absolute

Algorithms Precision distance error Standard deviation

Sol. 1 (Grunert) D.P. 0.19e-08 0.16e-06

S.P. 0.31e-01 0.88e-00

Sol. 2 (Finsterwalder) D.P. 0.22e-10 0.90e-09

S.P. 0.89e-02 0.51e-01

Sol. 3 ( erritt) D.P. 0.11e-05 0.64e-04

S.P. 0.28e-01 4.15e-00

Sol. 4 (Fischler) D.P. 0.62e-08 0.59e-06

S.P. 0.14e-01 0.34e-00

Sol. 5 (Linnainmaa) D.P. 0.74e-07 0.61e-05

S.P. 0.32e-01 0.82e-00

Sol. 6 (Grafarend) D.P. 0.46e-08 0.43e-06

S.P. 0.20e-01 0.75e-01

APPENDIX A 56 Three Point Perspective Pose Estimation Problem

Hlatogrtai ganaratad by Hlatograa generated by random pick H1*togru> generated by rendc« pick

-12.2 -S.B -B.75 -12.2 -9.G -6.76

ADE 1n og. ecale ADE 1n log. ecale ADE in log. ecale

H1»toor--» generated by rendoei pick Metogran generated by randcai pick Histogram generated by renduw pick

-12.2 -9.S -6.7S -12.2 -S .S -B.7S -12.2 -3.5 -6,75

ADE 1n log. >ca1a ADE in log. seals ADE in log. eca a

Fig. 4. Shows histograms of the absolute distance error of random permutations in log. scale for six solution techniques. Table / /. The best and the worst mean absolute distance error in single precision.

The best Standard The worst Standard

Algorithms MADE deviation MADE deviation

Sol. 1 (Grunert) 0.10e-03 0.25e-02 0.81e-01 1.45e-00

Sol. 2 (Finsterwalder) 0.74e-04 0.12e-02 0.59e-01 1.71e-00

Sol. 3 (Merritt) 0.17e-02 0-54e-01 1.29e-00 8.53e-00

Sol. 4 (Fischler) 0.87e-04 0.47e-03 0.40e-01 0.47e-00

Sol. 5 (Linnainmaa) 0.16e-02 0.14e-00 O.lle-00 2.16e-00

Sol. 6 (Grafarend) 0.77e-04 0.14e-02 0.94e-01 2.75e-00

(ADE) of 10000 trials are shown in Figure 4. better than the results of single precision. In From the histogram of the ADE we can see all the single precision mode the root finder subthe solutions can give an accuracy to the order routine fails in several cases and thus brings of 10 -13 in double precision. The populations up the MADE. Therefore, if possible, double of high accuracy results of solution one and precision calculation is recommended for the solution four are larger than that of solution 3-points perspective projection calculation. two. But the population of less accuracy for

The best MADE and the worst MADE of solution one and solution four also is a little bit

six permutations for the double precision and more than that of solution two.

the single precision are shown in Table III and Table IV. The best results are about 10 4 times

As we can expect, the double precision calcubetter than the worst results. Finsterwalder's solation gives a much better results than the single

lution, Grunert's solution and Fischler's solution precision calculation. For single precision most

give the same best accuracy.

of the solutions give the accuracy of the ADE

to the order of 10 "5 . Generally speaking, the Because Grunert's solution has the second results of double precision are about 10 7 times best accuracy and is easier to analyze, we use it

APPENDIX A 57 Haralick, Lee, Ottenberg, and Nolle

Table W. The best and the worst mean absolute distance error in double precision.

The best Standard The worst Standard

Algorithms MADE deviation MADE deviation

Sol. I (Grunert) 0.41e-12 0.90e-ll 0.60e-08 0.26e-06

Sol. 2 (Finsterwalder) 0.34e-12 0.73e-ll 0.20e-09 0.51e-08

Sol. 3 (Merritt) 0.26e-10 O.lSe-08 0.18e-04 0.13e-02

Sol. 4 (Fischler) 0.69e-12 0.19e-10 0.13e-07 0.69e-06

Sol. 5 (Linnainmaa) 0.35e-ll 0.24e-09 0.36e-06 0.23e-04

Sol. 6 (Grafarend) 0.44e-12 0.16e-10 0.88e-08 0.48e-06 to demonstrate how analysis methods can dis¬

Table V. The comparison of the mean absolute distance criminate the worst and the best from the six

error of randomly order, the best and the worst and the permutations. The analysis methods can be apmean absolute distance error picked by the e maTe , ewme. plied to the other solution techniques as well. In for two different depths. the following paragraphs we discuss the results Picking methods Mean absolute Distance error of analysis.

Depth 1 < z < 5 5 < z < 20

For each trial there are six permutation by Random order 0.19e-08 0.16e-06

The best 0.19e-H which the data can be presented to the resection 0.41e-12

The worst 0. 6 0e-08 0.87e-08 technique. In the controlled experiments where ^ware 0.99e-ll 0.34e-09 the correct answer are known, the six resection u»rre 0 .40e-08 0.31e-08

Sw 0.1 5 e-08 0.75e-09 results can be ordered from least error (best Swn 0.89e-12 0.58e-ll pick) to the highest error (worst pick) using the swrre 0.90e-12 O.lle-10 square error distance between the correct 3D ^aware 0.93e-12 O.lle-10 position of the triangle vertices and the calculated 3D position of the triangle vertices. The

fraction of times each selection technique selects worst normalized sensitivity of the polynomial the data permutation giving the best (least) erzero with respect to the coefficients give the ror to the worst (most) error for two different relative drift of the zeros. Hence, the e 3Wrre depths are plotted in Figures 5 and 6. The method also gives a pretty good accuracy. The histogram of the absolute distance error of the comparisons of the MADE of randomly order, six selection methods is shown in Figure 7. Figthe best and the worst and the MADE picked ures 5 and 6 show that the drift of zeros is by the€ ωαΓ€ , f¾, rre , S WJ Sy; n , t-awrre and€ sware for not affected by the absolute error (i.e. WS) two different depths are shown in Table V. or the relative error (i.e. WRRE). The worst The goal is to achieve the best accuracy. The sensitivity (i.e. WS) and the worst relative error accuracy of the best permutation is about a (i.e. WRRE) do not permit an accurate choice ten thousand times better than the accuracy to be made for the picking order. The worst obtained by the worst case and the accuracy obnormalized sensitivity produces the best results tained by choosing a random permutation. The and can effectively stabilize the calculation of S wn , Csware, and e SWTre methods have approxithe coefficients of the polynomial. mately a half of the accuracy obtained by the best permutation. Any of these three methods

The absolute drift of polynomial zeros is can be used to choose a permutation order which changed by both the absolute error of coeffigives reasonably good accuracy. However, the cients and the sensitivity of the polynomial zero worst normalized sensitivity only involves the with respect to the coefficients. Thus, the e swaTe sensitivity calculation. So it is a good method methods can suppress the probability of pickto quickly pick the right permutation. Although ing the worst result from the six permutations. the histograms of probability of S wn , e auMre , and Both the relative error of coefficients and the esw re do not have very high value around the

APPENDIX A 58 Three Point Perspective Pose Estimation Problem

The kinds of picks. The kinds of picks. The kinds of picks.

picked by SVRRE.

The kinds of picks. The kinds of plcke. The kinds of plcke.

#L Indicates the best pick and #6 Indicates the tiiorst pick.

Fig. 5. Shows, for each of the six selection techniques, the fraction of times the technique selected the data permutation giving the best (least) error to the worst (most) error for all 10000 experimental cases for which the depth z is in the range 1 < z < 5. picked by VS

The kinds of plcke. kinds of picks. The kinds of picks.

picked by SVRRE.

The kinds of picks. The kinds of picks. The kinds of picks.

#1 Indicates the best pick and #5 Indicates the uoret pick.

Fig. 6. Shows, for each of the six selection techniques, the fraction of times the technique selected the data permutation giving the best (least) error to the worst (most) error for all 10000 experimental cases for which the depth z is in the range 5 < z < 20.

APPENDIX A 59 Haralick, Lee, Ottenberg, and Nolle

Fig. 7. Shows histograms of the absolute distance error of six selection methods in log. scale. best pick, they still have a very accurate absolute possible parametrizations, which has the smalldistance error compared to the best absolute disest absolute distance error to the exact solution. tance error. This reveals that in many cases the The results are shown in Table VII.

accuracy of six permutations are too close to be Table VI and Table VII whose results are discriminated. obtained from the Vax 8500 running VMS oper¬

In order to study the frequency with which sinating system contain the statistics of the absolute gularities and instabilities may happen we pick distance error of the different selection methods the large error cases whose absolute distance for the three different depth cases, based on the error is greater than 10 ~7 , run more trials and sample of all 100000 experiments in Table VI add different depths for Grunert's technique. and based on the subsample of large error cases Around each singularity we find a region within in Table VII. The sample size for this cases is the parameter space leading to large absolute about 69 for the first depths, about 96 for the distance errors in the Grunert solution, diverging second depth and about 495 for the large depth. with decreasing distance to the point of singularTable VII shows that the singular cases do not ity. Because the real singularities may seldomly really happen in these experiments because the happen in the numerical calculation, most cases mean ADE is about 10 ~2 . However, in the vicinwe only have to deal with very large errors in ity of singular points the error is much larger the vicinity of singular points in the parameter compared to that of Table VI. The results in space. Because the set of the vicinities of all Table VII also show that the selection methods singularities in the parameter space does not work fine in these cases.

have the full symmetry of permutation group, When the experiments of Table VI and Table we always can find a better parametrization of VII are run in the Sun 3/280, results are simour experiment. Our task is to define an obilar to these obtained from the VAX8500 and jective function on the parameter space, which the magnitude differences in numerical accuallows us to select a parametrization from the six racy of results between two systems are within

APPENDIX A 60 Three Point Perspective Pose Estimation Problem

Table VI. The same as Ikble V. But it runs 100000 trials and with three different depths.

Depth [1. ..5] Depth [5. . .20] Depth [25. ..75]

Picking MADE Std. dev. MADE Std. dev. MADE Std dev.

Random 2.22e-07 6.58e- -05 4.49e-09 6.11e- -07 1.44e-07 1.72e- -05

Best 6.69e-12 1.79e- -09 2.01e-12 2.41e- -10 4.18e-l l 3.88e- -09

Worst 1.69e-06 4.39e- -04 7.14e-07 1.90e- -04 1.61e-04 5.50e- -02

1.06e-08 3.31e- -06 4.49e-10 6.24e- -08 3.70e-07 1.14e- -04

^wrre 1.68e-06 4.39e- -04 6.31 e-07 1.88e- -04 1.83e-06 3.81e- -04

Sw 5.99e-09 1.33e- -06 5.98e-07 1.88e- -04 1.34e-08 2.13e- -06

Swn 9.18e-12 1.88e- -09 3.76e-12 3.55e- -10 2.43e-10 3.89e- -08 tsware 7.64e- 12 1.80e- -09 3.66e-12 4.17e- -10 1.21e-10 1.18e- -08 tswrre 7.57e-12 1.80e~ -09 4.16e-12 4.53e- -10 1.21e-10 1.17e- -08

Table VII. The same as Table VI. But it only considers larg ;e error cases.

Depth [1. . .5] Depth [5. ..20] Depth [25. ..75]

Picking MADE Std. dev. MADE Std. dev. MADE Std dev.

Random 1.35e-05 6.35e- -05 4.03e-06 2.57e- -05 1.22e-04 2.43e- -03

Best 7,23e-08 5.16e- -07 2.43e-09 1.64e- -08 1.37e-08 2.38e- -07

Worst 1.18e-04 5.78e- -04 2.59e-03 2.52e- -02 5.43e-02 1.20e- -00 tware 6.76e-07 4.30e- -06 2.94e-07 1.21e- -06 9.56e-06 2.08e- -04

^wrre 6.02e-05 3.18e- -04 2.58e-03 2.52e- -02 1.29e-04 2.43e- -03

Sw 6.21e-06 4.70e- -05 1.14e-07 5.25e- -07 1.09e-04 2.42e- -03

Swn 5.42e-07 4.23e- -06 8.02e-09 4.49e- -08 2.18e-08 2.97e- -07 sware 5.20e-07 4.23e- -06 7.93e-09 4.43e- -08 1.78e-08 2.87e- -07

5.20e-07 4.23e- -06 8.01e-09 4.49e- -09 1.73e-08 2.87e- -07 an order of one except in worst cases with always produce a numerically stable calculation depth[5...20] and depth[25...75] and in Sw case for the Grunert solution. The analysis method with depth[5...20] whose magnitude differences described here can pick the solution's accuracy are an order of two and three, respectively. about 0.9 x 10 12 , which is very close to the best accuracy 0.41 x 10 -12 that can be achieved by picking the best permutation each trial and

6 Conclusions about thousand times better than 0.19 x 10 ~8 which is achieved by picking the random per¬

We have reviewed the six solutions of the three mutation.

point perspective pose estimation problem from

a unified perspective. We gave the comparisons

of the algebraic derivations among the six soluAcknowledgment

tions and observed the situations in which there

may be numerical instability and indeterminate The authors wish to thank the reviewers solutions. We ran hundreds of thousands of extheir helpful suggestions.

periments to analyze the numerical stability of

the solutions. The results show that the Finster- walder solution gives the best accuracy, about

lf 10 in double precision and about 10 -2 in sinAppendix I: A Simple Linear Solution for the gle precision. We have shown that the Use of Absolute Orientation

different pairs of equations and change of variables can produce different numerical behaviors. Let us restate the problem. Given three points We have described an analysis method to almost in the 3D camera coordinate system and their

APPENDIX A 61 Haralick, Lee, Ottenberg, and Nolle

as it is r 13 ,

(ru n 2 r lz \

^217 * 227-23 J Appendix II: The Numerical Accuracy of the

Solutions

Then, equation (a.l) is an underconstraint sysA.l The Problem Definition

tem of 9 equations in 12 unknowns. However, as

stated in Ganapathy (1984) those unknowns in In general, all the solutions given in Section 3 the rotation matrix are not independent. There can be used to solve the three point perspective exist some constraints as follows resection problem. However, the behavior of

2 , 2 i 2 2 i 2 i 2 2 i 2 i 2 1 the numerical calculations are different for the different solution techniques. Furthermore, for each solution technique the numerical behavior will be different when the order of the equation ri3 = r 2 ir 32 - r^r^ manipulation or variables is different. For ex¬

Γ23 = Π27-31 - r n r 32 (a.2) ample, if we let si = us 2 and s 3 = vs 2 instead r33 = ΓπΓ22 - T 12 r 2 i of s 2 = usi and s 3 = vsi, then the coefficients of equation (9) will be changed. These changes

Since three vertices of the triangle are copla- can be reflected by replacing a with b, b with o, nar, with the constraints above we can assume with β and β with a. As a result, it may affect ∑i = 0, ϊ = 1, 2, 3. Thus, equation (a.l) can be the numerical accuracy of the final results. written as The order of the equation manipulation combined with choosing different pairs of equations

Xi - r u x'i + r 12 yi' + t x for substitution can produce six different numerical behaviors for each solution. To simulate

Vi = rtix'i + r 22 yi' + t y i = 1, 2, 3 these effects we preorder the 2D perspective

■¾ = r 3 ix'i + r 32 yi' + t z projection and the corresponding 3D points in the six different possible permutations.

In terms of matrix form we have In this appendix we describe some analysis methods that can be used to determine the nu¬

AX = B merical stability of the solutions and truly aid in

APPENDIX A 62 Three Point Perspective Pose Estimation Problem determining a good order in which to present the effects will appear in the coefficients of the comthree corresponding point pairs to the re section puted polynomial and affect the stability of the procedure. zeros of the polynomial. For an ill-condition polynomial a small change in the value of a co¬

A.2 The Analysis— A Rounding Error Considerefficient will dramatically change the location of ation one or more zeros. This change will then propagate to the solution produced by the 3 point perspective resection technique. The sensitivity

There are several sensitivity measures which can

of the zeros of a polynomial with respect to be used. They include the numerical relative

a change in the coefficients is best derived by and absolute errors, and the drift of polynomial

assuming the zero location is a function of the zeros. We are mainly concerned about how

coefficients (Vlach and Singhal 1983). Thus for the manipulation order affects the rounding erj-th zero ¾ of the polynomial P(o 0 , «i, . . · , a„, x) ror propagation and the computed roots of the

= a„x n +a„-ix n~l + ■■■ +a x + OQ we represent polynomial. Since both the absolute rounding

error and the relative rounding error may afP(a 0 , oi , . . . , a n , x( 0 , oj, . . . , α η ))| χ=¾ = 0 fect the final accuracy, we consider both factors.

The sensitivity analysis focuses on the roots of Differentiating with respect to ο,· gives the polynomial formed by the three-point perspective solutions. In contrast, the polynomial dP QP_ dx_

= 0

zero drift considers both the errors and the sendai dx dai

sitivity of polynomial zero. However, all factors

can affect the numerical results. Each of these Rearranging the equation gives

measures will be used to predict sensitivity in

terms of the mean absolute error.

A.2.1 The Effect of Significant Digits. In this

analysis all computations are conducted in both where a 0 , ai, . . . , a n are the coefficients of the single precision and double precision for the six polynomial, ¾ is the j-th zero of polynomial. techniques. The quantity measured is the mean Consider the total sensitivity, S, of all the absolute distance error for each precision. coefficients on a particular zero. We have n

A.2.2 The Histogram of the Mean Absolute Distance Error. The histogram analysis will give i=0

the distribution of the absolute distance error.

To avoid the cancellation among positive and A technique may give a large number of highly

negative terms, we take the absolute value of accurate results, but produce a few large errors

each term and consider the worst case. We due to degenerate cases; others may give accuexpress the worst sensitivity S w by

rate results to all trials without any degenerate

cases. The analysis of the histogram of the errors will help us to discriminate between which

techniques are uniformly good from those which

are only good sometimes.

A large sensitivity of the zero with respect to the coefficients may lead to a large error in the final

A.2.3 The Sensitivity Analysis of Polynomial Zeresult. Laguerre's method is used to find the ros. The global accuracy is affected by the side zeros of polynomial. It has advantage of first lengths, the angles at the center of perspectivity extracting the zeros with small absolute values with respect to side lengths, and the permutation to better preserve accuracy in the deflation of order in which the input data is given. These the polynomial and can converge to a complex

APPENDIX A 63 Haralick, Lee, Ottenberg, and Nolle zero from a real initial estimate. The accuracy higher order terms are very small, thus they are for the iterative stop criterion is the rounding omitted.

error of the machine.

DEFINITION. A sequence {OPi , OP 2 , ... , ΟΡ η - Χ )

A.2.4 The Numerical Stability. Discussion in of binary mathematical operators from the class of most numerical books show how calculations addition, subtraction, multiplication and division involving finite-digit arithmetic can lead to sigapplied to a series of numbers (xi , ¾, · . · , x n ) tw nificant errors in some circumstances. For exat a time is given as follows:

ample, the division of a finite-digit result by P^( i, i+ i)f(e Xi , e X i , e r ) a small number, i.e., multiplying by a relative

= £(1 + e total )

large number, is numerically unstable. Another

example is the subtraction of large and nearly where f is a function of e Xl , e Xiil and e r , £ is the equal numbers which can produce an unacceptresult of the operation assuming infinite precision able rounding error. In order to study how large computation and e tota i is the total relative error a rounding absolute error can be produced by propagated from the first operation to the last opthe mathematical operation, we will calculate the eration. Hence, S(l + ( .total) is the result of the worst absolute and relative error for each kind calculation using finite precision. Similarly, e X{ is of arithmetic operation. Let fl be the floatthe relative error of x t ; e Xj+! is the relative error of ing point mathematical operator. Hence, the x i+ i.

rounding error produced by // on two numbers

which themselves have rounding error or trunWe consider the worst case for each operation, cation error (Wilkinson 1963) can be modeled i.e., e T = 0.5 x 10 1_LI . Thus, the worst relative as follows: rounding error(e TOrrei ) is expressed by

/!(¾ + ¾) = (a¾(l + e»i) + a¾(l + e¾2 ))(l + e r ) ^wrrei ^total

and the worst absolute rounding error(e u , 07 . e( ) is given

2 £¾uare,- = IE X ^tot l

Xl + X2 The e waTei and e wrrei will be accumulated for each

- %) = (a¾(l + e xl ) - a¾(l + ¾2))(1 + e r ) of the coefficient. As in the sensitivity of zero

/ _ section we expect a large relative or absolute

= ( i - a¾) 1 + e r + xi

error lead to a large final error.

A.2.5 Polynomial Zero Drift. The zero sensii - a¾ tivity helps us to understand how a permutation x ¾) - zia¾(l + e x i)(l + e x2 )(l + e r ) of the polynomial coefficients affects the zeros.

= XiX2(l + e r + e x i + e^) The worst relative and absolute error provide a quantitative measurement of errors. The drift

\ X2 / a¾(l + ea;2) of a polynomial zero from its correct value depends on both sensitivity and error variation. In this paragraph we will give the definition of polynomial zero drift. Define the normalized

- 0.5 x ΙΟ 1 ^ < e T < 0.5 x 10 l-d sensitivity of zero with respect to a coefficient by

where d is the number of significant digits of a dx

+ ¾); £r is relative error introduced by da;

each operation; the relative errors of xi and and the function x

¾ are e xi and e x2 respectively, and these are

propagated from the previous operations. The r=z, = x(a 0 , ai, ... , a n )

APPENDIX A 64 Three Point Perspective Pose Estimation Problem

Then, the worst normalized sensitivity (S wn ) S. Ganapathy, "Decomposition of Transformation Matrices

S. Linnainmaa, D. Harwood, and L. S. Davis, "Pose Estimaand the worst relative drift case due tion of a Three-Dimensional Object Using Triangle Pairs" relative rounding error we have IEEE Transactions on Pattern Analysis and Machine intelligence, Vol. 10, No. 5, 1988, pp. 634-647.

P. Lohse, "Dreidimensionaler Riickwartsschnitt Ein Algo-

X £„ rithmus zur Streckenberechnung ohne Hauptachsentrans- j=0 formation," Geodatisches Institut, Universitat Stuttgart, 1989.

D. G. Lowe, "Three-Dimensional Object Recognition from

As discussed above the final error is expected in Single Two-Dimensional Images," Artificial Intelligence, Vol. proportion to the value of the worst drift e su/rre 31, 1987, pp. 355-395.

d Efrui rp. · E. L. Merritt, "Explicity Three-Point Resection in Space"

Photogrammetric Engineering, Vol. XV, No. 4, 1949, pp. 649-

References 655.

E. L. Merritt, "General Explicit Equations for a Single Photograph" Analytical Photogrammetry, Pitman Publishing

E. Church, "Revised Geometry of the Aerial Photograph," Corporation, New York, USA, pp. 43-79.

Syracuse University Press, Bulletin 15, Syracuse, NY, 1945.

F. J. Miiller, "Direkte (exakte) Losung des ein-

E. Church, "Theory of Photogrammetry," Syracuse University

fachen Riickwartseinschneidens im Raume" Allgemeine Ver- Press, Bulletin 19, Syracuse, NY, 1948.

messungs-Nachrichten, 1925.

M. Dhome, M. Richetin, J. T. Lapreste and G. Rives, "The

P. H. Schonemann, and R.M. Caroll, "Fitting One Matrix Inverse Perspective Problem from a Single View for Poly- to Another Under Choice of a Central Dilation and a Rigid hedra Location," IEEE Conference on Computer Vision and

Body Motion." Psychometrika, 35(2):245-255, June 1970. Pattern Recognition, 1988, pp. 61-68.

P. H. Schonemann, "A Generalized Solution of the Orthog¬

S. Finsterwalder and W. Scheufele, "Das Rtickwartsein- onal Procrustes Problem," Psychometrika, 31(1):1-10, March schneiden im Raum" Sebastian Finsterwalder zum IS.Geburt- 1966.

stage, Verlag Herbert Wichmann, Berlin, Germany, 1937,

pp. 86-100. G. H. Schut, "On Exact Linear Equations for the Computation of the Rotational Elements of Absolute Orientation,"

M. A. Fischler and R. C. Bolles, "Random Sample ConPhologrammetria 16(1), 1960, pp. 34-37.

sensus: A Paradigm for Mode] Fitting with Applications to

Image Analysis and Automated Cartography, *1 Graphics and C. C. Slama, ed., "Manual of Photogrammetry" American Image Processing, Vol. 24, No. 6, 1981, pp. 381-395. Society of Photogrammetry, Falls Church, Virginia, 1980.

APPENDIX A 65 356 Haralick, Lee, Ottenberg, and Nolle

A.D.N. Smith, "The Explicit Solution of Single Picture ReJ. V!ach and . Singhal, "Computer Methods for Circuit section Problem with a Least Squares Adjustment to ReAnalysis and Design." 1983.

dundant Control,'' Photogrammetric Record, Vol. V, No. 26,

J.H. Wilkinson, "Rounding Errors in Algebraic Process." October 1965, pp. 113-122.

H.M. Stationery Office, London; 1963.

E.H. Thompson, "Space Resection: Failure Cases," Photogrammetric Record, Vol. V, No. 27, April 1966, pp. PR. Wolf, Elements of Photogrammetry, McGraw Hill, New 201-204. York, USA, 1974.

R.Y. Tsai, "A Versatile Camera Calibration Technique for W.J. Wolfe, D. Mathis, C.W. Sklair, and M. Magee, "The High-Accuracy 3D Machine Vision Metrology Using Off-the- Perspective View of Three Points," IEEE Transactions on Shelf TV Cameras and Lenses," IEEE Journal of Robotics Pattern Analysis and Machine Intelligence, Vol. 13, No. 1, and Automation, Vol. RA-3, No. 4, 1987, pp. 323-344. 1991, pp. 66-73^

APPENDIX A 66