KARTZ MICHAEL W (US)
FLATH LAURENCE M (US)
KARTZ MICHAEL W (US)
US20010038718A1 | 2001-11-08 |
LAURENCE FLATH: "Utilizing Commercial Graphics Processors in the real-time Geo-Registration of Streaming High-Resolution Imagery", ACM WORKSHOP ON GENERAL PURPOSE COMPUTING ON GPUS, AUGUST 2004, 19 November 2004 (2004-11-19), Los Angeles, California, USA, XP002400976, Retrieved from the Internet
BEAVEN S G ET AL: "Joint multisensor exploitation for mine detection", PROCEEDINGS OF THE SPIE - THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING SPIE-INT. SOC. OPT. ENG USA, vol. 5415, no. 1, 2004, pages 1094 - 1105, XP002429659, ISSN: 0277-786X
HENDERSON T C ET AL: "Edge- and shape-based geometric registration", IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING USA, vol. GE-23, no. 3, May 1985 (1985-05-01), pages 334 - 342, XP002429660, ISSN: 0196-2892
DE CASTRO E ET AL: "REGISTRATION OF TRANSLATED AND ROTATED IMAGES USING FINITE FOURIER TRANSFORMS", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE SERVICE CENTER, LOS ALAMITOS, CA, US, vol. 9, no. 5, 1 September 1987 (1987-09-01), pages 700 - 703, XP000561156, ISSN: 0162-8828
I Claim:
1. A geo-registration method comprising:
obtaining digital image data of an image source using a camera located
a distance D from a center field of view (CFOV) of the image source, where /is
the focal length of a lens of the camera and D »/;
obtaining inertial navigation system (INS) data of the camera associated
with the digital image data, said INS data including camera position data and
camera attitude data including roll, pitch, and heading;
loading the digital image data into a GPU of a computer system to be
processed thereby;
in the GPU of the computer system:
calculating relative angles and distances between the camera
and the image source from said INS data;
performing geometry correction of the digital image data
using the calculated relative angles and distances;
performing geo-rectification of the geometrically corrected
digital image data using an affine homogenous coordinate
transformation, to produce a geo-rectified digital image data;
performing a rotation transformation about a z-axis of the
geo-rectified digital image data to remove a headmg ; and performing geo-registration with the heading-adjusted and
geo-rectified digital image data.
2. The geo-registration method of claim 1,
wherein the step of calculating relative angles and distances includes
calculating the relative altitude, Z, of the camera with respect to the CFOV,
and a pitch of the camera.
3. The geo-registration method of claim 1,
wherein the step of performing geometry correction includes
converting measurement units into uniform units of measure.
4. The geo-registration method of claim 1,
wherein the step of performing geometry correction includes
performing a rotation transformation on the digital image data to offset any
a roll about the optical axis.
5. The geo-registration method of claim 1,
wherein the affine homogenous transformation equation is
represented by:
with m o = -—; m s = ~ ^ m τ = ~ ^> ^I A M.M.W U πAU = 0 ' andm 1015 = 1.
6. The geo-registration method of claim 1,
wherein the step of transferring the digital image data into a GPU
includes transferring the digital image data and into high-speed video random-
access memory (VRAM) of the GPU.
7. The geo-registration method of claim 1,
wherein the step of performing geo-registration includes removing jitter
due to INS data uncertainty by:
geo-registering a small segment of the heading-adjusted and geo-
rectified digital image data;
performing Fourier transform and inverse transform of two
image segments;
determining a position of a correlation peak having the highest
intensity light which identifies the appropriate relative positions of the
two images; and applying the relative position shift as part of the geo-registration
step of the complete image.
8. The geo-registration method of claim 1, wherein
the step of calculating relative angles and distances includes calculating
the relative altitude, Z, of the camera with respect to the CFOV, and calculating
relative angles and distances includes calculating a tch of the camera;
the step of performing geometry correction includes converting
measurement units into uniform units of measure;
the step of performing geometry correction includes performing a
rotation transformation on the digital image data to offset any a roll about the
optical axis; and
the affine homogenous transformation equation is represented by:
with m 0 = ~; Tn 5 = -^ Zf; m i = ~ l^> ^1,2,3, 4 ,6, 7 ,8, 9 , 11 , 112 , 13 , 14 = °/ and w 10> i 5 = 1;
9. The geo-registration method of claim 8,
wherein the step of converting measurement units includes using a
scaling transformation matrix. 10. The geo-registration method of claim 8,
wherein the step of performing geo-rectification includes calculating D
and G from Z and a pilch
11. The geo-registration method of claim 8,
wherein the step of performing geo-rectification includes calculating D
and G from the camera position data.
12. The geo-registration method of claim 8,
wherein the step of transferring the digital image data into a GPU
includes transferring the digital image data and into high-speed video random-
access memory (VRAM) of the GPU.
13. The geo-registration method of claim 8,
wherein the step of performing geo-registration includes removing jitter
due to INS data uncertainty by:
geo-registering a small segment of the heading-adjusted and geo-
rectified digital image data;
performing Fourier transform and inverse transform of two
image segments; determining a position of a correlation peak having the highest
intensity light which identifies the appropriate relative positions of the
two images; and
applying the relative position shift as part of the geo-registration
step of title complete image.
14. An article of manufacture comprising:
a computer usable medium having computer readable program code
means embodied therein for geo-registering digital image data of an image
source using a GPU of a computer, said digital image data obtained using a
camera located a distance D from a center field of view (CFOV) of the image
source, where /is the focal length of a lens of the camera and D »/, and said
digital image data associated with inertial navigation system (INS) data of the
camera including camera position data and camera attitude data, the computer
readable program code means in said article of manufacture comprising:
computer readable program code means for causing the GPU to
calculate relative angles and distances between the camera and the
image source from said INS data;
computer readable program code means for causing the GPU to
perform geometry correction of the digital image data using the
calculated relative angles and distances; computer readable program code means for causing the GPU to
perform geo-rectification of the geometrically corrected digital image
data using an affine homogenous coordinate transformation, to produce
a geo-rectified digital image data;
computer readable program code means for causing the GPU to
perform a rotation transformation about a z-axis of the geo-rectified
digital image data to remove cc headmg ; and
computer readable program code means for causing the GPU to
perform geo-registration with the heading-adjusted and geo-rectified
digital image data.
15. The article of manufacture of claim 14, wherein
the computer readable program code means for causing the GPU to
calculate relative angles and distances between the camera and the image
source includes computer readable program code means for causing the GPU
to calculate the relative altitude, Z, of the camera with respect to the CFOV,
and calculate a pιtch of the camera;
the computer readable program code means for causing the GPU to
perform geometry correction includes computer readable program code means
for causing the GPU to convert measurement units into uniform units of
measure; the computer readable program code means for causing the GPU to
perform geometry correction includes computer readable program code means
for causing the GPU to perform a rotation transformation on the digital image
data to offset any a roll about the optical axis; and
the affine homogenous transformation equation is represented by:
W 1015
= 1;
16. The article of manufacture of claim 15,
wherein the computer readable program code means for causing the
GPU to perform geo-registration with the heading-adjusted and geo-rectified
digital image data includes computer readable program code means for
causing the GPU to remove jitter due to INS data uncertainty by geo-
registering a small segment of the heading-adjusted and geo-rectified digital
image data; performing Fourier transform and inverse transform of two image
segments; determining a position of a correlation peak having the highest
intensity light which identifies the appropriate relative positions of the two
images; and applying the relative position shift as part of the geo-registration
step of the complete image. |
REAL-TIME GEO-REGISTRATION OF IMAGERY USING COTS GRAPHICS PROCESSORS
[0001] The United States Government has rights in this invention pursuant to
Contract No. W-7405-ENG-48 between the United States Department of Energy
and the University of California for the operation of Lawrence Livermore National
Laboratory.
I. CLAIM OF PRIORITY IN PROVISIONAL APPLICATION
[0002] This application claims the benefit of U.S. provisional application No.
60/651,839 filed February 9, 2005, entitled, "Real-Time Geo-Registration of
Imagery Using Commercial Off-the-Shelf Graphics Processors" by Laurence M.
Flath et al.
II. FIELD OF THE INVENTION
[0003] The present invention relates to geo-registration methods, and more
particularly relates to a method of performing real-time geo-registration of high-
resolution digital imagery using existing graphics processing units (GPUs) already
available with current personal computers.
III. BACKGROUND OF THE INVENTION
[0004] A wide range of sensor technologies (visible, infrared, radar, etc.) and a
wide variety of platforms (mountaintops, aircraft, satellite, etc) are currently used
to obtain imagery of planetary surfaces. Geo-registration is the process of
mapping imagery obtained from such various sources into predetermined
planetary coordinates and conditions, i.e. calibrating/ correlating such image data
to the real world so as to enable, for example, the determination of absolute
positions (e.g. GPS (global positioning system) coordinates), distances, etc. of
features found in the image. However, in order to overlay the images from the
various types of cameras (sensor fusion), the image data from all of the disparate
sources must first be modified into a common coordinate system. Geo-
rectification is the process of converting or otherwise transforming an image (e.g.
an off -axis image) recorded from an arbitrary position and camera orientation,
into one that appears as a nadir view, i.e. a view from directly above the
scene/ object/ features of interest looking straight down, as at a map. Geo-
rectification thus enables various images to share the same orthogonal perspective
so that they can be geo-registered and correlated to/ against each other or a
reference image.
[0005] Image processing, however, is often computationally expensive; the
number of image processing computations necessary to perform geo-rectification
of off-axis high-resolution images is typically very large even for small source
images, requiring significant computational resources and making real-time
visualization of live data difficult. Real-time image data processing is therefore
typically managed as a trade off between image size (number of pixels) and data
rate (frames per second).
[0006] Current techniques for fast image rendering and geo-registration either
employ software only for post processing of data, or require expensive custom
hardware i.e. dedicated pixel processors, even for relatively low-resolution source
data. Software-only techniques, for example, can perform the image
transformation calculations necessary for geo-registration on the central
processing unit (CPU) of a computer or workstation. Due to inadequate memory
bandwidth, however, these methods typically take 2-3 seconds per mega-pixel of
image data, even with currently available high-end workstations preventing such
software only methods from performing in real-time.
[0007] And the custom hardware approach typically utilizes dedicated pixel
processors, which are specially designed graphics cards (printed circuit boards)
and software capable of high throughputs which enable real-time performance of
image transformations and geo-registration. For example, one particular custom
hardware/ dedicated pixel processor for performing real time geo-registration,
known as Acadia™, is commercially available from Sarnoff/ Pyramid Vision
Technologies. This representative custom device, however, operates at a low
resolution with RS-170 quality video, which is ~ 640 x 480 pixels at 30 Hz
Moreover, there is a high cost for such custom hardware and the programming
time to custom-configure such hardware. For example, such custom dedicated
pixel processors typically cost in the tens of thousands of dollars for the hardware
alone, and an additional cost ranging up to $100K for the configuration of the
software.
[0008] What is needed therefore is a digital image processing methodology for
performing geo-registration that is faster (real time streaming), more cost effective,
and with higher resolution than software-only techniques or the use of expensive
custom hardware/ dedicated pixel processors.
IV. SUMMARY OF THE INVENTION
[0009] One aspect of the present invention includes a geo-registration method
comprising: obtaining digital image data of an image source using a camera
located a distance D from a center field of view (CFOV) of the image source,
where /is the focal length of a lens of the camera and D »/; obtaining inertial
navigation system (INS) data of the camera associated with the digital image data,
said INS data including camera position data and camera attitude data including
roll, pitch, and heading; loading the digital image data into a GPU of a computer
system to be processed thereby; in the GPU of the computer system: calculating
relative angles and distances between the camera and the image source from said
INS data; performing geometry correction of the digital image data using the
calculated relative angles and distances; performing geo-rectification of the
geometrically corrected digital image data using an affine homogenous coordinate
transformation, to produce a geo-rectified digital image data; performing a
rotation transformation about a z-axis of the geo-rectified digital image data to
remove a headmg ; and performing geo-registration with the heading-adjusted and
geo-rectified digital image data.
[0010] Another aspect of the present invention includes an article of manufacture
comprising: a computer usable medium having computer readable program code
means embodied therein for geo-registering digital image data of an image source
using a GPU of a computer, said digital image data obtained using a camera
located a distance D from a center field of view (CFOV) of the image source,
where/is the focal length of a lens of the camera and D »/, and said digital
image data associated with inertial navigation system (INS) data of the camera
including camera position data and camera attitude data, the computer readable
program code means in said article of manufacture comprising: computer
readable program code means for causing the GPU to calculate relative angles and
distances between the camera and the image source from said INS data; computer
readable program code means for causing the GPU to perform geometry
correction of the digital image data using the calculated relative angles and
distances; computer readable program code means for causing the GPU to
perform geo-rectification of the geometrically corrected digital image data using
an affine homogenous coordinate transformation, to produce a geo-rectified
digital image data; computer readable program code means for causing the GPU
to perform a rotation transformation about a z-axis of the geo-rectified digital
image data to remove a headmg ; and computer readable program code means for
causing the GPU to perform geo-registration with the heading-adjusted and geo-
rectified digital image data.
[0011] The present invention is directed to a geo-registration method for
performing transformations of high-resolution digital imagery that operates in
real time, is faster and typically produces higher resolution images than software
only or custom hardware/ dedicated pixel processor techniques, and is less
expensive than such existing techniques by using commercial off-the-shelf (COTS)
image processing hardware already found in most current personal computers
(PCs). In particular, the method and system of the invention performs image
transformations using a COTS graphics processing unit (GPU) typically found in a
conventional PC rather than the main central processing unit (CPU) of the
computer as is often done with existing geo-registration technology. Personal
computers with graphics cards typically cost less than $3K.
V. BRIEF DESCRIPTION OF THE DRAWINGS
[0012] The accompanying drawings, which are incorporated into and form a part
of the disclosure, are as follows:
[0013] Figure 1 is a schematic illustration of first-order transverse imaging.
[0014] Figure 2 is a schematic perspective view of an exemplary camera system
imaging the ground from a remote camera platform at an off-axis angle.
[0015] Figure 3 is an enlarged schematic view of an image of the rectangular
ground patch of Figure 2 that is projected onto the camera's focal plane for non¬
zero a puch-
[0016] Figure 4 is a schematic side view of the arrangement shown in Figure 2
illustrating the mapping of y-coordinates from ground to camera focal plane.
[0017] Figure 5 is a schematic side view of the arrangement shown in Figure 4
illustrating the trigonometric relationship for relating y g in terms of y c .
[0018] Figure 6 is a general flowchart of the geo-registration method of a preferred
embodiment of the present invention.
[0019] Figure 7 is a detailed flowchart of the geo-registration method of a
preferred embodiment of the present invention showing the specific operations
performed by the GPU.
[0020] Figure 8 is a code snippet of exemplary source code of the method of the
present invention.
[0021] Figure 9 is a screen shot showing the transformation of a colored square
using an exemplary embodiment of the present invention.
[0022] Figure 10 is a screen shot showing the transformation of a texture-mapped
square using an exemplary embodiment of the present invention.
[0023] Figure 11 is a screen shot of an "Experiment View" mode of an exemplary
embodiment of the present invention using INS data and camera/ lens
characteristics to recreate the flight pattern of an aircraft carrying a camera system
used in the present invention.
[0024] Figure 12 is a screen shot of a source image prior to geo-registration using
an exemplary embodiment of the present invention.
[0025] Figure 13 is a screen shot of the source image of Figure 11, after performing
geo-rectification and geo-registration with an exemplary embodiment of the
present invention.
[0026] Figure 14 is a schematic illustration of the Radeon™ 9700 video card, a
representative GPU known in the prior art.
VI. DETAILED DESCRIPTION
A. Theory of Geo-registration
[0027] In order to describe the geometric transformations involved in geo-
rectification, Figure 1 shows the simple case of first-order transverse imaging of a
planar object surface that is perpendicular to the optical axis, onto the real image
plane of a camera. In particular, Figure 1 shows a camera system, represented by
lens 10, located at a position C(x, y, z) in space, and staring in a nadir-looking
direction along an optical axis 13 at a point G(x, y, 0) on the ground plane 11. The
distance of the optical axis 13 between the lens 10 and the point G of the planar
object is shown as D g . If the focal length of the lens isf, and the distance between
the lens 10 and a real image plane is D c (not shown), then the thin lens equation is:
I-JL _L f ~ D c + D s (1)
Furthermore, if we assume a perfect thin lens (i.e. neglecting the effects of optical
aberrations) and assume that D g »/, then the real image distance D 0 is
approximately equal to the focal length/ as follows:
As such, the real image plane and the focal plane are considered the same (shown
at reference character 12), and ground objects are projected as a real image on the
focal plane. As used herein and in the claims, the image plane and focal plane are
therefore treated as equivalents and used interchangeably. In this manner, ground
objects along the x'^-axis are projected onto the camera's focal plane 12 in first-
order fashion, and is identical to the setup and operation of an ideal pinhole
camera. Based on this approximation, a similar-triangle relationship is obtained,
as shown in Figure 1, between objects and images on the x-axis. This is expressed
as:
^ = -is-. (3)
S D κ K )
where x g is the dimension of an object in the x s -axis of the ground plane, and x c is
the corresponding dimension of the image projection in the x c -axis of the focal
plane 12.
[0028] In contrast to the example of Figure 1, a more general imaging scenario may
be considered of a remote mobile camera system imaging a ground plane at an
oblique ("off-axis") angle (see Figures 2-5). Generally, several quantities define
the projection of the off-axis ground plane to the camera focal plane. The first
three quantities are the relative position (X, Y, Z) of the camera with respect to the
center field of view (CFOV). In aviation parlance, X, Y, and Z are the relative
longitude, latitude, and altitude, respectively. The next three quantities involve the
camera orientation or attitude, which include the camera rotation angles
corresponding to heading (a heading ),.pitch (a pilch ), and roll (a mll ). It is appreciated
that a heading is characterized as the angular deviation from a reference direction
(e.g. North) due to rotation about a Z-axis orthogonal to the X-Y ground plane ,
a pUch is the angle the optical axis makes with the X-Y ground plane, and a rM is the
angular deviation of the camera from a reference direction due to rotation about
the optical axis. And lastly, optical system characteristics, such as the lens focal
length,/, the focal plane array (FPA) size, and FPA pixel pitch determine the
magnification of the camera's optical system. It is appreciated that images
captured by the camera will be uniquely associated with the camera positions and
attitudes at the time of image acquisition.
[0029] It is notable that since it is impossible to map an arbitrary 3-D surface onto
a 2-D image plane without distortion, many types of planetary-scale projections
exist to prioritize a particular property. For example, Mercator gives straight
meridians and parallels, Azimuthal-Equidistant gives equal areas, etc. All such
projections require a 'warping' of the source data and are, at best, an
approximation of reality. However, for the imaging applications considered here
involving mapping a localized area, the earth's curvature can essentially be
ignored. As such, the transforming of imagery from a recorded first perspective
into another for geo-registration is possible with only the position and
orientation/ attitude information of the sensor (i.e. camera) platform. To this end,
a global positioning system (GPS) locator is typically used to provide the camera
position data (accurate to better than a few meters) anywhere on the globe, and
inertial measurement hardware (i.e. inertial measurement unit, "IMUs") known in
the art, is used to provide the camera attitude, e.g. roll, pitch, and heading.
Together the GPS and IMU is characterized as an inertial navigation system,
"INS".
[0030] In the general imaging scenario, camera roll ( a roU ) about the optical axis
will cause the off-axis projected image to appear as an asymmetric quadrilateral,
making direct calculations of the corner positions rather involved. The problem
can be simplified by performing two basic coordinate transformations. First, by
viewing along a ground line from the camera's position to the CFOV, the relative
positions X and Y of camera the can be reduced to a single quantity G (where
G=^X 2 + Y 2 ), and the three camera orientation angles (a headlng , a pιlch , a rM ) are also
reduced to a single pitch angle, a pllch . Second, by rotating the image to remove the
roll component, a roll , rectangular ground regions map to a symmetrical
trapezoidal areas on the camera focal plane, and vice-versa (where symmetrical
trapezoidal ground regions map to a rectangular area on the camera focal plane,
as shown in Figures 11 and 12), Deriving the relationship between the ground and
the camera focal plane then becomes straightforward geometry, and geo-
rectification may be subsequently performed.
[0031] Figures 2-5 illustrate the simplified general imaging scenario described
above with the coordinate transformation on G and the rotational transformation
on a roll already made. As such, Figure 2 shows the image source as a rectangular
patch 22 of the ground plane, and Figure 3 shows the projected image 23 as a
symmetrical trapezoidal area. The remote system camera is represented as lens 20
located at an altitude Z from the ground plane, and a distance G from the CFOV as
viewed along a ground line. The optical axis 21 of the camera is oblique to the
rectangular patch 22 at a pitch angle a pιtch to intersect the rectangular patch 22 at
the CFOV. The distance between the lens 20 and the CFOV is labeled D in Figures
4 and 5, and the projected image of the rectangular patch 22 onto the real image
plane is indicated at reference character 23 in Figures 2 and 3. Similar to the
discussion of Figure 1, here too the optical imaging system is treated like a pinhole
camera, i.e. Z, G » f.
[0032] Figure 3 shows an enlarged view of the projected image 23 of the
rectangular ground patch on the camera's focal plane for non-zero a pUch . Because
of the perspective effect, objects closer to the camera appear larger than those that
are far away. Therefore, assuming a point on the ground having coordinates (x g ,
y g ) is projected onto the focal plane at coordinates (x c , y c ), the coordinate
transformation from x g into x c is not necessarily the same as from y g into y c . In
particular, the projected x c coordinate will be dependent on both the x g and y g
ground coordinates (due to the perspective effect), whereas the projected y c
coordinate will be dependent only on the y s ground coordinate since all points for
a given y g on the ground map onto the same y c line on the focal plane, regardless
of the x g ground coordinate. The simplest means to derive this relationship is to
note that the camera has a single-point perspective of the scene, as shown in
Figure 3. The equation of a line drawn through the single-point is:
y e =— χ c + y«- (4)
X n .
where x F is the rtc-intercept of the Equation (3), and y ∞ represents the point where
all object points at an infinite distance will image.
[0033] To relate the ground x-coordinate (x g ) of the ground patch 22 to the camera
image plane coordinates (x c ), and the ground y-coordinate (y g ) of the ground patch
to the camera image plane coordinates (y c ), the quantities y ∞ and x F are
computed. In particular, x F is computed using equation (3) as follows.
*,=— v ( 5 )
And y ∞ is computed using the trigonometric arrangement illustrated in Figure 4,
as follows:
y ∞ = -ftma pitch = -f—. (6)
Substituting Equations (5) and (6) into the Equation (4):
(7)
X p x G G
and rearranging gives the transformation equation between x g and x c :
[0034] Similarly, the ground y-coordinate (y g ) is calculated as it relates to the y-
coordinate (y c ) of the camera focal plane using the trigonometric arrangement
illustrated in Figure 5, as follows:
sin v* = ~ => y L = y s ^ a pud,' ( 9 )
∞sa piιch = λ => D 1 = D-y g cosa ptbλ . (10)
Sg
Substituting equations (9) and (10) into a similar-triangle equation similar to
equation (3):
And rearranging gives:
It is notable that the -y g / + y c set of triangles in Figure 5 was chosen for ease of
description; a derivation using the +y g / -y c set of triangles is slightly more
involved, but yields the same result.
B, COTS Graphics Processing Units (GPUs)
[0035] The graphics subsystems in currently available personal computers are
designed to rapidly perform 3-D calculations of extremely complex scenes in real¬
time, and are especially well-suited for performing perspective transformations
like those in Equations 8 and 12. Mostly due to demand by the gaming industry,
the GPUs found on the top-of-the-line video cards, such as for example ATI™,
nVidia™, and Intel™, contain more transistors than the main CPU of a PC, and
utilize the same state-of-the-art semiconductor fabrication/ design rules. When
perspective calculations are performed on a general-purpose CPU, the division
and square-root operators are the most time-consuming and temporally non-
deterministic computations, not to mention the possibility of a divide-by-zero
exception, which would halt the process. This is not the case with GPUs, which
are specifically designed for performing complex arithmetic operators such as
inverse square root, clamping, and homogeneous coordinates, as well as
employing affine transformations for shift, scale, rotation, and shear, and point
operators for value scaling. Figure 13 shows a schematic illustration of the
Radeon™ 9700 video card commercially available from ATI Technologies, Inc.
(Ontario, Canada), illustrating some of the many sophisticated data processing
components and functions available in COTS GPUs.
[0036] It is appreciated that GPUs can be an adapter, i.e. a removable expansion
card in the PC, or can be an integrated part of the system board. Basic
components of a GPU include a video chip set for creating the signals to form an
image on a screen; some form of random access memory (RAM), such as EDO,
SGRAM, SDRAM, DDR, VRAM, etc. as a frame buffer where the entire screen
image is stored; and a display interface, such as a RAMDAC (digital/ analog)
through which signals are transmitted to be displayed on screen. The digital
image transferred into RAM is often called a texture map, and is applied to some
surface in the GPU's 3-D world. Preferably, the digital image is transferred into
the GPU's dedicated high-speed video random access memory (VRAM). It is
appreciated that VRAM is fast memory designed for storing the image to be
displayed on a computer's monitor. VRAM may be built from special memory
integrated circuits designed to be accessed sequentially. The VRAM may be dual
ported in order to allow the display electronics and the CPU to access it at the
same time.
[0037] Perhaps one downside of using GPUs, however, directly relates to the
relative immaturity of available low-level development/ coding tools (assembly
level). To achieve high-performance custom code, GPUs must be programmed in
assembly language without the availability of debuggers. Fortunately, high-level
application programming interfaces (APIs) to the GPU are provided by the
manufacturers of COTS video cards. Using languages, such as for example
OpenGL and DirectX, the vast majority of complex 3-D operations are performed
transparently in the graphics subsystem hardware so that outputs are fully
defined, known, and predictable given a set of inputs. At the high-level API, 3-D
perspective transformations make use of homogeneous coordinates, as suggested
in OpenGL Programming Guide, Third Edition, Addison Wesley, 1999, Appendix F
(pp 669-674). In this system, all coordinates are scale-invariant:
(13)
and can be used to represent scaled coordinates in 3-D:
Homogeneous 3-D World (14)
A completely general affine transform in homogeneous coordinates may thus be
implemented via a vector-matrix multiplication:
where the matrix with elements mi-niw is an identify matrix. In order to
implement Equations 8 and 12, a geo-registration matrix is constructed by scaling
three elements of an identity matrix as follows:
D G_
M 5 = - m η = ■ (16)
' Zf
and setting a zero value or a value of 1 for the remaining elements of the identify
matrix, as follows:
m 1,2,3,4,6,7,8,9,11,112,13,14 = 0, m 10,15 = 1 (17)
C. Operation of Geo-Registration Method Using COTS GPU .
[0038] Figure 6 shows a generalized flowchart of the geo-registration algorithm of
the present invention based on camera location, camera attitude, and optical
system characteristics. First a source image is obtained along with INS data, as
indicated at block 60, via the optical imaging (camera) platform and INS. Then the
angles and distances, such as for example Z, D, G,β a pitch axe calculated from INS
data at block 61. It is appreciated that depending on which method induces the
least error, D and G may be calculated directly from Z and a pllch (via trigonometric
relations) or by other means, such as GPS position data. Geometry correction of
the image is then performed, such as image flipping and distortion correction at
block 62. Geo-rectification is then performed at block 63 by mapping perspective
equations to a homogeneous coordinate transformation using the earlier
calculated values (e.g. Z, D, G,f) as discussed with respect to Equation 16, and
propagating source pixels produced from the coordinate transformation to an
output image plane. At this point, interpolation and anti-aliasing may be
performed, and blank patches may be filled-in. And finally, geo-registration is
performed at block 64 to register (or spatially correlate) the geo-rectified image to
a known target or previous imagery. Preferably, the registration region is
rendered by reading pixels, performing correlation on the CPU and providing
feedback shifts to the transformation matrix to shift the result to sub-pixel
resolution and thereby remove jitter caused by GPS uncertainty or IMU drift. And
the entire transformed image is rendered onscreen. In the present invention,
blocks 61-64 are preferably performed by a conventional GPU in a PC to realize
the benefits previously described.
[0039] Figure 7 shows a more detailed flow chart of a preferred embodiment of the
present invention, illustrating the transfer of the digital image data into the GPU
and the specific steps performed thereby. As shown at block 70, digital image
data and INS data are first obtained as previously discussed. At block 71, the
digital source image data and INS data are moved/loaded into a conventional
GPU of a PC for image processing. Subsequent processing steps 73-79 are shown
inside region 72 as being steps performed by the GPU. Preferably the image is
loaded into the GPU's dedicated high-speed VRAM of the GPU as a texture map
applied to some surface in the GPU's 3-D world.
[0040] Blocks 73 and 74 represent two calculations steps which are based on GPS
position data obtained by the GPS locator. In particular, the relative altitude, Z, of
the camera is calculated with respect to the CFOV at block 73, and the pitch angle,
a p i tch ' i s calculated at block 74. Depending on the INS's coordinate transformation
model, the calculation of a pitch may be a simple substitution or require a complex
set of trigonometric calculations involving heading, pitch, and roll.
[0041] Since it is unlikely that the GPU's 3D world, GPS coordinates, pixel spacing,
etc. will have identical measurement units, block 75 shows the step of converting
units, which may be considered a part of the geometry correction step 62 of Figure
6, This potential problem can be addressed by using a scaling transformation
matrix for unit conversion. It is notable, however, that care must be taken when
operating on numbers of vastly different orders-of-magnitude. The high-level
language may use double precision floating-point, but current GPUs typically
only support 16 or 24-bit numbers. Thus in the alternative, unit conversion scaling
may be propagated into the geo-rectification transformation matrix of equations
15-17.
[0042] Another aspect of the geometry correction step includes removing the roll,
a rM , of the camera at block 75, i.e. offsetting any angular deviation from a
reference direction caused by rotation of the camera about the optical axis. This is
accomplished by performing a rotation transformation on the source image.
[0043] Geo-rectification is then performed at block 77 by calculating m 0 , m 5 ,
and m η from the previously calculated values for Z, D, G, and/, and calculating
the transformation matrix of equations (16) and (17) above, to produce a geo-
rectified image.
[0044] At block 78, the heading angle, a heading , is removed to offset and adjust for
any angular deviation from a reference direction. This is accomplished by
performing a rotation transformation about the z-axis of the geo-rectified image
such that the result is oriented with a reference direction as 'up' (such as North).
Similar to the removal of the pitch angle, the heading angle, a heading , may involve a
transformation from the INS unit's coordinate system.
[0045] And in block 79, geo-registration is performed for spatial correlation, and a
registration region is rendered. It is notable that this step is highly sensitive to
errors in the determination of a pUch . On the scale of most geo-registration
applications, the uncertainty in position of GPS (< 10 m) has very little effect on
the calculation results. Unfortunately, this cannot be said for the angular data.
Depending on the INS hardware, heading may drift up to several degrees per hour.
Even so, as long as Z »/, the result of small angular errors is a transverse offset in
the resulting geo-rectified image. Correlation, or more sophisticated
morphological techniques may be required to stabilize the imagery to a reference
position. Note that GPUs are very efficient at translating an image and can
automatically perform the anti-aliasing required for non-integer shifts.
[0046] Thus, when performing the geo-registration step, a preferred embodiment
of the invention additionally provides jitter control of the generated image. In this
enhancement, a small segment of the image is geo-registered, and after a process
involving Fourier transforms and inverse transforms of two image segments, the
position of the correlation "peak" (i.e. the point of highest intensity light) in the
resulting image is discerned. This identifies the appropriate relative positions of
the two images, as a relative position shift. This relative position shift is then
applied as part of a geo-registration of the complete image. The advantage of this
approach is that the jittering of an image can be stabilized far more quickly than if
the correlation technique had been applied to the entire image.
[0047] Additionally, while the method of the present invention may assume a flat-
earth model (as previously discussed) it may also, in the alternative, incorporate
digital elevation maps (DEMs) (which are available for much of the earth's
surface) into the image projection process by 'draping' the texture map (source
image data) over a 3-D surface built from DEM data.
D. Example Software Implementation
[0048] The procedure outlined in the previous section C. has been implemented by
Applicants in an exemplary embodiment using a custom MacOS X application for
the user interface, and OpenGL for the GPU code, collectively the "software."
Figure 7 shows a screen shot showing a snippet of exemplary source code of the
software representing the display routine used by OpenGL to perform the steps
discussed above for geo-rectification. The signs of m 5 and m η are opposite to that
described in Equation 16 due to the coordinate system of the texture map.
Algebraic manipulation of m 5 leads to an alternate equation, namely:
C 2 1 5 v Z J (18)
[0049] Figures 8-12 show various screen shots produced in real time by the
software using raw data including camera position, attitude, and optical system
characteristics. In particular, Figures 8 and 9 show screen shots of a colored
square and a texture-mapped square, respectively, along with the onscreen
controls for graphically transforming the images. Figure 10 shows an "experiment
view" mode in the software, using INS data a optical system characteristics to
create the flight pattern of a mobile camera platform (e.g. an aircraft) carrying the
camera system. Figure 11 is an exemplary screen shot of a source image prior to
geo-rectification and geo-registration, along with the onscreen controls and view
parameters. And Figure 12 is an exemplary screen shot of a geo-rectified and geo-
registered image following Figure 11. It is appreciated that the square shape of the
projected image produced by the software in Figure 11 indicates that the imaged
source region should have a symmetric trapezoidal geometry, as previously
discussed. This is shown in Figure 12 where the transformed image (after geo-
rectification and geo-registration by the software) reflects the true shape and
boundaries of the region captured in the image. Furthermore, controls are
provided on screen so that the output resolution [m/ pixel] is user adjustable.
[0050] When used to process sample mobile-platform imagery recorded with
corresponding INS data, the software has been shown to process 4 Mpixel frames
at over 10 Hz on an 800 MHz PowerBook G4. In contrast, geo-registration of the
identical image data performed by custom software on a 2GHz-class Pentium 4
system required approximately 15 seconds to process each 4 Mpixel frame. Thus
the method of the present invention has been shown to work very well even on
mid-range GPUs. However, where the technique of the present invention out¬
performs custom hardware solutions is in the area of high-resolution digital
imagery. It is appreciated that the ultimate limitations of size/ speed will be
determined by both the amount of VRAM of the GPU, and the bandwidth of the
graphics card-to-PC motherboard interface. Rendering time is itself not a
limitation, since texture mapping and transforming a single quadrilateral is
naught for GPUs designed to process millions of triangles per second. Current
GPUs typically have up to 64 MB of VRAM (some high-end cards now have up to
256 MB), which must be shared between the screen's frame buffer, texture maps,
and other 3-D objects. Since this technique renders directly to the video buffer,
output resolutions will be limited to the maximum frame buffer size allowed by
the graphics card. Higher resolutions may be achieved by 'stitching' individual
geo-rectifications together. For large, multi-MB images, the most critical times are
those of moving the source data from the PC's main memory into the VRAM,
processing to the frame buffer, and then reading the result back out. The
advanced graphics port (AGP) interface is used for most graphics cards / PC
motherboards. The current 3.0 version (a.k.a. AGP 8x as suggested on the Intel
AGP website: www.intel.com/support/graphics) supports transfers up to 2
GB/ sec, and unlike in previous versions, the bandwidth is symmetrical for
reading and writing data. Nevertheless, if additional processing on the
transformed imagery is required, it is extremely advantageous to leave it in the
graphics card's VRAM and perform the calculations with the GPU. Depending on
the algorithm, this may or may not be possible.
[0051] While particular operational sequences, materials, temperatures,
parameters, and particular embodiments have been described and or illustrated,
such are not intended to be limiting. Modifications and changes may become
apparent to those skilled in the art, and it is intended that the invention be limited
only by the scope of the appended claims.
Next Patent: COMPOSITIONS AND METHODS INVOLVING MDA-7 FOR THE TREATMENT OF CANCER