Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND APPARATUS TO DETERMINE DEPTH INFORMATION FOR A SCENE OF INTEREST
Document Type and Number:
WIPO Patent Application WO/2013/052781
Kind Code:
A1
Abstract:
Depth information about a scene of interest is acquired by illuminating the scene, capturing reflected light energy from the scene with one or more photodetectors, and processing resulting signals. In at least one embodiment, a pseudo-randomly generated series of spatial light modulation patterns is used to modulate the light pulses either before or after reflection.

Inventors:
KIRMANI GHULAM AHMED (US)
GOYAL VIVEK K (US)
Application Number:
PCT/US2012/058924
Publication Date:
April 11, 2013
Filing Date:
October 05, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MASSACHUSETTS INST TECHNOLOGY (US)
KIRMANI GHULAM AHMED (US)
GOYAL VIVEK K (US)
International Classes:
G01S17/89; G01S7/481; G01S17/10
Foreign References:
DE102005028570A12006-12-28
US20060227317A12006-10-12
DE102008021465A12009-11-05
Other References:
GREGORY A HOWLAND ET AL: "Compressive sensing LIDAR for 3D imaging", LASERS AND ELECTRO-OPTICS (CLEO), LASER SCIENCE TO PHOTONIC APPLICATIONS)-CLEO: 2011 - LASER SCIENCE TO PHOTONIC APPLICATIONS- 1-6 MAY 2011, BALTIMORE, MD, USA, IEEE, US, 1 May 2011 (2011-05-01), pages 1 - 2, XP031891443, ISBN: 978-1-4577-1223-4
C E PARRISH ET AL: "Empirical comparison of full-waveform lidar algorithms: Range extraction and discrimination performance", PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, vol. 77, no. 8, 1 August 2011 (2011-08-01), pages 825 - 838, XP055049568, ISSN: 0099-1112
F. BRETAR ET AL: "Managing Full Waveform LIDAR Data: A Challenging Task for the Forthcoming Years", ISPRS ARCHIVES - VOLUME XXXVII PART B1: XXIST ISPRS CONGRESS, 3 July 2008 (2008-07-03), Beijing, China, pages 415 - 420, XP055049582, Retrieved from the Internet [retrieved on 20130115]
RONG ZHU ET AL: "Application of the deconvolution method in the processing of Full-waveform Lidar data", IMAGE AND SIGNAL PROCESSING (CISP), 2010 3RD INTERNATIONAL CONGRESS ON, IEEE, PISCATAWAY, NJ, USA, 16 October 2010 (2010-10-16), pages 2975 - 2979, XP031808782, ISBN: 978-1-4244-6513-2
AHMED KIRMANI ET AL: "Exploiting sparsity in time-of-flight range acquisition using a single time-resolved sensor", OPTICS EXPRESS, vol. 19, no. 22, 24 October 2011 (2011-10-24), pages 21485, XP055049584, ISSN: 1094-4087, DOI: 10.1364/OE.19.021485
PIER LUIGI DRAGOTTI ET AL: "Sampling Moments and Reconstructing Signals of Finite Rate of Innovation: Shannon Meets Strang-Fix", IEEE TRANSACTIONS ON SIGNAL PROCESSING, IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 55, no. 5, 1 May 2007 (2007-05-01), pages 1741 - 1757, XP011177342, ISSN: 1053-587X, DOI: 10.1109/TSP.2006.890907
DRAGOTTI: "Sampling Moments and Reconstructing Signals of Finite Rate of Innovation: Shannon Meets Strang-Fix", IEEE TRANS. SIGNAL PROCESS., vol. 55, 2007, pages 1741 - 1757, XP011177342, DOI: doi:10.1109/TSP.2006.890907
Attorney, Agent or Firm:
SCOTT, John, C. et al. (Crowley Mofford & Durkee, LLP,354A Turnpike Street,Suite 301, Canton Massachusetts, US)
Download PDF:
Claims:
What is claimed is:

1. A machine Implemented Imaging method for generating depth information for a three dimensional scene, the method comprising:

transmitting light toward the scene;

s receiving reflected light at at least one detector resulting from reflection of the transmitted iight from the scone;

spatially modulating the Iight before it reaches the at least one detector using a number of different spatial modulation patterns;

converting output signals of the at least one defector corresponding to0 different spatial modulation patterns to digital samples;

processing digital samples corresponding to different spatiai modulation patterns to estimate a number of scene impulse responses; and

processing the scene impulse responses to generate a dept map for the scene,

2. The method of claim 1 , wherein:

spatiaily modulating the Iight is performed before the Iight reaches the scene.

3. The method of claim 1 , wherein:

spatially modulating the Iight is performed after the Iight is reflected from the scene.

4. The method of claim 1 , wherein:

processing digital samples corresponding to different spatial modulation patterns inciudes deconvolving each set of digital samples using an impulse response of the at least one detector.

5. The method of claim 1 , wherein:

processing digital sarnpies corresponding to different spatial modulation patterns includes performing parametric signal deconvoiution,

8. The method of claim 1 , wherein processing digital samples includes: processing digital samples corresponding to a first spatiai modulation pattern to estimate minimum and maximum depths associated with the scene; and processing digital samples corresponding to other spatial modulation patterns using the minimum and maximum depths.

7. The method of claim 6, wherein;

the first spatial modulation pattern is a fuily transparent pattern.

8. The method of claim 8, wherein:

the intensit of transmitted light or at least one spatial modulation pattern depends on the result of processing of digital samples corresponding to at least one previous spatial modulation pattern.

9. The method of claim 1 , wherein:

spatially modulating the light before It reaches the at least one detector using a number of different spatial modulation patterns includes modulating the light using pseudo-randomly generated spatial light modulation patterns.

10. The method of claim 1 , wherein:

transmitting light includes transmitting light from a single stationary source.

1 1. The method of claim 1 , wherein:

the at least one detector includes fewer detectors than the number of array elements of the resulting depth map.

12. The method of claim 1 , wherein:

the at least one detector includes a single detector.

13. An imaging device comprising;

a light source to generate light to illuminate a three dimensional scene of interest;

at least one defector to detect light reflected from the scene of interest; a spatial light modulator to modulate the Sight before it reaches the at least one detector using a number of different spatial modulation patterns;

an analog to digital converter (ADC) to digitize output signals of the at least one detector to generate digital samples corresponding to different spatial modulation patterns; and

at ieast one digitai processor to:

process digitai samples oorresponding to different spatial modulation patterns to estimate a number of scene impulse responses; and

process the scene impulse responses to generate a depth map for the scene of interest,

14. The Imaging device of claim 13, wherein:

the spatial light modulator is located proximate to an output of the light source to modulate the light before it reaches the scene of interest.

15. The imaging device of claim 13, wherein:

the spatial iight modulator is located proximate to an input of the at least one defector to modulate the light after if has been reflected from the scene of interest.

16. The imaging device of claim 13, wherein:

the at least one digital processor is configured to process digital samples corresponding to a spatial modulation pattern by deconvolving the digital samples using an impulse response of the at ieast one defector.

17. The imaging device of claim 13, wherein:

tbe at least one digitai processor is configured to process digital samples corresponding to a spatial modulation pattern using parametric signal

deconvolution.

18. The imaging device of claim 13, wherein:

the at Ieast one digital processor is configured to process digital samples corresponding to a first spatial modulation pattern to estimate minimum and maximum image depths associated with the scene of interest, wherein the minimum and maximum image depths associated with the scene of interest are used to process digitai samples corresponding to other spatial modulation patterns,

19. The imaging device of eiaim 18f wherein:

the first spatial modulation pattern is a fully transparent pattern,

20. The imaging device of claim 13, wherein;

the spatial light modulator is to modulate the light using pseudo randomly generated spatial modulation patterns,

21. The imaging device of claim 13, wherein:

the al least one detector includes fewer detectors than are needed to represent the level of spatial resolution of the resulting depth map.

22. The imaging device of claim 13, wherein:

the at least one detector includes a single photodetector,

23. The imaging device of claim 13, wherein:

the at least one detector includes an avalanche photodiode,

24. The imaging device of claim 13, wherein:

the light source includes a single light emitting element

25. The imaging device of claim 13, wherein:

the spatial light modulator comprises a digital micromirror device.

28. The imaging device of claim 13, wherein:

the spatial light modulator comprises a liquid crystal spatial light modulator.

27, The imaging device of claim 13, further comprising:

one or more optical elements to process the light as it travels between the liohf source and the at least one detector.

28. The imaging device of claim 13, wherein:

the imaging device is part of a portable wireless communication device, 29, The imaging device of claim 13s wherein:

the imaging device is part of a device that receives inputs from a user or displays information to a user.

Description:
METHOD AND APPARATUS TO DETERMINE DEPTH INFORMATION

FOR A SCENE OF INTEREST

GGVERN!VENT RIGHTS

[0001] This work was supported by the National Science Foundation under Contract No. CCF-0643838. The Government has certain nghfs in this Invention, FIELD

[0002] Subject matter disclosed herein relates generally to imaging and, more particularly, to techniques for three dimensional scene acquisition,

BACKGROUND

[0003] Sensing three dimensional (3D) scene structure is an integral part of applications ranging from 3D microscopy to geographical surveying. While two dimensional (2D) imaging is a mature technology, 3D acquisition techniques have room for significant improvements in spatial resolution, range accuracy, and cost effectiveness. Humans use both monocular cues, such as motion parallax, and binocula cues, such as stereo disparity, to perceive depth, but camera-based stereo vision techniques suffer from poor range resolution and high sensitivity to noise. Computer vision techniques (including structured-light scanning, depth- from-foeus, depth-from-shape, and depth-from-motion) are computation intensive, and the range output from these methods is highly prone to errors from

miscalibration, absence of sufficient scene texture, and low signal-to-noise ratio (SNR).

[0004] In comparison, active range acquisition systems, such as light detection and ranging (LIDAR) systems and time of flight (TOF) cameras, are more robust against noise, work in real-time at video frame rates, and acquire range

information from a single viewpoint with little dependence on scene reflectance or texture. Both LIDAR and TOF cameras operate by measuring the time elapsed between transmitting a pulse and sensing a reflection from the scene. LIDAR systems consist of a pulsed illumination source such as a laser, a mechanical 2D laser scanning unit, and a single time-resolved phofodeteetor or avalanche phoiodlode. A TOP camera illumination unit is composed of an array of omnidirectional, modulated, infrared light emitting diodes (LEDs), The reflected light from the scene, with time delay proportional to distance, is focused at a 2D array of TOP range sensing pixels,

[0005] A major shortcoming of LIDAR systems and TOP cameras is iow spatial resolution, er the Inability to resolve sharp spatial features in the scene, For real- time operabilify, LIDAR devices have low 2D scanning resolution. Similarly, due to limitations in the 2D TOP sensor array fabrication process and readout rates, the number of pixeis in commercially-available TOF camera sensors is currently limited to a maximum of 320 240 pixels, Consequently, it Is desirable to develop novel, real-time range sensors that possess high spafiai resolution without increasing the device cost and complexity.

SUMMARY

[0008] Techniques, systems, and devices are provided herein that are capable of capturing depth information for a three dimensional scene of interest in an efficient and cost effective manner. In some implementations, spatial light modulation is used to either modulate a series of light pulses transmitted toward the scene of interest or modulate the light reflected by the scene before if is incident on a time-resolved sensor. Light energy reflected from the scene of interest is then captured in one or more time-resolved or time-sampling defectors and digitized. The resulting digital data may then be processed using parametric signal deconvolutlon to generate a range profile about the scene of interest. The one-dimensional range profile of a scene is a combined indicator of how much scene content is present at a particular depth and how much of if was illuminated or rejected by the spatial modulation pattern. The range profile corresponding to different spatial patterns may then be further processed using spatial recovery techniques, for example convex optimization, to generate a two-dimensional depth map for the scene of interest. In one embodiment It was determined that the depth map of a typical natural scene has a Laplacian that is sparse; more generally the depth map of a scene may be approximated well using only a small number of values in an appropriately chosen transform domain, like a discrete cosine transform or wavelet transform. The techniques provided herein may take advantage of this sparsity in recovery of the depth map of a scene from the digital samples of the light reflected from the scene,

[0007] Before other digital processing is performed, time-sampled digital data corresponding to one (or more) of the transmitted light pulses and spatial modulation patterns may he processed using signal dec versio , including parametric methods, to determine a depth or range profile associated with the scene of interest. The depth or range profile information may then be used during the subsequent digital processing to extract scene ranges of interest from the digital samples. In at least one embodiment, a single !igh! pulse that is either spatially unmodulated (omnidirectional) or modulated with a full transparent spatial light modulation (SLVf) pattern is used to determine the overall range profile of the scene without transverse spatial resolution.

[0008] In various implementations, the techniques, systems, and devices of the present disclosure may provide many benefits over conventional range/depth acquisition techniques. For example, the techniques, systems, and devices are capable of providing better spatial resolution in the resulting depth maps than would be expected for the number of light sensors being used (i.e., the spatial information is "compressed" within the detected light information), In some embodiments, for example, a single light sensor may be used. In addition, in some implementations, enhanced depth/temporal resolution may be achieved relative to the speed/bandwidth of the defectorfs) used even when the spatial information is mixed at single photodelector. The disclosed techniques, systems, and devices are also capable of being implemented in a manner that consumes very little power relative to more conventional depth acquisition schemes. For example, because fewer and simpler circuit elements may be used in different implementations (e,g. s light sources, light detectors, etc.), power consumption may be kept to a minimum.

[0009] in addition to the above, the disclosed techniques, systems, and devices are also capable of minimizing the negative effects of ambient light. This is because the techniques typically utilize the entire time profile fo image processing, which allows low frequency components (associated with background illumination) to be rejected. As Is well known, background light presents a problem in many conventional depth acquisition techniques, Based on some or all of the above described benefits, the techniques, systems, and devices of the present disclosure are well suited for use in applications having limited energy availability (e.g., battery powered applications, etc.) and applications having smaller form factors (handheld devices such as, for example, cellular phones, smart phones, tablet and laptop computers, personal digital assistants, digital cameras, and others).

[0010] In accordance with one aspect of the concepts, systems, circuits, and techniques described herein, a machine implemented imaging method for generating depth information for a three dimensional scene comprises;

transmitting light toward the scene; receiving reflected light at at least one detector resulting from reflection of the transmitted light from the scene; spatially

modulating the light before it reaches the at least one detector using a number of different spatial modulation patterns; converting output signals of the at least one defector corresponding to different spatial modulation patterns to digital samples; processing digital samples corresponding to different spatial modulation patterns to estimate a number of scene impulse responses; and processing the scene impulse responses to generate a depth map for the scene,

[0011] In some embodiments, the spatial modulation of light is performed before the light reaches the scene. In some embodiments, the spatial modulation of light is performed after the light is reflected from the scene,

[0012] In seme embodiments, the processing of the digital samples

corresponding to the different spatial modulation patterns includes deconvolving each set of digital samples using an impulse response of the at least one detector,

[0013] In some embodiments, the processing of the digital samples

corresponding to diff erent spatial modulation patterns includes performing parametric signal deconvolution.

[0014] In some embodiments, the processing of digital samples includes (i) processing digital samples corresponding to a first spatial modulation pattern to estimate minimum and maximum depths associated with the scene; and (ii) processing digital samples corresponding to other spatial modulation patterns using the minimum and maximum depths, in some embodiments, the first spatial modulation pattern is a fuiiy transparent pattern.

[0015] In some embodiments, the intensify of transmitted light of at iaast one spatial modulation pattern depends on the result of processing digital samples corresponding to at least one previous spatial modulation pattern.

[0018] In some embodiments, spatially modulating the light before it reaches the at least one detector using a number of different spatial modulation patterns includes modulating the light using pseudo-randomiy generated spatial light modulation patterns.

[0017] In some embodiments, transmitting light includes transmitting light from a single stationary source.

[0018] In some embodiments, the at least one detector includes fewer defectors than the number of array elements of the resulting depth map. In some embodiments, the at least one detector includes a single detector,

[0019] . In accordance with another aspect of the concepts, systems, circuits, and techniques described herein, an imaging device comprises: a light source to generate light to illuminate a three dimensional scene of interest; at least one detector to detect light reflected from the scene of interest; a spatial light modulator to modulate the light before it reaches the at least one defector using a number of different spatial modulation patterns; an analog to digital converter (ADC) to digitize output signals of the at least one defector to generate digital samples corresponding to different spatial modulation patterns; and at least one digital processor to; (i) process digital samples corresponding to different spatial modulation patterns to estimate a number of scene impulse responses; and (ii) process the scene impulse responses to generate a depth map for the scene of interest.

[0020] In some embodiments, the spatial light modulator is located proximate to an output of the light source to modulate the light before it reaches the scene of interest. In some embodiments, the spatial light modulator is located proximate to an input of the at least one detector to modulate the light after it has been reflected from the scene of interest. [0021] In some embodiments, the at least one digital processor is configured to process digital samples corresponding to a spatial modulation pattern by deconvolving the digital samples using an impulse response of the at least one detector. [0022] in some embodiments, the at least one digital processor Is configured to process digital samples corresponding to a spatial modulation pattern using parametric signal deeonvolution,

[0023] In some embodiments > the at least one digital processor is configured to process digital samples corresponding to a first spatial modulation pattern to estimate minimum and maximum image depths associated with the scene of interest, wherein the minimum and maximum image depths associated with the soene of Interest are used to process digital samples corresponding to other spatial modulation patterns. In some embodiments, the first spatial modulation pattern is a fully transparent pattern, [0024] In some embodiments, the spatial light modulator is to modulate the light using pseudo randomly generated spatial modulation patterns,

[0025] In some embodiments, the at least one detector Includes fewer detectors than are needed to represent the level of spatial resolution of the resulting depth map, In some embodiments, the at least one detector Includes a single photodeiector. In some embodiments, the at least one detector includes an avalanche photodlode.

[0026] In some embodiments, the light source includes a single light emitting element.

[0027] In some embodiments, the spatial light modulator comprises a digital mlcromirror device. In some embodiments, the spatial light modulator comprises a liquid crystal spatial light modulator,

[0028] In some embodiments, the Imaging device further includes one or more optical elements to process the light as it travels between the light source and the at least one detector. In some embodiments, the imaging device is part of a portable wireless communication device. In some embodiments, the imaging device is part of a device that receives Inputs from a user or displays information to a user. BRIEF DESCRIPTION OF THE DRAWINGS

[0029] The foregoing features may be more fully understood from the following description of the drawings In which:

[0030] Fig, 1A is a block diagram illustrating an exemplary imaging system In which spatial modulation of transmitted light is used in accordance with an embodiment;

[0031] Fig, 1B is a block diagram lliustrating an exemplary imaging system in which light is spatially modulated after reflection from a scene of interest in accordance with an embodiment;

[GGS2J Fig, 2 is a flowchart illustrating a machine implemented method for use in recovering depth information for a scene of interest In accordance with an embodiment;

[0033] Fig. 3 is a flowchart illustrating a machine implemented method for use in processing digital sample data to generate a depth map for a scene of interest in accordance with an embodiment;

[0034] Figs, 4A-4D are diagrams illustrating a coordinate system for use in describing an imaging environment in accordance with an embodiment;

[0035] Figs. 5A-5C are diagrams illustrating the response of a single rectangular facet of a scene of interest to a fully transparent SL pattern in accordance with an embodiment;

[0038] Fig, 6 is a diagram illustrating a spherical shell centered at an origin of a coordinate system with inner and outer radii equal to closest distance to the scene and the farthest distance to the scene from the imaging device, respectively;

[0037] Figs, 7A-7D are diagrams illustrating the response of a single rectanguiar facet of a scene of interest to a binary SLM pattern in accordance with an embodiment;

[0038] Fig, S is a diagram illustrating the generation of depth masks in accordance with an embodiment; [0039] Fig. 9 is a diagram illustrating parametric modeling for non-rectangular planes in accordance with an embodiment; and

[0040] Fig. 1 A-10F are diagrams Illustrating parametric modeling far scenes with multiple planar facets in accordance with an embodiment,

DETAILED DESCRIPTION

[0041] Fig, 1A is a block diagram illustrating an exemplary imaging system 10 in accordance with an embodiment. The imaging system 10 may be used to, for example, capture depth map information for a three dimensional scene of interest 12. Trie depth map information may include, for example, a two dimensional array of values corresponding to distances from system 10 to various points in the scene of interest. Such depth map information may be useful in various applications including, for example, gaming applications, security/surveillance applications, biometrics, augmented reality systems, natural and gestural user interfaces, outdoor ar d indoor navigation, automobile navigation, remote telepresence and conferencing systems, head mounted displays, automatic object defection, tracking and segmentation, geographical terrain mapping and medical imaging. During operation, a user may, for example, point imaging system 10 at scene of interest 12 and press an actuator or otherwise initiate a depth capture operation, Imaging system 10 may then direct light energy toward scene of Interest 12 and sense light energy reflected from the scene. Imaging system 10 may process the reflected light energy to determine depth information for scene of interest 12,

[0042] In some implementations, imaging system 10 may be provided as a standalone imaging device, in other Implementations, imaging system 1 may be made part of a larger system or device such as, for example, a handheld wireless communicator (e.g., a cell phone, a smart phone, a satellite communicator, a pager, a tablet computer having wireless functionality, etc.), a digital camera or camcorder, personal computers including tab!ete, desktops and laptops or some other electronic device including consumer electronic devices. In still other implementations, imaging system 1 may be provided as a removable unit that may be coupled to or installed within another system (e.g., a PC card inserted info a laptop computer, an imaging device that may be coupled to a port of a computer or smartphone, etc.).

[0043] As Illustrated in Fig. 1Ά, imaging system 10 may include: a light source 14, a spatial light modulator 16, one or more optical elements 18, one or more p otodetocfors 20, an analog to digital converter (ADC) 22 s a signal processing unit 24 s a memory 26, and a controller 28. Light source 14 is operative for generating coherent or incoherent light that is used to illuminate scene of interest 12 during depth acquisition operatbns. The light that Is generated may be in the form of light pulses or a continuous or intermittent stream of light. In the

discussion that follows, It will be assumed that light pulses are being used. Spatial light modulator 16 is operative for modulating the light pulses before they reach scene of interest 12 er after they are reflected baok from the scene 12. Optical element(s) 18 may bo used to appropriately focus or direct the light pulses on scene of interest 12. Photodetector{s) 20 is operative for sensing light refleoted from scene of interest 12 resulting from the transmitted light pulses. ADC 22 samples the output signal of phofodefector(s) 20 to generate digital samples representative of the detected light energy. Light source 14 may generate light pulses one at a time. Thus, a separate set of digital samples may be generated by ADC 22 for each transmitted light pulse as well as for each spatial pattern.

These digital samples may he temporarily stored within memory 26 to await further processing. Signal processing unit 24 is operative for processing some or all of the collected digital samples to generate a depth map for scene of interest 12. Signal processing unit 24 may store the generated depth map in memory 28.

[0044] Controller 28 may be used to control the operation and synchronization of some or all of the other components of system 10. Thus, for example, controller 14 may synchronize the transmission of light pulses by light source 14 with the changing of SLM patterns within spatial light modulator 16. Controller 1 may also synchronize the sampling of detected light energy by ADC 22 with the transmission of light pulses by light source 14. Controller 14 may also, in some implementations, control the storage and retrieval of digital sample data to/from memory 28 and the initiation of processing within signal processing unit 24.

[0045] In conceiving certain embodiments described herein, it was appreciated that most natural scenes may be modeled as a number of discrete planar facets with possibly some curved surfaces. As will be described in greater detail, reflected light signals from both planar facets and curved surfaces take the form of parametric signals. In contrast to general signals, parametric signals may be described using only a small number of parameters. In addition, parametric signals are generally smooth and vary In a linear manner. Thus, superposition will apply when adding parameteric signals. As will be described in greater detail, in some embodiments, signal processing unit 24 may employ, among other things, signal deconvolutior? techniques (e.g., parametric signal deconvolufion, etc) as part of the process to recover the depth map information for scene of interest 12.

[0048] Light source 14 may include any type of light source that is capable of generating light of a sufficient intensify to travel round trip to and from scene 12 and still be detected by photo detectors) 20. This may include light sources having a single or multiple light generating elements (e.g., a single laser diode, an array of laser diodes or light emitting diodes, an incandescent light source, edge emitting laser diodes, modelocked lasers etc.). In at least one Implementation, a single tow cost light element is used as light source 14.

[0047] Spatial light modulator 18 may include any type of modulation device that is capable of oonfrollably modulating light traveling through the device in a spatial manner. In one approach, spatial light modulator 16 may include a two dimensional array of pixels that may be individually and controllahly modified to change an amount of attenuation for light passing though the pixel The individual pixels may be binary in nature (e.g., changed between substantially transparent and substantially opaque states) or they may have more than two individual attenuation levels (e.g., grey scale). In at least one embodiment, spatial light modulator 18 may include a micromirror array. Other types of spatial light modulators 16 may also be used including, for example, phase based light modulators and spatial modulators based on liquid crystal displays and printed masks.

[0048] Optical elements) 18 is operative for directing the light energy towards the scene and collecting the light reflected by the scene and focusing it on the detector(s). Optical element(s) 18 may Include, for example, one or more lenses or microSenses, grating, or electro-mechanical and optical shutters. [0049] Photodeteetor(s} 20 may include any type of device or component that is capable of detecting Sight energy and converting the light energy to an electrical signal. Photodetector(s) 20 may include a single light detecting element (e.g., a single photodiode or phototransistor) or multiple elements (e.g., an arra of detectors). In a typical implementation, the spatial resolution of the depth map generated by system 10 will be much larger than the number of sensors of photodetector (s) 20. For instance, in one exemplary implementation, a 16X16 sensor array may be used to generate a megapixel depth map. In at least one embodiment, a single low cost sensor may he used as photodetector(s) 20, [0050] Signal processing unit 24 may be implemented using, for example, one or more digital processing devices. The digital processing devlce(s) may include, for example, a general purpose microprocessor, a digital signal processor (DSP), a reduced instruction set computer (RISC), a microcontroller, an embedded controller, a field programmable gate array (FPGA), a digital logic array (DLA), an application specific integrated circuit (ASIC), and/others, Including combinations of the above. In some implementations, programs and/or other configuration information may be stored within memory 26 for use In configuring signal processing unit 24. Controller 28 may also he implemented using one or more digital processing devices, Controller 28 may be implemented within fhe same or different digital processing devices than signal processing unit 24.

[0051] Memory 28 may include any type of system, device, or component, or combination thereof, that is capable of storing digital information (e„g, s digital data, computer executable instructions and/or programs, configuration information for reconflgurable hardware, etc) for access by signal processing unit 24 and/or other components of system 10. Memory 26 may include, for example, semiconducto memories, magnetic data storage devices, disc based storage devices, optical storage devices, read only memories (ROIvls), random access memories (RAMs), non-volatile memories, flash memories, and/or others,

[0052] In at least one embodiment, processing within signal processing unit 24 may take place in two stages. In a first stage, digital samples resulting from one spatial modulation pattern (or possibly multiple patterns) may be processed using signal deconvolufion to recover the range profile of scene of interest 12 (i.e., information about a lowest depth value and a highest depth value and bow the scene vanes as a function of range 12), In some Implementations, this depth range information may be specified as a shortest light puise transit time Turn to scene 12 and a longest light pulse transit time Tmx to scene 12. In at ieast one implementation, a single unmodulated or omnidirectional light pulse modulated or a light pulse modulated .using an St pattern may be used to generate the depth range information (although other patterns may foe used in other

implementations). As will be described in greater detail, an impulse response of photodetector 20 may be acquired for use in performing the signal deconvolution operation, [0053] In a second processing stage, after the scene range profile has been determined, digital samples resulting from transmission of other light or light pulses may be processed along with the range profile within signal processing unit 24, using signal deconvolution, to generate range profile information

corresponding to each of the different SLM spatial patterns (see Fig. ?D for an example of the scene range profile occurring between T½ M and T x)- The signal deconvolution operations will recover the temporal (or depth) resolution

information of scene 12. In some embodiments, each of the light pulses that are used for the second stage of processing will have been modulated using a different SLM pattern from the other pulses, in some implementations, the SLIVI patterns may be pseudo-randornly generated for use in the depth acquisition procedure. Because the SLM patterns are different, the scene range profiles recovered for each of the transmitted light pulses during the second stage of processing may foe different (I.e., different amplitudes variations with depth, etc.). It is this difference between the scene range profiles associated with the different modulated light pulses that carries the spatial information from which the depth map may be generated. As described above, an impulse response of

photodetector 20 may be used to perform the signal deconvolution operations,

[0054] After scene range profiles have been generated for each of the modulated light pulses, the resulting digital sample information may he processed in signal processing unit 24 to recover the spatial resolution of the scene and generate the depth map. As will be described in greater detail, spatial processing techniques, for example optimization, may be used to process the scene range profiles corresponding to different spatial patterns within signal processing unit 24, As will be appreciated, in some implementations, signai processing unit 24 may process digital sample Information resulting from lass than all of the light pulses transmitted toward scene of interest 12. In general, the number of spatial patterns used to recover the depth map will depend on the desired spatial resolution of the depth map. In some implementations, the number of pixels in the resulting depth map will be equal to the number of pixels in the SLM pattern utilized. In some embodiments, the number of different spatial patterns used will vary between 1 % and 10% of the number of pixels in the SLM pattern,

[0055] In system 10 of Fig. 1 A, the spatial light modulation is performed before the light signal reaches scene 12. In other embodiments, the light modulation may be performed after the light is reflected from scene 12. Fig, 1 B is a block diagram illustrating an exemplary imaging system 40 that performs light modulation after reflection in accordance with an embodiment. As illustrated, system 40 includes a spatial light modulator 32 just before pbofodefector{s) 20, Light source 14 illuminates scene 12 with unmodulated light. Spatial light modulator 32 then receives the light energy reflected from scene 12 and modulates the light using different SLM patterns before the light is detected by defector 20. The patterns that are used may be controlled by controller 28, As described previously, in some embodiments, controller 20 may pseudo-random!y generate the SLM patterns that are used. Other techniques for generating the SLM patterns may alternatively be used. The processing of the detected light may then proceed in substantially the same manner described previously,

[0058] Fig. 2 is a flowchart illustrating an exemplary machine implemented method 40 for use in recovering depth information for a scene of interest in accordance with an embodiment. The method 40 may be used in connection with, for example, imaging device 10 of Fig, 1 or other similar systems. First, an impulse response of photodetector may be acquired (block 42), Irs one approach, the impulse response may be acquired by direct measurement by applying a light impulse to an input of the photodetector and recording a resulting response. In other approaches, the impulse response may be retrieved a memory or from some other source. A light pulse that is either unmodulated or modulated using an SLM pattern may next be transmitled toward the scene of interest (block 44). Reflected light energy resulting from reflection of the transmitted pulse from the scene of interest may then be sensed by one or more photodetectors (block 46). The output of the photodetector may then be digitized to generate digital samples associated with the transmitted pulse (block 48). The digital samples ma be temporarily stored In a memory for later use (block 50), [0057] It next may he determined whether all spatial patterns that will be used to generate the depth map have been transmitted (block 52). If not, another spatial pattern will be used to modulate the light going towards the scene of interest or light reflected back from the scene (block 54). The reflected light energy for the new SLM pattern will then be sensed, digitized and stored (blocks 48, 48, 50), This process may then be repeated until a desired number of spatial patterns have been transmitted. In at least one embodiment, the SLM patterns that are used to modulate the light pulses may be pseudo-randomly selected. In other embodiments, the same series of SLM patterns may be used for each depth map acquisition operation or the SLIV1 patterns may be selected in another manner. After all the spatial light modulations are complete (block 52- Y), the digital sample information collected for the transmitted pulses may be processed to generate a depth map for the scene of interest (block 56).

[0058] Fig. 3 is a flowchart illustrating an exemplary machine implemented method 50 for use in processing digital sample data to generate a depth map for a scene of interest in accordance with an embodiment, The method 50 may be used as part of method 40 of Fig. 2 or similar methods. Digital samples associated with a first transmitted pulse may first be processed using signal deconvolufion to determine the scene range profile (block 52). In at least one embodiment, the first light pulse may be umodulated or omnidirectional, or it can be modulated with an SLM pattern. As described previously, TMN and JMAK represent a shortest light pulse transit time Turn and a longest light pulse transit time T¾,IAX to a corresponding scene of interest and may be used as an indication of a depth ranges present in the scene. As will be described in greater detail, to perform the signal deconvolufion, the impulse response of the photodetector may be used. Although referred to above and elsewhere herein as a "first" digital poise, it should be appreciated that this pulse does not have to be the first pulse transmitted In time. That is, the word "first" is being used here to identify the pulse and not to signify a temporal relationship. [0059] After the range profile has been determined, the digital samples associated with the other transmitted poises (along with the impulse response of the photodetector) may be processed using signal deconvolution techniques to generate range profile information for all the ofhor spatial or SLM patkems (block 54). The range profiles for the different SLM patterns may then be processed using spatial processing algorithms to generate the two-dimensional depth map (block 56). In the discussion that follows, the above described processing steps will be described in greater detail for example implementations. First, however, a coordinate system for use in describing a measurement environment will be described.

[0060] Fig, 3 is a series of diagrams illustrating an exemplary measurement setup and coordinate system that may be used during a depth information acquisition procedure for a scene of interest in accordance with an embodiment. As shown in Fig, 3A, a selected Sl pattern 70 may be focused on a the scene 72 using a focusing system 74, The center of the focusing system 74 may be denoted by O and is also the origin for a 3D Cartesian coordinate system 78. All angles and distances may be measured with respect to this global coordinate system. In the present example, the pixels of the SLM pattern are considered to be binary ( i.e., each SLM pixel is chosen to bo either fully opaque or fully transparent). As will be described later, in other implementations, continuous- valued or gray-scale SLM patterns may be used (e.g., to compensate for rapidly varying scene texture and reflectance, etc),

[0081] As shown in Fig. 4C, the light reflected from scene 72 is focused at photodetector 78, The origin O is the effective optical center of the entire imaging setup (illumination and defector).

[0082] In the discussion that follows, it will be assumed for simplicity of description that the scene of interest includes a single rectangular planar facet 82. This will later be extended to scenes have more complex content. As shown in Fig. 4A, the dimensions of the facet 82 are W L Line OC may be defined as the line that lies in the Y » Z plane and is also perpendicular to the rectangular facet, The plane is tilted from the zero-azimuth axis (marked Z in Fig. 4). However, as will be described later, this tilt does not affect the depth map construction. For simplicity, it will he assumed that there is no tilt from the zenith axis (marked in Fig, 4), However, as with the tilt from the zero-azimuth axis, this tilt would not affect the depth map construction,

[0063] As shown in Fig, 4C, the following parameters may be used to completely specify rectangular facet 82:

β dl denotes the length of the line OC.

» ψι and ?2 are angles between line OC and the extreme rays connecting the vertical edges of rectangular facet 82 to G, and Αφ - jipr Ψ2Ι Is their difference; clearly, Δφ is related to L

* θι and B2 are angles between line OC and the extreme rays connecting the horizontal edges of rectangular facet 82 to O s and Δ8 : ™ \ -r Θ2Ι is their difference; clearly, ΔΘ is related to W.

« a Is the angle between OC and the Z axis in the Y~Z plane,

[0064] The response of a single rectangular facet to a fully transparent SIM pattern will now be described. As shown in Fig, 4, Q is a point on the rectangular plana facet at an angle of θ (& < & < θ 2 ) and φ (φ, < ψ < ψ ζ with respect to the line OC. An illumination pulse, s(t), that originates at the origin at time f ~ 0 will be reflected from Q, attenuated due to scattering s and arrive back at the deteclor 78 delayed in time by an amount proportional to the distance 2 \OQ\. Since the speed of light is set to unify, the delay is exactly equal to the distance Z \OQ \ . Thus, the signal incident on photodetecfor 78 in response to impulse illumination of q is mathematically given by:

(t) = . S(t - 2\OQ\ s

where a is the total attenuation (fransmissivity) of the unit-intensity pulse, Since the photodetecfor has an impulse response, denoted by h(t), the electrical output r q (f) of the photodetecfor is mathematically equivalent to convolution of the signal £?(0 and the defector response rM) = h(t) * aS(t - 2\0Q\ = oh(t - Z\0Q\),

Next, the expression for r q (t) ma be used to model the response of scene 72 ir illumination to a fully transparent sLM pattern (see Fig, 5), The signal r(t) obtained in this case is the total |ht incident at photodetector 78 from all posssi positions of Q on the rectangui

Q ζ k (i ···· 2 \00 ( φ, θ}\) άθ ά ,

presuming a linear detector resf lee. From Fig. 4, it should be noted that

Thus, substituting in the above equation results m:

where the equality follows from a change of variables φ «™ ($ - φ ) and Θ *~

(Θ ~~ ¾). Since 0e[G, ] and φε[ϋ,Δφ] are small angles, is approximated well using a first-order expansion:

v ec 2 (( ! + ) + tw 2 e t + Θ) sas

Js**≠ 1 + tai 0 1 ^»— ((tan φ χ ec 2 χ ) φ -f (tan ¾ sec 2 θ χ ) ). For notationa! simplicity, let y(<p lt B t ) ~■ί3βε 2 φ ί + ΐαη 2 θ τ The above equation for r(t) may thus be approximated by:

(tandi t sec 2 φ φ ÷ tan

7( 1 ^!) + ·

Y ΚΨι> ί7 ι J

s where

τ(φ, 0) - 2ά ± γ(φΜ + ^·~- (tan fe sec^) φ + (tan^sec 2 0*) θ

(2)

ιο it is now noted that the range profile of a scene comprising a single rectangular plane is a linear function of the depth values. More generally the range profile of a natural scene is a smoothly varying function of depth with only finite number of discontinuities. This is the central observation that allows the returned signal (the scene range profile) to be modeled using a parametric signal processing

15 framework and recover the scene depth variations using the proposed acquisition setup. Again, for notafionai simplicity, let;

T 0 - 2d ;. r( :;

2d

Τ φ -——- tan φ 1 sec 2 φ , T 8 = ^--^-^ tan φ 1 sec* e

Note that T G > 0 f< values of φ χ and 6 lt but Τψ and T e may be negative or positive. With this sfion and a change of variables, r a «~ Τ ώ φ and τ 2 *™ T the following is obtaii

h{t) * 5(t - Γ 0 ) * B (t, Τψ&φ) * B(t, Τ β ΑΘ)

where B(f, 7) s the box function with width |7f as shown in Fig. SC and defined as: for t between 0 and T;

otherwise.

[0065] The function B(t 7) is a parametric function that can be described with a small number of parameters despite its infinite Fourier bandwidth, The

convolution of B(t, Τψ &φ) and Β(ι, 7¼Δ δ ) 5 delayed in time by To, is another parametric function as shown in Fig. 5C, This function will be denoted

P(¾ r CJf Τφ φ, Τ θ &θ). it is piecewiso linear and plays a central role in the present depth acquisition approach for pieeewise-planar scenes. With this notation, the following is obtained:

[0086] The function P(t, Τ ΰ , Τψ ψ, Τ@&θ) is nonzero over a time interval i β [T min , T max ] that is preciseiy the time interval in which reflected light from the points on the rectangular planar facet arrives at the detector. Also, it is noted that To is equal to the distance between O and the lower left corner of the rectangular plane, but if may or may not be the point on the plane closest to O. With knowledge of

Tmin and T msK , a region of certainty may be obtained in which the rectangular facet lies, This region is a spherical shell centered at O with inner and outer radii equal to 7 " m j n and TsTtast. respectively (see Fig. 8). Within this shell, the rectangular planar facet may have many possible orientations and positions.

[0087] In the ongoing example of a scene comprising a single planar facet, we desire to estimate the function Ρ(£, T 0 , ΤψΑ φ/ Τ Θ &Θ) and hence the values of T m ^ and T mm by processing the digital samples of the function r{i). The detector impulse response {i) may be modeled as a band limited lowpass filter. Thus, the general deconvotutfon problem of obtaining P(£, 7 , ΤφΑψ, Τ Θ ΔΘ) from sampies r[k] Is ill-posed and highly sensitive to noise. However, modeling shows that the light transport function Ρ^ Ί^, ΤφΑφ, T 8 M)) or the scene range profile is a parametric signal for natural scenes. This knowledge makes the recovery of

F{(t, r Df Τψ&φ, Τ 8 &θ) a well posed deoorwolotion problem that may be solved using a variety of signal deconvolufion methods including, for example, the parametric signal processing framework described in "Sampling Moments and Reconstructing Signals of Finite Rate of Innovation: Shannon Meets Strang-Fix," by Dragotfi et aL, IEEE Trans, Signal Process, 55 s 1741—1757 (2007) which is hereby incorporated by reference in its entirety.

[0068] It is important to emphasize that the analysis up to this point is independent of the tilt and orientation of the rectangular plane with respect to the global coordinate system (i.e., the tilt has not appeared in any mathematical expression). Thus, the scene range profile is the function Ρ(ε, Τ 0 , ΓφΔφ » Τ Θ ΑΘ) which describes thai the light transport between the imaging device and the rectangular planar facet is independent of the orientation of the line GO. This is intuitive because all the results were derived by considering a new frame of reference involving the rectangular plane and the normal to the plane from the origin, OC. The derived parametric light signal expressions themselves do not depend on how OC is oriented with respect to the global coordinate system, but rather depend on the relative position of the plane with respect to OC. This explains why it is not possible to infer the position and orientation of the planar facet in the field of view of the system from the estimates of P(t, T Qi Τ φί Τ Θ ΑΘ), Recovery of the position and orientation of a rectangular planar facet is

accomplished using patterned illuminations and associated processing, [0069] in general, the SLM pixels discretize the FOV into small squares of size &x A, The SLM xels and the corresponding scene points may be indexed by (/, J), Since the scene is illuminated with a series of M diff erenf binary SLM patterns, an index p may be used for the illumination patterns. The full collection of binary SLM values may be denoted [cf j i = 1, ... , N s j = 1, ,„ , N, p ~ 1, ... t Af).

[0070] In the discussion that follows, D will be used to denote the depth map to be constructed. In addition, Οη will be used to denote the depth in the direction of illumination of SLM pixel {I, j), assuming rays in that direction intersect the rectangular facet, D; will be set to zero otherwise. As shown in Fig, 7A S the lower-left corner of the projection of the pixel onto the planar facet may be used. It is convenient to also define an index map, I ~ {/y: i™ 1, j = 1, ..., N associated with the rectangular facet where /,y = 1 if rays along SLM illumination pixel (/, i) intersect the rectangular facet and ίη™ 0 otherwise.

[0071 ] If we consider the rectangular facet as being composed of smaller rectangular facets of size A* Δ, then following the derivation described above, if is found that the light signal received at the detector In response to patterned, impulsive illumination of the rectangular facet is given by:

'(t) (t 2J 2x $ — 2y $ ) dx.£ dy $ t,&

Next, the signal U p (t) is defined as: The function A (t, A) -~ B(t > &) * B(t,A) has a triangular shape with a base width of 2Δ as shown in Fig. 7C. In practice, when the St has high spatial resolution then Δ is very small (le,, A « W f A « L t and Δ (t t A) approximates a Dirac delta function 8{ή), Thus, for a high resolution SLM, the signal If (f) is a weighted sum of uniformly-spaced impulses where the spacing between impulses is equal to 2Δ. Mathematically, we use il & o Β{£, Δ)* B(f, Δ) ~ \im &→0 S (£-~ Δ)™ δ{ΐ) in the above equation to obtain:

The parametric signal Lf{i) is obtaine i;d in the process of illuminating the s* with a patterned illumination and co ecting light from illuminated portions scene (cf - ~ i) where the reotangi planar facet is present ( / ; 1). In

particular, for a small value of A and fully transparent SLM pattern (all-or

(cf- = 1) / = 1, ...,N, j™ I, ...,N) we have the following relation:

o nss it) - e ···· 2>¾) dx t dy t

which follows from the fact that the double-summation approximates the double integral in the limiting case (Α→ΰ). Additionally, this equation implies that if lhar!9S (t} ~ P(i, Τ ΤφΔφ, Τ Θ ΔΘ). An important observation that stems from this fact is that for any chosen illumination pattern, the signal if(t) and the signal P(t s T 0s ΤφΔφ, Τ β ΔΘ), which is obtained by using the all-ones or fully-transparent illumination pattern, have support in time [T min , T m& ^, To be precise, if the points on the rectangular planar facet that are closest and farthest to O are illuminated, then both if{t} and Ρ(ί, Τ 0> ΤφΔφ, 7 θ ΔΘ} have exactly the same duration and time delay, In practice, the binary patterns are randomly chosen with at least half of the SLM pixels "on," so if is highly likely that at least one point near the point closest to O and at least one point near the point farthest from O are illuminated. Hence, § {f) and P(t, T i ΤφΔψ, Τ Θ ΔΘ) are iikely to have approximately the same time support and time delay offset, Because the speed of light is normalized to unify, this implies £¾ ^fF^ T maK l

[0072] Digital samples of the received signal ^[k] allow the depth map 0 to be recovered, First, It is noted that the set of distance values, fl% ; / - Ι, .,,,Ν, j - 1 s ... f N} } may contain repetitions (i.e., several ( ) positions may have the same depth value £¼/). All of these points will lie on a circular arc on the rectangular facet as shown in Fig. 7A. Each J¾ f belongs to the set of equally-spaced distinct depth values {d^d^ . . . ,Q/J where:

L— ^ ,d — Twin, ™ &i " 2&-ii*£ 1 ,,, ·.

Note that the linear variation of the depths d^d^ ...,dt is a direct consequence of Eq. (2), which states that there is a linear variation of distance from O of the closest point on the rectangular facet to the farthest. In the case of a fully transparent SLM illumination pattern discussed previously, the continuous signal P(t, Τ ΰ! Τφάψ, Τ Θ ΑΘ) may be obtained. In the case of patterned Illumination, a signal tf(r) is obtained that is a weighted sum of uniformly-spaced impulses. With this new observation, we have:

where we define the matrix as;

Ί, if Dij^d

0, otherwise, so lij With this ne¾ :ation, the depth map , associated with the rectangular facet is the weights urn of the index maps { ^: £ I, 1.} (see Fig. 8). Thus, constructing the pth map is now solved finding the L binary-valued index maps.

[0073] Taking the Fourier transform F j of the signals on both sides of Eq, (4)

where i ~ ^- From elementary Fourier analysis and Eq, {3} It is known that

[0074J The ADC Is used to sample the signal incident on the photodetector at a sampling frequency of f samples per second. Then f using elementary sampling reiation is obtained:

K is used to denote the total number of samples collected by the ADC. Likewise, the discrete Fourier transform (DFT) of the samples {r p {k} k - ,„ , if} is denoted by {R p [k] i k = l, ... , if}. Similarly, {li p \k k = % ... , K] is defined for the impulse response samples {h [ft] ; h ~ l s ... , K }. Then: (5)

For notationai simplicity

fP yN yN ,.r> νΐ (6)

stents a and f may be computed using calibration and may be

ationaily compensated using normalization. Since the values d ? , & , . known, Eq. (5) can be represented as a system of linear equations as

which can be compactly written as:

re the division is eiement wise). The matrix V is a Vandermonde matrix. Therefore, K > L ensures that we can uniquely solve the linear system in Eq, (7) Furthermore, a larger value of K allows us to mitigate the effect of noise by producing least square estimates of y p ,

[0075] Next, from Eq, (8) we see thai y p can also be represented with a lines system of equations as follows:

From the different binary SLM sHurnsnatsor patterns, we get instances of Eq.

(8) that can be comb ned into the compact r presentation as follows:

This system of equations is under-constrained since there are LxAf unknowns (corresponding to the unknown values of jjf 1 ... I* ... I 1 }} and only L*M available transformed data observations y. Note that y is computed using a total of « samples of the light signals received in response to M « Ή 2 patterned

illuminations.

[0076] The goal now is to recover the depth map D s which has N χ N entries. To enable depth map reconstruction even though there are much fewer observations than unknowns, the structure of scene depth may be exploited. It Is known that the depth values Dsj correspond to the distances from O to points that are constrained to lie on a rectangular facet and that the distances ύη are also linearly spaced between d-\ and <&. The planar constraint and linear variation imply that the depth map D is sparse in the second-finite difference domain. By exploiting this sparsify of the depth map, if is possible to recover D from the data y by solving the following constrained i regularized optimization problem; minimize s ibject to a.nd

Here the Frobenius matrix norm squared |j, | is the sym-of~squares of the entries, the matrix Φ is the second-order finite difference operator matrix:

and ® is the standard ronecker product for matrices.

[0077] The optimization problem OFT has an intuitive interpretation, Our objective is to find the depth map D that is most consistent with having a pieeewise-planar scene. Such scenes are characterized by D having a discrete two-dimensional Lapiacian (Φ ® Φ τ ) D with a smaii number of nonzero entries (corresponding to the boundaries of the planar facets). The number of nonzero entries (the pseudonorm 8 ) is difficult to use because it is noneonvex and not robust to smaii perturbations, and the i 1 norm is a suitable proxy with many opiimality properties. The problem OPT combines the above objective with maintaining fidelity with the measured data by keeping i\ y - \P- ... ... i L ]€ B| smaii. The constraints if j e {0, 1} and ¾ :: if] 1 for ail (ij) are a mathematical rephrasing of the faci that each point in the depth map has a single depth value so different depth values cannot be assigned to one position {/, J), The constraint ∑^ t d. f i e - B expresses how the depth map is constructed from the index maps, [0078] While the optimization problem OPT aiready contains a convex relaxation in its use of I Φ D | is it is nevertheless computationally intractable because of the integrality constraints ε (0, 1}. Using a further relaxation of if j € [0, 1] yields the following tractable formulation: minimize

subject to for all (i ) = D, and if I £ [0,1] £ -~ 1, ... , h, i ~ 1, ... , v.. - 1, ,

In at least one example implementation, the convex optimization problem « OFT was solved using CVX, a package for specifying and solving convex programs, [0079] In the discussion that follows, the received signal model and depth map reconstruction developed above is generalized to planar facets of any shape and scenes with muitipie planar facets. The signal modeling techniques described above also apply to a planar faoet with non-reotangular shape, For example, consider the illumination of a single triangular facet with a fully transparent SLM pattern, as shown in Fig, 9 (left panel). In this case, the light signal received at the detector is:

r (t) - a (t ·-- 2 10<?(φ, &) |) άθ άφ.

Contrasting with Eq. (1 ), since the shape is not a rectangle, the angle β does not vary over the entire range [0 lt ¾], Instead, for a fixed value of angle , the angle e can only vary from between some and some θ 2 (φ). These limits of variation are determined by the shape of the object, as shown in Fig. 9 (right panel).

[0080] Since the planar facet is in the far field, the distances of plane points from O still vary linearly. As a result, r{i) is still equal to the convolution of the defector impulse response with a parametric signal whose shape depends on the shape of the planar facet. For example, as shown in Fig. 9 (right panel), the profile of the signal P(t, T Ql Τ φ Αφ, T 8 Δ0) is triangular with Jagged edges. The task of estimating the signal Ρ(ΐ, Τ Τφ Δ Ι ν β ΑΘ) corresponding to a general shape, such as a triangle, from the samples is more difficult than estimating

P(t s Γ Τ Δ , Τ β ) in the case of a rectangular facet. However, as can he seen from Fig, 9 (right panel), a good piece ise-finear fit may still be obtained using the samples of r[k]. This piecewise-linaar approximation, although not exact, suffices for the purpose of estimating the shortest and farthest distance to the points on the planar facet. Thus, if is possible to estimate the values T^ n and T ms!( using the samples r[k] without any dependence on the shape of the planar facet. Once T min and T nax are estimated, we use the framework described previously to recover the depth map of the scene, which will also reveal the exact shape and orientation of the planar facet.

[0081] When the scene has multiple planar facets, as shown in Fig. 10A, the linearity of light transport and the linear response of the detector together imply that the detector output is the sum of the signals received from each of the individual planar facets. This holds equally well for the cases of fully-transparent and patterned SLM illumination. Fig. I DA illustrates a scene composed of two planar facets 100, 102 illuminated with a fully transparent SLM pattern. The total response is given by:

r(t) - r a (t) ^ r 2 (i) - Pfo 7 Qtlt Τ φ Δφ χ , Τ 8 χ ) + F 2 (£, Τ ΰΙ2> Τ φ>2 άφ ¾Δ¾),

where η(ί) and P, denote the response from planar facet /. The total response is thus a parametric signal. When points on two different planar facets are at the same distance from O (see f e.g., Fig. 10C) S there is time overlap between

ΡΑ & ΤΟ Α , ΤΦ Α ΔΦΑ> ΤΘ Α ΜΑ anci Ρ Β ^ Τ ΘΒ/ Τ φΒ άφ 8> Τ ΘΒ ΑΘ 8 ) (see, e.g., Fig. 1 QE). In any case, closest distance T mf -„ and farthest distance T msx can be estimated from f(f). Thus, the framework developed previously for estimating the distance set {d lf d 2t »> > & pplies here as well. Mote that no prior information is needed on how many planar facets are present in the scene.

[0082] Figure 1 GB illustrates the same scene illuminated with a patterned SLM setting. The response to pattern p may be expressed as:

where ? (t) is the response from planar facet /, we can similarly write: W(t) = υ χί) + U% (t).

Thus, the problem of depth map reconstruction in case of scenes constituted of multiple planar facets may also be solved using convex optimization techniques. Figure 10 illustrates rectangular facets that do not occlude each other, but the lack of occlusion is not a fundamental limitation, if a portion of a facet is occluded, it effectively becomes nonrectangoiar.

[0083] in summary, in some embodiments, the procedure for reconstructing the depth map of a natural scene is as follows:

1 , easure the digital samples of the impulse response of the photodetector {h[k] : k ~ 1, . . , ,K}. It is assumed thai the ADC samples are at least twice as fast as the bandwidth of the photodetector (Nyquist criterion).

2, illuminate the entire scene with the first light pulse which may either be unmodulated or omnidirectional, or spatially modulated using an SLM pattern; and measure the digital samples of the received signal { } k - 1, . . , ,KJ. In case the source is periodic, such as an impulse train, the received signal iff) will also be periodic and hence the samples need to be collected only in one period.

3, Process the received signal samples fffA] : k ~ 1, . , . ,K and the impulse response samples, { [k} : k - 1, . . . y } using signal deconvolution to estimate the scene range profile.

4, Spatially pattern the scene with M - A i 00 to AfVi 0 times (1 % to 10%) using pro-chosen SLM patterns, again using an impulsive light source. This spatial patterning of the scene may be done either by using a SLM to modulate the tranmistted light pulses before they reach the scene, or by using an SLM to spatially modulate the light that is reflected back from the scene; both methods of spatially patterning the scene may be used

simultaneously as well.

5, Record K digital time samples of the light signal received at the

photodetector in response to each of the scene patterns. δ. For each pattern, compute the Fourier transformed data and process 1ΐ using the range profile nformation from step 3,

7. Construct the matrix€ from the binary SIM patterns.

8. Spatially process the transformed data from step 8 using the matrix C from 5 step 7 to reconstruct the depth map 0 associated with the rectangular facet. The depth map wili contain information about the position, orientation, and shape of the planar facet.

[0084] The procedure described in the preceding paragraph is extended to 10 natural scenes that have texture and reflectance variation along with depth

variation by modifying the spatial processing in step 8 to estimate scene reflectance and texture. In at least one embodiment, this is accomplished by incorporating additional variables corresponding to reflectance and texture variation in the optimization procedure used to reconstruct spatial information. In i s some other embodiments, scene reflectance and texture is compensated through the use of gray scale SIM patterning to illuminate darker scene regions with more light power and lighter regions with lesser light power. Scene depth, texture, and ref ectance information may also he estimated or compensated for using additional sensor information (if available), such as using two-dimensional RGB 0 image.

[0085] In some embodiments described above, parametric signal

deconvoiution is employed to process captured image data to estimate a scene impulse response. The following steps describe a signal processing procedure that may be used for estimating scene impulse response from digital samples of5 the detector(s) impulse response and digital samples collected at the detector{s) output.

1 , input K digital samples from the defector, denoted by i, a s y ¾ .

2, Compute an N point Fourier transform ( > K) of y .yst- Denote the

transformed data as ¥·¾,....Y^.

0 3. Compute the N point Fourier transform of the detector impulse response.

Denote the transformed data as Hi,,.,, HN. 4. Choose an appropriate interpolation kernel based on the kind of scene be ng imaged. If the scene to be imaged is mostly comprised of planar objects, then a linear interpolation kernel is appropriate. If there are curved objects in the scene, a higher order kernel like splines may be used.

Denote the Fourier coefficients of the interpolation kernel as GI,,..,GN.

5. Rescale each of the N coefficients YI,...,YM by their corresponding value of ΗΙ,.-,ΗΗ and GI,.,.,GN. Denote the rescaied data as ΖΙ,.,.,ΖΝ.

8, Use∑··,,.,∑ to estimate the number of discontinuities (or kinks) in the

scene impulse response. In at least one embodiment, this was

accomplished by forming a structured Hankel matrix using ZI,,..,ZN followed by computing the rank of the matrix using singular value decomposition. Denote the numbe of discontinuities In scene impulse by L Note that by the definition of matrix rank, L < N.

7. Use the computed value of L along with rescaied data ZI,...,ZN to compute the positions of discontinuities in the scene impulse response. In at least one embodiment, this may be accomplished by forming a structured Hankel matrix, H„ of size (N-K 1 L-M) and computing the smallest eigenvector of H to be as the coefficients of the polynomial whose roofs are the estimates for the positions of kinks in the scene impulse response. Denote these L kink position estimates as d-^-A, Other methods to estimate kink positions in the scene impulse response may alternatively be employed. These methods are based on spectrai estimation techniques.

8. Once the L kink locations are identified, the amplitudes of these kinks may be estimated using the data ΖΙ,,.,,ΖΝ and di,..,,dL- In at least one

embodiment, this may be accomplished using a fast Implementation of linear Vandemionde filtering operation. Other techniques may alternatively be used. Denote the amplitude estimates Ai„.., ( A L .

9. Use the estimates di s ,„ s dt and Ai,„., AL along with the Interpolation kernel G s to produce an estimate of the scene Impulse response.

[GG88J The depth and reflectance estimation procedures described above may be extended to scenes having curved objects or a combination of curved and planar objects by, for example, modifying Item 4 in the previous paragraph to use higher order interpolation kernels, such as splines, to estimate the scene Impulse response of scenes with curved objects. Once the scene impulse response Is estimated, the spatial processing procedure is used to recover the shape and positions of the curved objects and planar surfaces,

[0087] Having described exemplary embodiments of the invention, it will now become apparent to one of ordinary skill in the art that other embodiments incorporating their concepts may also be used. The embodiments contained herein should not be limited to disclosed embodiments but rather should be limited only by the spirit and scope of the appended claims. All publications and references cited herein are expressl incorporated herein by reference in their entirety.