Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MULTI-CAMERA THREE-DIMENSIONAL CAPTURING AND RECONSTRUCTION SYSTEM
Document Type and Number:
WIPO Patent Application WO/2022/153207
Kind Code:
A1
Abstract:
A multi-camera three-dimensional capturing and reconstruction system comprising: a mechanical structure (10); at least five cameras (11 -15) are placed on said structure (10); characterised in that said at least five cameras (11 -15) are cameras each using a fisheye optic; said at least five cameras (11 -15) are configured so that, compared to a forward direction of said system: a first front camera (11) is directed with the optics pointed at 0°± 10°; a second front-right camera (12) is directed with the optics pointed at +45°± 10°; a third front-left camera (13) is directed with the optics pointed at -45°± 10°; a fourth rear-right camera (14) is directed with the optics pointed at 60°± 20°; a fifth rear- left camera (15) is directed with the optics pointed at -60°± 20°; x defines the distance between said second front-right camera (12) and said third front-left camera (13), the distance y between said third front- left camera (13) and said fifth rear-left camera (15) is equal to or greater than the distance x, with y being greater than or equal to 10 cm.

Inventors:
FASSI FRANCESCO (IT)
PERFETTI LUCA (IT)
PARRI STEFANO (IT)
Application Number:
PCT/IB2022/050253
Publication Date:
July 21, 2022
Filing Date:
January 13, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MILANO POLITECNICO (IT)
International Classes:
G02B13/06; G01C11/02; G03B37/04; G06T7/55; H04N5/232
Domestic Patent References:
WO2018076154A12018-05-03
Foreign References:
US20170280056A12017-09-28
US20200106960A12020-04-02
Other References:
JI SHUNPING ET AL: "Panoramic SLAM from a multiple fisheye camera rig", ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, AMSTERDAM [U.A.] : ELSEVIER, AMSTERDAM, NL, vol. 159, 29 November 2019 (2019-11-29), pages 169 - 183, XP085961777, ISSN: 0924-2716, [retrieved on 20191129], DOI: 10.1016/J.ISPRSJPRS.2019.11.014
MARIANA CAMPOS ET AL: "A Backpack-Mounted Omnidirectional Camera with Off-the-Shelf Navigation Sensors for Mobile Terrestrial Mapping: Development and Forest Application", SENSORS, vol. 18, no. 3, 9 March 2018 (2018-03-09), pages 827, XP055734705, DOI: 10.3390/s18030827
Attorney, Agent or Firm:
GATTI, Enrico et al. (IT)
Download PDF:
Claims:
CLAIMS

1. A multi-camera three-dimensional capturing and reconstruction system comprising: a mechanical structure (10); at least five cameras (11 -15) are placed on said structure (10); characterised in that said at least five cameras (11 -15) are cameras each using a fisheye optic; said at least five cameras (11 -15) are configured so that, compared to a forward direction of said system: a first front camera (11 ) is directed with the optics pointed at 0°± 10°; a second front-right camera (12) is directed with the optics pointed at +45°± 10°; a third front-left camera (13) is directed with the optics pointed at -45°± 10°; a fourth rear-right camera (14) is directed with the optics pointed at 60°± 20°; a fifth rear-left camera (15) is directed with the optics pointed at -60°± 20°; x defines the distance between said second front-right camera (12) and said third front-left camera (13), the distance y between said third front-left camera (13) and said fifth rear-left camera (15) is equal to or greater than the distance x, with x being greater than or equal to 10 cm.

2. The system according to claim 1 , characterised in that said at least five cameras (11 -15) are configured so that, compared to a vertical plane passing through each of said at least five cameras (11 - 15), they are placed at an angle of 0°± 20°.

3. The system according to claim 1 , characterised in that said second camera (12), said third camera (13), said fourth camera (14) and said fifth camera (15) lie substantially on the vertices of a square.

4. The system according to claim 3, characterised in that the distance y between said second camera (12) and said fourth camera (14), and the distance y between said third camera (13) and said fifth camera (15) is comprised between 10 and 200 cm.

5. The system according to claim 1 , characterised in that said second camera (12), said third camera (13), said fourth camera (14) and said fifth camera (15) lie substantially on the vertices of a rectangle.

6. The system according to claim 1 , characterised in that said first camera (1 1 ), said second camera (12), said third camera (13), said fourth camera (14) and said fifth camera (15) lie in substantially a same horizontal plane.

7. The system according to claim 1 , characterised in that said first camera (11 ) is raised compared to said second camera (12), said third camera (13), said fourth camera (14) and said fifth camera (15).

8. The system according to claim 1 , characterised in that said at least five cameras (1 1 -15) use an image capturing method of the global shutter type.

9. The system according to claim 1 , characterised in that said first camera (11 ) is in a more forward position compared to said second camera (12) and to said third camera (13).

10. The system according to claim 1 , characterised in that said system further comprises a transportable control centre (20) comprising a power-supply battery (22) and a computer (21 ).

1 1 . The system according to claim 1 , characterised in that said system is transportable.

Description:
MULTI-CAMERA THREE-DIMENSIONAL CAPTURING AND

RECONSTRUCTION SYSTEM"

DESCRIPTION

The present invention refers to a multi-camera three- dimensional (3D) capturing and reconstruction system.

The 3D capturing and reconstruction systems have a practical problem common to many realities that operate in the architectural, archaeological, geological and infrastructural fields, namely, the unsustainability, in terms of time and costs, of the classic techniques in surveying environments such as: tunnels, narrow staircases, attics, underground utilities, catacombs, aqueducts etc. Namely, all those environments characterised by structural complexity, little or no lighting, possible danger in the event of long stays and above all by a tunnel shape, therefore environments that are very narrow in width and small in height but very long. The market offers few if any possibilities for surveying these environments; there is a real difficulty in finding instruments that are sufficiently agile to be moved in narrow, but also very long, environments and that are able to detect very close objects.

All the measurement techniques possible today, such as dynamic capture scanner systems, suffer from the problem of error propagation along the prevailing measurement direction, leading to even significant deviations that are difficult to predict or control and often cannot be corrected in post-processing.

For this reason, they are not considered reliable per se in metric terms, requiring support, correction and verification measurements, which are not always possible and are very difficult.

Classic measurement techniques require capturing time and time spent on site, which are often not physically or economically possible. The need to collect very redundant data (given the shape of the environments) also makes post-processing borderline feasible.

An aim of the present invention is to provide a multi-camera 3D capturing and reconstruction system that overcomes the drawbacks of the known art.

Another aim is to provide a multi-camera 3D capturing and reconstruction system that is designed, in particular, for small and narrow environments.

A further aim is to provide a system which is easy to handle.

In accordance with the present invention, these aims, and others still, are achieved by a multi-camera three-dimensional capturing and reconstruction system comprising: a mechanical structure; at least five cameras are placed on said structure; characterised in that said at least five cameras are cameras each using a fisheye optic; said at least five cameras are configured so that, compared to a forward direction of said system: a first front camera is directed with the optics pointed at 0°± 10°; a second front-right camera is directed with the optics pointed at +45°± 10°; a third front-left camera is directed with the optics pointed at -45°± 10°; a fourth rear-right camera is directed with the optics pointed at 60°± 20°; a fifth rear-left camera is directed with the optics pointed at -60°± 20°; x defines the distance between said second front-right camera and said third front- left camera, the distance y between said third front-left camera and said fifth rear-left camera is equal to or greater than the distance x, with y being greater than or equal to 10 cm.

Further characteristics of the invention are described in the dependent claims.

The advantages of this solution compared to solutions of the known art are various.

The system allows three-dimensional reconstruction and photographic inspection of both artificial and natural confined spaces in order to tackle the problem of survey accuracy and repeatability.

It finds its natural application in situations where a 360-degree digitisation is required, comprising high-resolution 3D geometric information and complete photographic documentation, which is increasingly common and required in the field of digital information and which opens up possibilities for exploitation in the field of modelling (Building Information Modeling - BIM), virtual reality (VR) and augmented reality (AR) experiences and "data sharing online".

The system is designed to be held and used by a single operator independently by walking through the environment/tunnel to be surveyed at the normal walking speed and allowing a complete and fully automatic capture in very short times. Alternatively, the system can be mounted on a cart.

Using the principle of image matching, the rules of digital photogrammetry, exploiting structure-from-motion algorithms and a multi-camera system specifically designed for the type of reference application, the device is intended to be an alternative to modern dynamic 3D surveying systems on the market, offering high accuracy shape measurement combined with capturing images with superior quality, which are optimal characteristics for digitisation and detailed inspection of the surfaces.

The device makes it possible to limit the propagation of drift error during long captures. This is due to the combined exploitation of an angle of view of the fisheye optics, at least five cameras arranged in such a way that the entire scene except for the operator is shot in its entirety at all times, the accurate synchronisation of the capturing of the photographic shots, the accurate calibration of the fixed distances between the cameras, and the processing of the data by means of the Structure From Motion process. All this allows the identification of a large number of key points in each direction around the structure and of a large number of constraints between the images (homologous points) between the cameras forming part of the structure and between consecutive positions of the structure itself during movement. The wide angle of view means that the same key points can be recognised and used as constraints (homologous points) in a large number of consecutive positions of the structure before they no longer fall within the field of view of the cameras. The redundancy of these constraints reduces the drift error in long captures.

The device makes it possible to reduce the number of images required for 360° 3D reconstruction thanks to the wide angle of view of the fisheye optics, which allow complete coverage of the framed scene, except for the operator, to be obtained by using only five cameras. Furthermore, the wide angle of view of the structure makes it possible to obtain a great redundancy of constraints (homologous points to connect subsequent positions of the structure) while allowing a large ratio between the shooting base and shooting distance (bp/dp), namely, the ratio between the distance of the barycentres of two consecutive positions of the structure and the distance between the barycentre of the structure and the photographed surface, equal to 1 :1 , whereas instead, by using straight projection optics, this ratio must be lower (approximately 1 :2) significantly increasing the number of images required.

The device makes it possible to obtain a reconstruction directly to scale without the need for additional support measurements by exploiting the fixed relative position of the calibrated cameras and the synchronisation of the cameras.

It also makes capture fast and achievable even for non-expert photogrammeters; this is also possible in a completely autonomous manner by mounting the structure on a vehicle or other mode of movement.

The device makes use of global shutter type cameras and uses fisheye lenses.

The fisheye optics with an angle of view of 190° (greater than 150°), arranged at different angles and directed towards the outside of the barycentre of the structure, allow a hemispherical shot of the framed scene, excluding the operator, allowing omnidirectional constraint points that make, at the same time, the final reconstruction and the determination of the device at the moment of capture more robust; furthermore, the reduced focal length favours a wide depth of field that allows very close and far objects to be in focus simultaneously, improving the result of the subsequent image processing.

The relative arrangement between the cameras and in particular the presence of a significant distance between the shooting centres of the cameras, in particular along the direction of pointing of the instrument, as well as the walking direction, allows the accurate triangulation of homologous points and the scaling of the resulting three-dimensional reconstruction.

The constrained, fixed and calibrated position between the cameras allows automatic and accurate dimensioning, even in the case of very large environments.

The device is also transportable and lightweight, therefore it can be used in extreme conditions.

The instrument can be scaled up in size by moving the cameras closer and further apart to better adapt it to more or less spacious environments, and in the number of cameras, which can be added in any number if necessary.

The instrument works autonomously and for a long time, so it can therefore be easily mounted on a mobile support for even very long captures and used even by non-expert photogrammeters.

It is possible to integrate with other sensors useful to further improve the accuracy and/or to expand the type of data acquired, such as, for example, thermal cameras.

It is also possible to process or pre-process the data in real time.

The characteristics and advantages of the present invention will be clear from the following detailed description of a practical embodiment thereof, illustrated by way of non-limiting example in the accompanying drawings, wherein:

Figure 1 shows a multi-camera 3D capturing and reconstruction system, viewed from above, in accordance with the present invention;

Figure 2 shows a multi-camera 3D capturing and reconstruction system, viewed from the side, in accordance with the present invention;

Figure 3 shows a block diagram of a control centre of a multicamera 3D capturing and reconstruction system, in accordance with the present invention;

Figure 4 schematically shows a first arrangement of the cameras of a multi-camera 3D capturing and reconstruction system, in accordance with the present invention;

Figure 5 schematically shows a second arrangement of the cameras of a multi-camera 3D capturing and reconstruction system, in accordance with the present invention;

Figure 6 schematically shows a third arrangement of the cameras of a multi-camera 3D capturing and reconstruction system, in accordance with the present invention;

Figure 7 schematically shows a fourth arrangement of the cameras of a multi-camera 3D capturing and reconstruction system, in accordance with the present invention;

Figure 8 schematically shows a fifth camera arrangement of a multi-camera 3D capturing and reconstruction system, in accordance with the present invention.

With reference to the attached figures, a multi-camera 3D capturing and reconstruction system, in accordance with the present invention, comprises a mechanical structure 10 on which at least five cameras 1 1 -15 are placed. On the structure 10 there is also preferably a screen 17 (replaceable with a tablet or smartphone) and preferably at least one LED lighting lamp 18. The screen and the lamp may also be placed on other structures.

Other sensors such as inertial sensors or thermal cameras (not shown) can also be installed on the structure 10.

The cameras 10-15, the screen 17, the lamps 18 and other sensors are electrically connected to a transportable control centre 20 comprising a computer 21 for management of the system, for image capturing and storage and possibly first processing of the captured images, and a power-supply battery 22.

The structure 10, in one embodiment thereof, basically consists of a rod 30 with a first camera 1 1 placed at its front end. A first crossbar 31 , perpendicular to the rod 30, is placed in proximity of the front end of the rod 30. The front cameras 12 and 13 are placed at the ends of the first crossbar 31 .

A second crossbar 32, perpendicular to the rod 30, is placed at the rear of the first crossbar 31. The rear cameras 14 and 15 are placed at the ends of the second crossbar 32.

A vertical handle 33 which allows the operator to carry the system is placed, in a barycentric position, underneath the rod 30.

In an alternative embodiment, the structure 10 comprises a metal plate shaped as a rigid fastening between all the cameras, therefore without rod and crossbars.

With respect to a horizontal axis, having the angle of 0° positioned frontally, namely, in the forward direction of the cameras 1 1 -15, the first front camera 1 1 is directed with the optics pointed at 0°, the second front-right camera 12 is directed with the optics pointed at +45°, the third front-left camera 13 is directed with the optics pointed at -45° (315°), the fourth rear-right camera 14 is directed with the optics pointed at 60°, the fifth rear-left camera 15 is directed with the optics pointed at -60° (300°).

The above angles, in which the cameras 11 -13 are directed, can be varied by approximately ± 10° by shifting on a horizontal plane (rig ht/left) and being positioned along a vertical plane at 0°, namely, in a horizontal position, this angle can vary by approximately ± 20° (up/down).

The above angles, in which the cameras 14-15 are positioned, can be varied during fastening on the structure by approximately ± 20°, both along the horizontal plane (right/left) and along the vertical plane (up/down).

The front cameras 12 and 13 and the rear cameras 14 and 15 are aligned with each other and are located, substantially, at the vertices of a square or a rectangle.

The front camera 1 1 is slightly more forward compared to the front cameras 12 and 13 to avoid interference with them.

The cameras 1 1 -15 all lie substantially on a same horizontal plane.

Preferably, the front camera 1 1 is raised a few centimetres to avoid framing interference with the cameras 12 and 13.

A preferred arrangement of the cameras is that with the cameras 12-15 arranged on the vertices of a square, with the front camera 11 arranged centrally and frontally.

Another arrangement provides the rear cameras 14 and 15 closer to each other at a distance less than the distance between the front cameras 12 and 13.

In this description, distance between the cameras means the distance between the sensor centres of the cameras themselves.

A further arrangement provides the front cameras 12 and 13 closer to each other at a distance less than the distance between the rear cameras 14 and 15.

A further arrangement provides the front cameras 12 and 13, and the rear cameras 14 and 15 closer to each other and arranged on the vertices of a rectangle.

A further arrangement provides the front cameras 12 and 13, and the rear cameras 14 and 15 spaced apart and arranged on the vertices of a rectangle. The distance between the cameras can be defined so that if x is the distance between the cameras 12 and 13 (or between 14 and 15), the distance y between the cameras 13-15 (or between 12 and 14) is equal to or greater than the distance x, where y is greater than or equal to 10cm, therefore the cameras 12-15 will be arranged either on a square or on a rectangle elongated in the direction of movement of the device.

The front camera 1 1 is always arranged centrally and, namely, at a distance x/2 from the camera 12 and the camera 13.

The front camera 1 1 may be aligned with the cameras 12 and 13 or may be in a more forward position compared to them in order to avoid interference.

It is also possible to determine the distance y between the cameras 12 and 14 or between the cameras 13 and 15 based on the width z of the reference environment (typically a tunnel), and preferably the distance y will be equal to z/4 ± 10%.

This geometry is then effective for tunnel widths from z/2 to 4z.

Therefore, for tunnel widths between 0.4m and 8m, the distance y is normally comprised between 10cm and 200cm.

For each camera 1 1 -15, a 2/3" colour camera with a resolution of 2448x2048 pixels at a detector pitch of 3.45 pm was used.

Cameras with different features in terms of resolution and radiometric capabilities can also be used.

The cameras use fisheye optics required to have sufficient coverage between images in the single capture, a robust overlap between successive shots covering a large portion of the tunnel even at a close view, and to have a very broad depth of field allowing the lenses to lock focus. Fisheye optics with equidistant projection, a field of view of 190°, with an image circle of 7.2 mm, and in any case with a minimum field of view of 150° were used.

The camera orientation angles are a function of the coverage of the fisheye optics used and the fisheye projection employed.

A sensor of the global shutter type cameras was also used and therefore an image capturing method was used where all the pixels that make up an image are captured at the same time.

Similarly, the geometric configuration of the cameras is periodically calibrated, by fastening the cameras in their position, with an accuracy of approximately 0.1 mm to ensure accurate scaling of the 3D reconstruction.

The cameras must also be synchronised, this takes place through the use of a hardware connection between the cameras of the device. An accurate synchronisation is important to avoid the introduction of distortions due to the presence of relative movement between the cameras and the framed scene. A synchronisation cable connects the master camera 1 1 to the slave cameras 12-15, the connection ensures that the capturing signal towards the master camera also causes capturing for the slave cameras.

The simultaneous presence of a synchronisation error and of a relative shift between the camera system and the framed scene, causes a deformation of the geometric ratios between the cameras (the calibrated distances) in a variable way as a function of: the type of the synchronisation error, the type of shift and its direction and the distance between the cameras and the surface shot. The presence of this deformation prevents the imposition of the calibrated distances as a fixed constraint during processing, given that the actual position of the cameras at the moment of the shot varies as a function of movement (if one camera shoots even slightly later than another, the distance between the position of the two at the moment of the respective shots increases as the speed of the shift increases); while imposing the calibrated distances as a constraint would shift the deformation in the three-dimensional reconstruction.

The synchronisation problem was approached from the point of view of the cameras. Considering in fact the cameras as stationary and the framed scene in movement. In the presence of a synchronisation error the point P of the scene will be shot at the time tO by camera 1 and at time t1 by the camera 2. The point P at position t1 is projected on the sensor of camera 2 in a different position from the one to which it would have been projected at time tO. The distance between these two points is z and if z is less by two pixels of the sensor of the camera 2, the distortion produces no effect. The synchronisation problem was tackled by defining a maximum threshold for the shift error of a given point of the object in the image space due to the synchronisation error and the presence of relative movement between the object and the camera, equal to 2 pixels.

With a regular slow walking speed of approximately 1 m/s and a distance of the object to be detected of 50cm, the estimated maximum acceptable synchronisation error is approximately 1 ms to satisfy the condition according to which this shift error is less by the size of two pixels.

The synchronisation error measured in one embodiment of the device was 200us, which allows distortion-free capturing of a surface at the distance of 50cm up to a maximum movement speed of 5m/s, which is equivalent to 18km/h.

The images captured by the cameras are stored on the onboard computer 21. The data processing step to obtain a reconstruction in scale, following a photogrammetric pipeline (structure from motion), in which the known calibrated distances between the cameras are provided as a constraint, takes place in postproduction (off-line) but a low-resolution three-dimensional reconstruction can also be performed on board with the computer 21 , in order to verify the results previewed on the screen 17.

The instrument is carried by hand by an operator who can set the shooting parameters from the screen 17 and control the shooting of images during the survey.

Images can be shot in different modes such as single shot, timed sequence shot or video. Once the capturing has started, the operator simply walks around the scene framing the object.