PANDIAN ENRICO (IT)
US20150334302A1 | 2015-11-19 | |||
US8462206B1 | 2013-06-11 | |||
US20020159628A1 | 2002-10-31 | |||
US20030038801A1 | 2003-02-27 | |||
US20140354769A1 | 2014-12-04 | |||
US20150365636A1 | 2015-12-17 | |||
US20170193680A1 | 2017-07-06 |
CLAIMS 1 . Video footage device, the device consists of set of cameras supported by a cameras support; the cameras have high resolution and can be of different type, such as: fish eyes, linear and depth cameras; a rotation plate is provided which is a disk and it is put in motion by a stepper motor and fully controlled from the remote through a web interface; the rotation velocity, direction and angle are controlled parameters; the rotating plate is configured to expose the product in front of the cameras from different angles and distances; the rotating plate may consist of multiple rotating plates of different diameter stacked one over the other. 2. Video footage device according to claim 1 , wherein the rotation plate may be constructed of the transparent material, suited for the footage for the bottom. 3. Video footage device according to one or more of the preceding claims, wherein an illumination device is provided, which is composed of a set of lamps. 4. Video footage device according to one or more of the preceding claims, wherein the device is situated within a box and the specific background of the box can be exchanged. 5. Video footage device according to claim 4, wherein the background can be of different color and may or may not have patterns on it. 6. Video footage device according to one or more of the preceding claims, wherein the device is configured so that when it starts rotating a video stream is sent remotely, however a copy of the video is being stored on the local device. 7. Video footage device according to one or more of the preceding claims, wherein right after the cycle of video footage of a product has finished, one other product can be put to be scanned by the device. 8. A 3D model reconstruction method on the output videos from the device of claim 1 , characterized in that the output videos are used as input for the algorithm able to create a 3D model of the product. 9. Method according to claim 8, wherein the 3D model is then used for the data augmentation, in this part one or more 3D models of products are processed at once by an algorithm to produce 2D images with many variation in scale and/or occlusion and/or translation and/or rotation and/or illumination and/or other conditions. 10. Method according to claim 8 or 9, wherein high resolution 2D images are generated, and they form the dataset with the annotations necessary for the deep neural network. |
BACKGROUND OF THE INVENTION
The present invention relates generally to deep neural networks and more particularly, to a method and apparatus that permits to automate the routine of a data set creation.
Deep learning is the application of artificial neural networks to learning tasks that contain more than one hidden layer. Deep learning architectures such as deep neural networks and recurrent neural networks have been applied to fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation and bioinformatics where they produced results comparable to and in some cases superior to human experts.
The quality of the data set is essential for the training of the deep learning networks. One of the hardest problems to solve in deep learning has nothing to do with neural nets, it's the problem of getting the right data in the right format. Deep learning models needs a good training set to work properly. Collecting and constructing the training set takes time and domain-specific knowledge of where and how to gather relevant information. The problem of data is perfectly described by the renowned scientist Andrew Ng in this analogy with a rocket
"I think Al is akin to building a rocket ship. You need a huge engine and a lot of fuel. If you have a large engine and a tiny amount of fuel, you won't make it to orbit. If you have a tiny engine and a ton of fuel, you can't even lift off. To build a rocket you need a huge engine and a lot of fuel. The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms. " Andrew Ng
Incorrectly constructed data set can lead to a poorly performing networks despite its logical potential. In Table below we group some of the details of the Deep Learning Models and its data sets VGGNet Pascal VOC COCO
Used For Image Recognition Image Recognition Image Recognition and Segmentation and Segmentation
Input Images Images Images
Output 1000 Categories 20 object classes 80 object categories
Data Size 2 images with 9993 segmented 2.5 segmented
assigned Category images object instances
Quoting the article, "Microsoft COCO: Common Objects in Context"
"Segmenting 2,500,000 object instances is an extremely time consuming task requiring over 22 worker hours per 1,000 segmentations."
So in the recent times it is sorted the huge need to develop an intelligent semi-automatic systems that will aid humans in the creation of data sets suited for the training of deep neural networks.
SUMMARY
The present invention relates to a video footage device, the device consists of set of cameras supported by a cameras support; the cameras have high resolution and can be of different type, such as: fish eyes, linear and depth cameras; a rotation plate is provided which is a disk and it is put in motion by a stepper motor and fully controlled from the remote through a web interface; the rotation velocity, direction and angle are controlled parameters; the rotating plate is configured to expose the product in front of the cameras from different angles and distances; the rotating plate may consist of multiple rotating plates of different diameter stacked one over the other.
In an embodiment, the rotation plate may be constructed of the transparent material, suited for the footage for the bottom.
In an embodiment, an illumination device is provided, which is composed of a set of lamps.
In an embodiment, the device is situated within a box and the specific background of the box can be exchanged. In an embodiment, the background can be of different color and may or may not have patterns on it.
In an embodiment, the device is configured so that when it starts rotating a video stream is sent remotely, however a copy of the video is being stored on the local device.
In an embodiment, right after the cycle of video footage of a product has finished, one other product can be put to be scanned by the device.
The invention further relates to a 3D model reconstruction method on the output videos from the device described above, wherein the output videos are used as input for the algorithm able to create a 3D model of the product.
In an embodiment, the 3D model is then used for the data augmentation, in this part one or more 3D models of products are processed at once by an algorithm to produce 2D images with many variation in scale and/or occlusion and/or translation and/or rotation and/or illumination and/or other conditions.
In an embodiment, high resolution 2D images are generated, and they form the dataset with the annotations necessary for the deep neural network.
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 . Shows the video footage device scheme. Two views are presented: front view and the top view. Video footage device is composed of the rotating plate and four cameras. Cameras are placed in the vertical plane as shown on the figure.
Camera 1 (horizontal): acquires video stream in horizontal plane, 0°
Camera 2 (oblique): acquires video stream in plane rotated -45°
Camera 3 (top): acquires video stream in plane rotated -90°
Camera 4 (bottom): acquires video stream in plane rotated 90° DESCRIPTION
Accurate three-dimensional shape reconstruction of objects using a video footage device can be achieved by using 3D imaging techniques along with the relative pose (i.e., translation and rotation) of a the object. For this scope it is proposed the novel device and the corresponding methods to construct the data set suited for the training of deep neural network.
The data set creation is performed in the following steps
Step 1 . Video capturing
Step 2. Video processing
Step 3. Object extraction from video
Step 4. 3D object model reconstruction
Step 5. Metadata export
Next Patent: DIAGNOSTIC AND TREATMENT KIT