Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AN IMAGE PROCESSING METHOD AND SYSTEM
Document Type and Number:
WIPO Patent Application WO/2020/016612
Kind Code:
A1
Abstract:
The present invention relates to a method for processing images. The method includes: capturing one or more source data images of a first subject; capturing one or more target data images of a second subject; and processing the one or more source data images and one or more target images to swap the head of the first subject with the head of the second subject using a processing method. A system for processing images is also disclosed.

Inventors:
KONSTANTINIDIS IOANNIS (GB)
SAWAS JAMIL (GB)
LEFAKIS CHRISTOS (GB)
Application Number:
PCT/GB2019/052043
Publication Date:
January 23, 2020
Filing Date:
July 19, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SUPERPERSONAL LTD (GB)
International Classes:
H04N5/232; G06K9/00; G06T7/174
Other References:
PABLO GARRIDO ET AL: "Automatic Face Reenactment", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 8 February 2016 (2016-02-08), XP080682208, DOI: 10.1109/CVPR.2014.537
YUEZUN LI ET AL: "In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 7 June 2018 (2018-06-07), XP080888350
THIES JUSTUS ET AL: "Face2Face: Real-Time Face Capture and Reenactment of RGB Videos", 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 27 June 2016 (2016-06-27), pages 2387 - 2395, XP033021418, DOI: 10.1109/CVPR.2016.262
Attorney, Agent or Firm:
RATIONAL IP LIMITED (GB)
Download PDF:
Claims:
Claims

1. A method for processing images, including:

a) capturing one or more source data images of a first subject;

b) capturing one or more target data images of a second subject; and c) processing the one or more source data images and one or more target images to swap the head of the first subject with the head of the second subject using a processing method. 2. A method as claimed in claim 1 , wherein the one or more source data images form a video sequence.

3. A method as claimed in any one of the preceding claims, wherein the one or more target data images form a video sequence.

4. A method as claimed in any one of the preceding claims, wherein the first and second subjects are human beings.

5. A method as claimed in any one of the preceding claims, wherein the processing method uses a network.

6. A method as claimed in claim 5, wherein the network is a neural network or a generative adversarial network. 7. A method as claimed in claims 5 or 6, wherein the network includes an auto-encoder.

8. A method as claimed in any one of the preceding claims, further including:

preprocessing the one or more source data images using a first preprocessing method; and

preprocessing the one or more target data images using a second preprocessing method.

9. A method as claimed in claim 8, wherein the first and second preprocessing methods are the same.

10. A method as claimed in any one of claims 8 to 9, wherein the first and/or second preprocessing methods include one or more steps selected from the set of:

i) applying a face detection algorithm to each image and deleting images where the faces do not match predefined conditions;

ii) applying a motion blur detection algorithm to each image and deleting images which exceed a blurriness threshold;

iii) aligning the heads within the image to a zero roll;

iv) cropping the images to isolate the head in the image;

v) scaling the images;

vi) segmenting the images to isolate the head in the image; and v) colour matching the target data images to source data images and/or vice versa.

1 1 . A method as claimed in any one of the preceding claims, wherein the one or more source data images are captured in accordance with a first protocol.

12. A method as claimed in claim 1 1 , wherein the one or more target data images are captured in accordance with a second protocol. 13. A method as claimed in claim 12, wherein the first and second protocols are the same.

14. A method as claimed in any one of claims 12 or 13, wherein the first and/or second protocols include one or more of the following conditions:

i) the head of the subject is wholly in the frame;

ii) the background is white or a light off-white colour;

iii) the lighting conditions are uniform with no hard or directional light; iv) the head of the subject includes side-to-side and up-and-down motion;

v) the head of the subject includes movement in rotation;

vi) the head of the subject includes movement in a spiral;

vii) the head of the subject includes a plurality of expressions including at least talking; and

viii) the head of the subject includes various lighting conditions.

15. A method as claimed in any one of the preceding claims, further including generating one or more additional target images by projecting the one or more target images onto an 3D head object, modifying attributes of the 3D object, and capturing the one or more additional target images at differing attributes. 16. A method as claimed in claim 15, wherein the attributes include one or more selected from the set of rotation and lighting conditions.

17. A method as claimed in any one of the preceding claims when dependent on claim 5, wherein the network is trained, at least in part, using the one or more source images and/or the one or more target images.

18. A method as claimed in claim 17, wherein, prior to training with the one or more target images, the network is selected from a plurality of networks pre-trained with a different set of images.

19. A method as claimed in claim 18, wherein the set of images differ by facial details, facial structures, skin colour, hair colour, and/or hair style. 20. A method as claimed in any one of claims 18 to 19, wherein the network is selected based upon the similarity of the one or more target images to the set of images which are used to pre-train the network.

21 . A method as claimed in any one of the preceding claims when dependent on claim 17, wherein the processing method includes applying the trained network to the one or more source images to generate one or more headshot images.

22. A method as claimed in claim 21 when dependent on claim 10, wherein the processing method includes compositing the one or more headshot images with the one or more source images using properties determined during the first and/or second preprocessing methods.

23. A method as claimed in claim 22, wherein the properties include position and/or roll.

24. A system for processing images, including:

A source image capture system configured to capture one or more source data images of a first subject;

A target image capture system configured to capture one or more target data images of a second subject; and A processor configured to process the one or more source data images and one or more target images to swap the head of the first subject with the head of the second subject in accordance with a processing method.

25. Software configured for performing the method of any one of claims 1 to 23.

26. An electronically readable medium configured to store the software of claim 25.

Description:
An Image Processing Method and System

Field of Invention

The present invention is in the field of image processing. More particularly, but not exclusively, the present invention relates to processing images to combine information between source and target images.

Background

In certain applications, it can be useful to process photographic/video images from different sources to generate composite images.

Traditionally, this process involves user manipulation of image processing software to identify elements from the images to generate the composite image. Clearly, this process does not scale to enable fast processing of large volumes of images to generate composite images. This limits the application of image compositing.

There is a desire to apply image compositing to different fields where responsiveness and scalability of the process is desirable. For example, it would be desirable if images of a user could be used to customise advertising or product visualisations to enhance purchase decisions.

It is an object of the present invention to provide an image processing method and system which overcomes the disadvantages of the prior art, or at least provides a useful alternative.

Summary of Invention

According to a first aspect of the invention there is provided a method for processing images, including: a) capturing one or more source data images of a first subject;

b) capturing one or more target data images of a second subject; and c) processing the one or more source data images and one or more target images to swap the head of the first subject with the head of the second subject using a processing method.

According to another aspect of the invention there is provided a system for processing images, including:

A source image capture system configured to capture one or more source data images of a first subject;

A target image capture system configured to capture one or more target data images of a second subject; and

A processor configured to process the one or more source data images and one or more target images to swap the head of the first subject with the head of the second subject in accordance with a processing method.

Other aspects of the invention are described within the claims.

Brief Description of the Drawings

Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings in which:

Figure 1 : shows a block diagram illustrating a system in accordance with an embodiment of the invention;

Figure 2: shows a flow diagram illustrating a method in accordance with an embodiment of the invention;

Figure 3: shows a flow diagram illustrating an image capture method in accordance with an embodiment of the invention; Figure 4: shows a flow diagram illustrating an image preprocessing method in accordance with an embodiment of the invention;

Figure 5: shows a flow diagram illustrating a method in accordance with an embodiment of the invention;

Figure 6: shows a diagram illustrating capture of an image in accordance with an embodiment of the invention;

Figure 7: shows a diagram illustrating a laplacian kernel for use with embodiments of the invention;

Figure 8: shows a diagram illustrating face alignment in accordance with an embodiment of the invention;

Figure 9: shows a diagram illustrating training of a neural network in accordance with an embodiment of the invention;

Figure 10: shows a diagram illustrating a convolutional filter for use with an embodiment of the invention;

Figure 1 1 : shows a diagram illustrating an example of the result of a convolutional filter such as in Figure 10;

Figure 12: shows a diagram illustrating training of a neural network in accordance with an embodiment of the invention;

Figure 13: shows a diagram illustrating training of a neural network in accordance with an embodiment of the invention; and

Figure 14: shows a flow diagram illustrating a method in accordance with an embodiment of the invention. Detailed Description of Preferred Embodiments

The present invention provides a method and system for processing images.

The inventors have discovered that source images and target images can be automatically processed to swap the head of a subject within the source image with the head of a subject in the target image. In this way composite images can be generated quickly and at scale to facilitate the personalisation of, for example, content where the head of a user could replace that of a model modelling a product such as clothing.

In Figure 1 , a system 100 in accordance with an embodiment of the invention is shown.

A capture system 101 for source images is shown. The source capture system 101 may include one or more cameras 102, such as still or video cameras, one or more lights 103, and a backdrop 104. One or more capture systems 105 for target images are shown. Each target capture system 105 may include one or more cameras 106, such as still or video cameras, one or more lights 107, and a backdrop 108. The one or more cameras 106 may be included within a user device 109 such as a mobile computing device (e.g. a smartphone or tablet).

One or more preprocessors 1 10 are shown. Each preprocessor 110 may be configured to process the source and/or target images captured to clean up, align, crop, and resize the images. One or more processors 1 1 1 are shown. The processor(s) 1 1 1 may be configured to use a neural network such as an autoencoder to create an algorithmic model to subsequent generate composite images where head information between the source and target images has been swapped. In some embodiments, the auto encoder is a deep neural network and comprises an encoder and two decoders. The source images may be compressed by the encoder and decoded by the first decoder. A loss value between a source image and the decoded source image may be generated representing the difference between the two images which the neural network improves over time. The target images may be compressed by the encoder and decoded by the second decoder. A loss value between a target image and the decoded target image may be generated representing the difference between the two images which the neural network improves over time. The process of compressing and decoding the source and target images is repeated until the loss value falls below a desired threshold. An algorithmic model for the target images may then be derived from the neural network. The model may be applied to the preprocessed source images to produce final target headshot images. The final target headshot images may be aligned and superimposed over the original source images to generate the composite images.

The preprocessors 1 10 and/or processors 1 1 1 may be located within a server 1 12 or may be distributed across a plurality of servers or computing devices. In one embodiment, a preprocessor 1 10 for processing target images is located at the user device 109 with the one or more cameras 106 for capturing target images.

The composite images may be transmitted from the server 1 12 to the user device 109.

A communications system 1 13 as shown may be configured to facilitate transmission of image data between the capture systems 101 and 105, preprocessors 1 10, and processors 1 1 1. The communication system 1 13 may be a network or a combination of multiple networks (such as the Internet). It will be appreciated by those skilled in the art that other infrastructure combinations may be used.

Referring to Figure 2, a method 200 for processing images in accordance with embodiments of the invention will be described.

In step 201 , the source image(s) are captured, for example, in accordance with a method described in relation to Figure 3.

In step 202, the source image(s) are preprocessed, for example, in accordance with a method described in relation to Figure 4.

In step 203, the target image(s) are captured, for example, in accordance with a method described in relation to Figure 3.

In step 204, the target image(s) are preprocessed, for example, in accordance with a method described in relation to Figure 4.

In step 205, a neural network is trained using the source and target image(s) to generate a unique algorithmic model for the target image(s).

The neural network may be an autoencoder. In some embodiments, the auto encoder is a deep neural network and comprises an encoder and two decoders. The source images may be compressed by the encoder and decoded by the first decoder. A loss value between a source image and the decoded source image may be generated representing the difference between the two images which the neural network improves over time. The target images may be compressed by the encoder and decoded by the second decoder. A loss value between a target image and the decoded target image may be generated representing the difference between the two images which the neural network improves over time. The process of compressing and decoding the source and target images is repeated until the loss value falls below a desired threshold. An algorithmic model for the target images may then be derived from the neural network.

In one embodiment, one of a plurality of unique algorithmic models is selected for the neural network for the target images. The models have been previously generated from different users or different subjects. The model may be selected using a facial recognition method to identify the closest similar subject within the models to the subject of the target images.

In other embodiments, the neural network is a generative adversarial network instead of an autoencoder.

In step 206, the unique algorithmic model is applied to the preprocessed source image(s) to generate target headshot images.

In step 207, the target headshot images are re-aligned and superimposed over the original source image(s)

In step 208, the composite images are transmitted to a user device (e.g. 109).

Referring to Figure 3, an image capture method 300 in accordance with an embodiment of the invention will be described.

In step 301 , the head of the subject is positioned to be wholly within the frame of the capture device (e.g. a camera or video camera such as 106 or 102).

In step 302, the background, for example, via the backdrop (e.g. 104 or 108), is configured to be of a white or light monochromatic colour.

In step 303, the lighting conditions are configured, for example with one or more lights (e.g. 103 or 107), to be uniform with no hard or directional light. In step 304, the image(s) are captured. In embodiments, for target images, the subject rotates their head in a circle and/or moves their head side-to-side and/or up-and-down.

Referring to Figure 4, an image preprocessing method 400 in accordance with an embodiment of the invention will be described.

A series of images for one subject is provided as input to the image preprocessing method 400.

In some embodiments, in step 401 , applying a face detection algorithm to each image and deleting images where the faces do not match predefined conditions.

In some embodiments, in step 402, applying a motion blur detection algorithm to each image and deleting images which exceed a blurriness threshold.

In some embodiments, in step 403, aligning the heads of the subject within each image to a zero roll.

In some embodiments, in step 404, cropping the images to isolate the subject’s head in the image.

In some embodiments, in step 405, scaling the images.

In some embodiments, in step 406, segmenting the images to isolate the subject’s head in the image.

In some embodiments, in step 407, colour matching the series of images to another series of images and/or vice versa. For example, colour matching target image(s) to source image(s). Referring to Figures 5 to 14, embodiments of the invention will be described. A1 a. Capture Source Material for Final Content Use

Using controlled shooting conditions the images and videos required for the final piece of content are captured. The main consideration is that the shooting conditions should be possible to be easily replicated by the user during their recording.

In one embodiment, the source model for the user may be a CG created model of a human rather than a real human. In this case, the capturing of the images of the source model would use a virtual camera.

1. The head of the subject needs to be wholly in the frame.

2. The background needs to be of a white or a light off-white colour.

3. The lighting conditions should be uniform with no hard or directional light.

A1 b. Additional Image Capturing of the Source Subject

Around a thousand video frames of the source face may be sufficient for an acceptable quality of the swapping. But in case the capturing of A1 a. produces fewer images it is preferable to record more images following the same conditions as in A1 a. Shooting a 40 second video may be sufficient.

In some embodiments, less than 1000 frames may be used. For example, acceptable results may be possible with 32 images/frames. An acceptable result may even be possible with just one image.

Additionally may be important to capture the source head from many angles. A simple rotation of the head as well as a side-to-side and up-and-down motion may be sufficient. A2a. Source Data Clean Up

A face detection algorithm built using deep learning from an open source library called Dlib may be used. Alternatively another algorithm available through Dlib called Histogram of Oriented Gradients to achieve the same results may be used. There are several face detection solutions similar to the ones provided by Dlib (for example commercial software like Face++) and they all analyse the image to extract three properties:

1. Detection and Location of the Face bounding box (i.e. the (x, y)-coordinates of the face in the image).

2. Key facial structures on the face region using 68 landmarks that correspond to points on facial features (ie. Mouth, Right eyebrow, Left eyebrow, Right eye,

Left eye, Nose, Jaw)

3. Estimates of the heads orientation in the image (ie. yaw, pitch, roll) The output of the analysis in a visual format is shown in Figure 6.

The blue box provides the location of the face while the colored dots indicate the location of the facial landmarks outlining the contour and elements of the face.

Additionally, the algorithm provides the estimates of the head’s yaw, pitch, and roll.

Based on the face detection analysis and the facial landmarks positions, images containing multiple faces, partially hidden faces (i.e., with one or more landmarks missing), and overly small faces (i.e., where the distance between the eyes is below 40 pixels) are removed. Also removed are faces that are rotated away from the camera (i.e., with a yaw greater than 45 degrees and a pitch greater than 30 degrees).

Additionally a motion blur detection algorithm is applied that works by calculating the total variance of the laplacian of an image by taking the greyscale channel of an image and convolving it with the laplacian 3 x 3 kernel (as shown in Figure 7).

The variance of the response is then taken which provides a quick and accurate method for scoring how blurry the image is. Based on the scoring images that are blurry may be automatically removed.

A2b. Source Data Alignment

Referring to Figure 8, next all faces should be aligned to be at 0 degrees Roll so that there is common orientation with the user images which will also also aligned at 0 degrees Roll. This will make it easier for the neural network algorithm to execute the algorithmic operation. Based on the orientation results the images are aligned (rotated) so that the resulting Roll information of the face is 0 degrees. Pitch and Yaw alignment is not relevant to this process as it is not desirable for the neural network to learn the visual variation in Pitch and Yaw whereas the Roll differences are not relevant to the final swapping.

Additionally the original position and Roll information of the head is recorded in all the images of stage A1 a. This is crucial for stage D1 when the process is inverted.

A2c. Source Data Cropping

The images are then cropped to isolate the head in the image and remove any visual information that is not needed, like body or clothes. It is important that the head does not reach the end of the frame and that all visual information of the head is within the frame, otherwise the neural network tries to create the missing information causing artifacts and smudged areas. To make sure there is enough space at the outside of the head area the image is cropped by calculating the distance on the outside of the face of the equivalent of the distance between the two outermost horizontal landmark points (ie. the landmarks at the top of the jawline).

A2d. Source Data Resizing

The resulting images of the dataset are then scaled to 256x256 pixels images. B1 a. User/Target Image Capturing

The user records their head following the same general controlled conditions as in step A1 a.

1. The head of the subject needs to be wholly in the frame.

2. The background needs to be of a white or a light monochromatic colour.

3. The lighting conditions should be uniform with no hard or directional light.

Recording a 40 second video is sufficient, but a combination of images and videos, or just videos are also good options.

To get the best results the user needs to rotate their head in a circle as well as move it side-to-side and up-and-down.

Subsequently the user sends the recorded data to a server for processing. In some embodiments, processing may occur at a user device.

B2a. Target Data Clean Up B2b. Target Data Alignment

B2c. Target Data Cropping

B2d. Target Data Resizing All steps from B2a. to B2d. are identical to steps A2a. to A2d.

The results of A2 and B2 are two datasets of different head identities (source and target) but which hold the same overall properties. 1 . The images of both sets are of the same size

2. The images of both sets are aligned to be 0 degrees Roll

3. The heads within the images of both sets are of the same size

4. Lighting and background conditions are similar. C1 . Neural Network Training

A neural network commonly known as an Autoencoder is employed to create an algorithmic model that will enable the swapping of the head information between source and target images.

An autoencoder learns to compress (encode) data from the input layer into a short code, and then uncompress that code (decode) into something that closely matches the original data. This forces the autoencoder to engage in dimensionality reduction.

(Alternatively a Generative Adversarial Network could be used to create similar results.)

The autoencoder uses one encoder and two decoders as shown in Figure 9.

The autoencoder is a Deep Neural Network. In particular the Encoder is comprised of a series of layers consisting of multiple convolutional steps. Each convolutional step is done for every location (l,j)(l,j) of the input image II that completely overlaps with the convolutional filter as shown in Figure 10.

The result of a convolution depends on the value of the convolutional filter. An example is seen on Figure 1 1 .

Every layer of the Encoder executes a series of distinct convolutional steps resulting to a compressed/encoded representation in the form of a code.

Decoders A and B work in a similar fashion, only the convolutional layers work backwards to try to reconstruct the original image.

In more detail the autoencoder process happens in the following manner:

1. Referring to Figure 12, first a source image (A1 ) passes through the encoder to create a compressed representation. The compressed representation is then decoded, with Decoder A, to produce a Reconstructed representation of the Source Data (A1 r). A1 r is compared to A1 and a loss value is generated representing the difference between the two images. The task of the neural network is to lower this loss value as much as possible. As more images pass through the system both the Encoder and Decoder A improve lowering the loss value.

2. Referring to Figure 13, after the network has been trained for A1 -to-A1 r function, this process is repeated for B1 -to-B1 r while retaining the trained properties of the Decoder. As the B1 images pass through the Encoder and then are decoded with Decoder B to produce B1 r a new loss value is generate which the system tries to improve over time.

3. Steps 1 and 2 are then repeated several times progressively improving both the Encoder as well as Decoders A and B. 4. The process stops automatically when the loss value of A1 -to-A1 r and B1 -to-B1 r has reached the desired level.

C2. Unique Algorithmic Model (UAM)

From step C1 a Unique Algorithmic Model is derived which is in essence the resulting B1 -to-B1 r.

A1 -to-A1 r part of the algorithm is not needed for the next steps and discarded from the model.

C3. UAM Database

In some embodiments, the creation of a large database of Unique Algorithmic Models may be undertaken, each model of which corresponds to different target facial details, facial structures, skin colour, hair colour and hair style. New users are matched to the closest existing UAMs from the archive and the training of the autoencoder begins midway through the process to create the new UAM saving processing time and cost.

The Dlib Face Recognition algorithm, which is a common pretrained deep neural network of 19 convolutional layers, is used. Once a new user’s headshot is passed through the algorithm the closest match from the UAM Database is obtained together with a matching accuracy value. If the value is above the desired threshold the matching UAM is used as a starting point for the creation of the user’s UAM.

D1 a. Application of UAM on final content headshot images

The final UAM is applied on the original headshot images captured at A1 a and processed during A2. This results to the final target headshot images. These are the headshots of the final content that is going to be sent to the user. D1 b. Compositing of the final content headshot images

The generated headshot images are then re-aligned and superimposed over the original captured images (A 1 a) using the properties (position and roll) recorded at stage A2b. In essence this is the inverted process of stage A2b and the goal is to align the generated image to the coordinates of the source image. Finally the edges of the superimposed generated headshots are blended with the corresponding pixels of the source content to create a unified final result. Depending on the exact look and lighting conditions of the source content the contrast of the generated headshots may need to be increased. For this a simple contrast enhancing function may be used.

E1. Final Content sent to User

The resulting content is sent from the server to the User. In some embodiments where the processing occurs at the user device, this step is, of course, not necessary.

In some embodiments, the following aspects may also be useful:

F. Universal Middle Agent Referring to Figure 14, once the UAM for a new user has been created this UAM can be reapplied on new source content as long as the identity of the person in the source content is the same. For some applications this is not possible and the user might need to head-swap a number of different people (eg. source content of a fashion shoot that uses several different models). Creating UAMs for all the different source identities would be inefficient and very costly. To solve the above a Universal Middle Agent may be created. To do this the process proceeds from A to E but the source head is trained against a head of another person (the Universal Middle Agent). This is not the user but rather a proxy that sits in between the multiple source identities and the user. Subsequently the training process is repeated treating the Universal Middle Agent as the source content.

This way the data of all the users can be trained against this Universal Middle Agent rather than the actual source(s). Therefore changing the source or having multiple different sources will not require a seperate UAM for an individual user.

In some embodiments, for example, using some types of GANs, the user does not need to be trained against a model, so the creation of the Universal Middle Agent may not be necessary.

Background Reconstruction through segmentation

Some applications do not allow for controlled background conditions (ie. white background) for either the source or the target. For example replacing the head on a fashion photoshoot that was done outdoors.

In this case a segmentation algorithm may be employed that automatically separates the body and head of a person from the background and reapplies it above a white background. In essence common conditions are forced between source and target to enable accurate head swapping. The final result is therefore presented on a white background unless the body is segmented again and the final result applied on a different background.

There are several commercial solutions that enable body/head segmentation the best of which use artificial intelligence to segment either the whole body or individual parts of the whole body (hair, clothes, skin etc.) Body Database

Some application that require very accurate solutions for personalisation have to take into account the general body size of the person and also the skin colour. For this when recording the source content the recording is repeated several times using individuals of different body sizes and skin colours, building a Body Database. The Universal Middle Agent process is used to create source content with different body types but the same head. This means that the user can change between body types and skin colour without the need to train a new UAM which would make it costly and inefficient.

Data Augmentation Some applications prohibit the recording of enough data by the target. In this case a data augmentation methodology may be implemented using a 3D face reconstruction process, where the original image of the user’s head is projected on an approximate 3D head object. The 3D object is subsequently rotated and further images are recorded capturing different angles of the object.

An extension of this data augmentation is the use of 3D lights to create images with additional lighting conditions. Same as above the original image is projected on the 3D object and then a new lighting condition is applied on the object which is subsequently recorded and used as part of the data.

Potential advantages of some embodiments of the present invention include that greater realism across a greater range of subjects is possible over the prior art and that processing of images to composite headshots is both fast and scalable. While the present invention has been illustrated by the description of the embodiments thereof, and while the embodiments have been described in considerable detail, it is not the intention of the applicant to restrict or in any way limit the scope of the appended claims to such detail. Additional advantages and modifications will readily appear to those skilled in the art. Therefore, the invention in its broader aspects is not limited to the specific details, representative apparatus and method, and illustrative examples shown and described. Accordingly, departures may be made from such details without departure from the spirit or scope of applicant’s general inventive concept.