Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS AND SYSTEMS FOR PROCESSING DIGITAL VIDEO FILES FOR IMAGE INSERTION INVOLVING COMPUTERIZED DETECTION OF SIMILAR BACKGROUNDS
Document Type and Number:
WIPO Patent Application WO/2017/066874
Kind Code:
A1
Abstract:
Computerized processing of frames of one or more digital video files for image insertion involves determining first Slot Data for a Placement Slot of a first frame for insertion of a first image, analyzing the first frame and the second frame to determine if the first frame and the second frame have a Similar Background, and upon detecting a Similar background, determining, based on the first Slot Data, second Slot Data for a Placement Slot of the second frame for insertion of a second image. Further, second Object Data for insertion of the second image in the Placement Slot of the second frame may be determined based on the first Object Data, or the second image may be adjusted for insertion in the Placement Slot of the second frame based on a relationship between the first Slot Data and first Object Data.

Inventors:
DHARSSI FATEHALI (CA)
Application Number:
PCT/CA2016/051213
Publication Date:
April 27, 2017
Filing Date:
October 19, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
DHARSSI FATEHALI (CA)
International Classes:
H04N21/854; G06Q30/02; H04N21/44; H04N21/458
Foreign References:
US20140359656A12014-12-04
US20080243636A12008-10-02
US20030028432A12003-02-06
US20150071613A12015-03-12
CA2875891A12012-12-13
Attorney, Agent or Firm:
PARLEE MCLAWS LLP (CA)
Download PDF:
Claims:
CLAIMS

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:

1. A method for processing a plurality of frames of at least one digital video file for image insertion, the plurality of frames comprising a first frame and a second frame, the method comprising the steps of:

(a) determining first Slot Data for a Placement Slot of the first frame for insertion of a first image;

(b) analyzing, using a computer system, the first frame and the second frame to determine if a background image in the second frame has a threshold degree of similarity with a background image in in the first frame (detecting a Similar Background); and

(c) in response to detecting the Similar Background, using the computer system to take a related action comprising:

(i) determining, based on the first Slot Data, second Slot Data for a

Placement Slot of the second frame for insertion of a second image.

2. The method of claim 1 wherein:

(a) the method further comprises the step of determining first Object Data for the first image; and

(b) the related action further comprises the step of:

(i) determining, based on the first Object Data, second Object Data for insertion of the second image in the Placement Slot of the second frame.

3. The method of claim 2 wherein the second Object Data is determined to be the same as the first Object Data.

4. The method of claim 2 wherein the related action further comprises providing the second Object Data to another computer system.

5. The method of claim 4 wherein the another computer system is controlled by an advertising exchange. The method of claim 4 wherein providing the second Object Data is responsive to a request to view the at least one digital file comprising the second frame.

The method of claim 2, wherein the related action further comprises the steps of:

(i) obtaining data encoding the second image, wherein the second image is formatted to conform with the determined second Object Data; and

(ii) modifying data encoding the second frame with the data encoding the second image so as to insert the second image in the second frame.

The method of claim 7 wherein the data encoding the second image is different from the data encoding the first image.

The method of claim 1 wherein:

(a) the method further comprises the step of determining a relationship between the first Slot Data and the first Object Data for adjusting the first image for insertion in the Placement Slot of the first frame;

(b) the related action further comprises the steps of:

(i) obtaining data encoding the second image;

(ii) determining second Object Data for the second image;

(iii) adjusting the second image for insertion in the Placement Slot of the Second Frame based on the determined relationship, the second Slot Data and the second Object Data; and

(iv) modifying data encoding the second frame with the data encoding the adjusted second image so as to insert the adjusted second image in the second frame.

The method of any one of claims 1 to 9 wherein the related action further comprises the steps of:

(i) obtaining viewer data related to a viewer of the second image; and

(ii) selecting the second image based on the viewer data. The method of any one of claims 1 to 9 wherein the related action further comprises the steps of:

(i) obtaining advertiser data related to an advertiser of the second image; and

(ii) selecting the second image based on the advertiser data.

The method of any one of claims 1 to 9 wherein the Slot Data comprises a position within the frame suitable for image insertion.

The method of any one of claim 1 to 9 wherein the Object Data comprises at least one of: appropriate type of image; size of image; lighting information; shadow information; and scene content.

The method of any one of claims 1 to 9 wherein the at least one digital video file comprises a single digital video file that comprises both the first frame and the second frame.

The method of any one of claims 1 to 9 wherein the at least one digital file comprises a first digital video file and a different second digital video file, wherein the first digital video file comprises the first frame, and wherein the second digital video file comprises the second frame.

The method of any one of claims 1 to 9 wherein the second image is an image of a product.

The method of any one of claims 1 to 9 wherein the second image is an image of an advertisement.

A system for processing a plurality of frames of at least one digital video file for image insertion, the plurality of frames comprising a first frame and a second frame, the system comprising a computing system comprising a processor and a memory comprising a non-transitory medium storing instructions executable by the processor to implement a method comprising the steps of:

(a) storing first Slot Data for a Placement Slot of the first frame for insertion of a first image;

(b) analyzing the first frame and the second frame to determine if a background image in the second frame has a threshold degree of similarity with a background image in in the first frame (detecting a Similar Background); and

(c) in response to detecting the Similar Background, using the computer system to take a related action comprising:

(i) determining, based on the first Slot Data, second Slot Data for a

Placement Slot of the second frame for insertion of a second image.

19. The system of claim 18 wherein:

(a) the method further comprises the step of storing first Object Data for the first image; and

(b) the related action further comprises the step of:

(i) determining, based on the first Object Data, second Object Data for insertion of the second image in the Placement Slot of the second frame.

20. The system of claim 19 wherein the second Object Data is determined to be the same as the first Obj ect Data.

21. The system of claim 19 wherein the related action further comprises providing the second Object Data to another computer system.

22. The system of any one of claim 21 wherein providing the second Object Data is responsive to a request to view the at least one digital file comprising the second frame.

23. The system of claim 19, wherein the related action further comprises the steps of:

(i) obtaining data encoding the second image, wherein the second image is formatted to conform with the determined Second Object Data; and

(ii) modifying data encoding the second frame with the data encoding the second image so as to insert the second image in the second frame.

24. The system of claim 23 wherein the data encoding the second image is different from the data encoding the first image.

25. The system of claim 18 wherein:

(a) the method further comprises the step of determining a relationship between the first Slot Data and the first Object Data for adjusting the first image for insertion in the Placement Slot of the first frame;

(b) the related action further comprises the steps of:

(i) obtaining data encoding the second image;

(ii) determining second Object Data for the second image;

(iii) adjusting the second image for insertion in the Placement Slot of the Second Frame based on the determined relationship, the second Slot Data and the second Object Data; and

(iv) modifying data encoding the second frame with the data encoding the adjusted second image so as to insert the adjusted second image in the second frame.

26. The system of any one of claims 18 to 25 wherein the related action further comprises the steps of:

(i) obtaining viewer data related to a viewer of the second image; and (iii) selecting the second image based on the viewer data.

27. The method of any one of claims 18 to 25 wherein the related action further comprises the steps of:

(i) obtaining advertiser data related to an advertiser of the second image; and

(ii) selecting the second image based on the advertiser data.

28. The system of any one of claims 18 to 25 wherein the Slot Data comprises a position within the frame suitable for image insertion.

29. The system of any one of claim 18 to 25 wherein the Object Data comprises at least one of: appropriate type of image; size of image; lighting information; shadow information; and scene content.

30. The system of claim any one of claims 18 to 25 wherein the at least one digital video file comprises a single digital video file that comprises both the first frame and the second frame.

31. The system of claim any one of claims 18 to 25 wherein the at least one digital file comprises a first digital video file and a different second digital video file, wherein the first digital video file comprises the first frame, and wherein the second digital video file comprises the second frame.

Description:
METHODS AND SYSTEMS FOR PROCESSING DIGITAL VIDEO FILES FOR IMAGE INSERTION INVOLVING COMPUTERIZED DETECTION OF

SIMILAR BACKGROUNDS

FIELD OF THE INVENTION

[0001] This invention relates to methods and systems for inserting an image, such as an image of a product or an advertisement into a digital video file, such as a digital video filed downloaded via the Internet.

BACKGROUND OF THE INVENTION

[0002] Digital video and modification thereof. Digital video creation is the process of capturing moving pictures as digitally stored files. Digital video files may be recorded on non-transitory media including magnetic media or solid-state media such as found in video tape, hard disks, flash memory, or other media which can record digital data. As digital video technology has improved over the years, digital video creation has become increasingly common. Many television shows and feature films are now recorded as digital video files.

[0003] Realistic postproduction modification of digital video content is useful for changing scene content, inserting an image of a product or advertisement, or adapting content aesthetics based on viewer information or preferences. Since digital video is often distributed "online" (i.e., via a communications network such as the Internet), this modification is desirably performed when the viewer downloads the digital video file, so that the modifications can be current, as well as targeted and different for each viewer.

[0004] Online digital video and advertising. Online digital video has become a popular medium for watching video, accounting for over 8 billion views per day and growing. YouTube™, one of the largest and most popular online digital video repositories, accounts for about half of all digital video viewing on the Internet. YouTube™ and other similar platforms have progressed from distributing user-generated content to distributing "professional" digital videos of all types (e.g., sports, movies, TV programs, news programs, music, interviews, etc.) Additionally, YouTube™ and other similar platforms have developed channels in which producers upload numerous digital videos resulting in "channels" with regular subscribers and viewers. Such channels account for over half of the viewership on YouTube™, with the more popular channels collectively having a few billion views per month and tens of millions of regular subscribers.

[0005] Accordingly, advertisers are increasingly interested in online digital videos for delivering promotional content, and more particularly, targeted advertising based on viewer-specific information such as the viewer's demographic, purchase behavior, and attitudes. For example, an advertiser may wish to advertise Coke™ when the viewer is a Pepsi™ consumer. Advertisers are also interested in improving the relevance of advertising to the content of the online digital video. For example, an advertiser may want to know the detailed scene content layout in order to place a product in a position that makes the product noticeable and provides a positive impression of the product. As another example, to reinforce the impact of a "pre-roll" advertisement shown before a selected digital video, an advertiser may want to place a branded promotional item in appropriate scenes of the digital video itself.

[0006] Images of products or advertisements inserted into an online digital video may have to be periodically changed. Advertisements are usually made only when there is a desire to raise consumer awareness, such as during introduction of a new product, an increase in competitive activity, or a sale event. For consumer products, most advertising activity typically lasts for only one to four weeks during a seasonal event (e.g., spring holiday, peak summer weekends, Thanksgiving and Christmas, back to school, etc.). As popular online videos can be viewed over several years, image insertion of a product or advertisement would have to be repeated numerous times to maintain currency.

[0007] Viewers of online digital videos are adverse to any obvious modification to the video or interference with watching video. Obvious modifications can create a negative impression of the product being "pushed" onto the viewer, and cause viewers to skip or shut down the video. Therefore, any type of product or advertising placement in an online digital video is preferably done so it appears as if it were part of the originally produced digital video.

[0008] Computerized object detection and editing of digital videos. Object detection in digital video may be performed with software. Three types of object detection methods are as follows. A first method, commonly referred to in the industry as "feature matching", involves searching for an identifying dominant distinctive features of reference item, in a frame-by-frame analysis of a digital video file. Feature matching works well if the scenes have discernible and distinctive features (in terms of color, shape, or intensity gradients) that can be consistently identified over numerous frames. A second method involves placing artificial "glyph" markers or bar codes in the digital video files for post-production video analysis and/or editing. A third method involves computer learning algorithms that compare numerous images of various instances of a similar object. Learning algorithms work best if an item is structurally similar across a wide range of circumstances. These programs analyze images to make statistical inferences about the recurring characteristics of the object to identify similar objects in a frame-by-frame analysis of a digital video. For example, a human face would cause a learning algorithm to focus on the spatial consistency of the location of shadows cast by a person's eyebrows, nose, and chin.

[0009] Once a location in a digital video is identified, rendering software can be used to insert an image of a product or an advertisement (e.g., a poster, tent top, screen saver, bill board, advertising blimp, etc.) in a digital video file. Some rendering software such as Photoshop™ (Adobe Systems) can be used to make changes frame-by-frame. Alternatively, specialized post-production software programs can be used to make changes to multiple frames at a time. Rendering software may require manual adjustments for factors such as movement in a video of camera, objects, lighting, reflections, shade, and blur. Recent developments in rendering software have focused on automating such factors (especially movement), and as a result reduce the time and manual effort it takes to insert images or make changes to a video.

[0010] Despite recent developments, the process of object detection, and rendering images in digital video files remains primarily based on statistical modelling of various factors identified above, and is therefore subject to inaccuracies and false positives. Since statistical modelling is rarely 100% accurate, current methods continue to require manual auditing and editing. This is not practically feasible if the digital video needs to be changed numerous times to insert images of different types of products or advertising images, if an image needs to be inserted in millions of digital videos, or if the digital video needs to be changed in real-time when downloaded by a viewer in a manner that is targeted towards the viewer based on information about the viewer. SUMMARY OF THE INVENTION

[0011] The present invention provides methods and systems for processing digital video files for image insertion. Embodiments of the invention may be used to insert an image of a product (e.g., a beverage container or a cereal box) or an advertisement (e.g., a poster on a wall, a screen saver on a television or computer screen, a printed message on a garment, a billboard on a highway, a product logo on a tent top) in a digital video file, when the digital video file is to be downloaded via a communications network such as the Internet. Other embodiments of the invention may be used to tailor digital videos to viewers for applications such as video games, or educational videos. The use of computer systems is essential to the invention so that images can be inserted in large numbers of digital video files, in real-time with the downloading of the digital video files, with limited or no manual auditing and editing.

[0012] The present invention involves processing of a plurality of frames of at least one digital video file for image insertion. The plurality of frames comprises a first frame and a second frame.

[0013] In a first aspect, the present invention comprises a method for processing the plurality of frames of the at least one digital video file for image insertion. In a second aspect, the present invention comprises a system for processing the plurality of frames of the at least one digital video file for image insertion. The system comprises a computing system comprising a processor and a memory comprising a non-transitory medium storing instructions executable by the processor to implement a method of the present invention.

[0014] The method of the present invention comprises the steps of: (a) determining and/or storing first Slot Data for a Placement Slot of the first frame for insertion of a first image; (b) analyzing, using a computer system, the first frame and the second frame to determine if a background image in the second frame has a threshold degree of similarity with a background image in in the first frame (detecting a Similar Background); and (c) in response to detecting the Similar Background, using the computer system to take a related action comprising: (i) determining, based on the first Slot Data, second Slot Data for a Placement Slot of the second frame for insertion of a second image. [0015] In embodiments of the method, the method further comprises the step of determining and/or storing first Object Data for the first image, and the related action further comprises the step of determining, based on the first Object Data, second Object Data for insertion of the second image in the Placement Slot of the second frame. The second Object Data may be determined to be the same as the first Object Data. The related action may further comprise providing the second Object Data to another computer system, which may be controlled by an advertising exchange, in response to a request to view the at least one digital file comprising the second frame. The related action further may comprise the steps of: (i) obtaining data encoding the second image, wherein the second image is formatted to conform with the determined second Object Data; and (ii) modifying data encoding the second frame with the data encoding the second image so as to insert the second image in the second frame. The data encoding the second image may be different from the data encoding the first image.

[0016] In embodiments of the method, the method further comprises the step of determining a relationship between the first Slot Data and the first Object Data for adjusting the first image for insertion in the Placement Slot of the first frame, and the related action further comprises the steps of: (i) obtaining data encoding the second image; (ii) determining second Object Data for the second image; (iii) adjusting the second image for insertion in the Placement Slot of the Second Frame based on the determined relationship, the second Slot Data and the second Object Data; and (iv) modifying data encoding the second frame with the data encoding the adjusted second image so as to insert the adjusted second image in the second frame.

[0017] In embodiments of the method, the related action further comprises the steps of: (i) obtaining viewer data and/or advertiser data related to a viewer and/or advertiser of the second image; and (ii) selecting the second image based on the viewer data and/or advertiser data.

[0018] In embodiments of the method, the Slot Data comprises a position within the frame suitable for image insertion.

[0019] In embodiments of the method, the Object Data comprises at least one of: an appropriate type of image; a size of image; lighting information; shadow information; and scene content. [0020] In embodiments of the method, the at least one digital video file comprises a single digital video file that comprises both the first frame and the second frame.

[0021] In embodiments of the method, the at least one digital file comprises a first digital video file and a different second digital video file, wherein the first digital video file comprises the first frame, and wherein the second digital video file comprises the second frame.

[0022] In embodiments of the method, the second image is an image of a product, or an image of an advertisement.

BRIEF DESCRIPTION OF THE DRAWINGS

[0023] Exemplary embodiments of the present invention are described with reference to the following drawings. In the drawings, like elements are assigned like reference numerals. The drawings are not necessarily to scale, with the emphasis instead placed upon the principles of the present invention. Additionally, each of the embodiments depicted is but one of a number of possible arrangements utilizing the fundamental concepts of the present invention. The drawings are briefly described as follows:

[0024] Fig. 1 is a block diagram of an embodiment of an overall system architecture used to implement methods of the present invention; and

[0025] Figs. 2A to 2C are a flow chart of an embodiment of a method of the present invention, as implemented by the embodiment of the system architecture shown in Fig. 1.

DETAILED DESCRIPTION OF THE INVENTION

[0026] The following provides a detailed description of a non-limiting exemplary embodiment of the invention. Any term or expression not expressly defined herein shall have its commonly accepted definition understood by a person skilled in the art.

[0027] Referring to Figure 1, an embodiment of the architecture of an overall system (10) for implementing the invention includes a plurality of computer systems in communication with each other via a communications network comprising the Internet. While Figure 1 shows each computer system as a single discrete component, they may comprise a plurality of components, either physically integrated or separated, that are operatively connected to each other through one or a combination of wired or wireless communication means. A viewer's computer (12) transmits a request for a download of a digital video file to an application server (14) having access to a repository of stored digital video files (16). Upon receiving the request, the application server (14) notifies an advertising exchange server (18), which in turn transmits an invitation to bid on an advertising Placement Slot in the digital video file to at least one advertiser's computer (20). The advertiser's computer (20) transmits a bid for the advertising Placement Slot to the advertising exchange server (18). The advertising exchange server (18) accepts the bid (e.g., if it is the highest bid or satisfies some other criteria) and transmits notification of the acceptance to the application server (14). In addition, the advertising exchange server (18) may also transmit data related to the image to be inserted in the digital video file. The image may be of a product or an advertisement. In embodiments, the transmitted data may encode information that allows for rendering of the image. Alternatively, this data may encode information that allows for selection of an image to be rendered, with the actual information that allows for rendering of the image being stored in a repository of stored image files accessible to the application server (14) or an edge server (22). For example, the information may be an identifier for the advertiser with the winning bid that can be associated with an image. Upon receipt of the notification, the application server (14) working in conjunction with an edge server (22) inserts an image of a product or an advertisement into the digital video file, based on information provided by the advertiser (24) and, optionally, the viewer's information (26) (e.g., the viewer's behavioural data). The modified digital video file is transmitted to the viewer's computer (12) and displayed thereon. Except for requesting the download of the digital file and the display of the modified digital file on the viewer's computer, the process is imperceptible to the viewer.

[0028] Figures 2A to 2C outline in greater detail the processes performed by the computer systems. Referring to Figure 2A, the application server (14) processes a group of digital video files to initialize the system. A group of digital video files includes at least one digital video file and, in embodiments, may include several hundreds or thousands of digital video files. For each one of a plurality of digital video files in the group, the application server (14) opens each digital video file (step 200) (i.e., accesses it), splits it into individual frames (step 202), and stores the frames (step 204) in a memory.

[0029] The application server (14) analyzes the stored frames to detect a Similar Background amongst the stored frames (step 206) for the group of digital video files. As used herein, a " Similar Background" refers to images depicted by two or more frames of one or more digital video files that satisfy a threshold degree of similarity with each other. The result of detecting Similar Backgrounds is a record in the form of a statistical histogram of the features of the frames. In embodiments, the threshold degree of similarity may be based on statistical similarity of the images, or other metrics of the degree of similarity of the images.

[0030] The Similar Background detection process can be performed by one or a combination of conventional computer-based object detection techniques. These techniques may include (but are not limited to) frame-by-frame matching (step 208) using frame-by-frame feature matching (step 210) and statistical analysis for prominent objects (step 212), as discussed in the Background of the Invention of this application. Under conventional methodology, a video is initially diagnosed frame-by-frame using object detection software to decipher video content. This tends to work well (i.e., with low false positives) when detecting very distinctive objects like human faces (using facial recognition software), or objects with distinctive features like cups, brand names/logos, pets, and outdoor features like trees, sky, clouds. However, even some distinctive objects can result in higher false positives. For example, it may be difficult to identify a branded product (e.g., a can of Coke™) where the brand logo is only partially visible or distorted by lighting in the video, or a cup where the handle is not fully visible, since the shape of the logo and the handle are fundamental in the mathematical analysis of the pixels to determine/find the shape of the logo or the cup. Furthermore, less distinctive items or objects like tables, chairs, TV and computer screens, carpets and wall features, having numerous variations can result in fairly high false positives. Therefore, once such objects are detected, some human checking/input may be required to identify and correct false positives or mistakes.

[0031] These techniques may also include pixel data analysis (step 214), which involves analysis of one or more pixels in each frame, continuous set of frames or an intermittent set of frames (i.e., the pixel may be the same pixel location in multiple frames or multiple pixel locations in multiple frames) to determine changes in pixel factors. These pixel factors may include change in color and aspects relating to colour including intensity level and the history across frames of such change for each individual pixel. Thereafter, the result of the pixel analysis is analyzed to determine the statistical probability of a scene change between successive frames. If such probability exceeds a pre-established threshold, then a scene change is considered to have occurred between the successive frames. As an example, suppose the pixel analysis indicates that a scene change occurs between the 25 th and 26 th frame of a digital video file, and another scene change occurs between the 75 th and 76 th frame of the digital video file. In this case, a first scene change begins at the 26 th frame and ends at the 75 th frame, and a second scene change begins at the 76 th frame. The 26 th to 75 th frames, inclusive, may be considered as having Similar Backgrounds. This principal of detecting frames with Similar Backgrounds may be applied to detect frames with Similar Backgrounds in frames of different digital video files, and adapted to other techniques for frame-by -frame analysis of one or more digital video files.

[0032] With this in mind, the Similar Background detection process may be facilitated by recognizing that digital videos within a group may fall into one of three categories, as follows. Category 1 includes digital videos that depict recurring settings and backgrounds that are sufficiently consistent and distinctive to be identified by object identification software. Examples include digital videos on a YouTube™ channel or Facebook™ page by the same producer, such as online video personality "Jenna Marbles" who records digital videos in her kitchen or living room. Another example is a digital video of a sitcom such as "Seinfeld" where numerous scenes take place in the Seinfeld character's kitchen or living room, or "Two and a Half Men" where numerous scenes take place in the same living room, eat-in kitchen or balcony. Feature matching and statistical methods of prominent object detection may be well suited for detection of Similar Backgrounds in frames of Category 1 digital videos, but it will be appreciated that other object detection techniques may also be used to detect Similar Backgrounds in frames of Category 1 videos.

[0033] Category 2 includes digital videos of outdoor scenes that are expected to have blue sky and white clouds or green forests. (In digital videos with outdoor scenes, it may be appropriate to insert images for advertising blimps, outdoor billboards, or products that are appropriate to place outdoors.) Since the blue colour of the sky or the green colour of the forests can be expected to fall within a certain range of tones, feature matching may be used to detect such outdoor scenes. Pixel data analysis and other object detection techniques may also be used for detection of Similar Backgrounds in frames of Category 2 digital videos.

[0034] Category 3 includes digital videos that do not fit into either Category 1 or Category 2. Examples include "random" online videos that become popular or "go viral" and receive tens or hundreds of millions of views. Similar Background detection may not be appropriate for Category 3 digital videos since they may not have similar backgrounds. If so, Similar Background detection may be skipped for Category 3 digital videos, and the process may proceed directly to the step of identifying a Placement Slot (as discussed below), such as when the viewership of a Category 3 digital video increases rapidly in a short time.

[0035] The next step is to identify a Placement Slot within at least one frame of at least one digital video (step 216). A purpose of identifying a Placement Slot is to facilitate repeated and automated image insertion in the frame. As used herein, a "Placement Slot" refers to a position within the frame in which a digital image (e.g., an image of a product to be placed or an advertisement) is to be inserted.

[0036] In analyzing the frames of a digital video, the application server (14) determines whether a Similar Background has been previously detected. If the particular Similar Background has not been previously detected or a Placement Slot has not already been identified for a previously detected Similar Background, then the application server (14) may perform computerized object detection (e.g., using computerized object detection techniques discussed above) on the frame to determine the position of an object of interest (e.g., a table, a TV/computer screen, a cup, a car, or external features such as sky and clouds, etc.) (step 218). The Placement Slot can then be determined as the position of the object of interest or a position defined relative thereto (e.g., an area adjacent to the object of interest). As necessary, the position of the Placement Slot as determined by the computer can be verified manually for appropriateness (e.g., to prevent false positives from the computer analysis) and/or determined manually (step 220).

[0037] Alternatively, if the particular Similar Background and an associated Placement

Slot have already been identified in a previously analyzed frame, then the previously identified Placement Slot may be used. It will be appreciated that this can be used advantageously to expedite the analysis of large number of frames in large numbers of digital videos. For example, additional digital videos produced by the same producer can also be analyzed using data from the analysis of the past digital videos from the producer to identify Similar Backgrounds and Placement Slots.

[0038] As an example, suppose that frame-by-frame analysis of a first digital video and a second digital video indicates that a frame of the first digital video and a frame of the second digital video have a Similar Background depicting a kitchen scene with a kitchen counter. Object detection analysis of the frame of the first digital video indicates the presence of a cup on the kitchen counter. The position of the cup can be used as a Placement Slot in the first frame. Further, given that the frame of the first digital video and the frame of the second digital video both depict the similar kitchen scene, a Placement Slot can be identified in the frame of the second digital video based on the position of the Placement Slot determined for the frame of the first digital video.

[0039] Once a Placement Slot has been identified, Slot Data is identified for the Placement Slot (step 222). As used herein, "Slot Data" refers to information about digital video characteristics of the Placement Slot. The purpose of Slot Data is to allow for adjustment of the image to be inserted into the frame to give it appropriate alignment, tracking, lighting, reflections, shadow, etc. In this manner, the inserted image can be made to appear as much as possible as if it were part of the originally produced digital video. In embodiments, identifying Slot Data may include identifying one or a combination of the following information: (a) XYZ alignment data (step 224), e.g., positional coordinates of the various surfaces in the frame; (b) tracking points (step 226): points on objects/surfaces in the frame/scene with high contrast, which data is used to make the placed product/advertising image appear stable in the context of any camera movement in the scene; (c) lighting characteristics in the frame: e.g., reflections (step 228), overall brightness, colors (especially dominant colors), basic lighting (step 230) especially as it relates to creating shadows (step 232), surface roughness (a rougher surface produces different reflections compared to a smoother surface); and (d) camera and content movement, and blur. Once the application server (14) determines the Slot Data, it stores the Slot Data (step 234) for future use.

[0040] Referring to Figure 2B, the next step is to identify Object Data (step 236). As used herein, "Object Data" refers to data describing an image (e.g., of a product or an advertisement), to be inserted into a Placement Slot of a digital video file. In embodiments, Object Data includes surface features (step 238), such as level of surface shine, surface roughness or texture (step 240), level of reflection, object shape (step 242) and object size (step 244), etc. The application server (14) may determine Object Data for each image that has the potential to be inserted and stores the Object Data (step 246) accordingly.

[0041] The next step is a correlation process to develop a relationship between the Slot Data and Object Data (step 256) so that any image inserted into a Placement Slot substantially appears to the viewer as if the object depicted by the inserted image were an integral part of the original digital video file. It will be appreciated that modification of the digital video (i.e., the insertion of the image in the digital video) may well be detectable upon close scrutiny by the viewer either with the unaided eye or using technical means for analyzing the digital video. However, in practical application, it may be sufficient for the Slot Data and the Object Data to be correlated so that the modification is not obvious to the viewer casually observing the video without a specific motivation to detect the modification.

[0042] The result of the correlation process in a mathematical relationship that describes how an image associated with given Object Data needs to be adjusted for the Placement Slot associated with certain Slot Data in a particular Similar Background. This correlation process can be performed by rendering a frame with a first image inserted into a Placement Slot (step 248). The rendering process may be performed using software programs to alter an image or to add an image to the frame of a digital video. Software programs for rendering objects or images in videos may be based on mathematical and statistical analysis of light, movement, shade, blur, camera jitter, planar analysis, etc. and, as a result, are subject to inaccuracies and false positives. The results of such rendering programs may need to be checked and corrected by humans.

[0043] Accordingly, at this point, a quality control check of the frame may be manually performed (step 250) to ensure that the inserted image appears satisfactory. In the case of the first digital video that is rendered with the Slot Data and the Object Data, the digital video may be subject to a manual approval process (step 252), after which the digital video is approved for distribution (step 254). Conversely, if the inserted image does not appear satisfactory, adjustments may be made to relationship between the Slot Data and the Object Data. [0044] Consider again the previous example of the Similar Background depicting a kitchen scene with an identified Placement Slot on the kitchen counter. Suppose that it is desired to insert an image of a metallic soda can in the Placement Slot in place of the cup. The soda can image is associated with Object Data indicating that the soda can has a high level of surface reflection, and a cylindrical shape with a height of about 12 cm. An initial rendering of the soda can image on the kitchen counter is made. However, the quality control check indicates that the initial rendering results in the soda can image lacking the expected reflectivity and appearing out of scale with other appliances on the kitchen counter. Accordingly, adjustments are made to the rendering until the soda can image appears satisfactorily as part of the original digital video. The adjustments made for the soda can image are noted, and used to determine the relationship between the Slot Data and the Object Data that produces the desired result. For example, it may be determined that the brightness of the soda can image needs to be increased by 20 percent to depict the expected reflectivity, and the soda can image needs to be increased in size by 50 percent to appear in scale with other appliances on the kitchen counter in this particular Placement Slot. A relationship between the Slot Data and the Object Data involving appropriate brightness calibration factors and scaling calibration factors can then be determined.

[0045] Once a satisfactory relationship between the Slot Data and Object Data has been developed and stored, other images associated with different Object Data may be subsequently inserted using the developed relationship (step 258). Accordingly, subsequent image insertions can be fully automated/computerised (without the need for any manual processing or approval) and the quality of image insertion would consistently be as required, and thus automatically be approved for distribution (step 260) without the need for manual auditing and editing.

[0046] Further, it will be appreciated that the subsequently inserted image need not be the same as the first inserted image, and need not have the same Object Data. Consider again the previous example of the Similar Background depicting a kitchen scene with an identified Placement Slot on the kitchen counter. The relationship between the Slot Data and the Object Data was developed by inserting a soda can image. Now, suppose that it is desired to insert an image of a cereal box in the same Placement Slot. The cereal box image is associated with Object Data indicating that the box has a low level of surface reflection, and a rectangular prismatic shape with a height of about 30 cm, all of which is different from the Object Data associated with the soda can image. However, the relationship between Slot Data and Object Data that resulted in a satisfactory insertion of the soda can image in the Placement Slot can be applied to produce a satisfactory insertion of the cereal box image in the same Placement Slot. For example, a brightness calibration factor and a scaling calibration factor for the cereal box image may be interpolated or extrapolated from the relationship between Slot Data and Object Data developed from the quality check and approval process for the soda can image. The cereal box image can be then rendered and inserted into the Placement Slot taking these calibration factors into account.

[0047] Referring to Figure 2C, the role of the advertising exchange server (18) and the advertiser's computer (20) are further illustrated. Advertising exchanges are a recent creation in online commerce where an advertiser can buy from such an advertising exchange an individual advertising event in real-time based on data specific to that event. For example, if a viewer uses a computer mouse to "click" on a hyperlink to select a digital video for a sports event for viewing on an online site, a message is then sent from the host website of such a digital video to the advertising exchange server (18) maintained by an independent third party. The advertising exchange server (18) then sends this message (of an impending viewer event) to all advertisers' computers (20) that are connected or enrolled with this particular advertising exchange. The advertiser then confirms whether its product is appropriate for the Placement Slot (step 262) and bids for such an advertising event based on the data relating to the specific event (e.g., viewer demographics and/or purchase behavior data, viewing time, data on the website and type of video to watched, etc.). The advertiser that wins the opportunity to advertise in the Placement Slot may be selected according to a set of pre-defined rules. As a non-limiting example, the winning advertiser may be selected as the highest bidding advertiser, or according to one or more other criteria used in the alternative or in combination. The winning advertiser would then receive such an opportunity to advertise and would then transmit the Object Data (step 264) and the image (e.g., of the product to be placed or the advertisement to be shown) (step 266) to be inserted during this specific viewing event.

This whole process of finding, selecting, notifying potential advertisers, bidding, selling/allocating an individual advertising event and transmitting an advertising message, occurs in micro-seconds so that there is no apparent delay in the viewing of sports video by the viewer. This is all done on a network of computers (i.e. , those of application server (14), the advertising exchange server (18), the advertiser's computer (20). and the edge server (22)) without any human involvement. The edge server (22) can then obtain the stored Slot Data for the Placement Slot (step 268), store the relationship between the Slot Data and the Object Data (step 270), render the digital video with the image received from the advertiser (step 272), and download the video to the viewer (step 274).

[0048] The method according to this invention is used to determine the Similar Backgrounds and Placement Slots in digital videos, with storing of the related Slot Data and Object Data. This would also be very applicable to advertising exchanges since the whole process of inserting a product or an advertisement in a video, under this invention, can be fully automated / computerized. For the methodology under this invention to work on an advertising exchange, the advertisers may transmit data related to an image for a product or advertisement for insertion in videos. As an example, a viewer is about to watch an online digital video for a soccer game. This digital video has several identified Placement Slots such as for: (a) insertion of a ceiling dangler in a scene near one of the goal posts; (b) placement of a can or bottle of a soda or a juice on a table where one of the coaches is sitting watching the game; and (c) a changing advertisement on the electronic advertising boards that are on the perimeter of the soccer field. The data that would be sent to the advertiser for each specific Placement Slot would specify the type of product that can be placed/advertised, data on size, characteristics of the placement, type of image, etc. For such advertising to work appropriately, the advertiser that wins the bid on such an event would transmit in return a data related to an image that meets the pre-set Object Data specifications. (Alternatively, it is conceivable that the Object Data specification may be unrestricted, provided that the image can be adjusted in accordance with a pre-established relationship between Object Data and Slot Data for one of the Placement Slots.) The transmitted data may encode information that allows for rendering of the image. Alternatively, this data may encode information that allows for selection of an image to be rendered, with the actual information that allows for rendering of the image being stored in a repository of stored image files accessible to the application server (14) or an edge server (22). For example, suppose that the winning advertiser wishes to insert an image of a can of soda on the table where one of the coaches is sitting.

The advertiser may transmit data that allows for rendering of the can of soda.

Alternatively, the advertiser may transmit an indicator that the image to be rendered is a can of soda having a certain brand. Based on this indicator, the application server (14) or edge server (22) may then retrieve data from a stored repository, which data allows for rendering of such a branded can. It will be appreciated that the latter technique may minimize the amount of data that needs to be transmitted at run-time to the application server (14) or edge server (22).

[0049] Further exemplary embodiments of the invention are now described.

[0050] Example 1: Creating Placement Slots in videos with recurring backgrounds to automate product and advertising insertion in online video

[0051] In this example, the steps outlined below may be followed to insert an image of a first product and subsequent products on the kitchen counter of the kitchen scene in a digital video of the "Seinfeld" television program in which the characters Seinfeld and Kramer are typically seen standing.

1. Using the application server, analyze a group of digital videos for the

"Seinfeld" television program using feature and pattern matching, and other machine learning methods to identify the specific videos and the frame location within the digital videos with the kitchen setting as the Similar Background.

2. Identify Placement Slots on the kitchen counter where products can be placed in the Similar Background.

3 For each Placement Slot:

(a) identify an appropriate type of product or advertising that can be inserted;

(b) insert an image of an appropriate product in the Placement Slot and manually check the appearance of the inserted image;

(c) as necessary, calibrate the size and other parameters of the image (e.g., lighting, reflections surface roughness, shadows, etc.) of the inserted product so that the inserted image appears as if it were originally in the video in the context of the nature of the light in the scene having regard to the visual characteristics of the product in the inserted image (e.g., if it is shiny, dark, metallic, transparent, etc., the image should display in the appropriate way); (d) determine the relationship between Slot Data for the Placement Slot and Object Data for the inserted image that results in satisfactory appearance of the inserted image;

(d) identify data relating to the scene content to assist in targeting the appropriate product for insertion (e.g., whether the Seinfeld or Kramer characters are present in the scene so as to place a product that is, in an advertising context, associated with this character); such data may be part of the text associated with a particular video or in online viewer comments.

3. Once the digital video is set-up with the above analysis and data for Similar Backgrounds, Slot Data, and Object Data, use the edge server to insert images for other products automatically in the Placement Slot, without any human intervention or checking since the relationship between the Slot Data for each Placement Slot and the specified Object Data would ensure appropriate placement. Such insertion may be based on data retrieved via Internet that is indicative of the purchase or behavioral data of the viewer for target marketing purposes.

[0052] Example 2: Using Similar Background detection and Placement Slot methodology to sell advertising on an online advertising exchange

[0053] In this example, the steps outlined below may be followed after Slot Data and Object Data have been set up for a digital video, to facilitate repeat insertion of image of products or advertising messages in the digital video, using an advertising exchange.

1. When a viewer visits a website and selects a digital video to watch, a message is sent to the advertising exchange of such an imminent advertising opportunity in a Placement Slot, with all the data relating to the video (i.e. genre, content, type of scene where product/ad placement can be done, etc.) and data relating to the viewer (i.e. demographics such as age, gender, purchase behavior such as brands used/purchased by viewer, response by viewer towards different types of advertising in terms of product switching or other change in buying behavior, etc.). 2. The advertising exchange server communicates the message via a computing network to the computers of various advertisers who buy advertising from the exchange.

3. The advertiser's computer reviews the data transmitted and, if appropriate, bids for or purchases this advertising opportunity.

4. Once a sale is completed via the computing networks, the advertiser's computer transmits the graphical data for the image of the product or advertising that is to be inserted in the digital video. Such graphical data (i.e., a two or three dimensional representation of a product or other type of advertising) that is transmitted would be consistent with predetermined specifications of Object

Data required for such advertising.

5. This whole process would take a few milliseconds to complete so that the viewer does not experience any delay in downloading and watching the video.

6. The edge server renders to digital video with the image inserted in the Placement Slot.

[0054] Computer implementation. The following discussion provides a brief and general description of a suitable computing environment in which various embodiments of the system may be implemented. Although not required, embodiments will be described in the general context of computer-executable instructions, such as program applications, modules, objects or macros being executed by a computer. Those skilled in the relevant art will appreciate that the invention, or components thereof, can be practiced with other computing system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, personal computers ("PCs"), network PCs, mini-computers, mainframe computers, mobile phones, smart phones, personal digital assistants, personal music players (like iPods™) and the like. The embodiments can be practiced in distributed computing environments where tasks or modules are performed by remote processing devices, which are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

[0055] As used herein, the terms "module" is a computing system as described in the following. A computing system may include one or more processing units (e.g. processor), system memories, and system buses that couple various system components including system memory to a processor. Computing system will at times be referred to in the singular herein, but this is not intended to limit the application to a single computing system since in typical embodiments, there will be more than one computing system or other device involved. Other computing systems may be employed, such as conventional and personal computers, where the size or scale of the system allows. The processing unit may be any logic processing unit, such as one or more central processing units ("CPUs"), digital signal processors ("DSPs"), application-specific integrated circuits ("ASICs"), etc. Unless described otherwise, the construction and operation of the various components are of conventional design. As a result, such components need not be described in further detail herein, as they will be understood by those skilled in the relevant art.

[0056] The computing system includes a system bus that can employ any known bus structures or architectures, including a memory bus with memory controller, a peripheral bus, and a local bus. The system also will have a memory which may include read-only memory ("ROM") and random access memory ("RAM"). A basic input/output system ("BIOS"), which can form part of the ROM, contains basic routines that help transfer information between elements within the computing system, such as during startup.

[0057] The computing system also includes non-volatile memory. The non-volatile memory may take a variety of forms, for example a hard disk drive for reading from and writing to a hard disk, and an optical disk drive and a magnetic disk drive for reading from and writing to removable optical disks and magnetic disks, respectively. The optical disk can be a CD-ROM or BLU-RAY, while the magnetic disk can be a magnetic floppy disk or diskette. The hard disk drive, optical disk drive and magnetic disk drive communicate with the processing unit via the system bus. The hard disk drive, optical disk drive and magnetic disk drive may include appropriate interfaces or controllers coupled between such drives and the system bus, as is known by those skilled in the relevant art. The drives, and their associated computer-readable media, provide nonvolatile storage of computer readable instructions, data structures, program modules and other data for the computing system. Although computing systems may employ hard disks, optical disks and/or magnetic disks, those skilled in the relevant art will appreciate that other types of non-volatile computer-readable media that can store data accessible by a computer may be employed, such a magnetic cassettes, flash memory cards, digital video disks ("DVD"), Bernoulli cartridges, RAMs, ROMs, smart cards, etc.

[0058] Various program modules or application programs and/or data can be stored in the system memory. For example, the system memory may store an operating system, end user application interfaces, server applications, and one or more application program interfaces ("APIs").

[0059] The system memory also includes one or more networking applications, for example a Web server application and/or Web client or browser application for permitting the computing system to exchange data with sources, such as clients operated by users and members via the Internet, corporate Intranets, or other networks as described below, as well as with other server applications on servers such as those further discussed below. The networking application in the preferred embodiment is markup language based, such as hypertext markup language ("HTML"), extensible markup language ("XML") or wireless markup language ("WML"), and operates with markup languages that use syntactically delimited characters added to the data of a document to represent the structure of the document. A number of Web server applications and Web client or browser applications are commercially available, such those available from Mozilla and Microsoft.

[0060] The present invention has been described above and shown in the drawings by way of exemplary embodiments and uses, having regard to the accompanying drawings. The exemplary embodiments and uses are intended to be illustrative of the present invention. It is not necessary for a particular feature of a particular embodiment to be used exclusively with that particular exemplary embodiment. Instead, any of the features described above and/or depicted in the drawings can be combined with any of the exemplary embodiments, in addition to or in substitution for any of the other features of those exemplary embodiments. One exemplary embodiment's features are not mutually exclusive to another exemplary embodiment's features. Instead, the scope of this disclosure encompasses any combination of any of the features. Further, it is not necessary for all features of an exemplary embodiment to be used. Instead, any of the features described above can be used, without any other particular feature or features also being used. Accordingly, various changes and modifications can be made to the exemplary embodiments and uses without departing from the scope of the invention as defined in the claims that follow.