Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
EMBLEM IDENTIFICATION
Document Type and Number:
WIPO Patent Application WO/2018/023004
Kind Code:
A1
Abstract:
Methods of identifying a target emblem in a video stream include receiving video data having a video frame, where the video frame comprises an image. The methods include segmenting the image to generate a segment map; and extracting, from the image, features to generate a feature map. At least one of the features is classified to identify the target emblem in the video frame. The identified target emblem is tracked by identifying an additional video frame in which the identified target emblem appears.

Inventors:
SCHUPP-OMID DANIEL RILEY (US)
RATNER EDWARD (US)
Application Number:
PCT/US2017/044389
Publication Date:
February 01, 2018
Filing Date:
July 28, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
LYRICAL LABS VIDEO COMPRESSION TECH LLC (US)
International Classes:
G06K9/00
Foreign References:
US201213428707A2012-03-23
US201313868749A2013-04-23
Other References:
"Dictionary of Computer Vision and Image Processing, 2nd ed.", 1 January 2014, SPRINGER, article ROBERT B. FISHER: "Dictionary of Computer Vision and Image Processing, 2nd ed.", pages: 86, XP055405590
"TV Content Analysis: Techniques and Applications", 1 January 2012, CRC, article YIANNIS KOMPATSIARIS: "TV Content Analysis: Techniques and Applications", pages: 122 - 153, XP055405589
"Semantics An International Handbook of Natural Language Meaning", 1 January 2011, DE GRUYTER MOUTON, article CLAUDIA MAIENBORN ET AL: "Semantics An International Handbook of Natural Language Meaning", pages: 1485, XP055357650
RAINER LIENHART; JOCHEN MAYDT: "An Extended Set of Haar-like Features for Rapid Object Detection", IEEE ICIP, vol. 1, September 2002 (2002-09-01), pages 900 - 903
LURI FROSIO; ED RATNER: "Adaptive Segmentation Based on a Learned Quality Metric", PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, March 2015 (2015-03-01)
NAVNEET DALAI; BILL TRIGGS, HISTOGRAMS OF ORIENTED GRADIENTS FOR HUMAN DETECTION, 2005, Retrieved from the Internet
JOHN DAUGMAN: "Complete Discrete 2-D Gabor Transforms by Neural Networks for Image Analysis and Compression", IEEE TRANSACTION ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, vol. 36, no. 7, 1988, XP001379099
D.G. LOWE: "Distinctive Image Features from Scale-Invariant Keypoints", INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 60, no. 2, 2004, pages 91 - 110, XP002756976, DOI: doi:10.1023/B:VISI.0000029664.99615.94
ANDREY GRITSENKO; EMIL EIROLA; DANIEL SCHUPP; ED RATNER; AMAURY LENDASSE: "Probabilistic Methods for Multiclass Classification Problems", PROCEEDINGS OF ELM-2015, vol. 2, January 2016 (2016-01-01)
GABRIELLA CSURKA; CHRISTOPHER R; DANCE; LIXIN FAN; JUTTA WILLAMOWSKI; CEDRIC BRAY: "Visual Categorization with Bags of Keypoints", 2004, XEROX RESEARCH CENTRE EUROPE
Attorney, Agent or Firm:
DUEBNER, Ryan et al. (US)
Download PDF:
Claims:
CLAIMS

The following is claimed:

1 . A method of identifying a target emblem in a video stream, the method comprising: receiving video data comprising a video frame, wherein the video frame comprises an image; segmenting the image to generate a segment map; extracting, from the image, a plurality of features to generate a feature map; classifying at least one of the plurality of features, using a classifier, to identify the target emblem in the video frame; and tracking the identified target emblem by identifying an additional video frame in which the identified target emblem appears.

2. The method of claim 1 , further comprising pre-filtering the image, comprising: determining, using one or more color metrics, that each segment of a first set of segments is unlikely to include the target emblem; and removing the first set of segments from the image to generate a pre- filtered image.

3. The method of claim 2, further comprising removing, from the feature map, a first set of features, wherein at least a portion of each of the first set of features is located in at least one of the segments of the first set of segments.

4. The method of any of claims 1 - 3, further comprising masking, using an additional classifier, at least one feature corresponding to a non-target emblem.

5. The method of any of claims 1 - 4, further comprising: receiving target emblem information, the target emblem information comprising an image of the target emblem; and extracting a set of target emblem features from the target emblem.

6. The method of claim 5, further comprising training the classifier based on the set of target emblem features.

7. The method of either of claims 5 or 6, further comprising: receiving non-target emblem information, the non-target emblem

information comprising an image of the non-target emblem; extracting a set of non-target emblem features from the non-target emblem; and training the classifier based on the set of non-target emblem features.

8. The method of any of claims 1 - 7, further comprising: classifying the at least one of the plurality of features, using the

classifier, to identify a candidate target emblem in the video frame; determining that the candidate target emblem does not appear in an additional video frame; and identifying, based on determining that the candidate target emblem does not appear in an additional video frame, the candidate target emblem as a false-positive.

9. A system for identifying a target emblem in a video stream, the video stream comprising a video frame, wherein the video frame comprises an image, the system comprising: a memory having one or more computer-executable instructions stored thereon; and a processor configured to access the memory and to execute the

computer-executable instructions, wherein the computer- executable instructions are configured to cause the processor, upon execution, to instantiate at least one component, the at least one component comprising: a segmenter configured to segment the image to generate a

segment map; and an emblem identifier configured to identify, using the segment map, the target emblem in the video frame.

10. The system of claim 9, the emblem identifier comprising: a feature extract emblem or configured to extract, from the image, a plurality of features to generate a feature map; and a classifier configured to classify at least one of the plurality of features to identify the target emblem in the video frame.

1 1 . The system of either of claims 9 or 10, the emblem identifier further comprising a pre-filter, the pre-filter configured to: determine, using one or more color metrics, that each segment of a first set of segments of the segment map is unlikely to include the target emblem; and remove the first set of segments from the image to generate a pre- filtered image.

12. The system of any of claims 9 - 1 1 , the emblem identifier further comprising an additional classifier, the additional classifier configured to mask at least one feature corresponding to a non-target emblem.

13. The system of any of claims 9 - 12, further comprising a tracker configured to track the identified target emblem by identifying an additional video frame in which the identified target emblem appears.

14. The system of claim 13, wherein the tracker is further configured to: classify the at least one of the plurality of features, using the classifier, to identify a candidate target emblem in the video frame; determine that the candidate target emblem does not appear in an additional video frame; and identify, based on determining that the candidate target emblem does not appear in an additional video frame, the candidate target emblem as a false-positive.

15. The system of any of claims 9 - 14, wherein the feature extractor is further configured to: receive target emblem information, the target emblem information comprising an image of the target emblem; receive non-target emblem information, the non-target emblem

information comprising an image of the non-target emblem; extract a set of target emblem features from the target emblem; and extract a set of non-target emblem features from the non-target

emblem, wherein the classifier is trained based on the set of target emblem features and the set of non-target emblem features.

Description:
EMBLEM IDENTIFICATION

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims priority to Provisional Application No.

62/368,853, filed July 29, 2016, the entirety of which is hereby incorporated herein by reference in its entirety.

BACKGROUND

[0001] There is a growing interest in identifying emblems in video scenes. An emblem is a visible representation of something. For example, emblems may include symbols, letters, numbers, pictures, drawings, logos, colors, patterns, and/or the like, and may represent any number of different things such as, for example, concepts, companies, brands, people, things, places, emotions, and/or the like. In embodiments, for example, marketers, corporations, and content providers have an interest in quantifying ad visibility, both from ads inserted by the content deliverer (e.g., commercials) and ads inserted by the content creators (e.g., product placement in-program). For example, knowing the visibility at point of delivery of a banner at a football match can inform decisions by marketers and stadium owners regarding value. Further, when purchasing ad space in content delivery networks (e.g., internet-based video providers), logo owners may desire to purchase ad space that is not proximal to ad space occupied by logos of their competitors. Conventional emblem identification techniques are characterized by computationally-intensive brute force pattern matching.

SUMMARY

[0002] Embodiments of the disclosure include systems, methods, and computer- readable media for identifying emblems in a video stream. In embodiments, a video stream is analyzed frame-by-frame to identify emblems, utilizing the results of efficient and robust segmentation processes to facilitate the use of high-precision classification engines that might otherwise be too computationally expensive for large-scale deployment. In embodiments, the emblem identification may be performed by, or in conjunction with, and encoding device. Embodiments of the technology for identifying emblems in video streams disclosed herein may be used with emblems of any kind, and generally may be most effective with emblems that have relatively static color schemes, shapes, sizes, and/or the like. Emblems are visual representations of objects, persons, concepts, brands, and/or the like, and may include, for example, logos, aspects of trade dress, colors, symbols, crests, and/or the like.

[0003] In an Example 1 , a method of identifying a target emblem in a video stream, the method comprising: receiving video data comprising a video frame, wherein the video frame comprises an image; segmenting the image to generate a segment map; extracting, from the image, a plurality of features to generate a feature map; classifying at least one of the plurality of features, using a classifier, to identify the target emblem in the video frame; and tracking the identified target emblem by identifying an additional video frame in which the identified target emblem appears.

[0004] In an Example 2, the method of Example 1 , further comprising pre-filtering the image, comprising: determining, using one or more color metrics, that each segment of a first set of segments is unlikely to include the target emblem; and removing the first set of segments from the image to generate a pre-filtered image.

[0005] In an Example 3, the method of Example 2, further comprising removing, from the feature map, a first set of features, wherein at least a portion of each of the first set of features is located in at least one of the segments of the first set of segments.

[0006] In an Example 4, the method of any of Examples 1 - 3, further comprising masking, using an additional classifier, at least one feature corresponding to a non- target emblem.

[0007] In an Example 5, the method of any of Examples 1 - 4, further comprising: receiving target emblem information, the target emblem information comprising an image of the target emblem; and extracting a set of target emblem features from the target emblem.

[0008] In an Example 6, the method of Example 5, further comprising training the classifier based on the set of target emblem features. [0009] In an Example 7, the method of either of Examples 5 or 6, further comprising: receiving non-target emblem information, the non-target emblem information comprising an image of the non-target emblem; extracting a set of non- target emblem features from the non-target emblem; and training the classifier based on the set of non-target emblem features.

[0010] In an Example 8, the method of any of Examples 1 - 7, further comprising: classifying the at least one of the plurality of features, using the classifier, to identify a candidate target emblem in the video frame; determining that the candidate target emblem does not appear in an additional video frame; and identifying, based on determining that the candidate target emblem does not appear in an additional video frame, the candidate target emblem as a false-positive.

[0011] In an Example 9, a system for identifying a target emblem in a video stream, the video stream comprising a video frame, wherein the video frame comprises an image, the system comprising: a memory having one or more computer-executable instructions stored thereon; and a processor configured to access the memory and to execute the computer-executable instructions, wherein the computer-executable instructions are configured to cause the processor, upon execution, to instantiate at least one component, the at least one component comprising: a segmenter configured to segment the image to generate a segment map; and an emblem identifier configured to identify, using the segment map, the target emblem in the video frame.

[0012] In an Example 10, the system of Example 9, the emblem identifier comprising: a feature extractor configured to extract, from the image, a plurality of features to generate a feature map; and a classifier configured to classify at least one of the plurality of features to identify the target emblem in the video frame.

[0013] In an Example 1 1 , the system of either of Examples 9 or 10, the emblem identifier further comprising a pre-filter, the pre-filter configured to: determine, using one or more color metrics, that each segment of a first set of segments of the segment map is unlikely to include the target emblem; and remove the first set of segments from the image to generate a pre-filtered image. [0014] In an Example 12, the system of any of Examples 9 - 1 1 , the emblem identifier further comprising an additional classifier, the additional classifier configured to mask at least one feature corresponding to a non-target emblem.

[0015] In an Example 13, the system of any of Examples 9 - 12, further comprising a tracker configured to track the identified target emblem by identifying an additional video frame in which the identified target emblem appears.

[0016] In an Example 14, the system of Example 13, wherein the tracker is further configured to: classify the at least one of the plurality of features, using the classifier, to identify a candidate target emblem in the video frame; determine that the candidate target emblem does not appear in an additional video frame; and identify, based on determining that the candidate target emblem does not appear in an additional video frame, the candidate target emblem as a false-positive.

[0017] In an Example 15, the system of any of Examples 9 - 14, wherein the feature extractor is further configured to: receive target emblem information, the target emblem information comprising an image of the target emblem; receive non- target emblem information, the non-target emblem information comprising an image of the non-target emblem; extract a set of target emblem features from the target emblem; and extract a set of non-target emblem features from the non-target emblem, wherein the classifier is trained based on the set of target emblem features and the set of non-target emblem features.

BRIEF DESCRIPTION OF THE DRAWINGS

[0018] FIG. 1 is a block diagram depicting an illustrative content delivery system, in accordance with embodiments of the disclosed subject matter.

[0019] FIG. 2 is a block diagram illustrating an operating environment, in accordance with embodiments of the disclosed subject matter.

[0020] FIG. 3 is a flow diagram depicting an illustrative method of emblem identification, in accordance with embodiments of the disclosed subject matter.

[0021] FIG. 4 is a flow diagram depicting another illustrative method of emblem identification, in accordance with embodiments of the disclosed subject matter. [0022] FIG. 5A is an illustrative image of a video frame, in accordance with embodiments of the disclosed subject matter.

[0023] FIG. 5B is an illustrative segment map generated by segmenting the illustrative image of FIG. 5A, in accordance with embodiments of the disclosed subject matter.

[0024] FIG. 5C is an illustrative pre-filtered image generated by pre-filtering the image of FIG. 5A based on the segment map of FIG. 5B, in accordance with embodiments of the disclosed subject matter.

[0025] While the disclosed subject matter is amenable to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and are described in detail below. The intention, however, is not to limit the subject matter disclosed herein to the particular embodiments described. On the contrary, the disclosure is intended to cover all modifications, equivalents, and alternatives falling within the scope of the subject matter disclosed herein, and as defined by the appended claims.

[0026] As used herein in association with values (e.g., terms of magnitude, measurement, and/or other degrees of qualitative and/or quantitative observations that are used herein with respect to characteristics (e.g., dimensions, measurements, attributes, components, etc.) and/or ranges thereof, of tangible things (e.g., products, inventory, etc.) and/or intangible things (e.g., data, electronic representations of currency, accounts, information, portions of things (e.g., percentages, fractions), calculations, data models, dynamic system models, algorithms, parameters, etc.), "about" and "approximately" may be used, interchangeably, to refer to a value, configuration, orientation, and/or other characteristic that is equal to (or the same as) the stated value, configuration, orientation, and/or other characteristic or equal to (or the same as) a value, configuration, orientation, and/or other characteristic that is reasonably close to the stated value, configuration, orientation, and/or other characteristic, but that may differ by a reasonably small amount such as will be understood, and readily ascertained, by individuals having ordinary skill in the relevant arts to be attributable to measurement error; differences in measurement and/or manufacturing equipment calibration; human error in reading and/or setting measurements; adjustments made to optimize performance and/or structural parameters in view of other measurements (e.g., measurements associated with other things); particular implementation scenarios; imprecise adjustment and/or manipulation of things, settings, and/or measurements by a person, a computing device, and/or a machine; system tolerances; control loops; machine-learning; foreseeable variations (e.g., statistically insignificant variations, chaotic variations, system and/or model instabilities, etc.); preferences; and/or the like.

[0027] Although the term "block" may be used herein to connote different elements illustratively employed, the term should not be interpreted as implying any requirement of, or particular order among or between, various blocks disclosed herein. Similarly, although illustrative methods may be represented by one or more drawings (e.g., flow diagrams, communication flows, etc.), the drawings should not be interpreted as implying any requirement of, or particular order among or between, various steps disclosed herein. However, certain embodiments may require certain steps and/or certain orders between certain steps, as may be explicitly described herein and/or as may be understood from the nature of the steps themselves (e.g., the performance of some steps may depend on the outcome of a previous step). Additionally, a "set," "subset," or "group" of items (e.g., inputs, algorithms, data values, etc.) may include one or more items, and, similarly, a subset or subgroup of items may include one or more items. A "plurality" means more than one.

[0028] As used herein, the term "based on" is not meant to be restrictive, but rather indicates that a determination, identification, prediction, calculation, and/or the like, is performed by using, at least, the term following "based on" as an input. For example, predicting an outcome based on a particular piece of information may additionally, or alternatively, base the same determination on another piece of information.

DETAILED DESCRIPTION

[0029] Embodiments of the disclosed subject matter include systems and methods configured for identifying one or more emblems in a video stream. The video stream may include multimedia content targeted at end-user consumption. Embodiments of the system may perform segmentation, classification, tracking, and reporting.

[0030] Embodiments of the classification procedures described herein may be similar to, or include, cascade classification (as described, for example, in Rainer Lienhart and Jochen Maydt, "An Extended Set of Haar-like Features for Rapid Object Detection " IEEE ICIP, Vol. 1 , pp. 900-903 (September 2002), attached herein as Appendix A, the entirety of which is incorporated herein by reference), an industry standard in object detection. Cascade classification largely takes advantage of two types of features: haar-like and LBP. These feature detectors have shown substantial promise for a variety of applications. For example, live face detection for autofocusing cameras is typically accomplished using this technique.

[0031] In embodiments, emblem information is received from emblem owners (e.g., customers of ad services associated with delivery of the video streams), and the emblem information is processed to enable more robust classification. That is, for example, emblem information may include images of emblems. The images may be segmented and feature extraction may be performed on the segmented images. The results of the feature extraction may be used to train classifiers to more readily identify the emblems.

[0032] Disadvantages of that technology include multiple detection, sensitivity to lighting conditions, and classification performance. The multiple detection issue may be due to overlapping sub-windows (caused by the sliding-window part of the approach). Embodiments mitigate this disadvantage by implementing a segmentation algorithm that provides a complete but disjoint labeling of the image. The sensitivity and the performance issues may be mitigated by embodiments of the disclosure by using a more robust feature set that would normally be too computationally expensive for live detection.

[0033] FIG. 1 depicts an illustrative content delivery system 100 having an encoding device 102. In embodiments, the encoding device 102 illustratively receives an image from the image source 104 over a network 106. The image source 104 may be a digital image device (e.g., a camera), a content provider, a storage device, and/or the like. Exemplary images include, but are not limited to, digital photographs, digital images from medical imaging, machine vision images, video images (e.g., frames of a video stream), and any other suitable images having a plurality of pixels. The encoding device 102 is illustratively coupled to a receiving device 108 by the network 106. In embodiments, the encoding device 102 communicates encoded video data to the receiving device 108 over the network 106. In some embodiments, the network 106 may include a wired network, a wireless network, or a combination of wired and wireless networks. Illustrative networks include any number of different types of communication networks such as, for example, a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), the Internet, a P2P network, or other suitable networks. The network 106 may include a combination of multiple networks.

[0034] Although not illustrated herein, the receiving device 108 may include any combination of components described herein with reference to the encoding device 102, components not shown or described, and/or combinations of these. In embodiments, the encoding device 102 may include, or be similar to, the encoding computing systems described in U.S. Application No. 13/428,707, filed Mar. 23, 2012, entitled "VIDEO ENCODING SYSTEM AND METHOD;" and/or U.S. Application No. 13/868,749, filed April 23, 2013, entitled "MACROBLOCK PARTITIONING AND MOTION ESTIMATION USING OBJECT ANALYSIS FOR VIDEO COMPRESSION;" the disclosure of each of which is expressly incorporated by reference herein.

[0035] The encoding device 102 may be configured to encode video data received from the image source 104 and may, in embodiments, be configured to facilitate insertion of ads into the encoded video data. In embodiments, the encoding device may encode video data into which ads have already been inserted by other components (e.g., ad network components). The ads may be provided by an ad provider 1 10 that communicates with the encoding device 102 via the network 106. In embodiments, an emblem owner (e.g., a company that purchases ads containing its emblem from the ad provider 1 10) may interact, via the network 106, with the ad provider 1 10, the encoding device 102, the image source 104, and/or the like, using an emblem owner device 1 12. In embodiments, the emblem owner may wish to receive reports containing information about the placement of its emblem(s) in content encoded by the encoding device. The emblem owner may also, or alternatively, wish to purchase ad space that is not also proximate to emblem placement by its competitors. For example, an emblem owner may require that a video stream into which its emblem is to be inserted not also contain an emblem of a competitor, or the emblem owner may require that its emblem be separated, within the video stream, from a competitor's emblem by a certain number of frames, a certain amount of playback time, and/or the like.

[0036] In embodiments, the encoding device 102 may instantiate an emblem identifier 1 14 configured to identify one or more emblems within a video stream. According to embodiments, emblem identification may be performed by an emblem identifier that is instantiated independently of the encoding device 102. For example, the emblem identifier 1 14 may be instantiated by another component such as a stand-alone emblem identification device, a virtual machine running in an encoding environment, by a device that is maintained by an entity that is not the entity that maintains/provides the encoding device 102 (e.g., emblem identification may be provided by a third-party vendor), by a component of an ad provider, and/or the like. Thus, although various aspects of embodiments of emblem identification are described herein in the context of an emblem identifier 1 14 implemented as part of an encoding device, this context is provided only as an example, and for clarity of description, and is not intended to limit the subject matter described herein to implementation on an encoding device. In this manner, according to embodiments, an emblem identifier 1 14 that is not implemented on an encoding device 102 may be implemented on a device and/or in an environment that includes a segmenter and/or that interacts with another device/environment having a segmenter. That is, for example, in embodiments, an emblem identifier may utilize the output of a segmenter implemented by an encoding device. In embodiments, the emblem identifier may include a segmenter.

[0037] In embodiments, the emblem owner may provide emblem information to the emblem identifier 1 14. The emblem information may include images of the emblem owner's emblem, images of emblems of the emblem owner's competitors, identifications of the emblem owner's competitors (e.g., which the emblem identifier 1 14 may use to look up, from another source of information, emblem information associated with the competitors), and/or the like. The emblem identifier 1 14 may store the emblem information in a database 1 16. In embodiments, the emblem identifier 1 14 may process the emblem information to facilitate more efficient and accurate identification of the associated emblem(s). For example, the emblem identifier 1 14 may utilize a segmenter implemented by the encoding device 102 to segment images of emblems, and may perform feature extraction on the segmented images of the emblems, to generate feature information that may be used to at least partially train one or more classifiers to identify the emblems. By associating the emblem identifier 1 14 with the encoding device 102 (e.g., by implementing the emblem identifier 1 14 on the encoding device 102, by facilitating communication between the emblem identifier 1 14 and the encoding device 102, etc.), the emblem identifier 1 14 may be configured to utilize the robust and efficient segmentation performed by the encoding device 102 to facilitate emblem identification.

[0038] The illustrative system 100 shown in FIG. 1 is not intended to suggest any limitation as to the scope of use or functionality of embodiments of the present disclosure. Neither should the illustrative system 100 be interpreted as having any dependency or requirement related to any single component or combination of components illustrated therein. Additionally, any one or more of the components depicted in FIG. 1 may be, in embodiments, integrated with various ones of the other components depicted therein (and/or components not illustrated), all of which are considered to be within the ambit of the present disclosure.

[0039] FIG. 2 is a block diagram illustrating an operating environment 200 in accordance with embodiments of the subject matter disclosed herein. In embodiments, aspects of the operating environment 200 may be, include, or be included in, a system for identifying a target emblem in a video stream. The operating environment 200 includes an encoding device 202 (e.g., the encoding device 102 depicted in FIG. 1 ) that may be configured to encode video data 204 to create encoded video data 206. As shown in FIG. 2, the encoding device 202 may also be configured to communicate the encoded video data 206 to a decoding device 208 (e.g., receiving device 108 depicted in FIG. 1 ) via a communication link 210. In embodiments, the communication link 210 may be, include, and/or be included in, a network (e.g., the network 106 depicted in FIG. 1 ).

[0040] As shown in FIG. 2, the encoding device 202 may be implemented on a computing device that includes a processor 212, a memory 214, and an input/output (I/O) device 216. Although the encoding device 202 is referred to herein in the singular, the encoding device 202 may be implemented in multiple instances, distributed across multiple computing devices, instantiated within multiple virtual machines, and/or the like. In embodiments, the processor 212 executes various program components stored in the memory 214, which may facilitate encoding the video data 204. In embodiments, the processor 212 may be, or include, one processor or multiple processors. In embodiments, the I/O device 216 may be, or include, any number of different types of devices such as, for example, a monitor, a keyboard, a printer, a disk drive, a universal serial bus (USB) port, a speaker, pointer device, a trackball, a button, a switch, a touch screen, and/or the like.

[0041] According to embodiments, as indicated above, various components of the operating environment 200, illustrated in FIG. 2, may be implemented on one or more computing devices. A computing device may include any type of computing device suitable for implementing embodiments of the subject matter disclosed herein. Examples of computing devices include specialized computing devices or general- purpose computing devices such as "workstations," "servers," "laptops," "desktops," "tablet computers," "hand-held devices," and the like, all of which are contemplated within the scope of FIG. 2 with reference to various components of the operating environment 200. For example, according to embodiments, the encoding device 202 (and/or the video decoding device 208) may be, or include, a general purpose computing device (e.g., a desktop computer, a laptop, a mobile device, and/or the like), a specially-designed computing device (e.g., a dedicated video encoding device), and/or the like. Additionally, although not illustrated herein, the decoding device 208 may include any combination of components described herein with reference to encoding device 202, components not shown or described, and/or combinations of these.

[0042] In embodiments, a computing device includes a bus that, directly and/or indirectly, couples the following devices: a processor, a memory, an input/output (I/O) port, an I/O component, and a power supply. Any number of additional components, different components, and/or combinations of components may also be included in the computing device. The bus represents what may be one or more busses (such as, for example, an address bus, data bus, or combination thereof). Similarly, in embodiments, the computing device may include a number of processors, a number of memory components, a number of I/O ports, a number of I/O components, and/or a number of power supplies. Additionally any number of these components, or combinations thereof, may be distributed and/or duplicated across a number of computing devices.

[0043] In embodiments, the memory 214 includes computer-readable media in the form of volatile and/or nonvolatile memory and may be removable, nonremovable, or a combination thereof. Media examples include Random Access Memory (RAM); Read Only Memory (ROM); Electronically Erasable Programmable Read Only Memory (EEPROM); flash memory; optical or holographic media; magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices; data transmissions; or any other medium that can be used to store information and can be accessed by a computing device such as, for example, quantum state memory, and the like. In embodiments, the memory 214 stores computer-executable instructions for causing the processor 212 to implement aspects of embodiments of system components discussed herein and/or to perform aspects of embodiments of methods and procedures discussed herein. Computer- executable instructions may include, for example, computer code, machine-useable instructions, and the like such as, for example, program components capable of being executed by one or more processors associated with a computing device. Examples of such program components include a segmenter 218, an emblem identifier 220, an encoder 222, and a communication component 224. Program components may be programmed using any number of different programming environments, including various languages, development kits, frameworks, and/or the like. Some or all of the functionality contemplated herein may also, or alternatively, be implemented in hardware and/or firmware.

[0044] In embodiments, as described above with reference to FIG. 1 , the segmenter 218 and/or the emblem identifier 220 may be implemented on the encoding device 202 and/or on (or in association with) any other device such as, for example, a device that is independent of (but that may communicate with) the encoding device 202. Thus, although various aspects of embodiments of emblem identification are described herein in the context of a segmenter 218 and an emblem identifier 220 implemented as part of an encoding device 202, this context is provided only as an example, and for clarity of description, and is not intended to limit the subject matter described herein to implementation on an encoding device.

[0045] In embodiments, the segmenter 218 may be configured to segment a video frame into a number of segments to generate a segment map. The segments may include, for example, objects, groups, slices, tiles, and/or the like. The segmenter 218 may employ any number of various automatic image segmentation methods known in the field. In embodiments, the segmenter 218 may use image color and corresponding gradients to subdivide an image into segments that have similar color and texture. Two examples of image segmentation techniques include the watershed algorithm and optimum cut partitioning of a pixel connectivity graph. For example, the segmenter 218 may use Canny edge detection to detect edges on a video frame for optimum cut partitioning, and create segments using the optimum cut partitioning of the resulting pixel connectivity graph. In embodiments, the segmenter 218 implements aspects of the segmentation techniques described in luri Frosio and Ed Ratner, "Adaptive Segmentation Based on a Learned Quality Metric," Proceedings of the 10 th International Conference on Computer Vision Theory and Applications, March 2015, attached herein as Appendix B, the entirety of which is incorporated herein by reference.

[0046] The resulting segment map of image segments includes an assignment of indices to every pixel in the image, which allows for the frame to be dealt with in a piecemeal fashion. Each index, which may be indexed in a database 226 stored in the memory 214, may be considered a mask for this purpose. The database 226, which may refer to one or more databases, may be, or include, one or more tables, one or more relational databases, one or more multi-dimensional data cubes, and the like. Further, though illustrated as a single component, the database 226 may, in fact, be a plurality of databases 226 such as, for instance, a database cluster, which may be implemented on a single computing device or distributed between a number of computing devices, memory components, or the like.

[0047] In embodiments, the emblem identifier 220 may be configured to identify, using the segment map, the presence of emblems within digital images such as, for example, frames of video. In embodiments, the emblem identifier 220 may perform emblem identification on images that have not been segmented. In embodiments, results of emblem identification may be used by the segmenter 218 to inform a segmentation process. According to embodiments, as shown in FIG. 2, the emblem identifier 220 includes a pre-filter 228 configured to filter segments that are determined to be unlikely to contain an emblem from a segmented image.

[0048] According to embodiments, the pre-filter 228 is configured to compute basic color metrics for each of the segments of a segmented image and to identify, based on emblem data 230 (which may, for example, include processed emblem information), segments that are unlikely to contain a particular emblem. As used herein, the term "based on" is not meant to be restrictive, but rather indicates that a determination, identification, prediction, calculation, or the like, is performed by using, at least, the term following "based on" as an input. For example, the pre-filter 228 may identify segments unlikely to contain a certain emblem based on emblem data, and/or may identify those segments based on other information such as, for example, texture information, known information regarding certain video frames, and/or the like.

[0049] For example, in embodiments, the pre-filter is configured to determine, using one or more color metrics, that each segment of a first set of segments of a segment map is unlikely to include the target emblem; and to remove the first set of segments from the segment map to generate a pre-filtered segment map. As used herein, the term "target emblem" refers to an emblem that an emblem identifier (e.g., the emblem identifier 220) is tasked with identifying within a video stream. The color metrics may include, in embodiments, color histogram matching to the emblem data 230. For example, in embodiments, by comparing means and standard deviations associated with color distributions in the frame and the emblem data 230, embodiments may facilitate removing segments that are unlikely to contain a target emblem. In embodiments, the pre-filter 228 pre-filters the image on a per- emblem basis, thereby facilitating a more efficient and accurate feature extraction process.

[0050] The emblem identifier 220 also may include a feature extractor 232 configured to extract one or more features from an image to generate a feature map. In embodiments, the feature extractor 232 may represent more than one feature extractors. The feature extractor 232 may include any number of different types of feature extractors, implementations of feature extraction algorithms, and/or the like. For example, the feature extractor 232 may perform histogram of oriented gradients feature extraction ("HOG," as described, for example, in Navneet Dalai and Bill Triggs, "Histograms of Oriented Gradients for Human Detection," available at http://iear -nnaipes fr/peop;e/triqg¾/pubs/Dalal-cvpr05.pdf. 2005, attached herein as Appendix C, the entirety of which is hereby incorporated herein by reference), Gabor feature extraction (as explained, for example, in John Daugman, "Complete Discrete 2-D Gabor Transforms by Neural Networks for Image Analysis and Compression," IEEE Transaction on Acoustics, Speech, and Signal Processing, Vol. 36, No. 7, 1988, attached herein as Appendix D, the entirety of which is hereby incorporated herein by reference), Kaze feature extraction, speeded-up robust features (SURF, as explained, for example, in D.G. Lowe, "Distinctive Image Features from Scale- Invariant Keypoints," International Journal of Computer Vision, 60, 2, pp. 91 -1 10, 2004, attached herein as Appendix E, the entirety of which is hereby incorporated herein by reference) feature extraction, features from accelerated segment (FAST) feature extraction, scale-invariant feature transform (SIFT) feature extraction, and/or the like. In embodiments, the feature extractor 232 may detect features in an image based on emblem data 230. By generating the features on the full frame, embodiments allow for feature detection in cases where nearby data could still be useful (e.g. an edge at the end of a segment).

[0051] The emblem identifier 220 (e.g., via the feature extractor 232 and/or classifier 234) may be further configured to use the list of culled segments from the pre-filter step to remove features that fall outside the expected areas of interest. The practice also may facilitate, for example, classification against different corporate emblems by masking different areas depending on the target emblem. For example, the emblem identifier 220 may be configured to remove, from a feature map, a first set of features, wherein at least a portion of each of the first set of features is located in at least one of the segments of the first set of segments.

[0052] After masking, the remaining features may be classified using a classifier 234 configured to classify at least one of the plurality of features to identify the target emblem in the video frame. In embodiments, the emblem identifier 220 further comprising an additional classifier, the additional classifier configured to mask at least one feature corresponding to a non-target emblem. The classifier 234 may be configured to receive input information and produce output that may include one or more classifications. In embodiments, the classifier 234 may be a binary classifier and/or a non-binary classifier. The classifier 234 may include any number of different types of classifiers such as, for example, a support vector machine (SVM), an extreme learning machine (ELM), a neural network, a kernel-based perceptron, a k- NN classifier, a bag-of-visual-words classifier, and/or the like. In embodiments, high quality matches are selected as matches, providing both identification and location for the target emblem. Embodiments of classification techniques that may be utilized by the classifier 234 include, for example, techniques described in Andrey Gritsenko, Emil Eirola, Daniel Schupp, Ed Ratner, and Amaury Lendasse, "Probabilistic Methods for Multiclass Classification Problems," Proceedings of ELM-2015, Vol. 2, January 2016, attached herein as Appendix F; and Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, and Cedric Bray, " Visual Categorization with Bags of Keypoints," Xerox Research Centre Europe, 2004, attached herein as Appendix G; the entirety of each of which is hereby incorporated herein by reference.

[0053] The emblem identifier 220 may include a tracker 236 that is configured to track the identified target emblem by identifying an additional video frame in which the identified target emblem appears. For example, to provide valuable insights, it may be desirable to identify an emblem on more than a frame-by-frame basis, thereby eliminating the need for a human operator to interpret a marked-up stream to provide timing a report. That is, it may be desirable to track the identified emblem from frame to frame. According to embodiments, given a match in a frame (determined by the appearance of a high-quality feature match), the tracker 236 looks at neighboring frames for high quality matches that are well localized. This not only allows robust reporting, but also may improve match quality. In embodiments, single- frame appearances may be discarded as false positives, and temporal hints may allow improved robustness for correct classification for each frame of the video. As an example, in embodiments, the classifier 234 may classify at least one of a plurality of features to identify a candidate target emblem in the video frame. The tracker 236 may be configured to determine that the candidate target emblem does not appear in an additional video frame; and identify, based on determining that the candidate target emblem does not appear in an additional video frame, the candidate target emblem as a false-positive.

[0054] The tracked identified emblems may be used to generate a report of emblem appearance, location, apparent size, and duration of appearance. As shown in FIG. 2, the emblem identifier 220 may further include a reporting component 238 configured to generate a report based on the tracked identified emblem. The report may include any number of different types of information including, for example, a listing of identified emblems, placement of each identified emblem, duration of appearance of each identified emblem, and/or the like. The reporting component 238 may provide the report, via the communication component 224, to an emblem owner. In embodiments, the communication component 224 may be configured to send a notification to the emblem owner, facilitate access to the report via a webpage, and/or the like.

[0055] As shown in FIG. 2, the encoding device 202 also includes an encoder 222 configured for entropy encoding of partitioned video frames. In embodiments, the communication component 224 is configured to communicate encoded video data 206. For example, in embodiments, the communication component 224 may facilitate communicating encoded video data 206 to the decoding device 208.

[0056] According to embodiments, the emblem identifier 220 may be configured to process emblem information to generate the emblem data 230. That is, for example, prior to live identification, the database 226 of target emblems may be processed for feature identification offline. Processing the emblem information may be performed by the segmenter 218 and the feature extractor 232. By processing the emblem information before performing an emblem identification procedure, embodiments of the subject matter disclosed herein facilitate training classifiers (e.g., the classifier 234) that can more efficiently identify the emblems. Additionally, in this manner, emblems that are split by the segmentation algorithm at runtime may be still well identified by the classifier 234. As an example, an emblem with several leaves incorporated has a high chance of the leaves being segmented apart. By identifying the local features for each segment (e.g. a shape/texture descriptor for a leaf), embodiments of the subject matter disclosed herein facilitate identifying those features on a segment-by-segment basis in the video stream.

[0057] In embodiments, for example, the encoding device 202 may be configured to receive target emblem information, the target emblem information including an image of the target emblem. The encoding device 202 may be further configured to receive non-target emblem information, the non-target emblem information including an image of one or more non-target emblems. The segmenter 218 may be configured to segment the images of the target emblems and/or non-target emblems, and the feature extractor 232 may be configured to extract a set of target emblem features from the target emblem; and extract a set of non-target emblem features from the non-target emblem. In this manner, the emblem identifier 220 may train the classifier 234 based on the set of target emblem features and the set of non-target emblem features.

[0058] The illustrative operating environment 200 shown in FIG. 2 is not intended to suggest any limitation as to the scope of use or functionality of embodiments of the present disclosure. Neither should the illustrative operating environment 200 be interpreted as having any dependency or requirement related to any single component or combination of components illustrated therein. Additionally, any one or more of the components depicted in FIG. 2 may be, in embodiments, integrated with various ones of the other components depicted therein (and/or components not illustrated), all of which are considered to be within the ambit of the present disclosure.

[0059] FIG. 3 is a flow diagram depicting an illustrative method 300 of identifying a target emblem in a video stream, in accordance with embodiments of the subject matter disclosed herein. In embodiments, aspects of the method 300 may be performed by an encoding device (e.g., the encoding device 102 depicted in FIG. 1 and/or the encoding device 202 depicted in FIG. 2). As shown in FIG. 3, embodiments of the illustrative method 300 may include receiving video data containing an image (block 302). In embodiments, the image may be a video frame, which may include, for example, one or more video frames received by the encoding device from another device (e.g., a memory device, a server, and/or the like).

[0060] Embodiments of the method 300 further include segmenting the image to generate a segment map (block 304). The image may be pre-filtered (block 306), based on the segment map. For example, in embodiments, the method 300 includes pre-filtering the image by determining, using one or more color metrics, that each segment of a first set of segments is unlikely to include the target emblem; and removing the first set of segments from the image to generate a pre-filtered image. In embodiments, the one or more color metrics includes metrics generated by performing color histogram matching.

[0061] According to embodiments, the method 300 further includes extracting features from the pre-filtered image to generate a feature map (block 308). Embodiments of the method 400 further include identifying the target emblem in the image (block 310). Identifying the target emblem may include classifying at least one of a plurality of features, using a classifier, to identify the target emblem in the video frame. In embodiments, the classifier may include a bag-of-visual-words model. In embodiments, before classification, the method 300 further includes removing, from the feature map, a first set of features, wherein at least a portion of each of the first set of features is located in at least one of the segments of the first set of segments. Additionally, or alternatively, embodiments of the method 300 further include masking, using an additional classifier, at least one feature corresponding to a non- target emblem.

[0062] The method 300 may further include tracking the identified target emblem by identifying an additional video frame in which the identified target emblem appears (block 312). Although not illustrated, embodiments of the method 300 may further include generating a report based on the tracked identified emblem. According to embodiments, the report may include any number of different types of information including, for example, target emblem appearance frequency, target emblem size, target emblem placement, and/or the like.

[0063] FIG. 4 is a flow diagram depicting another illustrative method 400 of identifying a target emblem in a video stream, in accordance with embodiments of the subject matter disclosed herein. The video stream may include video data, the video data including one or more video frames, where each video frame includes an image. In embodiments, aspects of the method 400 may be performed by an encoding device (e.g., the encoding device 102 depicted in FIG. 1 and/or the encoding device 202 depicted in FIG. 2). As shown in FIG. 4, embodiments of the illustrative method 400 include processing emblem information (block 402). Processing emblem information may include, for example, receiving target emblem information, the target emblem information including an image of the target emblem; and extracting a set of target emblem features from the target emblem. Processing emblem information may also include receiving non-target emblem information, the non-target emblem information including an image of one or more non-target emblems; and extracting a set of non-target emblem features from the one or more non-target emblems. Processing the emblem information may further include training one or more classifiers based on the target and/or non-target emblem features.

[0064] Embodiments of the illustrative method 400 may include segmenting the image to generate a segment map (block 404). Embodiments of the method 400 further include pre-filtering the image by determining a first set of segments unlikely to include the target emblem (block 406) and removing the first set of segments from the image to generate a pre-filtered image (block 408). For example, the image may be the illustrative image 500 depicted in FIG. 5A, containing an NBC emblem. The method 400 may include segmenting the image 500 to generate a segment map 502, depicted in FIG. 5B. A pre-filter (e.g., the pre-filter 228 depicted in FIG. 2) may be used to identify segments of the segment map 502 that are not likely to include the NBC emblem. That is, for example, color metrics may be determined and matching utilized to identify segments that are not likely to include the colors and/or color patterns or characteristics of the NBC emblem. Those segments may be removed from the image to generate the filtered image 504 depicted in FIG. 5C.

[0065] Embodiments of the method 400 further include generating a feature map (block 410). In embodiments, the method 400 may also include removing a first set of features corresponding to the first set of segments (block 412) and masking features corresponding to the non-target emblems (block 414). A classifier is used to identify a candidate target emblem (block 416). In embodiments, the method 400 includes tracking the candidate target emblem (block 418) and identifying, based on the tracking, the candidate target emblem as the target emblem or as a false-positive (block 420). For example, in embodiments, the method 400 includes determining that the candidate target emblem does not appear in an additional video frame; and identifying, based on determining that the candidate target emblem does not appear in an additional video frame, the candidate target emblem as a false-positive. In embodiments, the method 400 may alternatively include determining that the candidate target emblem appears in an additional video frame; and identifying, based on determining that the candidate target emblem appears in an additional video frame, the candidate target emblem as the target emblem.

[0066] While embodiments of the present disclosure are described with specificity, the description itself is not intended to limit the scope of this patent. Thus, the inventors have contemplated that the claimed disclosure might also be embodied in other ways, to include different steps or features, or combinations of steps or features similar to the ones described in this document, in conjunction with other technologies.