Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEM AND METHOD FOR ESTIMATING CHARACTERISTICS OF PERSONS OR THINGS
Document Type and Number:
WIPO Patent Application WO/2009/039350
Kind Code:
A1
Abstract:
In one embodiment of the subject application, there is a method and system of estimating the number of persons or things. The method and system include receiving data representing a visual image of the persons or things; analyzing the data in the frequency domain to observe one or more edge properties of one or more edges of an outline of the person or things in the visual representation; and estimating a number of persons or things represented by the data by comparing the one or more edge properties against a model set of characteristics for the persons or things. A person or thing is counted in the number of persons or things for each set of the one or more edge properties that correlate to the model set of characteristics.

Inventors:
SLOWE THOMAS (US)
HALL PETER (CA)
DACHUK JOSEPH (CA)
Application Number:
PCT/US2008/076977
Publication Date:
March 26, 2009
Filing Date:
September 19, 2008
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MICRO TARGET MEDIA HOLDINGS IN (CA)
SLOWE THOMAS (US)
HALL PETER (CA)
DACHUK JOSEPH (CA)
International Classes:
G06F19/00; G06V10/50
Foreign References:
US20070186165A12007-08-09
US20040249650A12004-12-09
US20030033145A12003-02-13
Attorney, Agent or Firm:
MIZER, Susan (925 Euclid Ave.1150 Huntington Bldg, Cleveland Ohio, US)
Download PDF:
Claims:
What is claimed:

1. A system for extracting population information from a digital image comprising: means adapted for receiving digital image data inclusive of geographic data corresponding to an image of a geographical location; means adapted for receiving sector data defining at least one sub-portion of interest relative to the digital image data; means adapted for receiving boundary data; boundary detection means adapted for detecting at least one boundary area within the at least one sub-portion of interest in accordance with received boundary data; means adapted for generating population data in accordance with each detected boundary area; and means adapted for outputting the population data.

2. The system of claim 1 wherein the boundary data includes histogram data corresponding to at least one physical human characteristic such that the at least one boundary area includes a human.

3. The system of claim 2 further comprising means adapted for isolating feature data corresponding to at least one human feature disposed within the at least one boundary area.

4. The system of claim 3 wherein the human feature includes clothing information.

5. The system of claim 3 wherein the human feature includes a human body characteristic.

6. The system of claim 5 wherein the at least one sub-portion of interest is associated with delivery of a commercial message directed thereto.

7. The system of claim 6 further comprising:

Im_n_ge\01149S 000051U015194 1-JXG 14

categorizing means adapted for categorizing each isolated human body characteristics as positive or negative; and means adapted for generating feedback data corresponding to effectiveness of the commercial message in accordance with an output of the categorizing means.

8. A method for extracting population information from a digital image comprising the steps of: receiving digital image data inclusive of geographic data corresponding to an image of a geographical location; receiving sector data defining at least one sub-portion of interest relative to the digital image data; receiving boundary data; detecting at least one boundary area within the at least one sub-portion of interest in accordance with received boundary data; generating population data in accordance with each detected boundary area; and outputting the population data.

9. The method of claim 8 wherein the boundary data includes histogram data corresponding to at least one physical human characteristic such that the at least one boundary area includes a human.

10. The method of claim 9 further comprising the step of isolating feature data corresponding to at least one human feature disposed within the at least one boundary area.

11. The method of claim 10 wherein the human feature includes clothing information.

12. The method of claim 10 wherein the human feature includes a human body characteristic.

13. The method of claim 7 wherein the at least one sub-portion of interest is associated with delivery of a commercial message directed thereto.

Imanage\011495.()00051\1015194.1-JXG 15

14. The method of claim 13 further comprising the steps of: categorizing each isolated human body characteristics as positive or negative; and generating feedback data corresponding to effectiveness of the commercial message in accordance with an output of the categorizing step.

1 JXG 16

Description:

SYSTEM AND METHOD FOR ESTIMATING CHARACTERISTICS OF PERSONS

OR THINGS

Cross-Reference to Related Applications This application claims priority to U.S. Provisional Application 60/973,678, filed 19

September 2007 and incorporates by this reference the disclosures of co-pending patent applications 11/558,031 entitled METHOD FOR DISPLAY OF ADVERTISING and filed 9 November 2006, 60/870,258 entitled SYSTEM AND METHOD FOR DISPLAY OF ADVERTISING, AND METHODS OF TRACKING VIEWINGS THEREOF and filed 15 December 2006, 60/871,507 entitled SYSTEM AND METHOD FOR DISPLAYING ADVERTISING AND TRACKING VIEWINGS THEREOF and filed 22 December 2006, 60/938,013 entitled SYSTEM AND METHOD FOR OBTAINING AND UTILIZING ADVERTISING INFORMATION and filed 15 May 2007, and 60/970,191 entitled SYSTEM AND METHOD FOR ESTIMATING CHARACTERISTICS OF PERSONS OR THINGS and filed 5 September 2007; including all appendices and other documents attached thereto.

Background of the Invention

The invention relates to systems and methods for estimating a number and/or other characteristics of persons or things, and particularly to systems and methods useful for estimating numbers and other characteristics of persons and other things included in visual representations and/or images of such persons, things and the like.

Summary of the Invention

In one embodiment of the subject application, there is provided an apparatus, systems, methods and computer programming for estimating a number of persons or things.

In one embodiment of the subject application, there is a method and system of estimating the number of persons or things. The method and system include receiving data representing a visual image of the persons or things; analyzing the data in the frequency domain to observe one or more edge properties of one or more edges of an outline of the person or things in the visual representation; and estimating a number of persons or things represented by the data by comparing the one or more edge properties against a model set of characteristics for the persons or things. A person or thing is counted in the number of persons or things for each set of the one or more edge properties that correlate to the model set of characteristics.

000051MOlSIe 4 1 JXG 1

The analyzing the data may include separating one or more areas of the visual representation showing the persons or things from one or more background areas, and analyzing the one or more areas showing the persons or things to observe the one or more edge properties of the persons or things. The model set of characteristics can be predetermined. The model set of characteristics can be updated. The model set of characteristics can be updated by self- training.

The one or more background areas may be determined by comparison to a background model set of characteristics, and the background model may be updatable. The one or more edge properties may be determined to correlate to the model set of characteristics by meeting a threshold number of characteristics in the model set of characteristics.

Still other advantages, aspects and features of the subject application will become readily apparent to those skilled in the art from the following description wherein there is shown and described a preferred embodiment of the subject application, simply by way of illustration of one of the best modes best suited to carry out the subject application. As it will be realized, the subject application is capable of other different embodiments and its several details are capable of modifications in various obvious aspects all without departing from the scope of the subject application. Accordingly, the drawings and descriptions will be regarded as illustrative in nature and not as restrictive.

Brief Description of the Drawings

The foregoing and other aspects of the invention will become more apparent from the following description of specific embodiments thereof and the accompanying drawings which illustrate, by way of example only, the principles of the invention. In the drawings, where like elements feature like reference numerals (and wherein individual elements bear unique alphabetical suffixes):

Figure 1 is flow chart block diagram of an exemplary method of estimating a number of persons or things in accordance with the invention; Figure 2 provides transition charts relating to data analysis techniques useful in implementing embodiments of the method of Figure 1;

Imdnage\011495 000051U015194 1-JXG

Figure 3 is a flow chart block diagram of an exemplary method of estimating a number of persons or things in accordance with the invention, incorporating of Figure 1;

Figure 4 is a graph showing a density and probability curve in an exemplary implementation of the method of Figure 1;

Figures 5 and 6 are schematic block diagrams of exemplary processes useful in implementing embodiments of the invention;

Figures 7 and 8 are schematic block diagrams of exemplary processes useful in implementing alternate embodiments of the invention; Figure 9 is a block diagram of a further embodiment of the invention; and

Figure 10 is a block diagram of a still further embodiment of the invention.

Detailed Description of the Preferred Embodiments

The description which follows, and the embodiments described therein, are provided by way of illustration of an example, or examples, of particular embodiments of the principles of the present invention. These examples are provided for the purposes of explanation, and not limitation, of those principles and of the invention.

The apparatus, systems and methods described below are herein are useful in determining numbers and other characteristics of persons and/or other things present within or otherwise appearing in a given area or image, such as for example within a live or stored visual representation, such as still or moving images, or within a field of view. Such apparatus, systems and methods are particularly useful, for example, for implementation in computer-controlled applications for estimating tile numbers and reactions or persons in a crowd being monitored, such as by surveillance camera or cameras at an event. Such embodiments of the invention can be useful, for instance, for estimating the number and other characteristics of spectators at an event, numbers and other characteristics of persons at designated locations (at an event or otherwise), or the numbers or other characteristics of persons that are in the vicinity of certain buildings, landmarks, attractions, or advertising media. In addition to estimating numbers and other characteristics of persons in such circumstances, the estimation of numbers and other characteristics of other things can also be desired.

The estimation of the number and other characteristics of objects (be it either persons or things) within a visual representation can tend to be difficult, particularly where such persons or objects are present in high density, due to different factors including occlusion of

Imanage\011495.000051U015194.1-JXG 3

objects by each other; varied motion or the lack thereof; unknown intrinsic camera parameters for obtaining the visual representation; unknown camera position relative to the scene of the visual representation; and/or unpredictable lighting changes.

Figure 1 is a flow chart block diagram of an exemplary method for use in estimating numbers or other characteristics of objects in accordance with the invention. Feature extraction process 100 of Figure 1 comprises providing data corresponding to a visual representation 102 to a computing system or other data processor for processing. At 104, visual representation data 102 is compared to data representing a background model, which permits the analysis of data representing "foreground" areas that may represent objects of interest, such as people. Such areas of interest are sometimes referred to herein as "blobs". As visual representation 102 is processed, the background model can be updated as appropriate, at 106, such as to adjust for daylight to nighttime changes and/or to stationary objects placed into the scene and which become part of the background. In some applications, the extraction of foreground data for further number analysis can be limited to one or more particular areas of the visual representation that are of interest, for example, such as may be desirable if one is trying to determine the number or persons in line at a concession stand or the number of persons within a certain distance from an advertisement.

As will be appreciated by those skilled in the relevant arts, "background" models useful in processes according to the invention are models of any information likely to be present in a representation of an image that is not of interest. Such models can represent, for example, static items located within a field of view, regardless of their relative position within the field of view, or predicted or expected items, such as items which appear on a recurrent or predictable basis and are not of interest to the analyst.

A background model can be defined using a number of characteristics of a background scene. For instance, for a scene at an event in which a number of persons present within a given area is to be estimated, a background model can derived using a statistical model of the scene as it appears prior to entry of people to be counted. For example, one manner of analyzing a background model is to record data representing the background scene on a pixel-by-pixel basis. Referring to Figure 2 one concept of an exemplary method of updating a model of a stationary background, as shown at 106 in Figure 1, is shown. The entry of a new object into the visual field can be determined as a sharp change in the image characteristics over time. For example, changes within pixels representing the entirety or a sampling of an image can compared be observed over time, such that a sharp transition (shown as I: New Object) can

000051UOlSW 1-JXG 4

be interpreted as entry into the scene of a new object, whereas a gradual change in the pixel (image) quality or characteristics can be interpreted to be merely a change in the background, such as due to changing lighting conditions. Should a new object be determined to have entered the scene, and if the new object remains in the scene for long enough, the background model can be updated to reflect that the background scene should include the new object.

Conversely, a short-term or other previously-undetected presence of a new object can be interpreted as entry of a persons or other thing of interest to the scene. Thus, a person skilled in the relevant arts would appreciate that the processes of locating of areas of interest and updating of background models can inform one another. Furthermore, as shown in Figure 1, the process of updating the background model can also include manual intervention by an operator of the computing system for estimating the number of objects, especially for difficult cases that the system has lower confidence in determining background change or area location. For example, the system can flag particular change scenarios for operator intervention, either real time or as stored scenarios for later analysis. Thus, in an exemplary embodiment background model 106 can include a set of statistical measures of the behavior of the pixels that collectively represent the appearance of the scene from a given camera in the form of an image, such as a video frame image. The background model is for measuring static areas of the image, such that when a new dynamic object enters the field of view of the camera, a difference can be detected between its visual appearance and the appearance of the scene behind it. For example, if a pixel is thought of as a random variable with modelable statistical traits, pixels depicting portions of a new object on the scene would be detected as having significantly changed statistical traits.

The identification of areas of interest within an image can be accomplished through visual comparison of a background model against another visual representation. Alternatively or additionally, foreground models can be constructed to detect foregrounds (i.e., areas of interest). This could for example be accomplished using orthogonal models to detect areas that appear to include objects for which a number or other characteristic is to be determined, which models set out generic features of the object. Another foreground detection method that can be used is motion detection, in which frame subject methods are used to determine foregrounds, in the object is a mobile one (such as persons or vehicles).

Referring back to Figure 1, a person of skill will appreciate that, optionally, background separation and the identification of areas of interest 104 can be skipped, and the visual representation can be passed directly to edge detection 108 without first removing or otherwise accounting for the background. While this may tend to be more computationally

Imdndge\011495 000051M015194 1-JXG 5

intensive, it can tend to reduce or eliminate the need to create and update a background model. For example, one way of proceeding can include using foreground modeling and/or segmentation processing to find any areas of interest. Regardless of whether areas of interest are identified, the process then can move to edge detection processing 108 of the area(s) of interest, or the entire visual representation 102, as the case may be. The following description refers to "blobs" or "areas of interest", but it is equally applicable to an implementation in which the entire visual representation 102 is analyzed.

In edge detection processing 108, the system analyses the areas of interest to observe one or more frequency properties to the edges of the outline(s) of each area of interest. For example, a frequency transform applied an exemplary two dimensional (such as an x, y pixel pair) signal of the visual presentation 102 can be taken to determine edge properties of the area(s) of interest. A frequency decomposition algorithm known in the art, such as fourier transform, discrete cosine transform and/or wavelets, can be used to reorganize image information in terms of frequency instead of space, which can be considered a visual image's innate form. Several frequency decomposition algorithms can be used to perform a subset of the normal decompositions, focusing only upon a range of frequencies. In general, these algorithms are termed "edge detection algorithms". In an exemplary implementation, the Sobel Edge Detection algorithm can be employed with standard settings for both horizontally and vertically oriented frequencies to obtain edge property information. Edge detection processing 108 can also be informed by a scene model 110, which like the background model can be updatable to describe a geometric relationship between a visual source (e.g. a camera) and a three dimensional scene being observed by the visual source. Scene model 110 can, but need not, also describe a camera's parameters such as lens focal length, field of view, or other properties. Scene model 110 can be used with edge detection 108 to help inform processing 108 in its detection of edge properties to any identified areas of interest.

Once edge detection 108 is complete, the process moves onto breaking each edge and its associated edge properties 112, into oriented feature(s). An oriented feature is for example an edge property that relates to the orientation of an edge on the visual representation, such as vertical, horizontal, or diagonal, Including at various degrees and angles. Generation of edge properties, such as oriented features, can be tabulated or tracked as a feature list 114.

Feature list 114 can for example include a plot or a histogram of information for any edge property, or feature, that is broken out at 112. To estimate the number of objects in the visual representation, feature list 114 can be compared against a model set of characteristics

Im-ndge\011495 000051U015194 1-JXG 6

for the object whose number is being estimated. For instance, if the number of persons is being estimated, there can be edge characteristics to persons that are set out in the model, which can be compared to feature list 114 to estimate the number of persons in visual representation 102. In one implementation, it has been found that a human model with eight defined edge characteristics can provide a fairly reliable indication of person(s) in a visual representation. In the exemplary implementation, the eight edge features are derived from their orientations, and can be computed as follows. The image is convolved with a horizontal and vertical Sobel filter using standard settings, resulting in two corresponding horizontal and vertical images, in which the intensity of the pixel value at any given location implies a strength of an edge. The total strength of the edge at any particular point in the image can therefore be defined as a vector magnitude as calculated from the horizontal and vertical edge images. In this example, if this magnitude is greater than half the maximum magnitude across the entire image being considered, then it is considered a feature. The particular feature can be measured for its orientation by calculating the vector angle. For example, a 360 degree range can be broken up into eight equal parts each representing 45 degrees, the first of which can be defined to start at -22.5 degrees. A histogram of these eight features can then be assembled based upon the number of incidences of each feature with a given region. It will be appreciated that this example given above is a simplification of an approach that can incorporate the use of more than a slice of image frequencies coupled with spatial constraints that can further model the outline of object(s) in an area of interest.

Thus, in an embodiment the estimation of a number of objects can be handled by the computing system by matching a histogram of feature list 114 against an object model and looking for the number of matches. In the example of a person, one or more edge characteristics can be defined for each body part (such as the head and/or arms), which can be matched against feature list 114 generated from visual representation 102. From the number of resulting matches, an estimate can be made, within desired or otherwise-specified error margins as dictated in part by the level of detail in the object model, of the number of persons (i.e. objects) in visual representation 102. In the embodiment, the system can be trained by providing multiple examples of humans at a distance and crowds varied in density and numbers, which can be hand labeled for location and rough outline. The training can be a fully automated process, such as with artificial intelligence, or partially or wholly be based on manual operator intervention. With this training information, a feature histogram can be generated for each person, where it is normalized for person size given by a scene model. Each of these "people models" can then be used to train a machine-learning algorithm such as

000051U015194 1-JXG 7

an support vector machine, neural network, or other algorithm, resulting in a generalized model of human appearance ("GMHA") in the feature space. Thus, a simple initial approach can be to accumulate individual feature histograms to create a collection of features of an entire group, which can then be normalized by a total number of people used for training to result in the GMHA. During live operation, new images and/or sub-parts thereof, can be feature- extracted, normalized and used to produce feature histogram(s). These new feature histogram(s) can then be compared to the GMHA, using a machine learning algorithm such as those described above. In a basic example, the number of incidences of GMHA features within the new feature histograms can denote the number of objects (i.e., persons or things) within a given visual representation, such as an image or a sub-image.

Thus, it will be appreciated that greater or fewer characteristics can be defined in an object model with respect to the object being estimated, which can provide for greater or lesser confidence in an estimation of the number of objects in a visual representation being analyzed. Since the model characteristics, and the threshold or criteria for declaring a match can all be set and adjusted as desired for a particular application, the estimation process can tend to be optimized towards particular applications. For example, for the estimation of numbers of persons in dense crowds, the system would tend to have a more detailed object model of a human head and/or shoulder, so that only a partial view of the head and/or shoulder would be sufficient to generate the edge property that would result in a match. Referring for example to an implementation for counting persons in a crowd, as shown in Figure 3, process 100 provides feature list 114 (not shown in Figure 3) to comparator 308 for matching edge properties of the visual representation 102 against features of object model 306. Also shown in Figure 3 is a training process that can optionally be used to update the object feature model 306. Therein, a video archive of crowds can be fed through feature extraction process 100 to generate an archive feature list that the system can learn at 304 as being characteristics of persons in a crowd, which can then be used to update or revise model 306 with edge properties as appropriate.

From a comparison of feature list 114 with object model 306 in block 308, a number (or density)/probability curve 310 can be constructed to track if a match has been made. An example of such a curve is shown in Figure 4. Such a curve shows the number (or density) of persons at different probabilities, and permits a performance threshold to be set by a user of the system. For example, the curve of Figure 4 permits reports to be generated to state that a certain number of persons are shown in the visual representation at a particular percentage probability.

Imanage\0π495.000051\i01S194.1-JXG 8

In alternate embodiments, additional or alternative characteristics of persons or other objects can be determined in addition to merely the number of objects. For example, if the system is used to estimate the number of persons, more parameters regarding the persons can be specified, such as number of persons of particular age/gender/ethnicity, number of persons with positive facial expressions, number of persons with negative facial expressions, or number of persons wearing cloths of a particular color or style. In particular, for implementations relating to advertising, it can be desirable to be able to estimate or otherwise determine the number of persons that react "positively" or "strongly" to the advertising by observing the number of persons with "positive" or "strong" facial expressions in the vicinity of the advertising. For example, in advertising media, one audience measurement metric is whether there is a strong reaction to advertising that can be correlated to memory retention by the audience. It will be appreciated that for other objects, different estimation parameters or characteristics may be specified.

Referring to Figure 5, there is shown an example of a video analysis architecture 500 for estimating the number and determining other characteristics of a group of persons within a video image. In architecture 500, visual representation 502 of the group of persons is analyzed by the feature extraction process 100, customized for persons as described above, in addition to one or more of face view estimator 506, gender/ethnicity estimation 508, expression estimation 510, or other analysis 512. Feature models 516 relating to each of these analysis processes can be then compared to with extracted features from each or any of 100, 506, 508, 510 and 512 at block 514, so as to determine a number matches for each feature to estimate the number of persons fitting parameters defined with the feature extraction in 506, 508, 510 and 512. Models 516 in this example could include model object features in model 306 described above, as well as other features relating to the estimation parameters defined with 506, 508, 510 and 512. A person of skill in the relevant arts will appreciate that these model characteristics and the comparison thereof to generate number (density)/probability curves 518 are similar to that described above with respect to curve 316, and so such details are not described again with respect to

506, 508, 510, 512 and 518. While the foregoing has been described with reference to a single source of visual information, the apparatus, systems and methods described herein can be applied to multiple sources of visual information so as to provide scalability over large 25 areas. Alternatively, if two or more visual information sources are provided to the same physical location, the estimates resulting from each source can be correlated to provide

Imanage\011495 000051U015194 1-JXG 9

greater confidence in the estimate of the number of the object in the location covered by the visual information sources. For example, building on the example described above with reference to Figure 5, in Figure 6 there is shown an architecture 600 (designated as "macro" as opposed to the "micro" designation of architecture 500 shown in Figure 5) that utilizes multiple cameras to provide multiple visual representations of different locations of an event, in which a micro architecture 500 is associated with each camera in order to generate number estimations and number (density)/probability curves 605 for the event. In architecture 600, any overlaps in views captured by different cameras can be calculated and stored as global scene models 604, which can be used to ensure that the same objects, such as persons, are not counted more than once due to the object appearing within views of two or more cameras or visual sources. The total cumulative number (density)/probability estimates of an event can then be created as curves 606, representing estimates as seen by the entire camera or visual source network.

The output of the micro/macro architectures need not be number (density)/probability estimates or curves, but the system can be specified to output other types of information as well, including for example statistics and counts. Referring for example to Figures 7 and 8, micro architecture 700 and macro architecture 800 similar to architectures 500 and 600 respectively are shown. However, in place of outputting a number (density)/probability curve as in architecture 500, architecture 700 is set to estimate and output demographic-based counts and scene (such as, of visual representation 502) statistics. Thus, macro architecture 800 shown in Figure 8 can be utilized to measure large scale event statistics similarly to architecture 600, but output results as event demographic counts and statistics 806.

As will be appreciated by those skilled in the relevant arts, any type of information derivable from data representing images may be used as output, particularly in advertising applications those types of data useful in assessing the effectiveness of displayed images, including for example, advertising images.

For a camera used in a system described herein, it can be calibrated in order to give greater confidence in number estimations. For example, a camera can be calibrated to generated geometric relationship(s) between a three-dimensional scene being observed by the camera. Such calibration can be automatic or manual, and can include use of template patterns, calibration periods and/or computer annotation of video frames. For instance, an automatic approach can leverage any prior or common knowledge about a size of readily detectable objects. As an example, persons can generally be readily detected through an

Imdπdge\011495 000051U015194 1-JXG 10

approach involving of background segmentation as discussed above. If an algorithm is tuned to assume that objects of particular pixel masses are persons, the knowledge that people are generally roughly 170 cm tall can be used to calculate a rough relationship between the size of objects in an observed scene and their pixel representation(s). Thus, if the algorithm performs this task upon people standing in at least 3 locations in an image, the an estimate of the relationship between the camera's orientation relative to the physical scene can be calculated.

Referring to Figure 9, there is shown an embodiment of the invention in 10 which an estimation system is configured for use with mobile station 900. Station 900 can include a vehicle, or a mobile platform that can be moved by a person or vehicle from location to location. The embodiment of Figure 9 is useful, for example, for having an estimation system set up at temporary locations with one or more stations 900 at a time and location when an event is taking place and estimations are desired.

As shown in Figure 9, a plurality of cameras 902 providing one or more visual representations can be connected to station 900 via post 906. For some embodiments, it can be desirable to elevate cameras 902 above persons or objects to be counted, so that the dept perception of the visual representation(s) can be improved. It will be appreciated that in other embodiments, a mobile station can have multiple posts and/or other camera mounts to provide additional cameras 902, visual sources, and/or viewing angles. Each station 900, with its array of cameras 902 can monitor an area 904 defined by the viewing angle and range of its associated cameras 902. In this example, mobile station 900 is shown to be among persons in area 904, and the numbers and/or characteristics of which, including demographics, can be estimated by the systems and methods of estimation as described above and operating in conjunction with mobile station 900. Such estimation operation can occur locally at station 900, or alternatively, raw, compressed, and/or processed data can be transferred live to another location for processing. Still alternatively, such data can be stored at station 900 for a time, and then off-loaded or transferred for processing such as when mobile station 900 returns to dock at a processing base or centre.

For the example shown in Figure 9, processing as described above with reference to Figures 1 to 8 can be conducted locally at station 900 with processing 908. Processing 908 include models for estimating, in the crowd in area 904, different numbers and characteristics such as set out in data sets 910 and 912. These include head counts (or an estimate of the number of persons in area 904), traffic density, face views,

Imanage\0H495 000051U015194 1-JXG 11

length of face views, ethnicity of viewers, gender of viewers, an emotional reaction (such as to an advertisement associated with station 900) and/or group demographics.

Systems and methods of estimation can also be used at a stationary position. Referring to Figure 10, there is an exemplary embodiment in which the systems and methods of estimation are implemented at a fixed location, such as with a fixed billboard (shown in side view) advertisement. System 1000 can be set up with a billboard style advertisement that may have a passive or fixed image, or actively changing image or multimedia presentations. In system 1000, there can be provided an enclosure 1004 having one or more cameras 1002 that are set up to estimate the number and characteristics of possible observers to the billboard advertisement or objects near the billboard. System 1000 further includes a battery 1012 to operate the system's electronics and computing circuitry, and a solar panel 1010 to charge battery 1012 when there is daylight. Alternatively, wired AC power can be used as well. System 1000 further includes processing 1014 to process the visual representation(s) that are observed from camera(s) 1002, such as described above with reference to Figures 1 to 8. System 1000 is also equipped with a trans/receiver 1006 connected to antenna 1008 for wirelessly transmitting the results of processing 1014 to a remote location for review. For example, the results of processing 1014 (such as number/probability curves, demographic information, face reactions and/or event statistics) can be transferred from system 1000 to a server (not shown) which then posts the results for access over the Internet or a private network. Alternatively, raw, compressed or processed data from camera(s) 1002 can be stored and later transferred, or transferred live, through wired or wireless connections to a remove location for estimation processing as described above with reference to Figures 1 to 8.

For the embodiment shown, system 1000 is set up near a road 1016 with sidewalk 1020. Camera 1002 are set up for observing vehicles 1018 on road 1016, and persons 1022 on sidewalk 1020 so as to be able estimate the number of persons and/or vehicles that come in proximity of an advertisement associated with system 1000, and to estimate characteristics such as demographics and/or reactions of viewers to the advertisement, such as face view estimations, gender/ethnicity estimation, face expression estimation, length of face views, persons/vehicle counts and traffic density, emotion reaction to advertisement, and/or demographics.

The observation of persons 1022 on sidewalk 1020 is similar to that described above with respect to Figures 1 to 9 and so the details of which are now repeated here again. With respect to vehicles 1018, in addition to training to estimate the numbers and characteristics of

Imanage\011495 000051U015194 1-JXG 12

the vehicles, system 1000 can also be trained to detect the direction of travel of vehicles 1018, so as to be able to determine the length of time that a billboard advertisement associated with system 1000 is, for example, in direct frontal view of a vehicle 1018 or the number of vehicles 1018 and the length of time that they are not in a direct frontal, but still visible angle to the billboard advertising. By utilizing higher resolution cameras 1012, it is also possible to observe and estimate the number and characteristics of persons in vehicles 1018 as well.

While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be appreciated by those skilled in the relevant arts, once they have been made familiar with this disclosure, that various changes in form and detail can be made without departing from the true scope of the invention in the appended claims. The invention is therefore not to be limited to the exact components or details of methodology or construction set forth above. Except to the extent necessary or inherent in the processes themselves, no particular order to steps or stages of methods or processes described in this disclosure, including the Figures, is intended or implied. In many cases the order of process steps may be varied without changing the purpose, effect, or import of the methods described.

Imanage\011495.00Q051\1015194.1-JXG 13