METHOD, DEVICE, USER EQUIPMENT AND COMPUTER PROGRAM FOR OBJECT EXTRACTION FROM MULTIMEDIA CONTENT

Title:

METHOD, DEVICE, USER EQUIPMENT AND COMPUTER PROGRAM FOR OBJECT EXTRACTION FROM MULTIMEDIA CONTENT

Document Type and Number:

WIPO Patent Application WO/2015/162027

Kind Code:

Abstract:

A method and a device (100) for extracting objects of a multimedia content are disclosed. The multimedia content comprises multiple image segments and each image segment comprises multiple pixels. The device calculates (S1) a force reflecting an edge strength for at least two pixels in an image segment of the multiple image segments. The device (100) identifies (S2) at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels. Each interest point is associated with a directionality and exit points. The directionality gives an angle indicative of a strength of the force. The exit points are used for connecting one interest point to another interest point. The device (100) creates (S3) at least one contour by connecting the at least two interest points by using at least one predefined pattern, selected from a number of predefined patterns. The device (100) extracts (S4) the at least one created contour. Furthermore, the device (100) extracts (S5) at least one object from the at least one extracted contour. A corresponding computer program is also disclosed.

Inventors:

ARNGREN TOMMY (SE)
KORNHAMMAR TIM (SE)

Application Number:

PCT/EP2015/057968

Publication Date:

October 29, 2015

Filing Date:

April 13, 2015

Export Citation:

Click for automatic bibliography generation Help

Assignee:

ERICSSON TELEFON AB L M (SE)

Other References:

None

Attorney, Agent or Firm:

EGRELIUS, Fredrik (Patent Unit Kista DSM, Stockholm, SE)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A method for extracting objects of a multimedia content, wherein the multimedia content comprises multiple image segments and each image segment comprises multiple pixels, wherein the method comprises:

calculating (SI) a force reflecting an edge strength for at least two pixels in an image segment of the multiple image segments,

identifying (S2) at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels, wherein each interest point is associated with a directionality and exit points, wherein the directionality gives an angle indicative of a strength of the force, and wherein the exit points are used for connecting one interest point to another interest point,

creating (S3) at least one contour by connecting the at least two interest points by using at least one predefined pattern, selected from a number of predefined patterns, extracting (S4) the at least one created contour, and

extracting (S5) at least one object from the at least one extracted contour.

2. The method according to claim 1, wherein the extracted objects are represented by a number of frequencies.

3. The method according to claim 2, wherein the method comprises:

indexing the objects, and

storing the objects in an index table.

4. The method according to any one of claims 1-3, wherein the extracted objects are stored as images.

5. The method according to any one of claims 1-4, wherein additional rules are imposed for connections between the interest points, wherein the additional rules include:

the connections are only started at an exit of an interest point and end in the exit of another interest point, or

an interest point is not connected to another interest point if difference of their forces is above a certain threshold.

6. The method according to any one of claims 1-5, wherein the method comprises:

setting, for each non-interest point, a counter to check how many times said each non-interest point is used in creating connections.

7. The method according to any one of claims 1-6, wherein the at least one extracted object has a shape obtained by morphological filtering applied on extracted contours.

8. A device (100) for extracting objects of a multimedia content, wherein the multimedia content comprises multiple image segments, wherein each image segment comprises multiple pixels, wherein the device (100) is configured to: calculate a force reflecting an edge strength for at least two pixels in an image segment of the multiple image segments,

identify at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels, wherein each interest point is associated with a directionality and exit points, wherein the directionality gives an angle indicative of a strength of the force, and wherein the exit points are used for connecting one interest point to another interest point,

create at least one contour by connecting the at least two interest points by using at least one predefined pattern, selected from a number of predefined patterns,

extract the at least one created contour, and

extract at least one object from the at least one extracted contour.

9. The device (100) according to claim 8, wherein the extracted objects are represented by a number of frequencies.

10. The device (100) according to claim 9, wherein the device (100) is configured to index the objects, and store the objects in an index table.

11. The device (100) according to claim 8, wherein the extracted objects are stored as images

12. A computer program for extracting objects of a multimedia content, wherein the computer program comprises executable code units, which when executed by a device (100) causes the device (100) to perform the method according to any one of claims 1-7.

13. A user equipment comprising a device according to any one of claims 8- 11.

Description:

METHOD, DEVICE, USER EQUIPMENT AND COMPUTER PROGRAM FOR OBJECT EXTRACTION FROM MULTIMEDIA CONTENT

TECHNICAL FIELD

[0001] This disclosure relates to a method, a device, a user equipment and a computer program for object extraction from multimedia content.

BACKGROUND

[0002] The World Wide Web (the "Web") grows larger and larger every day. Many users of the Web access it multiple times a day every day of the week using a variety of communication devices, such as Personal computers (PC), phones, tablets, cameras and Internet Protocol Television (IP-TV) devices. Advances in mobile technologies have made it easier for a user to capture multimedia content (e.g. audio content, video content, image content), and different social network and video sharing Web sites make it possible for the user to share such content on the Web.

[0003] As the amount of digital data increases, search engines are being deployed not only for Internet search, but also for proprietary, personal, or special-purpose databases, such as personal multimedia archives, user generated content sites, proprietary data stores, workplace databases, and others. For example, personal computers may host search engines to find content on the entire computer or in special-purpose archives (e.g., personal music or video collection). User generated content sites, which host multimedia and/or other types of content, may provide custom search functionality tailored to that type of content.

[0004] The value of information carried in multimedia content often depends on how easily it can be found, retrieved, accessed, filtered and managed and there is accordingly a growing need to have it processed in such a way that the extracted information it carries is maximized. Multimedia content typically contains several layers of information, the so-called modalities, e.g. image, sound, spoken language, or text.

[0005] Recently, objects have been recognized as an important modality in multimedia content and a lot of research has been done in the area of object extraction. In the field of computer vision there are different approaches for identifying objects and shapes, e.g. object recognition, image registration and image segmentation with two distinct approaches: segmentation by texture or by color.

[0006] The methods for object recognition focus on matching existing objects, images or shapes, to each tested image in a data set or a database. The problem with these is that they are very computationally heavy.

[0007] Image registration methods analyze parts of the image by selecting fixed points in the image, which then are compared with other images. This typically requires high similarity between the images and is therefore much more sensitive to the quality, translation, scaling, rotation or color difference of the images.

[0008] Image segmentation is the process of partitioning a digital image into multiple segments (sets of pixels). The goal of segmentation is to separate different parts of an image by colors, edges or patterns. An advantage of this method is the ability to analyze each part separately. Edge detection is one example of image segmentation. This method uses difference in colors to identify edges. Edge detection is a noise sensitive approach but at the same time it is more computationally efficient. The result of edge detection is contours that can be extracted from the image.

[0009] Solutions available today that can segment images generate separated parts of the image that are difficult to compare with other shapes. According to these, different segments and shapes can be found. However, these shapes do not have enough detail to distinguish objects.

[0010] What is desired is, for example, a fast and efficient method for object extraction that would improve the search results.

SUMMARY

[0011] It is a general objective to provide a fast and efficient object extraction from multimedia content. This objective is met by at least one of the embodiments disclosed herein.

[0012] An aspect of the embodiments defines a method for extracting objects of a multimedia content. The multimedia content comprises multiple image segments. Each segment comprises multiple pixels. The method comprises calculation of a force reflecting an edge strength for at least two pixels in an image segment of the multiple image segments. The method further comprises identification of at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels. This means that the identification is based on the calculated forces of the pixels. Each interest point is associated with a directionality and exit points. The directionality gives an angle indicative of a strength of the force, i.e. an angle in which the force is strongest. The exit points are used for connecting one interest point to another interest point. The method further comprises creation of at least one contour by connecting the at least two interest points by using at least one predefined pattern, selected from a number of predefined patterns. Expressed differently, the method further comprises creation of at least one contour from the identified interest points, where a contour is created by connecting the interest points by some predefined patterns. The method further comprises extraction of at least one created contour, e.g. obtained by connecting the interest points. The method further comprises extraction of at least one object from the at least one extracted contour.

[0013] Another aspect of the embodiments defines a device for extracting objects of a multimedia content, wherein the multimedia content comprises multiple image segments, wherein each image segment comprises multiple pixels. The device is configured to calculate a force reflecting an edge strength for at least two pixels in an image segment of the multiple image segments. Furthermore, the device is configured to identify at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels, wherein each interest point is associated with a directionality and exit points, wherein the directionality gives an angle indicative of a strength of the force, and wherein the exit points are used for connecting one interest point to another interest point. Moreover, the device is configured to create at least one contour by connecting the at least two interest points by using at least one predefined pattern, selected from a number of predefined patterns. Additionally, the device is configured to extract the at least one created contour, and to extract at least one object from the at least one extracted contour.

[0014] According to one embodiment of the device, there is provided a device for extracting objects of a multimedia content having the following characteristics. The device comprises a force calculator for calculating a force reflecting an edge strength for the at least two pixels in an image segment of the multiple image segments. The device further comprises an identifier, aka identifier module, for identifying at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels. Each interest point is associated with a directionality and exit points. The directionality gives an angle indicative of a strength of the force. The exit points are used for connecting one interest point to another interest point. The device further comprises a contour creator for creating at least one contour by connecting the at least two interest points by using at least one predefined pattern. Thus, the at least one contour may be created from the identified interest points. The device further comprises a contour extractor for extracting the at least one created contour, e.g. obtained by connecting the interest points. The device further comprises an object extractor for extracting the at least one object from the at least one extracted contour.

[0015] The device for extracting objects of a multimedia content comprising multiple image segments may also comprise a processor and a memory, said memory containing instructions executable by said processor whereby said device is operative to: calculate a force for at least two pixels in the image segment, identify at least two interest points among the pixels in the image segment based on the calculated forces of pixels, create at least one contour from the identified interest points, extract at least one created contour obtained by connecting the interest points and extract at least one object from the at least one extracted contour.

[0016] A further aspect of the embodiments defines a computer program for extracting objects of a multimedia content. The multimedia content comprises multiple image segments. The computer program comprises a force calculator module for calculating a force reflecting an edge strength for at least two pixels in an image segment of the multiple image segments. The computer program further comprises a module for identifying at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels. The computer program further comprises a contour creator module for creating at least one contour from the identified interest points, where a contour is created by connecting the at least two interest points by using at least one predefined pattern, selected from a number of predefined patterns. The computer program further comprises a contour extractor module for extracting at least one created contour, e.g. obtained by connecting the interest points. The computer program further comprises an object extractor module for extracting at least one object from the at least one extracted contour.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017] The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments and, together with the description, further serve to explain the principles of at least one embodiment herein and to enable a person skilled in the pertinent art to make and use one or more embodiments herein.

[0018] FIG. 1 illustrates functional components of a system wherein the embodiments can be implemented.

[0019] FIG. 2 illustrates a flowchart representing a sequence of actions from the moment a new multimedia content is accessed to the moment when the objects are extracted from the new content, according to an embodiment.

[0020] FIG. 3 illustrates an example of an image segment and its calculated force, according to an embodiment. [0021] FIG. 4 illustrates how the interest points are identified, according to an embodiment.

[0022] FIG. 5 shows examples of the patterns that can be used for connecting the identified interest points, according to an embodiment.

[0023] FIG. 6 depicts a rule according to which the interest points whose difference in forces is above the threshold are not connected by any pattern, according to an embodiment.

[0024] FIG. 7 shows how the remaining unconnected pixels can be connected by following the directions of the strongest force, according to an embodiment.

[0025] FIG. 8 summarizes the steps of forming of contours from the identified interest points, according to an embodiment.

[0026] FIG. 8a' shows an example of an image segment for which the contours are to be found, according to an embodiment.

[0027] FIG. 8b illustrates the straight lines that are extracted from the image segment in FIG. 8a', according to an embodiment.

[0028] FIG. 8c shows the extracted patterns after a search for semi-straight lines is performed, according to an embodiment.

[0029] FIG. 8d shows the extracted patterns when the arc-like structures are included as well, according to an embodiment.

[0030] FIG. 8e illustrates the extracted contours after the attempt to connect the remaining unconnected interest points by following the direction of the strongest force, according to an embodiment. [0031] FIG. 9 illustrates an example of morphological operations of dilation and erosion for a rectangular structuring element, according to an embodiment.

[0032] FIG. 10a illustrates an extracted contour from an image segment, according to an embodiment.

[0033] FIG. 10b illustrates a result of flood filling performed on the contour in FIG. 10a, according to an embodiment.

[0034] FIG. 10c shows the difference between dilation and erosion of the flood- filled image segment from FIG. 10b, according to an embodiment.

[0035] FIG. 11 is a schematic block diagram illustrating a device for object extraction from multimedia content according to an embodiment.

[0036] FIG. 12 is a schematic block diagram further illustrating a device for object extraction from multimedia content according to an embodiment.

[0037] FIG. 13 is a schematic block diagram illustrating a computer comprising a computer program product with a computer program for object extraction from multimedia content according to an embodiment.

DETAILED DESCRIPTION

[0038] The embodiments described herein relate to object extraction in multimedia content comprising multiple images or multiple image segments. FIG. 1 illustrates functional components of a system in which the embodiments may be implemented. A user submits search queries to a search and indexing server 101. A search query 102 can contain a shape or a keyword for example. The search and indexing server 101 may be deployed for internet search, e.g. online content 103 as depicted in FIG. 1, for proprietary, personal, or special-purpose databases, such as personal multimedia archives, user generated content sites, proprietary data stores, workplace databases, and others (all represented by the "content database" 104 in FIG. l).

[0039] Multimedia content typically contains several layers of information, the so-called modalities, e.g. objects, sound, spoken language, or text. Recently, objects have been recognized as an important modality in multimedia. Object extraction is performed by the search and indexing server 101 and may be initiated as soon as a new multimedia content is uploaded or when the server 101 detects consumption of a new content. The extraction of objects is usually done on the level of images (frames) or image segments of a multimedia content. The object extraction is a step performed before identification - see below. The object extraction deals with how to separate what possibly can be identified as objects from an image segment in order to allow comparison of objects.

[0040] The extracted objects can, for example, be represented by a number of frequencies, including one or more frequencies, and further indexed and stored in the index table 105. This makes the objects searchable within an image segment or a frame of a multimedia content. In addition, each extracted object may be associated with a time stamp.

[0041] FIG. 2 illustrates a flowchart representing a sequence of actions performed for object extraction, in accordance with embodiments herein. Multimedia content is usually segmented into smaller parts that can be handled separately, i.e. image segments. For example, it is common to split a video into separate frames that are further processed independently. Alternatively, a video can be segmented into groups containing a number of adjacent frames with high temporal correlation. It is also possible to perform segmentation on the image level. In this application, the term image will be used to represent both image and image segment. In this example, the object extraction is performed by the search and indexing server 101. In other examples, the method herein may be performed by other entities, such as a device, a computer etc., as described below. [0042] Object extraction is performed on an image segment level, which will result in a list of searchable items associated to a multimedia content. For example, in case of video, there may be a video_id, an extracted object and a time_stamp when this object is detected in a video. A time stamp may be a frame number as well.

[0043] In a step SI , the search and indexing server 101 calculates a force reflecting an edge strength for at least two pixels in an image segment of the multiple image segments. [0044] This may mean that in the first step SI of the object extraction, the calculation of the rate of change, also referred to as force, of an image segment is performed. This may, for example, be done by gradient-based edge detection. The rate of change F(6>) and the angle Θ along which the image segment has this rate of change can be calculated as follows:

Θ =

F(0) = - {{D _x2 + D _y2 ) + cos(20) {p _x2 - D _y2 ) + 2D _¾ysin(20)}

where D _x2 , D _y2 and D _xy are the tensor functions, R, G and B are the red, green and blue image segment component respectively, x and y the image segment coordinates,— ,— , dB dR dG dB

— , — ,—,— are the gradients of the R, G and B image segment components in ox dy dy dy horizontal (x) and vertical (y) directions, and r, g and b are the unitary vectors associated with the R, G and B components respectively: dR dG dB dR dG dB

- - - r +— - g +— -

dR dR dG dG dB dB

D _YV = u v = + +

dx dy dx dy dx dy dR dG dB dR

[0045] Using digital approximations of the image segment gradients— ,— ,— ,— , dG dB dR

— and— is common in practice. For example, the approximation of— can be obtained dy dy dx by convolving R with:

Similarly, the approximation of — can be obtained by convolving R with:

-3 0 3

-10 0 10

-3 0 3

The matrices above define the so-called Scharr operator. However, this disclosure is by no means limited to the Scharr operator. Other examples of operators that could be used are: Sobel, Laplacian, Prewitt, Roberts etc. [0046] The angle of the rate of change, Θ, is calculated from the equations above and is subsequently used to calculate the actual rate of change (force). A common approach is to store the force of an image segment in a matrix form, referred to as a matrix of forces. Thus, each matrix element of the matrix of forces reflects the edge strength of the corresponding pixel. FIG. 3 shows an example of an image segment and a calculated rate of change for all the pixels in that image segment. Left portion of FIG. 3 shows an image segment, denoted image', that is a color photograph converted into black/white for proper reproducibility. Right portion of FIG. 3 shows force calculated on the image segment to the left.

[0047] In a step S2, the search and indexing server 101 identifies at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels, wherein each interest point is associated with a directionality and exit points, wherein the directionality gives an angle indicative of a strength of the force, and wherein the exit points are used for connecting one interest point to another interest point.

[0048] In more detail, a number of interest points (or equivalently pixels) are identified from the matrix of forces described above. In order to define an interest point, the notion of a point weakness in the direction of north, south, east and west is introduced first. A point is weaker in the direction of north if the forces of the three nearest points directly above the tested point are weaker than the forces of the three points consisting of the tested point and its two nearest neighbors from the left and right respectively. The forces are compared point-wise, that is the force of the tested point is compared to the force of the point on top of it etc. To test whether a point is weak in the direction of south, we compare the forces of the tested point and its left and right neighbors to the nearest pixels right below. Similarly, we define weak points for the direction of east and west. For example, the center point with a force value 6 in FIG. 4 is weaker to the north in example (a), weaker to the west in example (c), weaker to both north and south in example (b) and weaker to both east and west in example (d).

[0049] A point that is weaker on two opposite sides is an interest point, where by opposite sides it is meant north-south, east-west, northeast-southwest, northwest- southeast or, in mathematical terms, the sides that are 180° apart.

[0050] Each interest point is characterized by a directionality measured as an angle between the horizontal direction and the direction in which the interest point is weaker. For example, an interest point that is weaker in the directions of north and south has a directionality of 0°, an interest point that is weak in the direction of north-west and southeast has a directionality of 45° and an interest point that is weaker towards east and west has a directionality of 90°.

[0051] An interest point is further characterized by the so-called exit points. An interest point with directionality of 0° has the exit points to the east and west. An interest point with directionality of 45° has exit points to the north-east and south-west. Similarly, an interest point with 90° directionality has exits to the north and south. The exit points are used for connecting one interest point to another interest point. Thus, the exit points are used for connection to other interest points.

[0052] Each interest point is therefore described with its coordinates, force, directionality and exit points. In a next step S3, the search and indexing server 101 creates at least one contour by connecting the at least two interest points by using at least one predefined pattern, selected from a number of predefined patterns. [0053] In more detail, the interest points are connected in order to form a contour for an object to be extracted. The interest points can be connected by using predefined patterns. A predefined pattern is determined in advance of the execution of the method herein, i.e. not dynamically created when the method is performed. Some typical examples of patterns are depicted in FIG. 5. The patterns are marked with boxes. An alternative representation is to set the squares with boxes to a grey shade, which however is more difficult to reproduce. The top left pattern in FIG. 5 shows a straight horizontal line that consists of two interest points with a directionality of 0° and two non-interest points in between them. A semi- straight line (pattern 2), further depicted in the bottom left, consists of the two interest points with directionality 0°, placed in two adjacent rows and separated by two pixels in the horizontal direction. These interest points are connected to the other two points, one to the east exit of the first interest point and another one to the west of the second interest point, to form a semi-straight line. The same figure further shows various other examples of arc-like patterns and other common patterns. However, embodiments herein are by no means limited to these patterns only.

[0054] To obtain different orientations of the patterns depicted in FIG. 5 for the whole range of 0-360°, these patterns are rotated and mirrored. For example, pattern 1 from FIG. 5 can be rotated by 90° to obtain a vertical line. Pattern 2 can be rotated and mirrored to produce a total of four semi-straight-line-like patterns. Patterns 3, 5 and 6 each produce three more patterns of the same shape in different directions by rotation by 90°, whereas patterns 4 and 7 each produce additional seven patterns with different orientations. This gives a total of 34 pre-defined patterns to be tested when connecting the interest points, for this example. [0055] The interest points are now connected by the pre-defined patterns. Some of these patterns may be considered as more important than the others, depending on the application. For example, horizontal lines may be considered more important than the others because of their impact on the human visual system or because they are most likely to capture the true horizontal edge in an image segment. Therefore, the search for patterns can be performed for every interest point in the priority order that can be set. For example, the priority order may correspond to the number of every pattern depicted in FIG. 5, where for each pattern we test its rotated and mirrored versions as well. Thus, for each interest point it is first verified whether it can be connected to another interest point by a pattern 1 (this of course given that an interest point has a proper directionality of 0°). The same steps are then repeated for each interest point and a vertical line etc.

[0056] Some additional rules may be imposed for connecting the interest points. For example, the connections may only start at an exit of an interest point and end in the exit of another interest point. Or, an interest point may not be connected to another interest point if the difference of their forces is above a certain threshold. This is to prevent connecting interest points that for example belong to different objects. FIG. 6 illustrates this scenario. Points A and B, having forces 50 each and forces 40 for in-between pixels, are connected by a straight line (pattern 1). However, the difference in forces between points C and D, as well as point C and in-between points, is considered too large for the points to be connected, despite the fact that these points can be connected by a straight line (pattern 1).

[0057] When it is found that interest points can be connected by one of the predefined patterns, the information for every connected interest point is updated with the formed connections. For example, the updated information can be that the interest point with coordinates (i,j) is connected to the interest point with coordinates (m,n) by a semi- straight line (pattern 2).

[0058] If some of the interest points remain unconnected upon testing for all the predefined patterns, one can follow the strongest force direction in order to connect them. An example illustrating this step is depicted in FIG. 7. Points A and B are initially unconnected. Point A has exits to the east and west, whereas point B has exits to the north-east and south-west. By following the directions of the strongest force for these two points one can find the path marked with framing as shown in FIG. 7.

[0059] Some of the interest points may still remain unconnected even after the attempt to follow the strongest force direction. These interest points may simply be disregarded in the following steps.

[0060] For each non-interest point we may set a counter to check how many times it is used in creating connections. If it is used more than once, we can set it as an interest point.

[0061] FIG. 8 summarizes the step S3 described above. FIG. 8(a') is the original image segment for which the contours are to be found. In this Figure, the original color photograph has been converted to a black/white image segment for proper reproducibility. FIG. 8(b) shows the straight lines (pattern 1) that are extracted first, whereas FIG. 8(c) shows the extracted patterns after a search for semi-straight lines is performed. In FIG. 8(d) the extracted arc-like structures are included as well. Finally, FIG. 8(e) shows the extracted contours after the attempt to connect the remaining unconnected interest points by following the direction of the strongest force. [0062] The contour(s) found following the steps described above may be filtered to remove noise. This can be done, but is not limited to, by morphological filtering. This step can be performed even prior to connecting the remaining unconnected points by following the strongest force.

[0063] In a step S4, the search and indexing server 101 extracts the at least one created contour.

[0064] In more detail, the found contour(s) (or equivalently object(s)) are finally being extracted from an image segment. This is initiated by a search for the connected interest points according to some scanning order (for example line-by-line search starting from the top left corner in an image segment). The first encountered connected interest point is assigned to belong to the first contour to be extracted. Further on, all the points having a connection to this interest points will be added to belong to this contour, and all of their connected points will be added and so on. The search is continued to find the next non-extracted interest point until all of the interest points are exhausted.

[0065] In a next step S5, the search and indexing server 101 extracts at least one object from the at least one extracted contour.

[0066] In one example, an object is extracted from the extracted contours. Each contour can be considered as a black and white image segment where the white pixels are the contour pixels and the black pixels are the non-contour pixels.

[0067] Each extracted object should have a shape defined by the outer contour of the object. This shape is obtained with for example morphological filtering, which is usually performed on binary image segments that are image segments containing only two colors - black and white. However, extensions to gray-scale image segments are available as well. In morphological filtering the two most basic methods are dilating and eroding. Both methods use a structuring object to define their operation. The structuring object can have different effects depending on the format. One such object, a rectangle with size 2x1 pixels, is being depicted in FIG.9. In dilating this object will offset each white pixel outwards (widening), with the same format as the structuring object. When eroding the same operation will be done inwards (thinning) instead, see FIG. 9.

[0068] A useful technique for morphological operations is the so-called flood-fill. If the contour (white pixels) is closed then, from a given start point the technique would fill all the black pixels that are connected into a third set (for example a gray scale). At this point all the black pixels that would remain in the image segment would be the inner parts of the closed contour. By converting the black pixels to white pixels and then the grey pixels back to black, the contour would now be filled without any holes. An example of this operation can be seen in FIG. 10, where (a) shows the original shape, (b) the flood-filled shape (a), and finally (c) the difference between dilation and erosion of the flood- filled image (b).

[0069] The extracted objects can be stored as image segments. The image segment size of each object is object ieight x object_width. Each object may be stored and associated with the video_id and a time_stamp/frame. This may be realized as a hyperlink to a certain position in the video where the extracted object occurs.

[0070] FIG. 11 is a schematic block diagram of a device 100 for object extraction according to an embodiment. The device 100 is configured to calculate a force reflecting an edge strength for at least two pixels in an image segment of the multiple image segments, identify at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels, wherein each interest point is associated with a directionality and exit points, wherein the directionality gives an angle indicative of a strength of the force, and wherein the exit points are used for connecting one interest point to another interest point. Moreover, the device 100 is configured to create at least one contour by connecting the at least two interest points by using at least one predefined pattern, selected from a number of predefined patterns, to extract the at least one created contour, and to extract at least one object from the at least one extracted contour.

[0071] In further embodiments, the device may comprise one or more of the following modules, such as hardware or software modules.

[0072] The device 100 comprises a force calculator 110, e.g. a force calculating module, configured to calculate a force reflecting an edge strength for the at least two pixels in an image segment of the multiple image segments. This means that the force calculator 110 calculate forces for pixels in an image or image segment.

[0073] An identifier 120 of interest points, e.g. an identifying module, is configured to identify at least two interest points among the pixels in the image segment based on the calculated forces for the at least two pixels. This means that the identifier 120 is configured to select, or identify, the interest points based on the calculated forces.

[0074] A contour creator 130, e.g. a contour creating module, is configured to create at least one contour by connecting the at least two interest points by using at least one predefined pattern. This means that the contour creator 130 performs forming, or creation, of contours by connecting the identified interest points by some predefined patterns. [0075] A contour extractor 140, e.g. a contour extracting module, of the device 100 is configured to extract the at least one created contour, e.g. obtained by connecting the interest points.

[0076] An object extractor 150, e.g. an object extracting module, is configured to extract the at least one object from the at least one extracted contour. Expressed somewhat differently, the object extractor 150 is configured to extract the objects from the extracted contours.

[0077] FIG. 12 is a schematic block diagram of a device 100 for object extraction according to another embodiment of the device 100. The device 100 comprises a processor 160 and a memory 170, said memory containing instructions 180 executable by said processor whereby said device is operative to: calculate a force for at least two pixels in the image segment, identify at least two interest points among the pixels in the image segment based on the calculated forces of pixels, create at least one contour from the identified interest points, extract at least one created contour obtained by connecting the interest points and extract at least one object from the at least one extracted contour.

[0078] The device 100 may be implemented in hardware, in software or a combination of hardware and software. The device may be implemented in, e.g. comprised in, user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer.

[0079] Alternative embodiments of the device 100 are possible where some or all of its units are implemented as computer program modules running on a general purpose processor. [0080] FIG. 13 schematically illustrates an embodiment of a computer 200 having a processing unit 201, such as a DSP (Digital Signal Processor) or CPU (Central Processing Unit). The processing unit 201 may be a single unit or a plurality of units for performing different steps of the method described herein. The computer also comprises an input/output (I/O) unit 202 for receiving the image segments and for outputting the extracted objects from the input image segments. The I/O unit 202 has been illustrated as a single unit in FIG. 13 but can likewise be in the form of a separate input unit and a separate output unit.

[0081] Furthermore, the computer comprises at least one computer program product 203 in the form of a non- volatile memory, for instance an EEPROM (Electrically Erasable Programmable Read-Only Memory), a flash memory or a disk drive. The computer program product 203 comprises a computer program 204, which comprises code means which when run on the computer, such as by the processing unit 201, causes the computer 200 to perform the steps of the method described in the foregoing. Hence, in an embodiment the code means in the computer program 204 comprises a force calculator module 210 for calculating forces in an image segment, an identifier of interest points module 220 for identifying the interest points based on the calculated force, a contour creator module 230 for creating of contours by connecting the identified interest points by some predefined patterns, a contour extractor module 240 for extracting the contours obtained by connecting the interest points and an object extractor module 250 for extracting the objects from the extracted contours. These modules essentially perform the steps of the flow diagram in FIG. 2 when run on, or executed by, the processing unit. [0082] The embodiments described above are to be understood as a few illustrative examples among further possible working examples. It will be understood by those skilled in the art that various modifications, combinations and changes may be made to the embodiments without departing from the scope as defined by the appended independent claims. In particular, different part solutions in the different embodiments can be combined in other configurations, where technically possible.

Previous Patent: MATERIAL FOR LABELING OBJECTS, SUBSTANCES OR SUBSTANCE MIXTURES

Next Patent: METHOD, DEVICE, USER EQUIPMENT AND COMPUTER PROGRAM FOR OBJECT EXTRACTION FROM MULTIMEDIA CONTENT