MULTIPLE REGION VIDEO CONFERENCE ENCODING

Title:

MULTIPLE REGION VIDEO CONFERENCE ENCODING

Document Type and Number:

WIPO Patent Application WO/2014/094216

Kind Code:

Abstract:

Systems and method may provide for a computing device that encodes multiple regions of a video frame at different quality levels. In particular, a first region of one or more frames containing a speaker's face maybe located and encoded at a first quality level. A second region containing a background, on the other hand, may be located and encoded at a second quality level. Optionally a third region containing additional faces maybe located and encoded at a third quality level and a fourth region maybe located and encoded at a fourth quality level.

Inventors:

YAN LIU (CN)
WANG BIN (CN)

Application Number:

PCT/CN2012/086805

Publication Date:

June 26, 2014

Filing Date:

December 18, 2012

Export Citation:

Click for automatic bibliography generation Help

Assignee:

INTEL CORP (US)
YAN LIU (CN)
WANG BIN (CN)

International Classes:

H04N7/15; H04N5/14

Domestic Patent References:

WO2011126500A1

2011-10-13

Foreign References:

US20040158719A1	2004-08-12
CN101141608A	2008-03-12

Other References:

See also references of EP 2936802A4

Attorney, Agent or Firm:

CHINA PATENT AGENT (HK) LTD. (Great Eagle Center23 Harbour Road,Wanchai, Hong Kong, CN)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

We claim:

1. A system to encode a video conference, comprising:

a came ra to capture one o r more frames associated with the video c onferenc e ; and a teleconferencing device including.

one or more region determination modules to determine in the one or more frames:

a first region to include a speaker's face; and

a second region to include a background, and

one or more encoders to encode:

the first region at a first quality; and

the second region at a second quality, the second quality being less than the first quality.

2. The syste m ac ording to claim 1 , further including a face re cognition module to locate the speaker's lace.

3. The system according to claim 1 , further including a face tracking module to track the location of the speaker's face.

4. The system according to anyone of claims 1-3, wherein:

the one or more region determination modules further define a third region including additional faces; and

the one or more encoders encode the third region at a third quality less than the first quality.

5. The system according to anyone of claims 1-3, wherein:

the one or more re gion detennination modules further define a fourth region specified by a user; and

the one or more encoders encode the fourth region at a fourth quality less than the first quality.

6. An apparatus for e ncoding video, comprising :

one or more region determination modules to determine in one or more f ames:

a first region to include a speaker's face; and

a second region to include a background, and

one or more encoders to encode:

the first region at a first quality, and

the se cond region at a se cond quality le ss than the first quality.

7. The apparatus according to claim 6, further including a face recognition module to locate the speaker's face .

8. The apparatus according to claim 6, further in luding a face tracking module to track the location of the speaker's face.

9. The apparatus according to anyone of claims 6-2, wherein:

the one or more re gion detenninatio n modules further defme a third region including additional faces; and

the one or more encoders encode the third region at a third quality less than the first quality.

10. The apparatus according to anyone of claims o^~-§. wherein

the one or more region detennination modules further defme a fourth region specified by a user; and

the one or more encoders en ode the fourth region at a fourth quality less than the first quality.

11. The apparatus according to anyone of claims o^"-8. wherein the one or more re gion deter ination modules reassign the first region to include a mew speaker 's face .

12. A method of encoding video, comprising:

loc ating a first region of one or more frame s containing a speaker' s fac e ;

loc ting a second region of the one or more frames containing a background;

encoding the first region at a first quality, and

enc oding the sec ond region at a sec and quality

13. The method according to claim 12. further mcluding:

locating a third region of the one or more frames containing additional laces; and enc oding the third region at a third quality.

14. The method acc ording to claim 12, farther includin :

locating a fourth region of the one or more frames defined by a user; and

enc oding the fourth region at a fourth quality.

15. The method acc ording to claims 14, wherein the fourth quality is wer than the first quality. lo^~. The method acc ording to claim 13. wherein the third quality is lower than the second quality.

17. The method acc ording to claim 12 , further mcluding de fining the first region using face recognition.

18. The method acc ording to claim 12 , further including adjusting the first reg ion to track the speaker's fac .

19. The method according to claim 12. further mcluding reassigning the first region to a new speaker's face.

20. The method acc ording to claim 12 , where in enco ding employs MPEG compression.

21. The method acc ording to any one o f c laims 12-20, wherein the second quahty is bwer than the first quality.

22. At least one non-trarLsitory nraclune -readable medium comprising one or more instructions for encoding video which, if executed bya processor, cause a computer to: be ate a first r gion of one or more fram s containing a speaker's fac ;

be ate a second region of the one or more frames containing a background; encode the first region at a first quality, and

enc ode the sec ond region at a sec ond quality

23. The medium according to claim 22, wherein the instructions, if executed, further cause the computer to:

locate a third region of the one or more frames containing additional faces, and encode the third r gion at a third quality

24. The medium according to claim 22, wherein the instructions, if executed, further cause the computer to:

locate a fourth region of the one or more frames definedbya user; and

en ode the fourth region at a fourth quality.

25. The medium according to claim 24, wherein the fourth quality is lower than the first quality.

26. The medium according to claim 23, wherein the third quality is lower than the second quality.

27. The medium according to claim 22, wherein the instructions, if executed, further cause the computer to define the first region using lace recognition.

28. The medium according to claim 22, wherein the instructions, if executed, further cause the computer to adjust the first region to track the speaker's face.

29. The medium according to claim 22, wherein the instructions, if ex cuted, furthe r cause the computer to reassign the first region to a new speaker's lac .

30. The medium according to anyone of claims 22-29, wherein the second quality is lower than the first quality.

Description:

Multiple Region Video Conference Encoding BACKGROUND

[0001 ] The conm inication quality of video conferenc applications may rely heavily on the real time status of a network. Many current video conference systems introduce complicated algprithms to smooth network disturbance( s) caused by among other things, the unmatched bit-rate between what the video conference application generate s and a networks ability to process stream d data. However, these algorithms may bring extra, complexity to conferencing systems and still tail to perform well under environments where the communication quality maybe significantly restricted by limited available bandwidth. Examples of such environments include: mobile communications networks, rural communications networks, combinations thereof, and/ or the like. What is ne ded is a way to decrease the bit-rate of a video conference without sacrifi ing the quality of important information in a video frame.

BRIEF DESCRIPTION OF THE DRAWINGS

[0002 ] Tie various advantages of the embodiments of the present invention will become apparent to one skilled in the art by reading the following specification and appended claims, and by referencing the following drawings, in which:

[0003 ] FIG. 1 illustrates an example video conferencing scheme as per an aspect of an embodiment of the present invention;

[0004] FIG. 2 A illustrates an example video frame with various identified entities and objects as per an aspect of an embodiment of the present invention;

[0005 ] FIG. 2B illustrates the example video frame with various identified regio ns as per an aspect of an emb odiment of the pre se nt invention;

[0006 ] FIGs. 3 A and 3B illustrate the example video frame with various ide ntified regions as per an aspect of an emb odiment of the pre se nt inve ntion;

[0007 ] FIG. 4 and 5 are a block diagrams of an e xample multiple re gion video conference e ncoder s as per an aspect o fan emb odiment of the pre se nt invention;

[0008 ] FIG. 6 is a flow diagram of an example multiple region video conference as per an aspec t of an e n^odiment of the pre sent invention ;

[0009 ] FIG. 7-9 are example flow diagrams of a video conference encoding me chanism as per an aspe ct of an e n^odiment of the present invention ; and

l [0010 ] FIG. 10 and 1 1 are illustrations of an embodiment of the present invention.

DETAILED DESCRIPTION

[0011 ] Embodime ts of the present invention may decrease the bit-rate of a video conference without sacrificing the quality of important information in a video frame by encoding different regions of the video frame at different quality levels. For example, it maybe determined that the most important part of a f ame is a speaker's face. In such a case, embodiments may encode a region of the frame that includes the speaker's fa e at a higher quality than the rest of the video frame . This selective encoding may result in a smaller frame size that may safely decrease the bit-rate of the video onferen e stream.

[0012 ] An example video confer nee is illustr ted in FIG. 1 . I this example video conference, a camera 120 may capture a video 130 of group of presenters 1 10 . The video 130 may then be input and processed by a teleconferencing device 140. The teleconferencing device 140 may be, for example : a computer system with an attached and/or integrated camera; a discrete teleconferencing device, a combination thereof; and/ or the like. In some embodiments, the camera 120 maybe integrated with the teleconferencing device 140 forming a teleconferencing system 100.

[0013 ] The tele confere ncing device 140 may ge nerate an enc oded vide o signal 150 from video 130 using a codec, wherein a odec can be a device or a computer program running on a computing device that is capable of encoding a video for storage, transmission, encryption, decoding for playback or editing, a combination the reof, and/or the like . C ode cs, as per c ertain embo dime nts, maybe designed and/or configured to emphasise certain regions of the video over other regions of the video. Examples of available codecs include, but are not limited to: Dirac available from the British Broadcast System; Blackbird available from Forbidden Technologies PLC; DivX available from DivX, Inc .; Neo Digital available from Nero AG; ProRjes available from Apple Inc.; and VPS available from On2 Te hnologies. Many of the Codec s use compression algorithm such as MPEG -1 , MPEG-2 , MPEG-4 ΑΞΡ, H.2o ^"l, H.2o ^"3, VC-3, WM 7, WMVS, MJPEG, MPEG-4v3, and DV.

[0014] Video codecs rate control strategies may use variable bit rate (VBR) and constant bit rate (CBR) rate control strategies. Variable bit rate (VBR) is a strategy to maximize the visual vide o quality and minimize the bit rate . For example, on fast motion scenes, a tri ble bit rate may use more bits than it does on slow motion sc nes of similar duration yet ac ieve a consistent visual quality. For realtime and non-buffered video streaming when the available bandwidth ma be fxed, (e.g. in videoconferencing delivered on channels of fixed bandwidth), a constant bit rate (CBR) may be used CBR may be used for applications such as videoconferercing, satellite and cable broadcasting, combinations thereof, and/or the like.

[0015 ] The quality that the codec may achieve may be affected by the compression format the codec uses. Multiple codecs may implement the same compression specification. For example, MPEG-1 codecs typically do not achieve quality/ size ratio comparable to codecs that implement the more modern H.264 specification. However, the quality/ size ratio of output produced by different implementations of the same specification may also vary.

[0016 ] Enco ded video 1 50 may be transporte d throug h a network to a second teleconferencing device . The network may be a local network (e .g. an intranet), a basic communications network (e.g. a POTS (plain old telephone system)), an advanced telecommunications system (e.g. a satellite relayed system), a hybrid mixed network, the internet and or the like . The teleconfer ncing device 170 maybe similar to the teleconfere ing device 140 . However, in this example, teleconferencing device 140 may ne d to have a decoder compatible with the codec. A decoder maybe a device or software operating in combination with computing hardware which does the reverse operation of an encoder, undoing the encoding so that the original information can be retrieved In this case, the decoder may need to retrieve the information encode dby tele conferencing device 140.

[0017 ] The encoder and decoder in teleconferencing devices 140 and 170 may be endec s. An e ndec may be a devic e that acts as both an encoder and a decoder on a signal or data stream, either with the same or separate circuitry or alg orithm . In some hterature, the term codec is used equivalently to the term endec. A device or program running in combination with hardware which uses a compression algorithm to create MPEG audio and/or video is often called an encoder, and one which plays back such files is often called a dec oder. Ho ver, this may also ofte n called be a c ode c .

[0018 ] The decoded video ISO maybe communicated from teleconferencing devices 170 to a display device 190 to present the decoded video 195. The display device maybe a computer, a TV, a projector, a combination thereof, and/ or the like. [0019 ] FIG. 2 A illustrates an example video frame 200 with various identified entities (210, 232, 234, 23D ^", and 238), and objects (240) as per an aspect of an embodiment of the pr sent invention. As shown in this illustration, figure 210 in the foreground is a primary speaker. Entities 232, 234, 236, and 238 are additional participants. Objects) 240 are additional item(s) that may be important for demonstrative purposes during a teleconference.

[0020 ] FIG. 2B illustrates the video frame with various regions as per an aspect of an embodiment of the present invention. In this illustration, an area covering a speaker maybe identified as a first region 212 and the remainder of the frame (the background) maybe identified as a second region 222.

[0021 ] FIG. 3 A and FIG. 3B illustrate the video frame with various alternative regions identified. In FIG. A, an area covering a speaker ma e identified as a first region 212, an area covering additional entities/participants 232, 234, 23ο ^", and 238 (FIG. 2 A) maybe identifie d as a tlriird region 330, an area covering object(s) 240 may be identified as a fourth region 342 and the remamder of the frame (the background) maybe identified as a second region 222. The regions may vary in size. For example, in FIG 3 A, the first region 212 includes the speaker and a portion of the speaker's exposed body. However, in FIG. 3B, the first region 212 includes only the speaker's head Similarly, in FIG 3 A, the third region 330 includes the additional participants and a portion of the additional participants' exposed bodies. In FIG.3 B, however, the third region 330 includes only the additional participants' heads.

[0022 ] According to some of the various embodiments, re ion discrimination maybe performed by tele onferen ing device 140. FIG. 4 is a block diagram of a multiple region video conference encoder as per an aspect of an nrfoodiment of the present invention The teleconferencing device 140 may include one or more region determination modules 420 to determine one or more regions in one or more frames 415. Region determination modules 420 may include a multitude of region determination modules such as region determination module 1 (421 ), region determination module 2 (422), and so forth up to region determination module n(429). Each of the region determination modules may be configured to identify different regions (e g . regions 212, 330, 342, and 222 ; FIGs. 3 A and 3B) in frame(s) 200 (FIGs. 2 A and 2B). Each region determination module (421, 422, ... , and 429) may generate from video 415 region data (431, 432, . .. , and 439 respectively), wherein the region data (431 , 432, ... , and 439) maybe encoded by encoder modules 440 at different qualities. For example, region 1 data 431 ma e en oded by region 1 encoder module 441 at a first quality, region 2 data 431 maybe encoded by region 2 encoder module 441 at a second quality up to region n data 43 1 maybe encoded by region n en od r module 449 at yet a different quality. In some en±iodiments, it is possible that some region determination modul s may process more than one region. It is also po ssible , that more than one re gion data (431 , 432, .. . , and/or 439) may be encoded at a same or similar quality by different or the same encoder module (441 , 442, . . . , and/ or 449). The output of the encoder modules 440 maybe en oded video 490 that has encoded different regions at different qualities to improve the overall bit rate of the encoded video without decreasing the quality of important elements of the frame, such as a speake s face .

[0023 ] With continuing reference to FIGs. A, 2B and 4, a first region 212 may include a speaker's face . This region 212 maybe determined using a region 1 determination module 421 . The region 1 determination module 421 may include a face recognition module to bcate the speaker's face 210 in a video frame 200. The fac recognition module may mploy a computer application in combination with computing hardware or other hardware solutions to identify the location of person(s) from a video frame 200. Additionally, the lace recognition module may identify the identity of the person(s). One methodology to locate a head in a frame is to detect facial features such as the shape of a head, locations of features such as ey s, mouths, and noses. Example face recognition systems include : Betaface available at betatace [dot] com, and Semantic Vision Technologies available from t e Warsaw University of Technology in Warsaw, Poland.

[0024] The region 1 determination module 421 may include a lace tracking module to track the location of a speaker's face. Using this l ce tracking module, region 1 may e adjusted to track the speaker's face as the speaker moves around in the frame . Face tracking may use features on a face such as nostrils, the comers of the lips and eyes, and wrinkles to track the movement of the . This technology may use active appearance models, principal component analysis, Eigen tracking, detbrmable surface models, other techniques to track the desired facial features from frame to frame, combinations thereof, and/or the like . Example face tracking technologies that maybe applied sequentially to frames of video, resulting in fac tracking include the Neven Vision system (formerly Eyematics. now acquired by Google, Inc.), which allows real-time 2D face tracking with no person-specific training. [0025 ] According to some of the various embodiments, region determination module(s)s 420 may reassign the first region to include a new speaker's face . This maybe ac omplished, for example, using extensions to face recognition techniques already discussed. Wh n a face recognition mechanism is employed to locate a head in a frame by detect facial features such as the shape of a head, locations of features such as eyes, mouths, and noses. The features may be compared to a database of known entities to identify specific users. Region determination module(s) 420 may reassign the first region to mclude a new speaker's face to another identified user when instructed that the other user is speaking. Instructed that another user is speaking may come from a user of the system and/or automatically from the region deterntination module(s) 420 themselves. For example, some vision based approac es to face recognition may also have the ability to detect and analyze tip and/ or tongue movement. By tracking the hp and tongue movement, the system may also be able to identify which speaker is talking at any one time and cause an adjustment in region 1 to include and/or move to this potentially new speaker.

[0026 ] According to some of the various embodiments of the present invention, additio al region determination modules maybe employed. For example, a frtird region determination module may identify an area covering additional entities 232, 234, 236, and 238 as a llurd region 330. This region maybe identified using an additional region determination module 422. This module may use similar technologies as the region 2 determination module 422 to identify where the additional participants 232, 234, 23D ^", and 238 reside in the fiame(s). Additionally, a fourth re ion deter ination module may identify an area covering additional objects 240 and or like as a fourth region 342. This region may e identified using an automated system configured to identify such objects, and/or the region may be identified by a user. For example, a user may draw a tine around a region of the frame to mdicate that this area is fourth region 342 (FIGs. 3 A and 3B). Alternatively, the presentation may include an object such as a white board which could be ide ntified as a region sue h as the fourth region 342.

[0027 ] As described earlier, the remainder of the frame (the background may be ide ntifie d as a sec ond region 222. To accomplish this, the other regions (e.g. 212, 330, and 342) maybe subtracted from the area encompassing the complete frame 200. However, in some enifcodiments, the background may be determined in other ways. For example, the background may be determined employing a technique such as chroma (or cobr) keying, employing a predetermin d nesting shape, and/or the like . Chroma keying is a technique for compositing (layering) two images or video streams together based on cobr hues (chroma range). The technique and/ or aspects of the technique, however, maybe employed to identify a background from subjects) of a video. In other words, a color range in maybe id ntified and used to create an image mask. In some of the various n^odiments, the mask maybe used to define a region such as the second (e.g., background) region 222. Variations of chroma keying technique are commonly referred to as green screen, and blue screen. Chroma keying maybe performed with backgrounds of any color that are uniform and distinct, but green and blue backgrounds are more commonly used because they differ most distinctly in hue from most human skin colors. Commercially available computer software, such as Pinnacle Studio, and Adobe Premiere use "chromakey" functionality with gre enscree n and/or bluescreen kits .

[0028 ] FIG. 5 is a bbck diagram of another multiple re gbn video confere nee enc oder as per an aspec t of an embodiment of the pre se nt invention. Spe cifically this bbck diagram illustrates an example teleconferencing device 140 embodiment configured to process a video 515 with up to four regions (212, 222, 330 and 342 ; FIGs. A and 3B) . Rjegb nal de terminatbn modules 520 may proc ess vide o 515 with four regional determination modules (521, 522, 523 and 524), each configured to identify and process a different region before being encoded by encoder module (s) 540.

[0029 ] Region 1 maybe an area 212 covering a primary participant such as an active speaker 210 (FIG. 2A) . The regbn 1 determination module 521 may be configured to identify region 1 areas 212 in video frames 515 and generate region 1 data 5 1 for that ide ntifie d regb n The re gbn 1 data 531 maybe enc ode d by region 1 enc oder module 541 at a first quality level.

[0030 ] Region 2 maybe an area 2 12 covering a background 222 (FIG. 2B). The region 2 determination module 522 maybe configured to identify region 2 areas in video frames 51 5 and generate regbn 2 data 532 for that identified region. The regbn 2 data 532 may be encoded by regbn 2 encoder module 542 at a second quality level.

[0031 ] Region 3 may be an area 330 covering additional entities participants in a teleconference. The region 3 determination module 523 may e configured to identify regbn 3 areas in video frames 515 and generate regbn 3 data 533 for that ide ntified region The region 3 data 533 ma be enc oded by reg ion 3 e ncoder mo dole

543 at a third qualitylevel.

[0032 ] Region 4 may be an area 342 covering additional areas of the video frame 515 such as objects) of interest 240 (FIG. 2 A), a white board, a combination thereof, and/or the like. The region 4 determination module 524 maybe configured to identify region 4 areas in video flames 51 and generate region 4 data 534 for that ide ntified region The region 4 data 533 maybe e ncoded b y region 4 encoder module

544 at a fourth qualitylevel.

[0033 ] To reduce the bit-rate of the encoded video, the various region data (531, 532, 533, and 534) maybe encoded using different quality levels. A quality level maybe indicative of a level of compression. Generally the lower the level of compression, the higher the quality of the output stream. Higher levels of compression generally produce a lower bit rate output, whereas, a lower level of compression generally poduces a higher bit rate output. In the example of FIG. 5, region 1 data 531 maybe encoded at a higher quality than the region 2 data 532, the region 3 data 533, and region 4 data 534. In some of the various embodiments, the region 2 data 532 maybe encoded at a higher quality than the region 3 data 533 and region 4 data 534. In some cases, the r ion 3 data may need to encoded at a higher quality to show an important subject of the teleconference . Therefore, one skilled in the art will recognise that other combinations of quality encoding for different regions maybe employed. Additionally, it maybe that one or more of the region 1 encoder module 541, region 2 encoder module 542, region 3 encoder module 543, and/or region 4 encoder module 544 ma be encoded at a similar and/or same quality level. In some of the various embodiments, one or more of the region 1 encoder module 541, region 2 encoder module 542, region 3 encoder module 543, and/or region 4 encoder module 544 may be the same encoder configured to process different regions at different qualitylevels.

[0034] FIG. 6 is a flow diagram of an example multiple region video conference encoding mechanism as per an aspect of an embodiment of the present invention. Blocks indicated with a dashed line are optional actions. The flow diagram may be implemented as a method using hardware and/or software in combination with digital hardware. Additionally, the flow diagram may be implemented as a series of one or more instru tions on a non- transitory machine -

S readable medium, which, if executed by a processor, cause a computer to implement the flow diagram.

[0035 ] A first region of one or more frames containing a speaker's face may be located at ό ^~10. Additional regions maybe located in the frame. For example: at 630, a third region of the one or more frames containing additional faces may be located; a fourth region of the one or more frames maybe bcated by a user; and at 620 a second region of the one or more frames containing a background may be located. These areas maybe located using techniques described earlier.

[0036 ] The first region may be identified employing face recognition techniques described earlier. Face tracking techniques maybe employed to adjust the first region to track the speaker's face as the speaker moves around a video frame. Additionally the first region maybe periodically reassigned to a new speaker's face.

[0037 ] Each of the E gions maybe enc ode d at different qualitie s. For example, the first region may be encoded at a first quality at 650, the second region maybe encoded at a second quality at 660, the third region maybe encoded at a third quality at 670, and the fourth region maybe encoded at a fourth quality at 680.

[0038 ] The quality levels maybe set relative to each other. For example, the third quality maybe lower than the second quality, the second quality maybe lower than the first quality, and/or the fourth quality may be lower than the first quality. Various combinations are possible depending upon constraints such as desired final output bit rate, desired image quality of the various regions, combination thereof, id ^1' or the like . In some embodiments, one or more quality levels maybe the same. Generally, in video conferencing applications, the quality level of region 1 will be set high st unless another area of the frame is deemed to be more important.

[0039 ] FIG. 7 through FIG. 9 are example flow diagrams of a video conference encoding mechanism as per an aspect of an embodiment of the present invention. Some of the various embodiments of the present invention may decrease the bit- rate of a video conference at the sacrifice of the image quality of unvalued information. Face detection and OI (region of interest) recognition technology may be combined such that crucial information of a video frame such as attendee faces or user-defined ROI parts may be extracted out and encoded in at high quality level. Since the frame size may be come smaller, the bit -rate of the video conference may decrease. [0040 ] In some embodiments, information in a video frame maybe classified into at least 3 types. Each type maybe assigned a different quality value according to its importance. In many cas s, the frame area which contains the face of the speaker's and the User-Defined ROI may be assigned to be encoded with a highest priority quality level. A secondary level maybe assigned to the faces of other attendees. A last level maybe assigned to the background of the frame.

[0041 ] For this example, the classification strategy may be based on the typical scenario of a video conference application. The speaker and his action maybe the focus of the video conference. The speaker may employ too Is such as blackboard or projection screen to help with a pres ntation. Correspondingly the some embodiments may detect the speaker's face automatically and the speaker the privilege to define user-defined ROI(s). As audiences, other attendees may contribute less to the current video conference, so they maybe assigned to a second level quality. At last, irformation in the rest area maybe roughly static, treated as background and assigned a minimum quality

[0042 ] An example embodiment may include three modules: an 'ΈΟΙ Demon", a 'Tre-Encoding" module and a "Discriminated Ercoding" module. FIG. 7 illustrates the flowchart of a "ROI D mon" module . At the conference local side, the "ROI Creation Eve nf maybe defined as, for example, the constant movement of the mouse on the local view while the "ROI De stio y Evenf ' defined, for example, as double click within a pre-defined ROI area. The demon may maintain the created ROIs; monitor and response to the bcal view events, provide the ROI creation and destroy service to the user. Specifically in this example, at processing block 710, window events) maybe bcally mordbred. When an ROI creation event is detected atbbck 720, a new ROI area maybe added to an ROI pool. If an ROI destroy event is detected at block 750, the new ROI area maybe removed from the ROI pool.

[0043 ] FIG. 8 is a flow diagram of a 'Tre -Encoding" module 800 and FIG. 9 is a flow diagram of a "Discriminated Encoding" module 900. The fre-Er.coding module 800 may receive the raw frame from a camera at bbck 810. By using face analyzing technology attendee faces maybe e tracted at bbck 820. Ajudgment as to whether the speaker has changed through the tracking of lips movement or expressbn change maybe made at bbck 830. Besides the initiative change made by the current speaker, if the speaker changes, it maybe expected that the speaker may have defined new ROI s and so a check on whether the ROI has changed maybe made at block 840. A "ROI Redefine" block 860 may send a request to the "ROI Demon" to ask for the latest User-De fined ROIs. At bbck 850, feces and ROIs maybe classified according to the three quality levels discussed earlier. Classified face and ROI areas from the "Pre-Encoding" module may be communicated to the 'Oiscriminated Encoding" module at block 8o ^"0 where the classified face and ROI areas maybe encoded with the highest, middle and lowest quality r spectively.

[0044] Unencoded face and/or user-defined area(s) maybe received at block 910. If the area is determined to be a level 1 area (e.g. highest priority quality level) at block 9o ^"0, then it maybe encoded at the highest quality level at block 930. If the area determined to be ranked as a level 2 area (e .g. medium priority quality level) at block 970. then it may be encoded at the medium quality level at block 940. Otherwise, it may be encoded at a bw quality level at block 950 . This process continues until it is determined at bbck 920 that all of the faces and areas have been enc oded. The encoded frame may the n be packe d se nt to the network at bbck 980.

[0045 ] This example embodiment may be implemented by modifying an H.2o ^~4 encoding module to assign different QP (quantisation parameter) values ta the three types of areas. Experimental results have shown that video output en oded by raw H.264 has a bit- rate is 187 bps. However, the video output of a modified H.264 encoder, where the encoding quality of the face area was 1.4 times more than that of the background and the bit-rate had a decreased bit rate from 187 kbps to 127 Kbps. Result represents a 32% improvement to the bit-rate.

[0046 ] Figure 10 illustrates an mbodiment of a system 1 000. In embodiments, system 1000 ma be a media system although system 1000 is not limited to this context. For example, system 1000 may be incorporated into a personal computer (PC), laptop computer, ultra-laptop computer, tablet, touch pad, portable computer, handheld computer, palmtop computer, personal digital assistant (PDA), cellular telephone, combination cellular telephone/PDA, televisbn, smart device (e.g., smart phone, smart tablet or smart televisbn), mobile internet device (MID), messaging device, data communication device, and so forth.

[0047 ] In embodiments, system 1000 comprises a platform 1002 coupled to a display 1020. Platform 1002 may receive content from a content device such as content services device(s) 1030 or content delivery device(s) 1040 or other similar content sources. A navigatbn controller 1 050 comprising o e or more navigation features maybe used to interact with, for example, platform 1002 and/or display 1020. Each of these components is described in more detail below.

[0048 ] In embodiments, platform 1002 may comprise any combination of a chipset 1005, processor 1010. memory 1012, storage 1014, graphics subsystem 1015, applications 1016 arii/or radio 1018. Chipset 1005 may provide irderconmaunication among processor 1010, memory 1012, storage 1014, graphics subsystem 1015, applications 1016 and/or radio 1018. For example, chipset 1005 may include a storage adapter (not depicted) apable of providing intercommunication with storage 1014.

[0049 ] Processor 1010 may be implemented as Complex Instruction Set Computer (CISC) or Reduced Instruction Set Computer (RISC) processors, χ8ό ^" instruction set compatible processors, multi-core, or any other mi roprocessor or central processing unit (CPU). In embodiments, processor 1010 may comprise dual- core processors), dual-core mobile processors), and so forth.

[0050 ] Memory 1012 maybe implemented as a volatile memory device such as, but not hrnited to, a Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), or Static RAM (Ξ RAM) .

[0051 ] Storage 1014 may be implemented as a non-volatile storage device such as, but not limited to, a magnetic disk drive, optical disk drive, tape drive, an internal storage device, an attached storage device, flash memory, battery backed-up SDRAM (synchronous DRAM), and/or a network accessible storage device. In embodiments, storage 1014 may comprise technology to increase the storage performance enhanced protection for valuable digital media when multiple hard drives are included, for example .

[0052 ] Graphi s subsystem 1015 may perform processing of images such as still or video for display Graphics subsystem 1015 maybe a graphics proc ssing unit (GPU) or a visual pro essing unit (VPU), for example. An analog or digital interface maybe used to communicatively couple graphi s subsystem 1015 and display 1020. For example, the interface may be any of a High-Definition Multimedia Interface, DisplayPort, wireless HDMI, and/ or wireless HD compliant techniques. Graphics subsystem 1015 could be integrated into processor 1010 or chipset 1005. Graphics subsystem 1015 could be a stand-alone card communicatively coupled to chipset 1005.

[0053 ] The graphics and/ or vide o proce ssing tec hnicjues de scribed he rein may be implemented in various hardware architectures. For example, graphics and/or video mnctfonality may be integrated within a chipset. Alternatively, a discrete graphics and/or video processor may be used. As still another embodiment the graphics and/or video functions maybe implemented by a general purpose processor, including a multi-core processor. In a further e mbodimenl, the functions may be implemented in a consumer electronics device .

[0054] Radio 1018 may include one or more radios capable of transmitting and receiving signals using various suitable wireless communications techniques. Such techniques may involve communications across one or more wireless networks. Exemplary wireless networks include (but are not limited to) wireless bcal area networks (WLANs), wireless personal area networks (WPANs), wireless metropolitan are network (WMANs), cellular networks, and satellite networks. In communicating across such networks, radio 10 IS may operate in accordance with one or more applicable standards in any version.

[0055 ] In embodiments, display 1020 may comprise any television type monitor or display. Display 1020 may comprise, for example, a computer display screen, touch screen display, video monitor, television-like device, and/or a television. Display 1020 maybe digital and/or anabg. In embodiments, display 1 020 maybe a hobgraphic display. Also, display 1020 may be a transparent surface that may receive a visual projection. Such projectbns may convey various forms of information, images, and/or objects. For example, such projections maybe a visual overlay for a mobile augmented reality (MAR) application. Under the control of one or more software applicatbns 101b ^", platform 1002 may display user interface 1022 on displayl020.

[0056 ] In embo dime rets, conte nt se rvic es devic e(s) 1030 mayb e hosted by any natfonal, international and/or independent service and thus accessible to platform 1002 via the Internet, for example. Content services device(s) 1 030 maybe coupled to platform 1002 and/or to display 1020. Platform 1002 and/or content services device(s) 1030 maybe coupled to a network 10o ^~0 to communicate (e.g., send and/or rec ive) media informatton to and from network 10o ^"0. Content delivery device(s) 1040 also maybe coupled to platform 1002 and/or to display 1020.

[0057 ] In embodime rets, conte nt servic es devic e(s) 1030 may comprise a c able television box, personal computer, network, telephone, Internet enabled devices or appliance capable of dehvering digital irLformation and/or content and any other similar device capable of unidirectbnally or bidirectionally communicating content between content pOviders and platform 1002 and/display 1020. via network 10o ^~0 or directly It will be appreciated that the content maybe communicated unidirectionally and/ or bidir tio rally to and from any one of the compon ts in system 1000 and a content provider via network l OfjO. Examples of content may include any media information mcludirig, for example, video, music, medical and gaming information, and so forth.

[0058 ] Content services device(s) 1030 receives content such as cable television progranrning including media information, digital information, and/or other content. Examples of content providers may include any cable or satellite television or radio or Internet content providers. The provided e amples are not meant to limit embodiments of the invention.

[0059 ] In embodiments, platform 1002 may receive control signals from navigation controller 1050 having one or more navigation features. The navigation features of controller 1050 may be used to interact with user interface 1022, for example. In embodiments, navigation controller 1050 maybe a pointing device that maybe a computer haidware component (specifically human interface device) that allows a user to input spatial (e.g., continuous and multi-dime nsdonal) data into a computer. Ivkny systems such as graphical user interfaces (GUI), and televisions and monitors allow the user to control and provide data to the computer or television using physical gestures.

[0060 ] Movements of the navigation features of contioller 1 050 may be echoed on a display (e.g., display 1020) by movements of a pointer, cursor, focus ring, or other visual indicators displayed on the display. For example, under the control of software applications l Olfj, the navigation features bcated on navigation controller 1050 maybe mapped to virtual navigation features displayed on user interface 1022, for example. In embodiments, controller 1050 may not be a separate component but integrated into platform 1002 and/ or display 1020. Embodiments, however, are not limited to the elements or in the c ontext shown or desc ribe d herein

[0061 ] In embodiments, drivers (not shown) may comprise technology to enable users to instantly turn on and off platform 1002 like a te levision with the touc h of a button after initial boot-up, when enabled, for e xample . Program logic may allow platform 1002 to stream content to media adaptors or other content services devic (s) 1030 or content delivery device(s) 1040 when the platform is turned "off." In addition, chip set 1005 may comprise hardware and/ or software support for 5.1 sunourd sound audio and/or high definition 7.1 surround sound audio, for exampl . Drivers may include a graphics driver tor integrated graphics platforms. In embodiments, the graphi s driver may comprise a peripheral component interconnect (PCI) Express graphics card.

[0062 ] In various embodiments, any one or more of the components shown in system 700 may e integrated. For example, platform 702 and content services device(s) 730 maybe integrated, or platform 702 and content delivery device(s) 740 may e integrated, or platform 702, content services device(s) 730, and content delivery device(s) 740 may be integrated, for example . In various embodiments, platform 702 and display 720 may be an integrated unit. Display 720 and content service device(s) 730 maybe integrated, or display 720 and content delivery de vice( s) 740 may be integrated, for example . These examples are not meant to limit the invent ion.

[0063 ] In various mbodiments, system 700 may be implemented as a wireless system, a wired system, or a combination of both. When implemented as a wireless system, system 700 may include components and interlaces suitable for communicating over a wireless shared media, such as one or more antennas, transmitters, receivers, transceivers, amplifiers, filters, control logic, and so forth. An example of wireless shared media may include portions of a wirel ss spectrum, such as the RP spe ctruni and so forth. Whe n imple mented as a wire d system, system 700 may include components and interfaces suitable for communicaung over wired communications media, such as input/output (I O) adapters, physical connectors to connect the IfO adapter with a corresponding wired c onmiunications medium, a network interface card (NIC), disc controller, video controller, audio controller, and so forth. Examples of wired communications media may include a wire, cable, metal leads, printed circuit board (FCB), backplane, switch fabric, semiconductor material, twisted-pair wire, co-axial cable, fiber optics, and so forth.

[0064] Platform 1 002 may establish one or more logical or physical channels to communicate information The information may include media information and control information. Media information may refer to any data representing content meant for a user. Examples of content may include, for example, data from a voice conversation, vide oconfere nee, streaming video, electronic mail ("email") message, voice mail message, alphanumeric symbols, graphics, image, video, text and so forth. Data from a voice conversation may be, for example, speech information, silence periods, background noise, comfort noise, tones and so forth. Control information may refer to any data representing commands, instructions or control words meant for an automated system. For example, control information maybe used to route media information through a syste m, or instruc t a no de to process the media ir-formatio n in a predetermined manner. The embodiments, however, are not limited to the el ments or in the context shown or described in Figure 10.

[0065 ] As described above, system 1000 may be embodied in varying physical styles or form factors. Figure 1 1 illustrates mbodiments of a small form factor device 1 100 in which system 1000 may e embodied. In en iodiments, for example, device 1100 may be implemented as a mobile computing device having wireless capabilities. A mobile computing device may refer to any device having a proce ssing system and a mobile power sourc e or supply, such as o re or more batte ries, for example.

[0066 ] As described above, examples of a mobile computing device may include a personal computer (PC), laptop computer, ultra-laptop computer, tablet touch pad, portable computer, handheld computer, palmtop computer, personal digital assistant (PDA), cellular telephone, combination cellular telephone/PDA, television, smart device (e.g., smart phone, smart tablet or smart television), mobile internet device (MID), messaging device, data communication device, and so forth.

[0067 ] Examples of a mobile computing device also may include computers that are arranged to be worn by a person, such as a wrist computer, finger computer, ring computer, eyeglass computer, belt-clip computer, arm-band computer, shoe computers, c thing computers, and other wearable computers. In embodiments, for example, a mobile computing device maybe implemented as a smart phone capable of executing computer applications, as well as voice communications and/ or data communications. Although some embodiments may e described with a mobile computing device implemented as a smart phone by way of example, it may be appre iated that other embodiments maybe implemented using other wireless mobile computing devices as well. The embodiments are not limited in this context.

[0068 ] As shown in Figure 1 1, device 1 100 may comprise a housing 1102, a display 1104, an input/output (ISO) device 110o ^~, and an antenna 1108. Device 1100 also may comprise navigation features 1112. Display 1104 may comprise any suitable display unit for dsp-aying information appropriate for a mobile computing device. I O device 1 1 06 may comprise any suitable I/O device for entering information into a mobile computing device. Examples for ISO device 1 10ό ^" may in lude an alphanumeric keyboard, a numeric keypad, a touch pad, input keys, buttons, switches, rocker switches, microphones, speakers, voice recognition device and software, and so forth. Information also maybe entered into device 1100 by way of microphone. Such information ma be digitised by a voice recognition device. The embodiments are not limited in this context.

[0069 ] Various embodiments may be implemented using hardware elements, software elements, or a combination of both. Examples of hardware elements may include processors, microprocessors, circuits, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth), integrated circuits, application specific integrated circuits (ASIC), programmable logic devices (PLD), digital signal processors (ESP), field programmable gate array (FPGA), logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth. Examples of software may include software components, programs, applications, computer programs, application programs, system programs, nmchine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, pro edures, software interfaces, application program interfaces (API), instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. Determining whether an embodiment is implemented using hardware elements and or software elements may vary in accordance with any number of factors, such as desired computational rate, pDwer levels, heat toleran es, processing cy le budget input data rates, output data rates, memory resources, data bus speeds and other design or performance constraints.

[0070 ] One or more aspects of at least one embodiment maybe implemented by representative instructions stored on a machine -readable medium wriich represents various logic within the processor, which when read by a ma hine causes the machine to fabricate logic to perform the techniques described herein. Such representations, known as ΊΡ cores" may be stored on a tangible, machine readable medium and supplied to various customers or mamfacturing facilities to load into the fabrication nrachines lhat acluaHymake the logic or processor.

[0071 ] In this specification, "a" and "an" and similar phrases are to be interpreted as "at least one" and "one or more." R feren es to "an" en odiment in this disclosure are not necessarily to the same en±iodiment. [0072 ] Many of the elements described in the disclosed embodiments maybe implemented as modules. A module is defined here as an isol table element that performs a defined function and has a defined interface to other elements. The modules described in this disclosure maybe implemented in hardware, a combination of hardware and software, firmware, or a ombination thereof, all of which are behaviorall equivalent. For example, modules maybe implemented using computer hardware in combination with software routine (s) written in a computer language (such as C, C++, Fortran, Java, Basic, Matlab or the like) or a mo deling/ simulation program such as Simulink, State flow, GNU Octave, or Lab VIEW IvkthScript. Additionally it maybe possible to implement modules using physical hardware that incorporates discrete or programmable analog, digital and/or quantum hardware. Examples of programmable hardware include: computers, microcontrollers, microproc ssors, application-specific integrated circuits (ASICs); field programmable gate arrays (FPGAs); and complex programmable logic devices (CPLDs). Computers, microcontrollers and microprocessors are programmed using languages such as assembly C, C++ or the like. FPGAs, ASICs and CPLDs are often programmed using hardware d scription languages (HDL) such as VHSIC hardware description language (VHDL) or V rilog that configure connections betwe n internal hardware modules with lesser functionality on a pro rammable device. Finally, it needs to be emphasised that the above mentioned technologies may be used in combination to achieve the result of a functional module.

[0073 ] In addition, it should be understood that any figures that lughlight any functionality and/ or advantages, are presented for xample purposes only. The disclosed architecture is sufficiently flexible and configurable, such that it may be utilised in ways other than that shown. For example, the steps listed in any flow hart maybe re-ordered or only optionally use d in so me e mbodime nts.

[0074] Further, the purpose of the Abstract of the Disclosure is to enable the U.S. Patent and Trademark Office and the public generally, and especially the scientists, engineers and practitioners in the art who are not familiar with patent or l al terms or phraseology, to determine quickly from a cursory inspection the nature and essence of the technical disclosure of the application. The Abstract of the Disclosure is not intended to be kmiting as to the scope in anyway.

[0075 ] It is the applicants intent that only claims trat include the express language "means for" or "step for" be interpreted under 35 U.S.C. 1 12, paragraph 6.

13 Claims that do not expressly include the phrase "means for" or "step for" are not to be interpreted under 35 U.S.C. 1 12. paragraph 6.

[0076 ] Unless sp ifically stated otherwise, it maybe appreciated that terms such as "processing," "computing," "calculating, " "deteimLriing," or the lite, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulates and/or transforms data represented as physical quantities (e.g., electronic) witlun the computing system's registers andfor memories into other data similarly represented as physical quantities within the computing system's memories, registers or other such ir-formation storage, trarismission or display devi es. The embodime ts are not limited in this context.

[0077 ] The term "coupled" may be used herein to refer to any type of relationship, dire t or indir t, between the components in question, and may apply to electrical, n^chanical, fluid, optical, electromagnetic, electromechanical or other connections. In addition, the terms "firsf, "second", etc. maybe used herein only to facilitate discussion, and carry no particular temporal or chronological significance unless otherwise indicated.

[0078 ] Those skilled in the art will appreciate from the foregoing description that the broad techniques of the embodiments of the present invention can be implemented in a variety of forms. Therefore, while the mbodiments of this invention have been described in connection with particular examples thereof, the true scope of the embodiments of the invention should not be so limited since other modifications will become apparent to the skilled practitioner upon a study of the drawings, specification, and following claims.

Previous Patent: COOLING UNIT AND METHOD

Next Patent: IMPROVED COAL GASIFICATION