Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND DEVICE FOR VIDEO PROCESSING
Document Type and Number:
WIPO Patent Application WO/2014/180138
Kind Code:
A1
Abstract:
A method and a device for processing a video are provided. The method includes: fetching, from a buffer queue, an ith image frame of the video; calculating a sampling interval of the ith image frame; calculating a waiting time of the ith image frame; calculating a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame; determining a playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay; if the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is not shorter than the playing interval of the ith image frame, playing the ith image frame of the video on the current time point.

Inventors:
YIN CHENGGUO (CN)
Application Number:
PCT/CN2013/089239
Publication Date:
November 13, 2014
Filing Date:
December 12, 2013
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
TENCENT TECH SHENZHEN CO LTD (CN)
International Classes:
H04N7/14
Foreign References:
CN101378484A2009-03-04
CN101668223A2010-03-10
US20070263070A12007-11-15
JP2011066944A2011-03-31
Attorney, Agent or Firm:
SHENPAT INTELLECTUAL PROPERTY AGENCY (West Block Guomao Buildin, Shenzhen Guangdong 4, CN)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method for processing a video, comprising:

fetching, from a buffer queue, an ith image frame of the video, wherein the variable i is a natural number;

calculating a sampling interval of the ith image frame between a time point on which the ith image frame of the video is sampled and another time point on which an (i-l)th image frame of the video is sampled, wherein the (i-l)th image frame of the video is the image frame of the video fetched from the buffer queue immediately before the ith image frame;

calculating a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue;

calculating a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame;

determining a playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay;

determining whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame; and

playing the ith image frame of the video on the current time point if the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is not shorter than the playing interval of the ith image frame. 2. The method according to claim 1, wherein calculating the regulated waiting time of the ith image frame based on the waiting time of the ith image frame and the regulated waiting time of the (i-l)th image frame comprises:

obtaining a preset adjustment factor a for the waiting time, wherein the adjustment factor a satisfies 0<α<1 ; and calculating the regulated waiting time of the ith image frame with the following formula:

AVR_Wi= a AVR Wi i + (1 - a)x Wi wherein AVR Wi represents the regulated waiting time of the ith image frame, AVR Wi ι represents the regulated waiting time of the (i-l)th image frame, and the Wi represents the waiting time of the ith image frame.

3. The method according to claim 1 or claim 2, wherein obtaining the playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and the preset waiting delay comprises: determining whether the regulated waiting time of the ith image frame is longer than the preset waiting delay; if the regulated waiting time of the ith image frame is longer than the preset waiting delay, obtaining a preset adjustment factor for the playing interval, and calculating the playing interval of the ith image frame based on the adjustment factor for the playing interval and the sampling interval of the ith image frame; or

if the regulated waiting time of the ith image frame is shorter than or equal to the preset waiting delay, taking the sampling interval of the ith image frame as the playing interval of the ith image frame.

4. The method according to claim 3, wherein calculating the playing interval of the ith image frame based on the adjustment factor for the playing interval and the sampling interval of the ith image frame comprises:

calculating the playing interval of the ith image frame with the following formula:

Playlntervali = Samplelntervalixp

wherein the Playlntervali represents the playing interval of the ith image frame, the Samplelntervali represents the sampling interval of the ith image frame, β represents the adjustment factor for the playing interval, and satisfies 0<β<1.

5. The method according to claim 1 or claim 2, wherein the method further comprises:

in the case that the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame, playing the ith image frame of the video at a time point when a duration equal to the playing interval of the ith image frame is elapsed since the (i-l)th frame of video imageimage frame of the video starts to be played.

6. The method according to claim 1, wherein calculating the sampling interval of the ith image frame between one time point on which the ith image frame of the video is sampled and another time point on which the (i-l)th image frame of the video is sampled comprises:

fetching, from attribute information of the ith image frame of the video, a sampling time stamp marking a time point on which the ith image frame of the video is sampled by a sender; and

calculating the sampling interval of the ith image frame based on the sampling time stamp of the ith image frame of the video and a sampling time stamp of the (i-l)th image frame of the video.

7. The method according to claim 1, wherein the method further comprises: receiving, through a network, images of the video sent by a sender; and each time when receiving one image frame of the video, putting the received image frame of the video into the buffer queue.

8. The method according to claim 1, wherein fetching, from the buffer queue, the ith image frame of the video comprises:

in the case that the buffer queue comprises a plurality of image frames of the video, selecting, from the plurality of image frames of the video, a image frame which is put into the buffer queue earliest; and

fetching, from the buffer queue, the image frame which is put into the buffer queue earliest, as the ith image frame.

9. A device for processing a video, comprising:

an fetching module, configured to fetch, from a buffer queue, an ith image frame of the video, wherein the variable i is a natural number;

a sampling interval calculation module, configured to calculate a sampling interval of the ith image frame between a time point on which the ith image frame of the video is sampled and another time point on which an (i-l)th image frame of the video is sampled, wherein the (i-l)th image frame of the video is the image frame fetched from the buffer queue immediately before the ith image frame;

a first time calculation module, configured to calculate a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue;

a second time calculation module, configured to calculate a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame;

an obtaining module, configured to obtain a playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay;

a determination module, configured to determine whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame; and

a playing module, configured to play the ith image frame of the video on the current time point if the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is longer than or equal to the playing interval of the ith image frame.

10. The device according to claim 9, wherein the second time calculation module comprises:

a first obtaining sub-module, configured to obtain a preset adjustment factor a for the waiting time, wherein the adjustment factor a satisfies 0<α<1 ; and

a first time calculation sub-module, configured to calculate the regulated waiting time of the ith image frame with the following formula: AVR_Wi= axAVR Wi i + (l— a)xWi, wherein the AVR Wi represents the regulated waiting time of the ith image frame, the AVR W ; i represents the regulated waiting time of the (i-l)th image frame, and the Wi represents the waiting time of the ith image frame.

11. The device according to claim 9 or claim 10, wherein the obtaining module comprises:

a determination sub-module, configured to determine whether the regulated waiting time of the ith image frame is longer than the preset waiting delay;

a second obtaining sub-module, configured to obtain a preset adjustment factor for the playing interval if the regulated waiting time of the ith image frame is longer than the preset waiting delay;

a playing interval calculation sub-module, configured to calculate the playing interval of the ith image frame based on the adjustment factor for the playing interval and the sampling interval of the ith image frame; and

a third obtaining sub-module, configured to take the sampling interval of the ith image frame as the playing interval of the ith image frame if the regulated waiting time of the ith image frame is shorter than or equal to the preset waiting delay.

12. The device according to claim 11 , wherein the playing interval calculation sub-module is further configured to calculate the playing interval of the ith image frame with the following formula: Playlntervali = SamplelntervaliXp, wherein the Playlntervali represents the playing interval of the ith image frame, the Samplelntervali represents the sampling interval of the ith image frame,P represents the adjustment factor for the playing interval, andp satisfies 0<β<1.

13. The device according to claim 9 or claim 10, wherein the playing module is further configured to, in the case that the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame, play the ith image frame of the video at a time point when a duration equal to the playing interval of the ith image frame is elapsed since the (i-l)th frame of video imageimage frame of the video starts to be played.

14. The device according to claim 9, wherein the sampling interval calculation module comprises:

a time stamp extraction sub-module, configured to extract, from attribute information of the ith image frame of the video, a sampling time stamp marking a time point on which the ith image frame of the video is sampled by a sender; and

a sampling interval calculation sub-module, configured to calculate the sampling interval of the ith image frame based on the sampling time stamp of the ith image frame of the video and a sampling time stamp of the (i-l)th image frame of the video.

15. The device according to claim 9, further comprises:

a receiving module, configured to receive, through a network, image frames of the video sent by a sender; and

a storage module, configured to, each time when receiving one image frame of the video, put the received image frame of the video into the buffer queue.

16. The device according to claim 9, wherein the fetching module comprises:

a fourth obtaining sub-module, configured to, in the case that the buffer queue comprises a plurality of image frames of the video, select, from the plurality of image frames of the video, the image frame which is put into the buffer queue earliest; and

an fetching sub-module, configured to fetch, from the buffer queue, the image frame which is put into the buffer queue earliest, as the ith image frame.

17. A non- volatile storage medium comprising computer-executable instructions, when being executed by a processor, enabling the processor to: fetch, from a buffer queue, an ith image frame of the video, wherein the variable i is a natural number;

calculate a sampling interval of the ith image frame between a time point on which the ith image frame of the video is sampled and another time point on which an (i-l)th image frame of the video is sampled, wherein the (i-l)th image frame of the video is the image frame of the video fetched from the buffer queue immediately before the ith image frame;

calculate a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue;

calculate a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame;

determine a playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay;

determine whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame; and

play the ith image frame of the video on the current time point if the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is not shorter than the playing interval of the ith image frame.

Description:
METHOD AND DEVICE FOR VIDEO PROCESSING

[0001] The present application claims the priority to Chinese Patent Application No. 201310169074.3, entitled "METHOD AND DEVICE FOR VIDEO PROCESSING", filed on May 9, 2013 with State Intellectual Property Office of People's Republic of China, which is incorporated herein by reference in its entirety.

TECHNICAL FIELD

[0002] The disclosure relates to the video technology, and in particular, to a method and a device for processing a video.

BACKGROUND

[0003] With the rapid development of the network technology and the popularization of mobile phone terminals, more and more users install video call software in the mobile phone terminals, and through the video call software, a video of the counterpart in a phone call may be displayed in real time during the call.

[0004] A mobile phone terminal, acting as a receiver, may receive, through a network, the image frames of the video sent from a sender. After the receiver receives the image frames of the video, the image frames of the video may be played.

[0005] The existing approaches for playing the image frames of the video are not adaptive to the network delay and may influence the soomthness and the real-time performance of the video.

SUMMARY

[0006] A method and a device for processing a video are provided according to embodiments of the invention. With the method and the device, the playing of a current image frame of a video may be timed dymatically based on the network delay, and the real-time performance of the video call is ensured.

[0007] In order to solve at least one of the technical problems described above, the following technical solutions are provided according to embodiments of the invention.

[0008] In an aspect, a method for processing a video is provided according to an embodiment of the invention. The method includes:

[0009] fetching, from a buffer queue, an ith image frame of the video, where the variable i is a natural number;

[0010] calculating a sampling interval of the ith image frame between a time point on which the ith image frame of the video is sampled and another time point on which an (i-l)th image frame of the video is sampled, where the (i-l)th image frame of the video is the image frame fetched from the buffer queue immediately before the ith image frame;

[0011] calculating a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue;

[0012] calculating a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame;

[0013] determining a playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay;

[0014] determining whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame; and

[0015] if the duration is not shorter than the playing interval of the ith image frame, playing the ith image frame of the video on the current time point.

[0016] In another aspect, a device for processing a video is provided according to an embodiment of the invention. The device includes:

[0017] an fetching module, configured to fetch, from a buffer queue, an ith image frame of the video, where the variable i is a natural number;

[0018] a sampling interval calculation module, configured to calculate a sampling interval of the ith image frame between a time point on which the ith image frame of the video is sampled and another time point on which an (i-l)th image frame of the video is sampled, where the (i-l)th image frame of the video is the image frame fetched from the buffer queue immediately before the ith image frame; [0019] a first time calculation module, configured to calculate a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue;

[0020] a second time calculation module, configured to calculate a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame;

[0021] an obtaining module, configured to obtain a playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay;

[0022] a determination module, configured to determine whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame; and

[0023] a playing module, configured to play the ith image frame of the video on the current time point if the duration is not shorter than the playing interval of the ith image frame.

[0024] In yet another aspect, it is provided a non-volatile storage medium including computer-executable instructions, where the instructions, when being executed by a processor, enable the processor to implement the foregoing method for processing a video.

[0025] According to the technical solutions, the embodiments of the invention have the following advantages.

[0026] According to the embodiments of the invention, firstly the ith image frame of the video is fetched from the buffer queue; the sampling interval of the ith image frame, between one time point on which the ith image frame of the video is sampled and another time point on which the (i-l)th image frame of the video is sampled, is calculated; the waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue, is calculated; the regulated waiting time of the ith image frame is calculated based on the waiting time of the ith image frame and the regulated waiting time of the (i-l)th image frame; the playing interval of the ith image frame is obtained based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and the preset waiting delay; then it is determined whether the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame; the ith image frame of the video is played on the current time point if the duration is not shorter than the playing interval of the ith image frame. According to the embodiments of the invention, a playing interval may be set for each image frame of a video based on the regulated waiting time, the sampling interval and the preset waiting delay; therefore, the current network delay may be acquired through the regulated waiting time, the buffer time of the current image frame of the video may be determined adaptively by setting the current playing interval of the each frame, the images of the video may be displayed smoothly, and the stable play of the images of the video is ensured; in addition, after the playing interval is set, the duration between the time point on which a previous image frame of the video is played and the current time point, is compared with the set playing interval, and in the case that the duration between the time point on which the previous image frame of the video is played and the current time point is longer than or equal to the set playing interval, the current image frame of the video is played at the current time point, therefore, the timing for playing the current image frame of the video may be dynamically adjusted in order that the received image of the video is played in time and the real-time performance of the video call is improved.

BRIEF DESCRIPTION OF THE DRAWINGS

[0027] For explaining technical solutions according to embodiments of the invention more clearly, drawings used in the description of the embodiments are explained briefly hereinafter. Obviously, the drawings in the description are merely some of the embodiments of the invention, and other drawings may be obtained by those skilled in the art based on these drawings.

[0028] Figure 1 is a flowchart of a method for processing a video according to an embodiment of the invention.

[0029] Figure 2 is a flowchart of a method for processing a video according to another embodiment of the invention.

[0030] Figure 3 is a flowchart of a method for processing a video according to yet another embodiment of the invention.

[0031] Figure 4-a is a schematic structure diagram of a device for processing a video according to an embodiment of the invention.

[0032] Figure 4-b is a schematic structure diagram of a device for processing a video according to another embodiment of the invention.

[0033] Figure 5 is a schematic structure diagram of a terminal in which the method for processing a video is applied according to an embodiment of the invention.

DETAILED DESCRIPTION OF THE EMBODIMENTS

[0034] A method and a device for processing a video are provided according to embodiments of the invention. The method and the device are configured to dynamically adjust the timing for playing a current image frame of the video based on the network delay, and ensure the real-time performance of a video call.

[0035] For making objectives, features and advantages of the disclosure more apparent, technical solutions according to embodiments of the invention are described clearly and completely hereinafter in conjunction with drawings according to the embodiments of the invention. Apparently, the described embodiments are merely part of the embodiments of the invention, rather than all the embodiments. Any other embodiment obtained by those skilled in the art based on the embodiments of the invention should fall into the scope of protection of the invention.

[0036] Terms of "first", "second", etc., used in the specification, claims and the drawings, are intended to distinguish similar objects, rather than to describe a specific order or precedence. It should be understood that, those terms are only applied, during describing the embodiments of the invention, to distinguish the objects having the same attributes, and those terms may be interchangeable under proper circumstance. In addition, terms of "include", "comprise" and any transformation thereof are intended to be non-exclusive; procedures, methods, systems, products or devices including a series of units are not limited to the units, and inherent units or the units which are not expressively listed may be included in the procedures, methods, systems, products or devices.

[0037] The embodiments are described respectively as follows.

[0038] A method for processing a video is provided according to an embodiment of the invention. The method may include: fetching, from a buffer queue, an ith image frame of the video, where the variable i is a natural number; calculating a sampling interval of the ith image frame between a time point on which the ith image frame of the video is sampled and another time point on which an (i-l)th image frame of the video is sampled, where the (i-l)th image frame of the video is the image frame fetched from the buffer queue immediately before the ith image frame; calculating a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue; calculating a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame; determining a playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay; determining whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame; and if the duration is not shorter than the playing interval of the ith image frame, playing the ith image frame of the video on the current time point.

[0039] Referring Figure 1, a method for processing a video according to an embodiment of the invention is illustrated. The method may include steps 101-107.

[0040] The step 101 is to fetch, from a buffer queue, an ith image frame of the video;

[0041] where the variable i is a natural number.

[0042] According to the embodiment, the buffer queue is configured to store images of the video received by a receiver through network. There may be only one image frame or may be multiple image frames of the video stored in the buffer queue, which depends on intervals between the sampling of the images at a sender and the network delay. Furthermore, when one image frame of the video is fetched from the buffer queue, for convenient description of a current image frame of the video and a image frame of the video fetched immediately before the current image frame of the video, the current image frame fetched from the buffer queue is defined as the ith image frame, and the image frame fetched immediately before the current image frame is defined as an (i-l)th image frame of the video; where the variable i is a natural number. Variables i and i-1 are only used to represent the current frame and the frame fetched immediately before the current frame. Obviously, the representation may be achieved by defining other variables, for example, variables s and t (variables s and t meets the condition s=t+l), etc., may be used to represent the current frame and the frame fetched immediately before the current frame.

[0043] According to some embodiments of the invention, before the ith image frame of the video is fetched from the buffer queue, the images of the video may be stored in the buffer queue by: receiving through a network the images of the video sent by the sender; each time when one image frame of the video is received, putting the received image of the video into the buffer queue. That is to say, each time when the receiver receives through a network one image frame of the video sent by the sender, the image frame of the video is put into the buffer queue. An entering moment is generated when one image frame of the video is put into the buffer queue, and an fetching moment is also generated when the image frame of the video is fetched from the buffer queue. In addition, when each image frame of the video is sent from the sender, the sender indicates a sampling time stamp in attribute information of the image frame of the video, where the sampling time stamp represents a time point on which the image frame is sampled.

[0044] According to some embodiments of the invention, in the case that the buffer queue includes a plurality of image frames of the video, the image frame which is put into the buffer queue earliest may be selected from the plurality of image frames of the video, and the image frame which is put into the buffer queue earliest is taken as the ith image frame of the video and is fetched from the buffer queue. When any image frame of the video is put into the buffer queue, an entering moment is generated, and it may be learned which image frame of the video among the plurality of images of the video stored in the buffer queue is put into the buffer queue earliest. According to some embodiments of the invention, a first-in-first-out principle is followed in fetching the images of the video from the buffer queue, i.e., the image frame which is put into the buffer queue earliest is required to be fetched first when fetching the images of the video from the buffer queue.

[0045] It should be noted that, the step 101 and the following steps are the processings of fetching the ith image frame of the video from the buffer queue, processing and playing the ith image frame. These steps are only configured to implement one cycle in the video processing procedures of the disclosure, the (i-l)th image frame of the video and an i-2th image frame of the video in the buffer queue are circularly processed based on the processing procedure for the ith image frame of the video, and obviously, an (i+l)th image frame of the video in the buffer queue may also be circularly processed with a similar processing procedure. For convenient understanding, according to the embodiment of the invention, only the processing procedure for the ith image frame of the video is explained as an example.

[0046] The step 102 is to calculate a sampling interval of the ith image frame between a time point on which the ith image frame of the video is sampled and another time point on which the (i-l)th image frame of the video is sampled;

[0047] where the (i-l)th image frame of the video is the image frame fetched from the buffer queue immediately before the ith image frame.

[0048] According to the embodiment of the invention, the (i-l)th image frame of the video is the image frame obtained during a previous processing procedure before processing the ith image frame of the video, and for convenient illustration of the order of the previous image frame and a current image frame, the image frame obtained during the previous processing procedure is defined as the (i-l)th image frame of the video.

[0049] It should be noted that, in the prior art, a fixed number of image frames of the video are buffered in advance and then played based on sampling intervals, and in the case that many image frames of the video are buffered in advance or the network delay is large, a large accumulated delay may be introduced, thus the real-time performance of a video call is affected. However, according to the embodiment, after calculating the sampling interval, the ith image frame of the video is not played directly with the sampling interval; the sampling interval is taken as a basis for setting a playing interval of the ith image frame.

[0050] The step 103 is to calculate a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue.

[0051] According to the embodiment, a entering moment is generated when one image frame of the video is put into the buffer queue, and an fetching moment is also generated when the image frame of the video is fetched from the buffer queue, thus the waiting time, during which the image frame of the video is waiting in the buffer queue, may be obtained with a difference between the fetching moment and the entering moment. With the waiting time of the ith image frame, a current network delay may be learned, since the current network delay is small if the waiting time of the ith image frame is short and the current network delay is large if the waiting time of the ith image frame is long.

[0052] The step 104 is to calculate a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame.

[0053] According to the embodiment, the regulated waiting time of the ith image frame may be calculated by performing the steps 101-104 on the ith image frame of the video, which is the processing procedure for the ith image frame of the video; similarly, the regulated waiting time of the (i-l)th image frame may also be calculated for the (i-l)th image frame of the video; and if only an initial regulated waiting time is assigned, a regulated waiting time of the (i-2)th frame, a regulated waiting time of the (i-3)th frame, etc., may be calculated through multiple iterations and updates. For example, when the variable i equals 1 , an initial regulated waiting time is given, a regulated waiting time of the 1 st frame may be calculated based on a waiting time of the 1 st frame and the initial regulated waiting time; when the variable i is updated as 2, a regulated waiting time of the 2 nd frame may be calculated based on a waiting time of the 2 nd frame and the regulated waiting time of the 1 st frame.

[0054] In an exemplary implementation, the regulated waiting time may be a dynamically adjustable waiting time, referred to a dynamic average waiting time, which may be a weighted average value of the waiting time of the current image frame and waiting time(s) of previous frame(s). Of course, those skilled in the art may understand that the regulated waiting time may be determined in any other manner and the present disclosure is not limited in this aspect.

[0055] According to the embodiment of the invention, a trend of the current network delay may be preliminarily obtained through the waiting time of the ith image frame and the regulated waiting time of the (i-l)th image frame, the regulated waiting time of the ith image frame may be calculated, and the current network delay may be learned through the regulated waiting time of the ith image frame. By profiling the trend of the current network delay based on a waiting time of the current frame and a regulated waiting time of a previous frame, and taking the calculated regulated waiting time of the ith image frame as an input parameter for calculating the playing interval of the ith image frame, the setting of the playing interval of the ith image frame may conform to the situation of the current network delay better.

[0056] The step 105 is to obtain the playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay.

[0057] According to the embodiment of the invention, the preset waiting delay is taken as an input parameter for calculating the playing interval of the ith image frame. The preset waiting delay reflects both a tolerable threshold of waiting time and an effect on the setting of the playing interval of the ith image frame caused by the hardware performance of a mobile device as the receiver. In practice, the value of the waiting delay may be set based on the tolerable threshold of waiting time and the hardware performance of the mobile device acting as the receiver, that is to say, the value of the waiting delay may be set small if the hardware performance of the mobile device is good, while the value of the waiting delay may be set large if the tolerable threshold of the waiting time is high.

[0058] The step 106 is to determine whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame.

[0059] The length (duration) of passed time is obtained by timing from the time point on which the (i-l)th image frame of the video starts to be played to the current time point and it is determined whether the duration is shorter than the playing interval of the ith image frame. If the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is longer than or equal to the playing interval of the ith image frame, step 107 is triggered to be executed.

[0060] According to some embodiments of the invention, if the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame, which indicates that the time during which the ith image frame of the video is waiting to be played, does not exceed the playing interval of the ith image frame; in order to ensure that the image of the video may be smoothly displayed, the ith image frame of the video may continue waiting and may be played at a time point when a duration equal to the playing interval of the ith image frame is elapsed since the (i-l)th frame of video imageimage frame of the video starts to be played.

[0061] The step 107 is to play the ith image frame of the video on the current time point, if the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is longer than or equal to the playing interval of the ith image frame.

[0062] If the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is longer than or equal to the playing interval of the ith image frame, which indicates that the time during which the ith image frame of the video is waiting to be played, exceeds the playing interval of the ith image frame, and the ith image frame of the video is required to be played immediately. Therefore, according to the embodiment of the invention, the timing for playing a current image frame of the video may be dynamically adjusted in order that the received image of the video is played in time and the real-time performance of the video call of the user is improved.

[0063] It should be noted that, according to the embodiment of the invention, the processing procedure for the ith image frame of the video fetched from the buffer queue is described with the steps 101-107; the steps 101-107 are repeated after finishing the processing on the ith image frame of the video, except for an (i+l)th image frame of the video is fetched from the buffer queue, i.e., the variable i in the steps 101-107 is required to be iterated, updated and replaced with i+1. An (i+2)th image frame of the video is fetched after finishing processing the (i+l)th image frame of the video; therefore, images of the video are processed continuously.

[0064] Accordingly, firstly the ith image frame of the video is fetched from the buffer queue; the sampling interval of the ith image frame, between one time point on which the ith image frame of the video is sampled and another time point on which the (i-l)th image frame of the video is sampled, is calculated; the waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue, is calculated; the regulated waiting time of the ith image frame is calculated based on the waiting time of the ith image frame and the regulated waiting time of the (i-l)th image frame; the playing interval of the ith image frame is obtained based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and the preset waiting delay; then it is determined whether the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame; the ith image frame of the video is played on the current time point if the duration is not shorter than the playing interval of the ith image frame. According to the embodiment of the invention, a playing interval may be set for each image frame of a video based on the regulated waiting time, the sampling interval and the preset waiting delay; therefore, the current network delay may be acquired through the regulated waiting time, the buffer time of the current image frame of the video may be determined adaptively by setting the current playing interval of the each frame, the images of the video may be displayed smoothly, and the stable play of the images of the video is ensured; in addition, after the playing interval is set, the duration between the time point on which a previous image frame of the video is played and the current time point, is compared with the set playing interval, and in the case that the duration between the time point on which the previous image frame of the video is played and the current time point is longer than or equal to the set playing interval, the current image frame of the video is played at the current time point, therefore, the timing for playing the current image frame of the video may be dynamically adjusted in order that the received image of the video is played in time and the real-time performance of the video call is improved.

[0065] Referring Figure 2, a method for processing a video according to another embodiment of the invention is illustrated. The method may include steps 201-213.

[0066] The step 201 is to fetch, from a buffer queue, an ith image frame of the video;

[0067] where the variable i is a natural number.

[0068] According to the embodiment, the buffer queue is configured to store images of the video received by a receiver through network. There may be only one image frame or may be multiple image frames of the video stored in the buffer queue, which depends on intervals between the sampling of the images at a sender and the network delay.

[0069] The step 202 is to extract, from attribute information of the ith image frame of the video, a sampling time stamp marking a time point on which the ith image frame of the video is sampled by a sender.

[0070] The step 203 is to calculate a sampling interval of the ith image frame based on the sampling time stamp of the ith image frame of the video and a sampling time stamp of an (i-l)th image frame of the video.

[0071] When each image frame of the video is sent from the sender, the sender indicates one sampling time stamp in the attribute information of the image fram of the video, where the sampling time stamp represents a time point on which the image frame of the video is sampled. By subtracting the sampling time stamp of the (i-l)th image frame of the video from the sampling time stamp of the ith image frame of the video, the sampling interval of the ith image frame for the ith image frame of the video may be obtained.

[0072] The step 204 is to calculate a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue.

[0073] According to the embodiment of the invention, a entering moment is generated when each image frame of the video is put into the buffer queue, and an fetching moment is also generated when the image frame of the video is fetched from the buffer queue, thus a waiting time, during which the image frame of the video is waiting in the buffer queue, may be obtained with a difference between the fetching moment and the entering moment.

[0074] The step 205 is to obtain a preset adjustment factor a for the waiting time;

[0075] where the adjustment factor a satisfies 0<α<1.

[0076] According to the embodiment of the invention, the preset adjustment factor for the waiting time, taken as a weight of the regulated waiting time AVR Wi i of the (i-l)th frame, is configured to correct the value of the regulated waiting time AVR Wi of the ith frame. The adjustment factor for the waiting time may be set based on a plurality of regulated waiting times of a plurality of frames previous to the current frame and waiting times of a plurality of frames. The regulated waiting time is significantly affected if the delay of one frame among multiple frames is large, this problem may be avoided by setting the adjustment factor for the waiting time based on historical statistical information, and the equilibrium of system design is ensured.

[0077] The step 206 is to calculate a regulated waiting time of the ith image frame with the following formula:

[0078] AVR_Wi = axAVR_Wi- 1 + ( 1 - a)x Wi;

[0079] where AVR Wi represents the regulated waiting time of the ith image frame, AVR Wi ι represents a regulated waiting time of the (i-l)th image frame, and Wi represents the waiting time of the ith image frame.

[0080] In this embodiment of the disclosure, the regulated waiting time is a dynamically adjustable waiting time, referred to a dynamic average waiting time, which is a weighted average value of the waiting time of the current image frame and the waiting time(s) of previous frame(s).

[0081] The step 207 is to determine whether the regulated waiting time of the ith image frame is longer than the preset waiting delay.

[0082] If the regulated waiting time of the ith image frame is longer than the preset waiting delay, the steps 208 and 209 are executed; if the regulated waiting time of the ith image frame is shorter than or equal to the preset waiting delay, the step 210 is executed.

[0083] The step 208 is to obtain a preset adjustment factor for the playing interval.

[0084] The step 209 is to calculate a playing interval of the ith image frame based on the adjustment factor for the playing interval and the sampling interval of the ith image frame.

[0085] It is determined whether the regulated waiting time of the ith image frame is longer than the preset waiting delay, and the playing interval of the ith image frame is determined differently based on the result of the determination. If the regulated waiting time of the ith image frame is longer than the preset waiting delay, it is indicated that the time during which the ith image frame of the video is waiting to be played is too long, and the waiting time should be reduced in order to decrease an accumulated delay; the preset adjustment factor for the playing interval may be obtained, and the playing interval of the ith image frame is calculated based on the adjustment factor for the playing interval and the sampling interval of the ith image frame. The playing interval of the ith image frame may be calculated with the following formula:

[0086] Playlntervali =

[0087] where Playlntervali represents the playing interval of the ith image frame, Samplelntervali represents the sampling interval of the ith image frame, β represents the adjustment factor for the playing interval, and the β satisfies 0<β<1.

[0088] The adjustment factor for the playing interval, acting as a weight of Samplelntervali, is configured to correct the value of Playlntervali. The adjustment factor for the playing interval may be set based on regulated waiting times of a plurality of frames previous to the current frame and waiting times of a plurality of frames. The time during which one image frame of the video is waiting to be played is long if the delay of one frame among multiple frames is large, the problem may be avoided by setting the adjustment factor for the playing interval based on historical statistical information, and the image frame of the video may ensured to be played in time.

[0089] The step 210 is to take the sampling interval of the ith image frame as the playing interval of the ith image frame.

[0090] If the regulated waiting time of the ith image frame is shorter than or equal to the preset waiting delay, it is indicated that the time during which the ith image frame of the video is waiting to be played is short, it is unnecessary to correct the playing interval of the ith image frame with the adjustment factor for the playing interval, and the sampling interval of the ith image frame may be directly taken as the playing interval of the ith image frame.

[0091] The step 211 is to determine whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame.

[0092] The length (duration) of passed time is obtained by timing from the time point on which the (i-l)th image frame of the video starts to be played to the current time point and it is determined whether the duration is shorter than the playing interval of the ith image frame. If the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is longer than or equal to the playing interval of the ith image frame, the step 212 is triggered to be executed; if the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame, the step 213 is triggered to be executed.

[0093] The step 212 is to play the ith image frame of the video on the current time point, if the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is longer than or equal to the playing interval of the ith image frame.

[0094] If the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is longer than or equal to the playing interval of the ith image frame, which indicates that the time, during which the ith image frame of the video is waiting to be played, exceeds the playing interval of the ith image frame, and the ith image frame of the video is required to be played immediately. Therefore, according to the embodiment of the invention, the timing for playing a current image frame of the video may be dynamically adjusted in order that the received image of the video is played in time and the real-time performance of the video call of the user is improved.

[0095] In the step 213, in the case that the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame, the ith image frame of the video is played at a time point when a duration equal to the playing interval of the ith image frame is elapsed since the (i-l)th frame of video imageimage frame of the video starts to be played.

[0096] If the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame, which indicates that the time, during which the ith image frame of the video is waiting to be played, does not exceed the playing interval of the ith image frame; in order to ensure that the image of the video may be smoothly displayed, the ith image frame of the video may continue waiting and may be played at a time point when a duration equal to the playing interval of the ith image frame is elapsed since the (i-l)th frame of video imageimage frame of the video starts to be played.

[0097] Accordingly, firstly the ith image frame of the video is fetched from the buffer queue; the sampling interval of the ith image frame, between one time point on which the ith image frame of the video is sampled and another time point on which the (i-l)th image frame of the video is sampled, is calculated; the waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue, is calculated; the regulated waiting time of the ith image frame is calculated based on the waiting time of the ith image frame and the regulated waiting time of the (i-l)th image frame; the playing interval of the ith image frame is obtained based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and the preset waiting delay; then it is determined whether the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame; the ith image frame of the video is played on the current time point if the duration is not shorter than the playing interval of the ith image frame. According to the embodiment of the invention, a playing interval may be set for each image frame of a video based on the regulated waiting time, the sampling interval and the preset waiting delay; therefore, the current network delay may be acquired through the regulated waiting time, the buffer time of the current image frame of the video may be determined adaptively by setting the current playing interval of the each frame, the images of the video may be displayed smoothly, and the stable play of the images of the video is ensured; in addition, after the playing interval is set, the duration between the time point on which a previous image frame of the video is played and the current time point, is compared with the set playing interval, and in the case that the duration between the time point on which the previous image frame of the video is played and the current time point is longer than or equal to the set playing interval, the current image frame of the video is played at the current time point, therefore, the timing for playing the current image frame of the video may be dynamically adjusted in order that the received image of the video is played in time and the real-time performance of the video call is improved. [0098] For better understanding of the previous solutions according to embodiments of the invention, corresponding application scenarios are explained.

[0099] Referring Figure 3, a schematic flow chart of a method for processing a video according to an embodiment of the invention is illustrated. The method may include steps 301-310.

[0100] In the step 301, image frames of the video sent by a sender is received through a network; each time when one image frame of the video is received, the received image frame of the video is put into a buffer queue H.

[0101] Currently, images of the video stored in the buffer queue H are: f 6 , f 7 , fg and f 9 , which correspond respectively to a sixth image frame of the video, a seventh image frame of the video, an eighth image frame of the video and a ninth image frame of the video.

[00100] In step 302, the sixth image frame of the video is fetched from the buffer queue, where the sixth image frame of the video is put into the buffer queue H earliest.

[0102] The step 303 may include step 3031 and step 3032.

[0103] In the step 3031, a sampling time stamp is extracted from attribute information of the sixth image frame f 6 ; and

[0104] in the step 3032, a sampling interval Samplelnterval 6 of the sixth image frame, i.e., the interval between one time point on which the f 6 is sampled and another time point on which an f 5 is sampled, is calculated. [0105] The step 304 includes step 3041 and step 3042.

[0106] In the step 3041, a waiting time W 6 of the sixth image frame is calculated, where the waiting time W 6 starts when the sixth image frame of the video, f 6 , is put into the buffer queue and ends when the sixth image frame of the video, f 6 , is fetched out of the buffer queue; and

[0107] in the step 3042, a regulated waiting time of the sixth image frame is calculated with the following formula: AVR_W 6 = axAVR_W 5 + (l— a)xW 6 , where 0<a<l .

[0108] In the step 305, it is determined whether the regulated waiting time AVR W 6 of the sixth image frame is longer than a preset waiting delay noted as Delay. If AVR_W 6 is longer than Delay, the step 306 is triggered to be executed; if AVR W 6 is not longer than Delay, the step 307 is triggered to be executed.

[0109] In the step 306, a playing interval Playlnterval 6 of the sixth image frame may be calculated with the following formula: Playlnterval 6 = SampleInterval 6 x P, where 0<β<1, then the step 308 is triggered to be executed.

[0110] In the step 307, the playing interval Playlnterval 6 of the sixth image frame may be calculated with the following formula: Playlnterval 6 = Samplelnterval 6 , then the step 308 is triggered to be executed.

[0111] In the step 308, it is determined whether the duration between a time point on which the fifth image frame of the video, f 5 , startes to be played and a current time point is shorter than the playing interval Playlnterval 6 of the sixth image frame. If the duration is shorter than the Playlnterval 6 , the step 309 is triggered to be executed; if the duration is not shorter than the Playlnterval 6 , the step 310 is triggered to be executed.

[0112] In the step 309, the f 6 is played at a time point when a duration equal to Playlnterval 6 is elapsed since the fifth image frame of the video, f 5 , starts to be played.

[0113] In the step 310, the sixth image frame of the video, f 6 , is played on the current time point.

[0114] Accordingly, a playing interval may be set for each image frame of a video based on the regulated waiting time, the sampling interval and the preset waiting delay; therefore, the current network delay may be acquired through the regulated waiting time, the buffer time of the current image frame of the video may be determined adaptively by setting the current playing interval of the each frame, the images of the video may be displayed smoothly, and the stable play of the images of the video is ensured; in addition, after the playing interval is set, the duration between the time point on which a previous image frame of the video is played and the current time point, is compared with the set playing interval, and in the case that the duration between the time point on which the previous image frame of the video is played and the current time point is longer than or equal to the set playing interval, the current image frame of the video is played at the current time point, therefore, the timing for playing the current image frame of the video may be dynamically adjusted in order that the received image of the video is played in time and the real-time performance of the video call is improved.

[0115] It should be noted that, for briefly describing the previous method embodiments, the methods are all expressed as a combination of a series of operations; nevertheless, it should be understood by those skilled in the art that, the invention is not limited by the order of the operations, because some steps may be performed in other orders or be performed simultaneously. It should further be understood by those skilled in the art that, all the embodiment described in the specification are preferred embodiments, and referred operations and modules may not be necessary in the invention.

[0116] To facilitate the implementing of the previous solutions according to the embodiments of the invention, a corresponding device for implementing the previous solutions is further provided.

[0117] Referring Figure 4-a, a device 400 for processing a video is provided according to an embodiment of the invention. The device may include: an fetching module 401, a sampling interval calculation module 402, a first time calculation module 403, a second time calculation module 404, an obtaining module 405, a determination module 406 and a playing module 407; where

[0118] the fetching module 401 is configured to fetch, from a buffer queue, an ith image frame of the video, where the variable i is a natural number;

[0119] the sampling interval calculation module 402 is configured to calculate a sampling interval of the ith image frame between a time point on which the ith image frame of the video is sampled and another time point on which an (i-l)th image frame of the video is sampled, where the (i-l)th image frame of the video is the image frame fetched from the buffer queue immediately before the ith image frame;

[0120] the first time calculation module 403 is configured to calculate a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue;

[0121] the second time calculation module 404 is configured to calculate a regulated waiting time of the ith image frame based on the waiting time of the ith image frame calculated by the first time calculation module 403 and a regulated waiting time of the (i-l)th image frame;

[0122] the obtaining module 405 is configured to obtain a playing interval of the ith image frame based on the regulated waiting time of the ith image frame calculated by the second time calculation module 404, the sampling interval of the ith image frame calculated by the sampling interval calculation module 402 and a preset waiting delay;

[0123] the determination module 406 is configured to determine whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame obtained by the obtaining module 405; and

[0124] the playing module 407 is configured to play the ith image frame of the video on the current time point if it is determined, by the determination module 406, that the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is longer than or equal to the playing interval of the ith image frame.

[0125] According to some embodiments of the invention, the playing module 407 is further configured to, in the case that the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame, play the ith image frame of the video at a time point when a duration equal to the playing interval of the ith image frame is elapsed since the (i-l)th frame of video imageimage frame of the video starts to be played.

[0126] Referring Figure 4-b, according to some embodiments of the invention, the second time calculation module 404 may include:

[0127] a first obtaining sub-module 4041 , configured to obtain a preset adjustment factor a for the waiting time; where the adjustment factor a satisfies 0<a<l ; and

[0128] a first time calculation sub-module 4042, configured to calculate the regulated waiting time of the ith image frame with the following formula:

[0129] AVR_Wi = axAVR_Wi- 1 + ( 1 - a)x Wi;

[0130] where the AVR Wi represents the regulated waiting time of the ith image frame, the AVR Wi i represents the regulated waiting time of the (i-l)th image frame, and the Wi represents the waiting time of the ith image frame.

[0131] According to some embodiments of the invention, the obtaining module 405 may include:

[0132] a determination sub-module 4051, configured to determine whether the regulated waiting time of the ith image frame is longer than the preset waiting delay;

[0133] a second obtaining sub-module 4052, configured to obtain a preset adjustment factor for the playing interval, if the regulated waiting time of the ith image frame is longer than the preset waiting delay;

[0134] a playing interval calculation sub-module 4053, configured to calculate the playing interval of the ith image frame based on the adjustment factor for the playing interval and the sampling interval of the ith image frame; and

[0135] a third obtaining sub-module 4054, configured to take the sampling interval of the ith image frame as the playing interval of the ith image frame, if the regulated waiting time of the ith image frame is shorter than or equal to the preset waiting delay.

[0136] According to some embodiments of the invention, the playing interval calculation sub-module 4053 is configured to calculate the playing interval of the ith image frame with the following formula:

[0137] Playlntervali = SampleIntervali><P;

[0138] where the Playlntervali represents the playing interval of the ith image frame, the Samplelntervali represents the sampling interval of the ith image frame, β represents the adjustment factor for the playing interval, and β satisfies 0<β<1. [0139] According to some embodiments of the invention, the sampling interval calculation module 402 may include:

[0140] a time stamp extraction sub-module 4021, configured to extract, from attribute information of the ith image frame of the video, a sampling time stamp marking a time point on which the ith image frame of the video is sampled by a sender; and

[0141] a sampling interval calculation sub-module 4022, configured to calculate the sampling interval of the ith image frame based on the sampling time stamp of the ith image frame of the video and a sampling time stamp of the (i-l)th image frame of the video.

[0142] According to some embodiments of the invention, the device 400 for processing a video may further include:

[0143] a receiving module 408, configured to receive through a network, image frames of the video sent by the sender; and

[0144] a storage module 409, configured to, each time when receiving one image frame of the video, put the received image frame of the video into the buffer queue.

[0145] According to some embodiments of the invention, the fetching module 401 may include:

[0146] a fourth obtaining sub-module 4011, configured to, in the case that the buffer queue includes a plurality of image frames of the video, select, from the plurality of image frames of the video, the image frame which is put into the buffer queue earliest; and

[0147] an fetching sub-module 4012, configured to fetch, from the buffer queue, the image frame which is put into the buffer queue earliest, as the ith image frame.

[0148] It should be noted that, since the information interaction among the modules/units of the devices, the execution procedures and other contents are based on the same conception as the method embodiments of the invention, the technical effects of the device embodiments of the invention are the same as those of the method embodiments of the invention, detailed description of the technical effects is not given here and may be referred to the descriptions in the method embodiments of the invention.

[0149] Accordingly, firstly the ith image frame of the video is fetched from the buffer queue; the sampling interval of the ith image frame, between one time point on which the ith image frame of the video is sampled and another time point on which the (i-l)th image frame of the video is sampled, is calculated; the waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue, is calculated; the regulated waiting time of the ith image frame is calculated based on the waiting time of the ith image frame and the regulated waiting time of the (i-l)th image frame; the playing interval of the ith image frame is obtained based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and the preset waiting delay; then it is determined whether the duration between the time point on which the (i-l)th image frame of the video starts to be played and the current time point is shorter than the playing interval of the ith image frame; the ith image frame of the video is played on the current time point if the duration is not shorter than the playing interval of the ith image frame. According to the embodiment of the invention, a playing interval may be set for each image frame of a video based on the regulated waiting time, the sampling interval and the preset waiting delay; therefore, the current network delay may be acquired through the regulated waiting time, the buffer time of the current image frame of the video may be determined adaptively by setting the current playing interval of the each frame, the images of the video may be displayed smoothly, and the stable play of the images of the video is ensured; in addition, after the playing interval is set, the duration between the time point on which a previous image frame of the video is played and the current time point, is compared with the set playing interval, and in the case that the duration between the time point on which the previous image frame of the video is played and the current time point is longer than or equal to the set playing interval, the current image frame of the video is played at the current time point, therefore, the timing for playing the current image frame of the video may be dynamically adjusted in order that the received image of the video is played in time and the real-time performance of the video call is improved.

[0150] In the following, the application of the method for processing a video according to an embodiment of the invention in a terminal is examplified. The terminal may include smart phone, tablet computer, e-book reader, Moving Picture Experts Group Audio Layer III Player (MP3 Player), Moving Picture Experts Group Audio Layer IV Player (MP4 Player), laptop, desktop, etc.

[0151] Referring Figure 5, a schematic structure diagram of a terminal according to an embodiment of the invention is illustrated.

[0152] The terminal may include a Radio frequency (RF) circuit 20, a storage device 21 including one or more computer readable storage mediums, an input unit 22, a display unit 23, a sensor 24, an audio circuit 25, a wireless fidelity (WiFi) module 26, a processor 27 including one or more processing cores, a power supply 28, etc. It should be understood by those skilled in the art that, the structure of the terminal shown in Figure 5 is not intended to limit the terminal, more or less components than shown in Figure 5 may be included in the terminal, some components may be combined or may may be in another arrangement.

[0153] The RF circuit 20 may be configured to receive and send a signal during receiving and sending messages or during making a phone call, and in particular, deliver downlink information of a base station to one or more processors 27 to process, and send uplink data to the base station. Usually, the RF circuit 20 includes, but not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, etc. In addition, the RF circuit 20 may communicate with a network or other devices through wireless communication. Any communication standard or protocol may be adopted for the wireless communication, the communication standard or protocol includes: Global System of Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), etc.

[0154] The storage device 21 may be configured to store a software program and module. The processor 27 executes different applications and processes data by running the software program and module stored in the storage device 21. The storage device 21 may mainly include a program storage area and a data storage area, where the program storage area may store an operating system, an application program for at least one function (e.g., a function of playing audio, a function of displaying image, etc.), etc.; the data storage area may store data (e.g., audio data, telephone book, etc.) created based on the usage of the terminal, etc. In addition, the storage device 21 may include a high speed random access memory, a nonvolatile storage such as at least one magnetic disk storage or flash disk, and any solid volatile storage. Correspondingly, the storage device 21 may include a storage controller, which is configured to enable the processor 27 and the input unit 22 to access the storage device 21.

[0155] The input unit 22 may be configured to receive an input number or input character information, and generate a signal input through a keyboard, a mouse, an operating rod, an optical input device or a trackball, where the signal is associated with user configuration and function control. The input unit 22 may include a touch- sensitive surface 221 and another input device 222. The touch- sensitive surface 221, also known as a touch screen or touch panel, may capture a touch operation performed on or nearby the surface (e.g., an operation performed on or near the touch-sensitive surface 221 by a user with a fmger, a stylus or any suitable object or accessory), and drive a corresponding connection device based on a preset program. Optionally, the touch-sensitive surface 221 may include two components: a touch detection device and a touch controller. The touch detection device is configured to detect a touch location of the user, detect the signal caused by the touch operation, and send the signal to the touch controller; the touch controller is configured to receive touch information from the touch detection device, convert the touch information into coordinates of the touch position, send the coordinates to the processor 27, and receive and execute a command from the processor 27. In addition, the touch-sensitive surface 531 may be implemented in many types, e.g., a resistance type, an infrared type, a Surface Acoustic Wave type, etc. Besides the touch- sensitive surface 221, the input unit 22 may include another input device 222. The another input device 222 includes but not limited to: one or any combination of physical keyboard, function key (e.g., key for controlling volume, ON/OFF key, etc.), trackball, mouse and operating rod.

[0156] The display unit 23 is configured to display information input by the user, information provided to the user and different graphic user interfaces of the terminal, where those graphic user interfaces may include one or any combination of image, text, icon and video. The display unit 23 may include a display panel 231, and optionally, the display panel 231 may be in forms of Liquid Crystal Display (LCD), Organic Light-Emitting Diode (OLED), etc. Furthermore, the touch- sensitive surface 221 may cover the display panel 231 , and after the touch operation is detected on or near the touch- sensitive surface 221, the touch operation is sent to the processor 27 to determine the type of a touch event, then the processor 27 provides a corresponding visual output on the display panel 231 based on the type of the touch event. Although in Figure 5, the touch-sensitive surface 221 and the display panel 231 are implemented as two independent components to achieve input and output, the touch- sensitive surface 221 and the display panel 231 may be integrated together to achieve input and output according to some embodiments of the disclosure.

[0157] The terminal may further include at least one kind of sensor 24, e.g., optical sensor, motion sensor and any other sensors. The optical sensor may include an ambient light sensor and a proximity sensor, where the ambient light sensor may adjust the brightness of the display panel 231 based on the intensity of ambient light, and the proximity sensor may turn off the display panel and/or a backlight when the terminal is moved near to an ear. As one kind of the motion sensor, a gravity acceleration sensor may detect values of accelerations on all directions (usually three-axis), and detect the value and direction of the gravity when remaining stationary; the gravity acceleration sensor may be applied in an application for recognizing posture of a mobile phone (for example, switching between landscape and portrait, a relevant game, magnetometer pose calibration), a function related to vibration recognition (for example, a pedometer, knocking), etc.; in addition, other sensors, e.g., gyroscope, barometer, hygrometer, thermometer, infrared sensor, etc., may be further provided in the terminal, the description of which is omitted herein.

[0158] The audio circuit 25, a loudspeaker 251 and a microphone 252 may provide an audio interface between the user and the terminal. The audio circuit 25 may transmit an electric signal, converted from received audio data, to the loudspeaker 251, and a voice signal is converted from the electric signal and then is output, by the loudspeaker 251 ; on the other hand, the microphone 252 converts captured voice signal into an electric signal, the electric signal is received by the audio circuit 25 and converted into audio data. The audio data is output to the processor 27 to process and then sent to another terminal via the RF circuit 20; alternatively, the audio data is output to the storage device 21 for further processing. The audio circuit 25 may further include a headset jack through which an external earphone and the terminal may be connected.

[0159] WiFi is a technology for short-distance wireless transmission. With the WiFi module 26, the terminal may insist the user in receiving and sending an email, browsing a web page, accessing a stream media, etc., and a wireless broadband Internet access is provided to the user. Although the WiFi module 26 is shown in Figure 5, it should be understood that the WiFi module is not indispensable in the terminal and may be omitted in practice without changing the essence of the disclosure.

[0160] The processor 27, as a control center of the terminal, is connected to all components of the whole mobile phone via different interfaces and wires, and executes different functions of the terminal and process data by running or executing the software program and/or module stored in the storage device 21 , and invoking the data stored in the storage device 21 ; therefore, the whole mobile phone is monitored. Optionally, the processor 27 may include one or more processing cores; preferably, the processor 27 may be integrated with an application processor and a modem processor, where the application processor is mainly responsive of processing involved with the operating system, the user interface, the applications, etc. and the modem processor is mainly responsive of processing involved with the wireless communication. It may be understood that, the modem processor may not be integrated in the processor 27.

[0161] The terminal further includes the power supply 28 (such as a battery) providing power to all the components. Preferably, the power supply may be connected to the processor 27 logically through a power management system in order to implement functions of charging management, discharging management, power consumption management, etc. The power supply 28 may further include one or more direct-current or alternating current power supplies, a recharging system, a power failure detection circuit, a power adapter or inverter, a power status indicator, etc. [0162] The terminal may include, although not shown in Figure 5, a camera, a Bluetooth module, etc., for which the description is omitted. According to the embodiment of the disclosure, the display unit of the terminal is a touch screen display, the storage device 21 of the terminal is similar with a database, and the storage device 21 may be configured to store sampling periods, sampling intervals and statistic frame rate.

[0163] In addition, in the terminal according to the embodiment of the invention, one or more programs are stored in the storage device 21 , the one or more processors 27 are configured to execute the one or more programs, where the one or more programs include the following operating instructions:

[0164] fetching, from a buffer queue, an ith image frame of the video, where the variable i is a natural number;

[0165] calculating a sampling interval of the ith image frame between one time point on which the ith image frame of the video is sampled and another time point on which an (i-l)th image frame of the video is sampled, where the (i-l)th image frame of the video is the image frame fetched from the buffer queue immediately before the ith image frame;

[0166] calculating a waiting time of the ith image frame, starting when the ith image frame of the video is put into the buffer queue and ending when the ith image frame of the video is fetched out of the buffer queue;

[0167] calculating a regulated waiting time of the ith image frame based on the waiting time of the ith image frame and a regulated waiting time of the (i-l)th image frame;

[0168] determining a playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and a preset waiting delay;

[0169] determining whether the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame; and

[0170] if the duration is not shorter than the playing interval of the ith image frame, playing the ith image frame of the video on the current time point.

[0171] The operating instruction of calculating the regulated waiting time of the ith image frame based on the waiting time of the ith image frame and the regulated waiting time of the (i-l)th image frame includes:

[0172] obtaining a preset adjustment factor a for the waiting time, where the adjustment factor a satisfies 0<α<1 ; and

[0173] calculating the regulated waiting time of the ith image frame with the following formula:

[0174] AVR_Wi = axAVR_Wi- 1 + ( 1 - a)x Wi;

[0175] where the AVR Wi represents the regulated waiting time of the ith image frame, the AVR Wi ι represents the regulated waiting time of the (i-l)th image frame, and the Wi represents the waiting time of the ith image frame.

[0176] The operating instruction of obtaining the playing interval of the ith image frame based on the regulated waiting time of the ith image frame, the sampling interval of the ith image frame and the preset waiting delay includes:

[0177] determining whether the regulated waiting time of the ith image frame is longer than the preset waiting delay;

[0178] if the regulated waiting time of the ith image frame is longer than the preset waiting delay, obtaining a preset adjustment factor for the playing interval, and calculating the playing interval of the ith image frame based on the adjustment factor for the playing interval and the sampling interval of the ith image frame; or

[0179] if the regulated waiting time of the ith image frame is shorter than or equal to the preset waiting delay, taking the sampling interval of the ith image frame as the playing interval of the ith image frame.

[0180] The process of calculating the playing interval of the ith image frame based on the adjustment factor for the playing interval and the sampling interval of the ith image frame includes:

[0181] calculating the playing interval of the ith image frame with the following formula:

[0182] Playlntervali = Samplelntervalixp;

[0183] where the Playlntervali represents the playing interval of the ith image frame, the Samplelntervali represents the sampling interval of the ith image frame, β represents the adjustment factor for the playing interval, and β satisfies 0<β<1.

[0184] Furthermore, the one or more programs, executed by the processor 27, includes the following operating instructions:

[0185] in the case that the duration between a time point on which the (i-l)th image frame of the video starts to be played and a current time point is shorter than the playing interval of the ith image frame, playing the ith image frame of the video at a time point when a duration equal to the playing interval of the ith image frame is elapsed since the (i-l)th frame of video imageimage frame of the video starts to be played.

[0186] The operating instruction of calculating the sampling interval of the ith image frame between one time point on which the ith image frame of the video is sampled and another time point on which the (i-l)th image frame of the video is sampled includes:

[0187] extracting, from attribute information of the ith image frame of the video, a sampling time stamp marking a time point on which the ith image frame of the video is sampled by a sender; and

[0188] calculating the sampling interval of the ith image frame based on the sampling time stamp of the ith image frame of the video and a sampling time stamp of the (i-l)th image frame of the video.

[0189] Furthermore, the one or more programs further include:

[0190] receiving the image frames of the video sent by the sender through network; and

[0191] each time when receiving one image frame of the video, putting the received image frame of the video into the buffer queue. [0192] The operating instruction of fetching, from the buffer queue, the ith image frame of the video includes:

[0193] in the case that the buffer queue includes a plurality of image frames of the video, selecting, from the plurality of image frames of the video, the image frame which is put into the buffer queue earliest; and

[0194] fetching, from the buffer queue, the image frame which is put into the buffer queue earliest, as the ith image frame.

[0195] It should be understood by those skilled in the art that, part or all of the steps in the methods according to the embodiments of the disclosure may be performed through a corresponding hardware instructed with a program. The program may be stored in a computer readable storage medium, and the computer readable storage medium may include: Read Only Memory (ROM), magnetic disk, Compact Disk, etc.

[0196] The method and device for processing a video provided in the disclosure are described in detail. The implementation and the application scope of the disclosure may vary within the spirit of the disclosure and the specification is not intended to limit the disclosure.