Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PROVIDING AND HANDLING INFORMATION ON A STATE OF A MEDIA STREAM
Document Type and Number:
WIPO Patent Application WO/2007/091207
Kind Code:
A1
Abstract:
A media stream is multiplexed with at least one media stream of another type to transmission packets of a multimedia session. For providing information on a state of the media stream to a receiver, a sender signals dedicated state information for the media stream in-band of the multimedia session to the receiver. The receiver receives and evaluates this state information.

Inventors:
VEDANTHAM RAMAKRISHNA (US)
CHANDRA UMESH (US)
LEON DAVID (FR)
Application Number:
PCT/IB2007/050371
Publication Date:
August 16, 2007
Filing Date:
February 05, 2007
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NOKIA CORP (FI)
VEDANTHAM RAMAKRISHNA (US)
CHANDRA UMESH (US)
LEON DAVID (FR)
International Classes:
H04L29/06
Domestic Patent References:
WO2001030045A12001-04-26
WO2003041424A22003-05-15
Other References:
ROSENBERG J ET AL: "Issues and Options for RTP Multiplexing", INTERNET CITATION, 1 October 1998 (1998-10-01), XP002149446, Retrieved from the Internet [retrieved on 20000929]
MARK HANDLEY ISI: "GeRM: Generic RTP Multiplexing", IETF STANDARD-WORKING-DRAFT, INTERNET ENGINEERING TASK FORCE, IETF, CH, vol. avt, 11 November 1998 (1998-11-11), XP015015588, ISSN: 0000-0004
Attorney, Agent or Firm:
COHAUSZ & FLORACK (Düsseldorf, DE)
Download PDF:
Claims:

What is claimed is:

1. A method of providing information on a state of a media stream to a receiver, wherein said media stream is multiplexed with at least one media stream of another type to transmission packets of a multimedia session, said method comprising at a sender : signaling dedicated state information for said media stream in-band of said multimedia session to said receiver.

2. The method according to claim 1, wherein said state information comprises an identification of said at least one media stream.

3. The method according to claim 2, wherein said state information further comprises associated to said identification of said at least one media stream an indication of a current state of said at least one media stream.

4. The method according to claim 2, wherein said state information further comprises a number of media streams for which state information is provided.

5. The method according to claim 1, wherein said state information is signaled within said transmission packets .

6. The method according to claim 5, wherein said state information is signaled in a payload section of said transmission packets, wherein a dedicated

payload type is used in case said state information is signaled within said transmission packets, and wherein a respective use of said dedicated payload type is indicated in a fixed header field of said transmission packet.

7. The method according to claim 5, wherein said state information is signaled in an extension of a packet header of said transmission packet, and wherein a respective use of an extension is indicated in a fixed field of said packet header.

8. The method according to claim 1, wherein said state information is signaled in control packets of said multimedia session.

9. The method according to claim 1, wherein state information is signaled for a media stream in case said media stream is to be omitted from said transmission packets.

10. The method according to claim 1, wherein state information for said media stream is provided in addition using out-of-band signaling and wherein new state information is provided by said in-band signaling at least until said state information has been provided using out-of-band signaling.

11. The method according to claim 1, wherein said dedicated state information for said media stream is given by dedicated sender reports for said media stream, wherein said sender reports are transmitted in control packets of said multimedia session, and

wherein said sender reports are transmitted or not depending on a state of said media stream.

12. The method according to claim 11, wherein said dedicated sender reports for said media stream are transmitted periodically until said media stream is to be omitted from said transmission packets for the rest of said media session.

13. The method according to claim 1, wherein each of said media streams is distributed to Real-time Transport Protocol packets and wherein said transmission packets are Generic Real-time Transport Protocol Multiplexing packets.

14. The method according to claim 8, wherein said control packets are application defined Real-time control protocol packets.

15. The method according to claim 11, wherein said control packets are Real-time Control Protocol packets .

16. The method according to claim 1, wherein said multimedia session is a packet switched video telephony session.

17. The method according to claim 1, wherein said media streams multiplexed to transmission packets comprise at least one of an audio stream and a video stream.

18. A method of handling information on a state of a media stream, said method comprising:

receiving an in-band signaling of a multimedia session with dedicated state information for a media stream, wherein said media stream is multiplexed in received transmission packets of said multimedia session with at least one media stream of another type; and evaluating said state information.

19. An electronic device comprising: multiplexing means adapted to multiplex a media stream with at least one media stream of another type to transmission packets of a multimedia session; and processing means adapted to cause a signaling of dedicated state information for said media stream in-band of said multimedia session to a receiver .

20. An electronic device comprising: de-multiplexing means adapted to de-multiplex media streams of different types in received transmission packets of a multimedia session; and processing means adapted to evaluate dedicated state information for said media stream received in an in-band of said multimedia session.

21. A multimedia system comprising a first electronic device and a second electronic device, said first electronic device comprising multiplexing means adapted to multiplex a media stream with at least one media stream of another type to transmission packets of a multimedia session, and processing means adapted to cause a signaling of dedicated state information for said

media stream in-band of said multimedia session to said second electronic device; and said second electronic device comprising demultiplexing means adapted to de-multiplex media streams of different types in received transmission packets of a multimedia session, and processing means adapted to evaluate dedicated state information for said media stream received in an in-band of said multimedia session.

22. A software program product in which a software code for providing information on a state of a media stream to a receiver is stored in a readable medium, said media stream being multiplexed with at least one media stream of another type to transmission packets of a multimedia session, said software code realizing the following step when being executed by a processor: causing a signaling of dedicated state information for said media stream in-band of said multimedia session to said receiver.

23. A software program product in which a software code for handling information on a state of a media stream is stored in a readable medium, said software code realizing the following step when being executed by a processor: receiving an in-band signaling of a multimedia session with dedicated state information for a media stream, wherein said media stream is multiplexed in received transmission packets of said multimedia session with at least one media stream of another type; and evaluating said state information.

Description:

Providing and handling information on a state of a media stream

FIELD OF THE INVENTION

The invention relates to methods of providing and handling information on a state of a media stream. The invention relates equally to corresponding electronic devices, to a corresponding multimedia system and to corresponding software program products.

BACKGROUND OF THE INVENTION

Real-time Transport Protocol (RTP) is the preferred transport protocol for packet switched multimedia telephony applications.

RTP has been described for example in Request for

Comments (RFC) 3550: "RTP: A Transport Protocol for Real-Time Applications", July 2003, by S. Casner, R. Frederick and V. Jacobson, which is incorporated by reference herein. The document describes more specifically the RTP, which is used for carrying data that has real-time properties, and the RTP control protocol (RTCP) , which is used for monitoring the quality of service and for conveying information about the participants in an on-going session.

In the IETF Internet Draft draft-ietf-avt-aggregation- OO.txt: "An RTP Payload Format for User Multiplexing", May 6, 1998, J.Rosenberg and H. Schulzrinne have described an RTP payload format for multiplexing data

from multiple users into a single RTP packet. Such a multiplexing can be carried out in different ways. Among them, the approach described by B. Thompson, T. Koren and D. Wingin in RFC 4170: "Tunneling Compressed RTP (TCRTP)", November 2005, is the only scheme being actively pursued.

If both audio and video data are used in a multimedia telephone conference, they are transmitted as separate RTP sessions, where each session is a separate association among a set of participants communicating with RTP. Thus, separate RTP and RTCP packets are transmitted for each medium.

Generic RTP multiplexing (GeRM) is another way of RTP multiplexing. With GeRM, RTP packets from various media sources are multiplexed to form a single large RTP packet stream. GeRM was originally proposed to multiplex RTP packets of the same media type but different sources, in IETF Internet Draft draft-ietf- avt-germ-00.txt: "GeRM: Generic RTP Multiplexing", November 11, 1998, by Mark Handley.

The basic approach of GeRM is that a single User Datagram Protocol (UDP) or Transmission Control

Protocol (TCP) packet in an Internet Protocol (IP) packet contains multiple RTP headers each followed by its own payload. Such an RTP header followed by its payload is referred to as a sub-packet. Each GeRM packet has a full GeRM RTP header, which contains the synchronization source identifier (SSRC) , the sequence number, the timestamp, etc., corresponding to the first sub-packet payload, but the RTP payload type field is set to a value indicating that this is a GeRM packet.

The first sub-packet header will encode only the differences to this full RTP header and the next sub- packet headers will encode only the differences to the respective preceding sub-packet header. Thus, each sub- packet header is compressed.

The use of GeRM for multiplexing RTP packets of all cooperating media streams in a packet switched video telephony (PSVT) session has been proposed in the 3GPP2 TSG-Cl.2 contribution document PA C12-20030718-009 : "Generic RTP Multiplexing (GeRM) " by Harinath Garudadri, Qualcomm Inc.

Examples of co-operating media streams include video and audio streams in a PSVT session. Consider a lOfps video stream. A video frame is output every 100ms. The audio frames, in general, have a length of 20ms. Thus for every video frame, there are five audio frames. Each video frame may be compressed and encapsulated in one RTP packet and each audio frame may be compressed and encapsulated in one RTP packet. GeRM can then be used to multiplex the video RTP packet and five audio RTP packets to form a composite GeRM packet.

In such a composite GeRM packet that contains n RTP packets of different media streams, there is equally one GeRM RTP header for the first RTP packet and there are (n-1) sub-headers, one for each of the (n-1) multiplexed RTP packets. The details of various fields in the GeRM header and the sub-headers can be found in the above cited document "Generic RTP Multiplexing (GeRM) " and in the above cited document "GeRM: Generic RTP Multiplexing".

The above cited document "Generic RTP Multiplexing (GeRM) " lists several advantages of GeRM. It is indicated that GeRM provides improved inter-media synchronization between the multiplexed media flows because variable jitter on different flows is avoided. It is further indicated that GeRM provides alternatives to classic header compression schemes like robust header compression (ROHC) , because it compresses the RTP headers of the sub-packets and removes the need for separate UDP/IP headers for the sub-packets. It is further indicated that in a two-way PSVT call, GeRM provides a mechanism for a faster feedback than RTCP reports. This is achieved by including the feedback about received streams in the GeRM packets of the outgoing stream. Finally, it is indicated that support for GeRM is necessary only in the end terminals, no other network elements need to support it. GeRM support is also optional in the terminals. GeRM support can be negotiated by the terminals in a capability exchange phase before the PSVT session.

To support GeRM, certain parameters that describe the structure of GeRM packets, like payload types and default payload lengths, must be signaled out-of-band before the PSVT session begins. This signaling supports an efficient compression of the sub-packet headers. For example, a "payload length" field would be required in each sub-packet header for GeRM de-multiplexing at the receiver. This field would require additional 1-2 bytes for each sub-header. However, if the GeRM packet is structured such that all audio RTP packets in a GeRM packet have a fixed length (in bytes) and that the video packet of variable length is included at the end of the GeRM packet, then the field "payload length" can

be omitted from the sub-packet headers. The fixed payload length of the audio RTP packets may be signaled out-of-band.

The required parameters can be included in a Session

Description Protocol (SDP) file that is exchanged as a part of the Session Initiation Protocol (SIP) INVITE message initiating the PSVT session.

It has to be taken into account, however, that the structure of GeRM packets may change during an ongoing session for various reasons.

During a PSVT session, for instance, for which a sender originally multiplexed audio and video RTP packets using GeRM, the sender may enter for example any one of the following states:

(a) Video payloads are omitted by the sender due to downgraded Quality of Service (QoS) on its uplink. (b) The video source is turned off by the sender for the rest of the PSVT session, for whatever reason.

(c) The video input is paused by the sender for a brief period and resumed after this period.

The receiver has been receiving GeRM packets that have both audio and video sub-packets. Suddenly, it starts to receive GeRM packets that have only audio RTP packets, and it can make various assumptions on the sender's state. The receiver's application behavior will change depending on the receiver' s assumption of the sender's state.

For example, if the receiver assumes that the video source is turned off by the sender - as in cases (a)

and (b) mentioned above -, then it can terminate the video decoder/renderer thread of the PSVT application. If it assumes, in contrast, that the video input is only paused by the sender - as in case (c) mentioned above -, then it can keep its video decoder/renderer thread open for certain duration.

However, if the receiver makes a wrong assumption of the sender's state, then the user experience degrades severely. To avoid this problem, the sender's state should be signaled to the receiver. If this signaling is too late, then also the user experience degrades.

In conventional approaches adopted by 3GPP Packet Switched Conversational Services (PSC) applications, described in Technical Specification 3GPP TS 26.236 V6.4.0 (2005-09): "Packet switched conversational multimedia applications; Transport protocols", Release 6, GeRM is not used at all, and all constituent media in PSVT calls are transported as individual RTP packet streams. If there are media level changes during the PSVT session, then these are signaled by SIP UPDATE mechanisms .

For this mechanism, a session description protocol (SDP) file is signaled once at the beginning of the session, for example as a part of a SIP INVITE message. During the PSVT session, if the sender drops a media stream, then it has to signal this information to the receiver. This information is normally signaled by SIP UPDATE messages.

The technical specification 3GPP TS 24.228 V5.14.0 (2005-12) : "Signalling flows for the IP multimedia call

control based on Session Initiation Protocol (SIP) and Session Description Protocol (SDP); Stage 3", Release 5, describes various signaling flows for initiation, termination and modifications of IP Multimedia Subsystem (IMS) multimedia sessions. 3GPP TS 24.228 also discusses signaling flows for hold and resume of media flows .

The signaling of hold and resume of media in IMS based multimedia telephony sessions is performed by SIP

UPDATE messages. For example, when a media flow is to be held, the SIP UPDATE (hold) message includes an SDP file with a media level line a= inactive corresponding to the held media. Similarly when the held media is to be resumed, another SIP UPDATE (resume) message includes another SDP file with media level line a= sendrecv corresponding to the resumed media.

The signaling of media level changes through SIP UPDATE mechanisms involves long delays in the order of a few seconds, though. The in-band RTCP reports may help the receiver to determine the sender's state. For example, if the receiver does not get video RTP packets but keeps getting the corresponding RTCP sender reports, then it can safely assume that the sender is on hold or pause. The RTCP sender reports are thus used to signal link aliveness. The SIP UPDATE (hold) message that comes after some time may confirm the media hold. The sender can keep on sending RTCP sender reports corresponding to video source as long as it intends to pause the video source. If the sender intends to shut down the video source, then it may stop sending the corresponding RTCP reports. Thus, when the receiver

does not receive any RTCP reports for certain duration, it can safely close the video decoder/renderer thread.

In the case of GeRM, the RTCP behavior is not specified yet, though. If conventional RTCP sender reports would be provided for a composite GeRM RTP stream, the receiver could not determine from the received RTCP sender reports which of the audio and video sources is on hold. The signaling of state changes of a media stream through SIP UPDATE mechanisms involves too long delays, as mentioned above.

Thus the existing mechanisms to signal the media level changes are not suitable for the case of GeRM.

SUMMARY OF THE INVENTION

It is an object of the invention to provide possibilities for informing a receiver in a multimedia session about a state of a particular media stream, which is multiplexed in the multimedia session with at least one media stream of another type into transmission packets.

For a sender, a method of providing information on a state of a media stream to a receiver is proposed. The media stream is multiplexed with at least one media stream of another type to transmission packets of a multimedia session. The method comprises signaling dedicated state information for the media stream in- band of the multimedia session to the receiver.

Moreover, a first electronic device is proposed. The electronic device comprises multiplexing means adapted

to multiplex a media stream with at least one media stream of another type to transmission packets of a multimedia session. The electronic device further comprises processing means adapted to cause a signaling of dedicated state information for the media stream in- band of the multimedia session to a receiver.

Moreover, a first software program product is proposed, in which a software code for providing information on a state of a media stream to a receiver is stored in a readable medium. The media stream is multiplexed with at least one media stream of another type to transmission packets of a multimedia session. When being executed by a processor, the software code causes a signaling of dedicated state information for the media stream in-band of the multimedia session to the receiver .

For a receiver, a method is proposed, which comprises receiving an in-band signaling of a multimedia session with dedicated state information for a media stream. The media stream is multiplexed in received transmission packets of the multimedia session with at least one media stream of another type. The method further comprises extracting and evaluating the state information .

Moreover, a second electronic device is proposed. The electronic device comprises de-multiplexing means adapted to de-multiplex media streams of different types in received transmission packets of a multimedia session. The electronic device further comprises processing means adapted to evaluate dedicated state

information for the media stream received in an in-band of the multimedia session.

Moreover, a second software program product is proposed, in which a software code for handling information on a state of a media stream is stored in a readable medium. When being executed by a processor, the software code realizes the proposed second method.

Finally, a multimedia system is proposed, which comprises the proposed first electronic device and the proposed second electronic device.

Multiplexing several media streams to transmission packets of a multimedia session means that each transmission packet may comprise data from all media streams of which data is currently to be provided to the receiver in the scope of the multimedia session. It is to be understood that the state information does not have to be provided for each transmission packet. It is further to be understood that it may be provided only for selected states of the media stream.

The invention proceeds from the consideration that an out-of-band signaling of the state of multiplexed media streams involves too large delays to ensure a good user experience. The invention proceeds further from the consideration that an in-band signaling which relates to the entire stream of transmission packets with multiplexed media streams does not provide sufficiently detailed information to a receiver to ensure a good user experience. It is therefore proposed that the sender' s state is provided in an in-band signaling of a

multimedia session, including dedicated state information for a respective media stream.

It is an advantage of the invention that it enables the receiver to know the sender's state immediately. The receiver can use this timely information to ensure good user experience.

The state information may be provided in a predetermined format whenever it is required. It may comprise an identification of at least one media stream. Further, it may optionally comprise, associated to a respective identification of at least one media stream, an indication of a current state of this at least one media stream. The state may be in particular, though not exclusively, a reason why the media stream is dropped. Further, the state information may optionally comprise an indication of the total length of included state information. If the state information for a respective media stream has a fixed length, the total length of the state information may be indicated for instance with the number of media streams for which state information is provided.

Such state information may be signaled in-band within the transmission packets themselves or in control packets of the multimedia session.

If the state information is to be transmitted within the transmission packets, it may be included in a payload section or in a header section of the transmission packets.

If the state information is to be included in a payload section of transmission packets, a dedicated payload type may be used for the transmission packet whenever state information is actually signaled within the transmission packet. Whether the dedicated payload type is actually used for a respective transmission packet may then be indicated in a fixed header field of the transmission packet.

If the state information is to be included in a header section of the transmission packets, the state information may be signaled in an extension of a packet header of the transmission packet. Whether an extension is actually used for a respective transmission packet header may then be indicated in a fixed field of the transmission packet header.

State information may be signaled for a media stream in particular in case the media stream is to be omitted from the transmission packets.

State information on a new state of a media stream is advantageously provided by in-band signaling as proposed at least until a corresponding out-of-band signaling is provided. Such an out-of-band signaling may be for instance a SIP UPDATE message.

Instead of providing the state information explicitly using a dedicated format, the state information may also be provided implicitly. The dedicated state information for a media stream may be given for example by dedicated sender reports for each media stream, which are transmitted in control packets of the multimedia session. The sender reports are transmitted

in control packets of the multimedia session, and whether or not a sender report is transmitted depends on a state of the media stream.

Such dedicated sender reports for a media stream may be transmitted periodically until the media stream is to be omitted from the transmission packets for the rest of the media session. That is, whenever the media stream is to be omitted only temporarily, the sender reports are still transmitted. When the receiver does not receive sender reports anymore for a particular media stream, it will thus know that transmission of this media stream has been stopped completely.

Signaling state information within the transmission packets has the advantage that it is more bandwidth efficient than the use of dedicated sender reports, since the additional fields in the transmission packet headers add only a few bits of overhead. In this case, sender reports may be transmitted for the transmission packet stream as a whole.

Signaling state information by employing sender reports for each constituent media stream or by employing other types of control packets for a respective media stream, in contrast, has the advantage that it does not require any header extensions.

At the receiver side, the signaled state information can be evaluated for example for deciding on whether a decoder thread for a dropped multimedia stream is to be kept open or not.

The multiplexing of media streams to transmission packets may be carried out according to any suitable protocol. In an exemplary embodiment, the media streams are distributed to RTP packets and the transmission packets are GeRM packets.

In case control packets are employed for signaling explicit state information, these control packets may be application defined RTCP packets (APP) packets.

In case control packets are employed for signaling implicit state information, these control packets may be regular RTCP packets.

The multimedia session can be for instance, though, not necessarily, a PSVT session.

In a PSVT session using GeRM for RTP multiplexing of co-operating media streams, for example, if one or more of the media streams are dropped from the PSVT session, the sender may thus signal this information within the GeRM packets that are affected by this change. Alternatively, the sender may send RTCP sender reports separately for each media source. On the sender side, RTCP sender reports are advantageously generated before GeRM multiplexing. On the receiver side, the RTCP sender reports are advantageously interpreted after demultiplexing of the GeRM packets.

The invention can be employed for example, though not exclusively, for GeRM applications that are used for PSVT applications in Third Generation Partnership Project 2 (3GPP2) networks.

The media streams multiplexed to transmission packets may comprise for instance an audio stream and/or a video stream.

It is to be understood that any of the presented exemplary embodiments of the proposed methods can be implemented in the proposed devices, system and software program products as well.

Other objects and features of the present invention will become apparent from the following detailed description considered in conjunction with the accompanying drawings. It is to be understood, however, that the drawings are designed solely for purposes of illustration and not as a definition of the limits of the invention, for which reference should be made to the appended claims. It should be further understood that the drawings are not drawn to scale and that they are merely intended to conceptually illustrate the structures and procedures described herein.

BRIEF DESCRIPTION OF THE FIGURES

Fig. 1 is a schematic block diagram of a data transmission system according to an embodiment of the invention; Fig. 2 is a flow chart illustrating a first embodiment of a method according to the invention, which is employed at a sender side in the system of Figure 1 ;

Fig. 3 is a diagram illustrating an exemplary in-band signaling format that may be employed in the system of Figure 1 ;

Fig. 4 is a flow chart illustrating an embodiment of a method according to the invention, which is employed at a receiver side in the system of Figure 1; and Fig. 5 is a flow chart illustrating a second embodiment of a method according to the invention, which is employed at a sender side in the system of Figure 1.

DETAILED DESCRIPTION OF THE INVENTION

Figure 1 is a schematic block diagram of a data transmission system, which enables a sender using GeRM to inform a receiver about a changed state of a media stream in accordance with an exemplary embodiment of the invention.

The system may be used for instance for a PSVT conference. It comprises a first electronic device 10 operating at least as a sender and a second electronic device 20 operating at least as a receiver.

The sender 10 comprises an audio source 11, a video source 12 and a GeRM multiplexer 13. The audio source 11 may comprise for instance a microphone, an audio encoder and an RTP packetizer generating RTP packets from captured and encoded audio signals. The video source 13 may comprise for instance a camera, a video encoder and an RTP packetizer generating RTP packets from captured and encoded video signals. The GeRM multiplexer 13 is adapted to multiplex audio and video RTP packets for a PSVT session into GeRM packets. The sender 10 moreover includes an RTCP packet handler 14

for generating RTCP sender reports and for receiving RTCP receiver reports.

Furthermore, the sender 10 includes a user interface 15 enabling a user to enter commands.

The sender 10 further includes a processing unit 16 that is enabled to cause an in-band signaling with separate state information for each media stream. The processing unit 16 can be implemented in hardware and/or software. It may include for instance a processor executing corresponding software code. Alternatively, it may include for instance a chip or a chipset with an integrated circuit realizing the required functions. The processing unit 16 may be linked to all other depicted components.

It is to be understood that the sender 10 includes a variety of additional components not shown. If each RTP packet is to be included in a UDP or TCP packet, which is further included in an IP packet, for instance, the sender 10 may include means for adding UDP or TCP and IP headers to each GeRM packet and to each RTCP packet. In addition, the sender 10 may include a transceiving component enabling a link to the receiver 20, like a radio transceiver or an interface to the Internet. In accordance with above cited RFC 3550, one port of an established link is used for media data, and another is used for control (RTCP) packets.

The receiver 20 comprises a GeRM demultiplexer 21. The GeRM demultiplexer 21 is linked on the one hand via an RTP de-packetizer 22 and an audio decoder 23 to an audio renderer 24, for instance loudspeakers. The GeRM

demultiplexer 21 is linked on the other hand via an RTP de-packetizer 25 and a video decoder 26 to a video renderer 27, for instance a display.

The receiver 20 further includes an RTCP packet handler 28 for receiving RTCP sender reports and for generating RTCP receiver reports.

The receiver 20 moreover includes a processing unit 29 that is adapted to interpret an in-band signaling that is received. The processing unit 29 can be implemented in hardware and/or software. It may include for instance a processor executing corresponding software code. Alternatively, it may include for instance a chip or a chipset with an integrated circuit realizing the required functions. The processing unit 29 may be linked to all other depicted components.

It is to be understood that the receiver 20 includes a variety of additional components not shown. It includes for instance means for removing IP and UDP or TCP headers from each received GeRM packet and each RTCP packet, and a transceiving component enabling a link to a sender 10, like a radio transceiver or an interface to the Internet.

A first possible in-band signaling according to the invention in the system of Figure 1 will now be described with reference to Figures 2-4.

Figure 2 is a flow chart illustrating an operation at the sender 10.

For a PSVT session, the GeRM multiplexer 13 receives RTP audio packets from the audio source 11 and RTP video packets from the video source 12. The GeRM multiplexer 13 multiplexes the RTP packets to a single GeRM RTP stream for transmission to the receiver 20. It has to be noted that other media streams from other media sources could be included in the multiplexing as well. In parallel, the RTCP packet handler 14 periodically generates RTCP sender reports separately for each media stream and receives RTCP receiver reports. The RTCP receiver reports include a feedback from the receiver 20 about the QoS of received packets. The information for the RTCP sender reports may be provided by the processing unit 16, and the information contained in the RTCP receiver reports may be provided to the processing unit 16 for evaluation.

The structure of the GeRM packets may now change during the PSVT session upon request, as mentioned above. The video packets may be omitted by the sender 10 due to downgraded QoS on its uplink. This may be indicated for instance by a feedback from the receiver 20 in the RTCP receiver reports. The sender 20 may switch the video source off for the rest of the session. This may be initiated for instance by a user by means of the user interface 15. The sender 20 may interrupt the video input temporarily and resume it after this period. This may equally be initiated for instance by a user by means of the user interface 15. It is to be understood that other changes are possible as well. The same state changes could be caused for the audio stream or for any other included media stream.

The processing unit 16 monitors the arrival of state change requests from all possible kind of origins, for instance from the user interface 15 or from the RTCP packet handler 14 (step 201) .

In case the processing unit 16 detects that a state change is requested, it determines the type of the requested state change, which may be one of the types indicated above (step 202).

If the request involves stopping the transmission of one of the media streams (step 203) , for instance of the video stream, the corresponding media packets are omitted from the next GeRM packets (step 204) . This can be achieved by informing the GeRM multiplexer 13 accordingly. Alternatively or in addition, the media source 11, 12 providing the concerned media stream may be informed so that it stops capturing data and providing RTP packets to the GeRM multiplexer 13 in the first place.

In addition, information about the state change is transmitted in an in-band signaling to the receiver 20 (step 205) .

This signaling includes the identity of the concerned media stream and optionally the state of the media stream, that is, the reason for dropping the media stream.

A possible format of the transmitted information including both the identity of the media stream and the state of the media stream is illustrated in Figure 3.

The information includes the identity of each stopped media stream, for example MediaID - 1, MediaID - 2, ... , and/or MediaID - N. The identity of each media stream may be described using three bits, since it is unlikely that more than N=Q RTP streams are GeRM multiplexed in a PSVT session. In the presented example, two RTP streams are GeRM multiplexed.

The information further includes the state of each stopped media stream, State - 1, State - 2, ... , and/or State - N. This information can equally be coded using three bits. Thus, a total of eight different states or reasons are possible for each media. In the presented example, three states are identified for a dropped media stream. As indicated above, the media payloads may be omitted by the sender due to downgraded QoS on its uplink, the media source may be turned off by the sender for the rest of the PSVT session, or the media input may be paused by the sender for a brief period and resumed later on. Additional states can be defined depending on the application.

Thus, six bits are needed to identify a particular media stream and to describe its state. Since up to eight different media streams are supported, a maximum of 6*8=48 bits are needed to describe the states of all media streams. Consequently an additional three bit length field can be used to describe the length of the header extension in multiples of six bits. The maximum total length of the conveyed information is thus 3 + 48 = 51 bits.

In the format of Figure 3, bits 1-3 indicate the number N of media streams for which information is included,

bits 4-6 indicate a first media stream identity, bits 7-9 indicate the state of the first media stream, bits 10-12 indicate a second media stream identity, bits 13- 15 indicate the state of the second media stream, etc.

The in-band signaling making use of this format can be realized in various ways.

The sender 10 may signal the information for instance inside the GeRM packets that are affected by a state change .

In a first embodiment of this approach, a first payload type (PT) is defined which indicates that the packet is a GeRM packet, while a second PT is defined which indicates that the GeRM packet additionally contains the format defined in Figure 3.

GeRM packets are expected to use a dynamic payload type, in the range of 96-127. The GeRM packet stream should be assigned a PT dynamically. This assignment can be signaled out-of-band, for example using SIP, Real Time Streaming Protocol (RTSP) , Session Announcement Protocol (SAP) or H.323.

For a session using only normal GeRM packets, the following lines could be added to an SDP file that is exchanged during the session setup:

m = GERM 491 70 RTP/AVP 98 a = rtpmap : 98 xxxxxxxxxxxxxx

Similarly a new dynamic PT could be assigned to the GeRM packets that contain the format defined in the

Figure 3. This PT should obviously be different from the PT of the normal GeRM packets. For a multimedia session that contains both types of GeRM packets, the SDP could contain for example the following assignment:

m = GERM 49170 RTP/AVP 98 a = rtpmap : 98 xxxxxxxxxxxxxx m = GERM 51372 RTP/AVP 105 a = rtpmap : 105 yyyyyyyyyyyyy

The strings xxxxxxxxx, and yyyyyyyy in the above examples may be custom defined depending on the constituent payloads of the GeRM packets.

The format defined in the Figure 3 is preferably included immediately after a GeRM RTP header including a corresponding PT indication and before any other payload headers.

In a second embodiment of the approach, in which the sender 10 signals the format of Figure 3 inside the GeRM packets, the format is included in a header extension .

RFC 3550 already defines the fourth bit of each RTP header to be an extension (X) field of one bit. If the extension bit is set, the fixed header must be followed by exactly one variable-length header extension. Such a field can also be made use of in a GeRM RTP header for the proposed in-band signaling.

In case the transmission of data for one media stream is stopped, the "X" bit in the GeRM RTP header is set to indicate the existence of a header extension.

The GeRM RTP header extension may then contain the information in the format presented in Figure 3.

Instead of inserting the format of Figure 3 into the GeRM packets, the sender 10 may signal the format in the RTCP stream of the PSVT session.

In RFC 3550, RTCP packet are defined to be control packets consisting of a fixed header part similar to that of RTP data packets, followed by structured elements that vary depending upon the RTCP packet type.

As one possible type, RTCP APP packets are defined.

They are intended for experimental use as new applications and new features are developed, without requiring packet type value registration.

A new RTCP APP packet may thus be defined for a GeRM stream, which contains the format described in Figure 3 as its payload. The RTCP APP packets may then be sent for instance according to the packing and timing rules of RTCP reports defined in RFC 3550 or in the Internet draft draft-ietf-avt-rtcp-feedback-ll.txt: "Extended RTP Profile for RTCP-based Feedback (RTP/AVPF)", 10 August 2004, by Joerg Ott, Stephan Wenger, Noriyuki Sato, Carsten Burmeister and Jose Rey.

The processing unit 16 continues monitoring whether a new state change request is received (step 206) .

If a new state change request is received, the operation continues with step 202.

As long as no new state change request is received, in contrast, the operation continues with step 204 until a SIP UPDATE message has been transmitted (step 207) . The SIP UPDATE message is an out-of-band signaling that contains information on the state of the media streams.

Once a SIP UPDATE message has been transmitted, the in- band signaling can be stopped (step 208), until a new state change request is received (step 201).

In case a new state change request is received (step 201), and the detected type of state change (step 202) contains a request to resume providing media data of one type (step 203) , the packets from the concerned media stream are included again in the GeRM packets

(step 209) . This can be achieved by the processing unit 16 by informing the GeRM multiplexer 13 accordingly. Alternatively or in addition, the media source 11, 12 providing the concerned media stream may be informed so that it resumes capturing data and providing corresponding RTP packets to the GeRM multiplexer 13.

With the presented approach, the receiver 20 is informed immediately about a changed state of a media stream of the PSVT session so that it may act accordingly.

An exemplary operation of the receiver 20 for the case of in-band signaling within the GeRM packets is illustrated in the flow chart of Figure 4.

The receiver 20 receives the RTP stream and the RTCP stream of the PSVT session. After the IP and UDP or TCP headers have been stripped off of a respective GeRM RTP

packet, the GeRM demultiplexer 21 demultiplexes the GeRM packet.

Any included audio RTP packet is provided to the RTP de-packetizer 22. The RTP de-packetizer 22 de- packetizes the RTP packet and provides the included audio data to the audio decoder 23 for decoding. The decoded audio data is then provided to the audio renderer 24 for presentation to the user. A decoder/renderer thread is maintained to this end between the GeRM demultiplexer 21 and the audio renderer 24 as long as audio data is still expected for the ongoing session.

Any video RTP packet included in the GeRM stream is provided by the GeRM demultiplexer 21 to the RTP de- packetizer 25. The RTP de-packetizer 25 de-packetizes the RTP packet and provides the included video data to the video decoder 26 for decoding. The decoded video data is then provided to the video renderer 27 for presentation to the user. A decoder/renderer thread is maintained to this end between the GeRM demultiplexer 21 and the video renderer 27 as long as video data is still expected for the ongoing session.

Further, the GeRM demultiplexer 21 provides any in-band signaling on media stream states that is included in a received GeRM packet as well as an indication about the contained RTP packets to the processing unit 29. Depending on the employed embodiment, the GeRM demultiplexer 21 knows either from the PT field in the GeRM RTP header or from the extension field in the GeRM RTP header whether in-band signaling is included.

The processing unit 29 interprets provided in-band signaling as follows.

The processing unit 29 determines whether the GeRM packet comprises RTP packets for all media streams (step 401) .

If RTP packets of all media streams are included, the RTP packets are processed for a presentation as described above (step 402) .

If no RTP packets are included in the GeRM packet for a particular media stream, the processing unit 29 checks whether in-band signaling is provided for this media stream (step 403) .

In case state information is included, the processing unit 29 determines whether the state information indicates a desired pause or a desired shut-down of the dropped media stream (step 404) . The state information is associated to the concerned media stream identifier, as indicated in Figure 3. A pause is assumed to be desired in case the media stream state indicates a temporary interruption. A shut-down is assumed to be desired in case the media stream state indicates that the media stream has been interrupted due to a bad QoS or in case the media stream state indicates a permanent interruption .

In case the processing unit 29 determines a desired pause, the concerned decoder/renderer thread is kept open (step 405) . In case it determines a desired shutdown, the concerned decoder/renderer thread is closed (step 406) .

In case no state information is included in the GeRM packet, the processing unit 29 determines whether a SIP UPDATE (hold) message has been received (step 407) . If a SIP UPDATE (hold) message has been received, the concerned decoder/renderer thread is kept open (step 408). Otherwise, the concerned decoder/renderer thread is closed (step 409) .

The processing unit 29 also takes care of evaluating received RTCP sender reports and of providing information for the RTCP receiver reports back to the sender 10.

In case the in-band signaling is provided in RTCP APP packets, the processing unit 29 receives the required information from the RTCP packet handler 28 and acts accordingly.

A second possible in-band signaling according to the invention in the system of Figure 1 will now be described with reference to Figure 5.

Figure 5 is a flow chart illustrating the operation at the sender 10.

As a starting point, the GeRM multiplexer 13 multiplexes again packets from all media streams into GeRM packets (step 501) . Further, the RTCP packet handler 14 generates and transmits RTCP sender reports periodically. More specifically, it generates separate RTCP sender reports for each media stream that is included in the GeRM RTP packets. The RTCP sender

reports are generated under control of the processing unit 16 before multiplexing.

RFC 3550 defines RTCP sender report packets for transmission and reception statistics from participants that are active senders in a conventional single-media RTP session. The defined RTCP sender report packet comprises various parameters, including a version (V) field of 2 bits, a padding (P) field of 1 bit, a reception report count (RC) field of 5 bits, a packet type (PT) field of 8 bits, a length field of 16 bits, an SSRC field of 32 bits, an NTP timestamp field of 64 bits, an RTP timestamp field of 32 bits, a sender's packet count field of 32 bits, a sender's octet count field of 32 bits, an SSRC_n (source identifier) field of 32 bits, a fraction lost field of 8 bits, a cumulative number of packets lost field of 24 bits, an extended highest sequence number received field of 32 bits, an interarrival jitter field of 32 bits, a last SR timestamp (LSR) field of 32 bits, and a delay since last SR (DLSR) field of 32 bits. For details on these fields, it is referred to RFC 3550.

In the presented GeRM case, the fields "SSRC", "sender's packet count" and "sender's octet count" of an RTCP sender report that is associated to one media stream may correspond to the constituent media RTP packets before GeRM multiplexing on the sender's side. In RFC 3550, the SSRC is a synchronization source identifier for the originator of the sender report packet. In the present case, it is thus an identified for the sender and the concerned media source. In RFC 3550, the sender's packet count is the total number of RTP data packets transmitted by the sender since

starting transmission up until the time this sender report packet was generated. In the present case, it is thus the corresponding total number of transmitted RTP data packets originating from the concerned media source. In RFC 3550, the sender's octet count is the total number of payload octets transmitted in RTP data packets by the sender since starting transmission up until the time this sender report packet was generated. In the present case, it is thus the corresponding total number of payload octets transmitted in RTP data packets originating from the concerned media source. All other fields in the RTCP Sender Reports can be determined according to the guidelines in RFC 3550.

The processing unit 16 moreover monitors whether it receives any state change request (step 502). Possible state changes correspond to those mentioned above with reference to Figure 2.

When a state change request is detected, the processing unit 16 determines the type of the requested state change (step 503) .

When the state change request is a request to drop packets of one media stream (step 504), the processing unit 16 determines whether the drop request is a shutdown request (step 505) .

If the drop request is no shut-down request, it is a hold request. In this case, the packets of the concerned media stream are omitted from the GeRM packets (506) . The RTCP sender reports for the concerned media stream are transmitted as before nevertheless with the timing defined in RFC 3550 (step

507) . As long as there is no new state change request (step 508), processing unit 16 continues with step 506. Otherwise, it continues with step 503.

If the drop request is a shut-down request, the packets of the concerned media stream are equally omitted from the GeRM packets (509) . In this case, however, the RTCP sender reports for the concerned media stream are stopped as well (step 510) . As long as there is no new state change request (step 511), processing unit 16 continues with step 509. Otherwise, it continues with step 503.

In case the detected type of a change request is no drop request (step 504), the RTP packets of the concerned media stream are included in the GeRM packets again (step 512) . Further, the RTCP sender reports that are associated to this media stream are continued to be transmitted or transmitted again.

In case a new state change request is detected in this situation (step 513), the processing unit 16 continues with step 503.

The sender 10 can thus, for example, keep on sending RTCP sender reports associated to the video source 12 as long as it intends to pause/hold the video source 12. If the sender 10 intends to shut down the video source 12, then it may stop sending the corresponding RTCP reports.

With this second possible operation of the sender 10, the receiver 20 operates basically as described with reference to Figure 4, except that no state information

is received. Instead, the in-band RTCP reports support the processing unit 29 in determining the sender's state. They are interpreted after de-multiplexing the GeRM packets .

For example, if the receiver 20 does not get video RTP packets but keeps getting the corresponding RTCP sender reports, then it can safely assume that the sender is on hold or pause. The RTCP sender reports are thus used to signal link aliveness. A SIP UPDATE (hold) message that arrives at the receiver 20 after some time may confirm the media hold. When the receiver 20 does not receive any RTCP reports for certain duration, the processing unit 29 can safely close the video decoder/renderer thread.

While there have been shown and described and pointed out fundamental novel features of the invention as applied to preferred embodiments thereof, it will be understood that various omissions and substitutions and changes in the form and details of the devices and methods described may be made by those skilled in the art without departing from the spirit of the invention. For example, it is expressly intended that all combinations of those elements and/or method steps which perform substantially the same function in substantially the same way to achieve the same results are within the scope of the invention. Moreover, it should be recognized that structures and/or elements and/or method steps shown and/or described in connection with any disclosed form or embodiment of the invention may be incorporated in any other disclosed or described or suggested form or embodiment as a general matter of design choice. It is the intention,

therefore, to be limited only as indicated by the scope of the claims appended hereto.




 
Previous Patent: WLAN TRANSMIT POWER CONTROL

Next Patent: MULTI-BAND SLOT ANTENNA