SYSTEM FOR LOST PACKET RECOVERY IN VOICE OVER INTERNET PROTOCOL BASED ON TIME DOMAIN INTERPOLATION

Title:

SYSTEM FOR LOST PACKET RECOVERY IN VOICE OVER INTERNET PROTOCOL BASED ON TIME DOMAIN INTERPOLATION

Document Type and Number:

WIPO Patent Application WO/2001/049005

Kind Code:

A1

Abstract:

A lost packet recovery device, method and computer program for use in a VoIP system in which lost packets containing voice information are replaced using time domain interpolation techniques. These time domain interpolation techniques employ two different approaches to interpolate missing data packets. The first approach relies on time domain harmonic scaling to interpolate a replacement frame for a missing frame using the frames that come before and after the missing frame. The second approach replicates a frame immediately prior to the missing frame. This replicated frame then has an energy reduction function applied to it to gradually reduce the energy output level of the data samples in the frame. This replicated frame is then used to replace the missing frame. In the second approach, the process of duplicating the prior frame and reducing its energy levels using an energy reduction function is repeated until no further missing frames are detected. Once no further missing frames are detected, an energy restoration function is applied to the next available frame to gradually increase its energy levle and provide for a smooth transition. Using these techniques, missing frames of voice data may be replaced to mask the effects of missing frames to a listener.

Inventors:

TANG HAITAO
FLINK HANNU ISTO JUHANI

Application Number:

PCT/US2000/035348

Publication Date:

July 05, 2001

Filing Date:

December 27, 2000

Export Citation:

Click for automatic bibliography generation Help

Assignee:

NOKIA INC (US)

International Classes:

H04L12/64; H04L29/06; (IPC1-7): H04L29/14; H04L29/06; H04L12/64

Other References:

KOHLER M A ET AL: "Naturalness preserving transform for missing frame compensation", ORLANDO, FL, MAY 30 - JUNE 2, 1999,NEW YORK, NY: IEEE,US, 1999, pages 118 - 122, XP002162089, ISBN: 0-7803-5472-9
LEE I ET AL: "Tree coding combined with TDHS for speech coding at 6.4 and 4.8 kbps", SPEECH COMMUNICATION,ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM,NL, vol. 29, no. 1, September 1999 (1999-09-01), pages 23 - 37, XP004179197, ISSN: 0167-6393

Attorney, Agent or Firm:

Stout, Donald E. (Terry Stout & Krau, LLP 1300 N. Seventeenth Street Suite 1800 Arlington VA, US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS We Claim :

1.

A method of lost frame recovery in a VolP system, comprising: receiving a plurality of packets having at least one frame of data per packet in the VolP system ; detecting a missing frame of data ; interpolating a frame of data using a prior frame of data; and presenting the frame of data interpolated to a user of the VolP system.

2.	The method recited in claim 1, wherein a frame of data comprises a plurality of digitized sound samples taken in a predetermined time period.

3.	The method recited in claim 2, wherein interpolating the frame of data using a prior frame of data is accomplished using TDHS principles and a frame of data that occurs after the missing frame of data.

4.	The method recited in claim 2, wherein interpolating the frame of data using a prior frame of data is accomplished using an energy reduction function.

5.	The method recited in claim 4, wherein the energy reduction function decreases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the frame.

6.	The method recited in claim 5, wherein the energy reduction function decreases energy levels of last digitized sound sample in a range from 5% to 50% over the first digitized sound sample.

7.	The method recited in claim 6, wherein the energy reduction function decreases energy levels of last digitized sound sample from 20% to 30% over the first digitized sound sample.

8.

The method recited in claim 4, further comprising: detecting the presence of a frame of data after presenting the interpolated frame of data to the user; applying an energy restoration function to the interpolated frame of data; and presenting the interpolated frame of data to the user once the energy restoration function has been applied.

9.

The method recited in claim 8, wherein the energy restoration function gradually restores energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the frame until the energy level of the last digitized sound sample is fully restored.

10.

The device recited in claim 8, further comprising: detecting another missing frame of data; interpolating another frame of data using the energy reduction function; and repeating the detecting of another missing frame and the interpolating another frame of data using the energy reduction function until no further missing frame of data is detected.

11.

The method as recited in claim 10, further comprising : detecting the presence of a frame of data after presenting the interpolated frame of data to the user; applying an energy restoration function to the interpolated frame of data; and presenting the interpolated frame of data to the user once the energy restoration function has been applied.

12.

A method of lost frame recovery in a VolP system, comprising: receiving a plurality of packets having at least one frame of data per packet in the VoIP system, wherein a frame of data comprises a plurality of digitized sound samples taken in a predetermined time period; detecting a missing frame of data in the plurality of packets; creating a replacement frame of data co replace the missing frame of data by using a frame of data immediately prior to the missing frame of data; applying a energy reduction function to the replacement frame of data, wherein the energy reduction function decreases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the replacement frame of data; presenting the replacement frame to a user of the VolP system; repeating the detecting, creating, applying and presenting operations until the missing frame of data is not detected; applying a energy restoration function to the replacement frame of data, wherein the energy restoration function increases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the replacement frame of data; and presenting the replacement frame of data to the user of the VolP system.

13.

A device for lost frame recovery in a VolP system, comprising: an input packet reception module to receive a plurality of packets having at least one frame of data per packet in the VolP system; a lost packet interpolation module to detect a missing frame of data received from the input packet reception module and interpolate a frame of data using a prior frame of data; and a frame playback module to present the frame of data interpolated by the lost packet interpolation module to a user of the VolP system.

14.	The device recited in claim 13, wherein a frame of data comprises a plurality of digitized sound samples taken in a predetermined time period.

15.	The device recited in claim 14, wherein the lost packet interpolation module uses a prior frame of data, a frame of data that occurs after the missing frame of data and TDHS principles to interpolate the missing frame of data.

16.	The device recited in claim 14, wherein the lost packet interpolation module further comprises: an energy reduction function to interpolate the missing frame of data using the prior frame of data.

17.	The device recited in claim 16, wherein the energy reduction function decreases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the frame.

18.	The device recited in claim 17, wherein the energy reduction function decreases energy levels of last digitized sound sample in a range from 5% to 50% over the first digitized sound sample.

19.	The device recited in claim 18, wherein the energy reduction function decreases energy levels of last digitized sound sample from 20% to 30% over the first digitized sound sample.

20.

The device recited in claim 16, wherein the lost packet interpolation module further comprises: an energy restoration function to restore the energy level to the frame of data interpolated by the energy reduction function when a missing frame of data is no longer detected by the packet interpolation module.

21.

The device recited in claim 20, wherein the energy restoration function gradually restores energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the frame until the energy level of the last digitized sound sample is fully restored.

22.

The device recited in claim 17, wherein the lost packet interpolation module detects another missing frame of data and interpolates another frame of data using the energy reduction function, wherein the detection of another missing frame and the interpolation another frame of data using the energy reduction function repeats until no further missing frame of data is detected.

23.

The device as recited in claim 22, wherein the lost packet interpolation module further comprises: an energy restoration function to gradually restore the energy to the frame of data reduced by the energy reduction module upon the detection of no further missing frame of data by the lost packet interpolation module.

24.

A device for lost frame recovery in a VolP system, comprising: an input packet reception module to receive a plurality of packets having at least one frame of data per packet in the VolP system, wherein a frame of data comprises a plurality of digitized sound samples taken in a predetermined time period; a lost packet interpolation module to detect and replace a missing frame of data in the plurality of packets; the lost packet interpolation module further comprising: a current frame unavailable function to create a replacement frame of data to replace the missing frame of data by using a frame of data immediately prior to the missing frame of data and a energy reduction function, wherein the energy reduction function the energy reduction function decreases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the replacement frame of data; a current input frame available and at least one frame lost function to apply a energy restoration function to the replacement frame of data created by the energy reduction function, wherein the energy restoration function increases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the replacement frame of data.

25.

A computer program embodied on a computerreadable medium to perform lost frame recovery in a VolP system, comprising: an input packet reception module code segment to receive a plurality of packets having at least one frame of data per packet in the VolP system; lost packet interpolation module code segment to detect a missing frame of data received from the input packet reception module code segment and interpolate a frame of data using a prior frame of data; and a frame playback module code segment to present the frame of data interpolated by the lost packet interpolation module code segment to a user of the VoIP system.

26.	The computer program recited in claim 25, wherein a frame of data comprises a plurality of digitized sound samples taken in a predetermined time period.

27.	The computer program recited in claim 26, wherein the lost packet interpolation module code segment uses a prior frame of data, a frame of data that occurs after the missing frame of data and TDHS principles to interpolate the missing frame of data.

28.	The computer program recited in claim 26, wherein the lost packet interpolation module code segment further comprises: an energy reduction function code segment to interpolate the missing frame of data using the prior frame of data.

29.

The computer program recited in claim 28, wherein the energy reduction function code segment decreases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the frame.

30.	The computer program recited in claim 29, wherein the energy reduction function code segment decreases energy levels of last digitized sound sample in a range from 5% to 50% over the first digitized sound sample.

31.	The computer program recited in claim 30, wherein the energy reduction function code segment decreases energy levels of last digitized sound sample from 20% to 30% over the first digitized sound sample.

32.

The computer program recited in claim 28, wherein the lost packet interpolation module code segment further comprises: an energy restoration function code segment to restore the energy level to the frame of data interpolated by the energy reduction function code segment when a missing frame of data is no longer detected by the packet interpolation module code segment.

33.

The computer program recited in claim 32, wherein the energy restoration function code segment gradually restores energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing mannerto a last digitized sound sample for the plurality of digitized sound samples in the frame until the energy level of the last digitized sound sample is fully restored.

34.

The computer program recited in claim 29, wherein the lost packet interpolation module code segment detects another missing frame of data and interpolates another frame of data using the energy reduction function code segment, wherein the detection of another missing frame and the interpolation another frame of data using the energy reduction function code segment repeats until no further missing frame of data is detected.

35.

The computer program as recited in claim 34, wherein the lost packet interpolation module code segment further comprises: an energy restoration function code segment to gradually restore the energy to the frame of data reduced by the energy reduction module code segment upon the detection of no further missing frame of data by the lost packet interpolation module code segment.

36.

A computer program embodied on a computerreadable medium to perform lost frame recovery in a VoIP system, comprising: an input packet reception module code segment to receive a plurality of packets having at least one frame of data per packet in the VolP system, wherein a frame of data comprises a plurality of digitized sound samples taken in a predetermined time period; a lost packet interpolation module code segment to detect and replace a missing frame of data in the plurality of packets; the lost packet interpolation module code segment further comprising : a current frame unavailable function code segment to create a replacement frame of data to replace the missing frame of data by using a frame of data immediately prior to the missing frame of data and a energy reduction function code segment, wherein the energy reduction function code segment the energy reduction function code segment decreases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the replacement frame of data; and a current input frame available and at least one frame lost function code segment to apply a energy restoration function code segment to the replacement frame of data created by the energy reduction function code segment, wherein the energy restoration function code segment increases energy levels of each digitized sound sample starting with a first digitized sound sample in an ever increasing manner to a last digitized sound sample for the plurality of digitized sound samples in the replacement frame of data.

37.

A method of lost frame recovery in a VolP system, comprising: receiving a plurality of packets having at least one frame of data per packet in the VolP system ; detecting a plurality of consecutively missing frames of data within the plurality of frames having data ; replacing the plurality of consecutively missing frames of data by gradually reducing the energy level of a prior frame having data that appears immediately before the plurality of consecutively missing frames of data; increasing the energy level of a last missing frame of data of the plurality of consecutively missing frames of data to a full energy level when the last missing frame of data appears before a frame having data. presenting the plurality of frames of data to a user of the VolP system.

38.	The method recited in claim 37, wherein the gradual reducing of the energy level of the prior frame of data is accomplished using an energy reduction function, wherein the energy reduction function decreases the energy level of the prior frame of data in a range from 5% to 50%.

Description:

SYSTEM FOR LOST PACKET RECOVERY IN VOICE OVER INTERNET PROTOCOL BASED ON TIME DOMAIN INTERPOLATION

Technical Field The invention relates to a system and method for cross-domain server association to form server groups. More particularly, the invention relates to a system and method for identifying and forming server associations so that a server or user may access servers located on the same or different local area network and utilize the resources and applications located on that local area network.

With the explosion in Internet and Intrant access and usage a large volume of information and services are now available to users. However, it does a user little good to have a world of information at their fingertips and not have a simple means to find it.

All too often in the business and corporate environment a user may need to access specific server device on a domain or across a domain. In order to access such a specific server the user requires a simple means of looking up names of resources and groupings of resources available to him and a simple direct means of accessing them.

Further, the administrator needs a simple means of defining all resources in a domain and grouping them in a logical fashion for the user who does not know exactly what he is looking for. For example, the marketing department may want access to the latest designs for a new product being created by engineering, but where to look among the hundreds of computers the firm is tied to through out the world. Further, a firm may wish to publish the availability of specific services on specific servers. At present the Internet and Intrant protocols do not allow for this feature.

Typically, an Internet user would have a browser installed in his local computer or server such as Internet Explorer or Netscape. Using this browser, the user would access an Internet service provider, such as America On Line (AOL) via a modem over the local the public switched telephone network (PSTN). Once logged onto the Internet server, the user may utilize one of the many search engines, such as Yahoo or Lycos, to specify search terms. The user may also use a web crawler, spider or robot to attempt to find a service or information desired. However, the use of these search engines or web crawlers is time consuming and is not guaranteed to find a specific web site desired or a group of related web sites. Further, even if a user finds

a web site he is interested in, this web site may simply be a gateway to a local area network (LAN) or wide area network (WAN). It would then be up to the web page creator to create a list of services and servers on the web page and links to services or servers available on the LAN or WAN. Therefore, using the aforementioned method of Internet searching there is no means provided to access a specific server directly within a domain, such as a LAN, or discover groups of related servers in a domain or outside it.

This limitation is due to the manner in which Internet domain names are maintained and located. Internet naming is accomplished by a Domain Name Server (DNS), which allows a computer that is registered to the Internet to be uniquely identified by that name wherever it may be located. The DNS serves to translate the unique host name into the appropriate Internet protocol (IP) address required to establish communications. The DNS may also provide limited identification of the types of applications that are available on a particular machine, such as Telnet or file transport protocol (FTP). As discussed above the DNS may further identify gateways to networks and to show which machines are capable of mail relay and to which network.

Therefore, DNS is unable to link directly to specific servers within a LAN or domain and is unable to associate servers in groups by function or other criteria.

Further, DNS is quite complex syntactically which makes it unsuitable for any but the largest domains, although its operation is straightforward. A host, given a name, asks the server for a name-to-address translation. If the name server does not possess the means to perform that translation directly, it will pass the request on to a server with a higher authority than itself. This process can be repeated until the request is satisfied.

This DNS approach is adequate for Internet surfing, but is not suitable for a corporate Intrant access. One attempt to overcome the short comings of DNS is the use of host files which are a text-based file often used in TCP/IP (transmission control protocol/Internet protocol) systems, and which contains a simple list of IP addresses and the names that relate to them. This file can name common systems both inside and outside an organization, and each address can have several names, usually a "formal"name followed by a number of less formal"nicknames"or alises. However, using host files is possible in smaller LANs, they can have serious drawbacks in larger domains. The main problem is that a copy of the file must exist on each and every TCP/IP client which intends to refer to resources by name rather than IP address. Host files presents the systems administrator with a potential nightmare in a network with

hundreds or thousands of clients. Ensuring each and every HOSTS file is always completely up-to-date as network changes are made is bad enough, but there is also the temptation for users to create their own files with customized naming conventions, making it difficult for"hot desking"colleagues. Of course, it is possible to manage HOSTS files by keeping a master version on one of the central servers, and downloading it to clients automatically on a regular basis-but this approach, too, can have its problems in large distributed networks.

Directory Access Protocol (DAP) is a further attempt to overcome the aforementioned problems. DAP is a portion of OSI standard X. 500. DAP specifies how user applications access the directory information. Unfortunately, the DAP protocol as defined in the X. 500 specification has significant overhead, resulting in a distinct lack of full DAP implementation by clients and applications. To overcome this the overhead problem with DAP, Lightweight Directory Access Protocol (LDAP) was developed and is quickly gaining client acceptance on both the Internet and Intranet to access directory information. LDAP allows users to quickly and easily access directories of people and information such as user names, e-mail addresses, and telephone numbers. For example, using LDAP client it would be possible to search for a user name and retrieve the e-mail address and perhaps the telephone number. It would also be possible to search all entries having a specific character string in their names.

However, even with the development of LDAP no low overhead mechanism exists which allows for a server or client to locate and directly access a specific server outside of its own domain. Further, no low overhead method or system exists which would allow associations of clients and servers to be formed and accessed. Finally, there is currently no method of publishing or advertizing services available in a specific server in a domain without going through a web page.

Therefore, what is needed is a low overhead system and method that may be used in a domain of any size and across domains that can associate servers based on function or other criteria and further access those servers directly. Further, this system and method should to widely publish or advertize services available on a server within a domain and allow access to authorized users that server directly.

Disclosure of the Invention An embodiment of the present invention provides for a method of lost frame recovery in a VolP system. This method receives several packets having at least one frame of data per packet. The frames are then examined to detect a missing frame of

data. The method then interpolates a frame of data using a prior frame of data. Once a frame is interpolated, it is presented to a user of the VolP system.

Further, an embodiment of the present invention creates a device for lost frame recovery in a VoIP system. This device has an input packet reception module to receive several packets having at least one frame of data per packet. A lost packet interpolation module is used to detect a missing frame of data received from the input packet reception module and interpolate a frame of data using a prior frame of data.

Further, a frame playback module is used to present the frame of data interpolated by the lost packet interpolation module to a user of the VolP system.

Still further, an embodiment of the present invention provides a computer program embodied on a computer-readable medium to perform lost frame recovery in a VolP system. This computer program has an input packet reception module code segment to receive a several packets having at least one frame of data per packet. It also has a lost packet interpolation module code segment to detect a missing frame of data received from the input packet reception module code segment and interpolate a frame of data using a prior frame of data. Further, it also has a frame playback module code segment to present the frame of data interpolated by the lost packet interpolation module code segment to a user of the VolP system.

These and other features of this device and method will become more apparent from the following description when taken in connection with the accompanying drawings which show, for purposes of illustration only, examples in accordance with the present invention.

Brief Description of the Drawings The foregoing and a better understanding of the present invention will become apparent from the following detailed description of exemplary embodiments and the claims when read in connection with the accompanying drawings, all forming a part of the disclosure of this invention. While the foregoing and following written and illustrated disclosure focuses on disclosing example embodiments of the invention, it should be clearly understood that the same is by way of illustration and example only and the invention is not limited thereto. The spirit and scope of the present invention are limited only by the terms of the appended claims.

The following represents brief descriptions of the drawings, wherein:

FIG. 1 is an example of an overall system diagram of an embodiment of the present invention; FIG. 2 is diagram showing an example of time domain harmonic scaling principles employed in pitch period decimation; FIG. 3 is diagram showing an example of time domain harmonic scaling principles employed in pitch period interpolation in an embodiment of the present invention ; FIG. 4 is a diagram showing an example of lost frame interpolation using an energy smoothing function in the preferred embodiment of the present invention; FIG. 5 is a diagram showing examples of the energy smoothing function used in an embodiment of the present invention; FIG. 6 is a diagram of the software modules used in an embodiment of the present invention ; and FIG. 7 is a flowchart of the lost packet recovery algorithm employed in an embodiment of the present invention.

BEST MODE FOR CARRYING OUT THE INVENTION Before beginning a detailed description of the subject invention, mention of the following is in order. When appropriate, like reference numerals and characters maybe used to designate identical, corresponding or similar components in differing figure drawings. Further, in the detailed description to follow, exemplary sizes/models/values/ranges may be given, although the present invention is not limited to the same.

FIG. 1 illustrates an example of an embodiment of the present invention in which phone conversations using a packet switched IP network 50 are enabled. In FIG. 1, a user employs a communications device 10 to communicate to a VoIP gateway 30 through PSTN 20. Communications device 10 may be a telephone, a voice-equipped PC (personal computer) or any other device capable of transmitting sound, or sound in conjunction with video. In the case where voice-equipped PC is used, the PC would require a microphone, at least one speaker and the supporting software. Further, the user may either initiate the call or receive the call. Also, the user is not limited to contacting another human being when placing the call, but may instead contact any form of sound reproduction device including a computer.

Still referring to FIG. 1, the Vol P gateway 30 is interfaced to the packet switched IP network 50. This packet switched IP network 50 may be the Internet, a LAN or a WAN. The communications interface between the VolP gateway 30 and communications device 10 is typically the PSTN 20 and may take the form of communications lines such as standard twisted pair phone lines, coax cable and fiber optics. These communications lines may be leased lines including: T1 lines capable of transmitting at 1.54 Mbits/sec; T3 lines capable of transmitting at 45 Mbits/sec; E1 lines capable of transmitting at 2.048 Mbits/sec; and E3 lines capable of transmitting at 34 Mbits/sec. Further, the communications device 10 may also take the form of a cellular phone, satellite phone, PC, lap top or palm computer interfaced to these communications devices. The packet switched IP network 50 uses a call processing server (CPS) 40 that provides call setup and tear down capability to other gateways.

This CPS 40 also maintains an updated view of the call state and physical location of all gateway ports. CPS 40 can support thousands of simultaneous calls throughout a geographically distributed network. CPS 40 may be implemented in software running on a PC connected to the packet switched IP network 50 or any device where complex logic may be implemented such as firmware.

Referring to FIG. 2, an approach for lost packet recovery which may be used in an embodiment of the present invention relies on time domain harmonic scaling (TDHS) principles which are typically utilized for noise reduction and time scale modification of a speech signal. A detailed description of TDHS, incorporated herein by reference, is found on pages 549-551 of Discrete-Time Processing of Speech Signals by J. R. Deller, J. G. Proakis, and J. H. Hansen, Prentice Hall, Inc. 1987, ISBN 0-02-328301-7. TDHS is a time domain technique that accomplishes pitch-synchronous block reduction and interpolation. FIG 2. is an example of TDHS in which a two to one decimation or reduction process is shown for two consecutive pitch periods to form a single pitch period output. In the two charts shown in FIG. 2, time is represented in the horizontal axis and pitch frequency is represented in the vertical axis. In FIG. 2, pitch chart 60 represents two pitch periods while pitch chart (70) represents reduction of pitch chart 60 to a single pitch period.

Referring to FIG. 3, interpolation works in a similar manner to decimation discussed in reference to FIG. 2. A missing frame may be reconstructed as a linear combination of two adjacent neighboring frames as shown in FIG 3. In the Deller et al. text mentioned above and incorporated by reference herein, TDHS is utilized for noise

reduction and time scale modification of a speech signal and is often used in speech recognition. In this embodiment of the present invention, TDHS is implemented using any general purpose computer language and executes on the VolP gateway 30.

Further, TDHS is employed as forward error correction and only operates upon voice data at the receiving end of the transmission and thereby not at the transmitting end which would create further overhead. TDHS in this embodiment is used to create a missing frame from two adjacent frames of voice data. In order to provide a smooth transition, each frame is multiplied by a saw wave function. TDHS may be executed by a lost packet interpolation module 320, shown in FIG. 6, running on the VoIP gateway 30 shown in FIG. 1.

In order to allow time for processing of a missing frame, all frames received by the VolP gateway 30 are held for a time period equal to one frame, 15 milliseconds in this example, prior to playing the frame for the listener. Such a delay of a single frame is not noticeable by the human listener and therefore such a delay has no impact on the quality of the connection perceived by the participants in a conversation.

The use of TDHS to interpolate a missing frame of voice data would in most cases creates an accurate approximation of the missing frame. In the examples provided for TDHS in FIG. 2 and FIG. 3, a frame size comprises 120 samples of voice data samples taken in a 15-millisecond time frame. A packet of data may consist of one or more frames. Further, the computer time required to process two frames of 240 samples to create an interpolation of a missing frame is not significant in spite of the computational intensive nature of TDHS. However, where the VolP gateway 30 is simultaneously handling hundreds of VolP calls over a busy packet switched IP network 50, the computations required would be prohibitive. Therefore, the usage of TDHS to interpolate missing frames of voice data in a packet switched IP network is not considered the preferred embodiment of the present invention.

FIG. 4 is an example of the preferred embodiment of the present invention. This example provides for four frames of voice data including: first frame 100; second frame 110; third frame 120; and fourth frame 130. As in the discussion of FIG. 2, the frame size is set at 15 milliseconds with 120 samples of digitized voice data in each frame.

However, the length of each frame may be altered with little if any impact on the operation of the present invention. The energy reduction function 170 and the energy restoration function 180, discussed in detail below, execute on the VolP gateway 30 and operate in a forward error correction manner only on voice data received. Using this

approach, additional delays and overhead are not added to the voice data transmitted and additional bandwidth on the packet switched IP network 50 is not required.

As shown in FIG. 4, the original signal 140 transmitted by communications device 10 to VolP gateway 30 has no gaps or blank frames. However, upon receipt of the signal, the VolP gateway 30 received a single lost frame signal 150 with third frame 120 missing or significantly delayed. The preferred embodiment of the present invention detects the missing third frame 120 in the VolP gateway 30 at the receiving end of transmission and applies an energy reduction function 170, shown below and FIG. 6, to the second frame 110 which is about to be played for the listener. It is possible to apply such an energy reduction function 170 to second frame 110 upon detection that third frame 120 is missing since, as in the case where TDHS interpolation is done, all frames are held for a time period equal to one frame prior to being played for the listener. Energy reduction function 170, shown below implemented in C++ programming language, gradually reduces the energy level of the signal until a 25% reduction in signal strength is achieved at the end of the frame. Thus, in the case where 120 samples are taken per frame, the 1St sample in the frame experiences no reduction in energy level as indicated in energy reduction function 170. In the 30'" sample of the frame, a 6.25% reduction in energy level would be seen. Further, in the 60'"sample would see a 12.5% reduction, the goth sample a 18.75% reduction, and the 120"sample a 25% reduction.

ENERGY REDUCTION FUNCTION 170 . static inline void frame!nterpo!at!on1to4fa!! (int *source, int *destination, int seq, int total) { Register int i ; Float direc ; For For (i=0 ; 1<FRAMES!ZE;1++) direc = (1.-(((float) seq-1.)/4))- (F!oat)!/(fioat)tota!'(f!oat)(FRAMES!ZE-1)) ; *destination++ = (int) (direc* (float) (*source++)); } } In the example illustrated in FIG. 4, only the third frame 120 is missing from original signal 140, therefore missing third frame 120 is replaced by the second frame 110 at a 25% reduced energy level throughout the entire third frame 120. Fourth frame 130 is received by VolP gateway 30 as shown in single lost frame signal 150. However, rather than an abrupt change in energy level being played for the listener, energy restoration function 180 is applied to the fourth frame 130 to create a smooth transition. Energy restoration function 180, shown below implemented in C++ programming language, starts at the energy level generated by the energy reduction function 170 and gradually increases the energy level of the signal until a 100% restoration in signal strength is achieved at the end of the frame. Thus, in the case where 120 samples are taken per frame, the 1 su sample in the frame would experience a 25% reduction in energy level as indicated in energy restoration function 180. In the 30sample of the frame a 18.75% reduction in energy level would be seen. Further, in the 60"sample would see a 12.5% reduction, the 90'"sample a 6.25% reduction and the 120"sample would be played at 100% of its signal energy level.

ENERGY RESTORATION FUNCTION 180 <BR> <BR> I static Inline void frame_interpolation_1to4_rise (int *source, int *destination, int seq, int total) Register int i; Float direc ; For (i=0 ; 1 <FRAME-SIZE ; 1 ++) { direc = ( (float) seq-1.)/4)) + (float) i/(float) total* (float) (FRAME_SIZE-1)) ; *destination++ = (int) (direc* (float) (*source++)) ; The C++ code provided for energy reduction function 170 and energy restoration function 180 are merely supplied to illustrate the simple nature of the code used and because of this simple nature, a large number of conversations may be simultaneously handled by a Vo ! P gateway 30 Further, any general purpose programming language may be used and the specific code may take any form suitable to the application. In addition, the reduction of 25% in the energy level after a single frame loss is dependant on the frame size used. In the case where frame size is smaller, then a smaller energy reduction level should also be used. In the

case where a larger frame size is used, then a larger energy reduction should also be used.

Thus, dependent on the frame size selected by the person of ordinary skill in the art, a reduction per frame of anywhere from 5 to 50% is appropriate.

So far in the discussion of the TDHS embodiment and the preferred embodiment, examples have been provided dealing with the loss of only a single frame of data. However, on occasion more than one frame of data may be lost in any sequence of a transmission. In the case where more than one frame is lost, the preferred embodiment may still be used to mask the loss. Referring to FIG. 5, this figure is an example on how the preferred embodiment of the present invention may be used to mask the loss of up to five frames of data.

The preferred embodiment may be employed for any number of missing frames and is only dependant on the frame size and percentage reduction employed by the energy reduction function 170.

FIG. 5 is a diagram representing a time line of a series of voice frames, referred to as time line 210 through 270 received by the VolP gateway 30. Each box represents a single frame containing, for illustrative purposes only, 120 samples of digitized voice data. The shaded boxes represent voice data received or played for the listener. The blank boxes represent a missing frame of voice data. Time line 210 through 270 illustrate a progression of voice data received or created and played for the user. New frames of data appear on the right of FIG. 5 and with each consecutive time line 210 through 270, old frames drop off and are not shown on the left. Each time line 220 through 270 represents a one frame addition from the prior time. To illustrate this current output frame 200 is arbitrarily marked with an X starting at time line 210 to show its progression historically as new frames come in or are replaced in time lines 220 through 270.

Referring to FIG. 5, in time line 210, current input frame 190 is received and held for a time period of one frame while current output frame 200 is played for the listener. In time line 210, all frames are received and played for the listener at full volume. In time line 220, current input frame 190 is missing and, as discussed in reference to FIG. 4, energy reduction function 170 is applied to current output frame 200, shown in time line 210, and the resulting decreasing energy frame is played as the current output frame 200 in time line 220. Further, as indicated in time line 220, the current input frame 190 is once again missing. Therefore, the process is repeated and energy reduction function 170 is applied to the current output frame 200 shown in time line 220 and played for the listener as the current output frame 200 in time line 230. As noted in time line 230, the current input frame 190 is again missing and process of applying the

energy reduction function is again repeated. This remains the case for time line 240,250 and 260. In each time line the current output frame 190 in the prior time line has the energy reduction function 170 applied against it and it is presented to the listener as the current output frame 190. As shown in time line 260, after four consecutive frames are missing the energy reduction function 170 has so decreased the level of the current output frame so that silence is heard by the listener in that frame.

Still referring to FIG. 5, in time line 270 a current input frame 190 is received and when played for the listener, the energy restoration function 180 is applied to the current output of time line 260 so that the energy level of that frame increases gradually until it obtains 100% energy output. Then, assuming no further frames are missing, the preferred embodiment of the present invention plays each frame at a 100% energy level.

FIG. 6 is a modular configuration of the present invention being processed in VolP gateway 30 shown in FIG. 1. Only the processing involved for the VolP gateway 30 to process incoming data packets containing voice data is discussed. VolP gateway 30 both transmits packets of voice data and receives them. As discussed above, since the embodiments of the present invention employ a forward error correction approach to avoid burdening the packet switched IP Network 50, lost frame recovery only takes place for packets received. Therefore, in the discussion of the present invention, only the receiving of packets and the processing of frames not received is discussed Referring to FIG. 6, packets containing voice or other sound data are received from the packet switched IP network 50, shown in FIG. 1, and temporarily stored in memory or other mass storage device (not shown) of the VolP gateway 30 by the input packet reception module 300. Packet disassembly module 310 then orders the packets according to the sequence number contained in the header of each packet and divides them into frames of equal size prior to the execution of lost packet interpolation module 320.

In reference to the discussion of lost packet interpolation module 320 both FIG. 6 and FIG. 7 will be referred to simultaneously. Upon completion of packet disassembly module 310, lost packet interpolation module 320 starts execution in operation 400 shown in FIG. 7. In operation 410 of FIG. 7, lost packet interpolation module 320 determines if current input frame 190, shown FIG. 5, is present. If the current input frame 190 is not present, then processing proceeds to current input frame unavailable function 330 containing operations 420 and 430 shown in FIG. 7. In operation 420 shown in FIG. 7, the lost frame counter is incremented by

1. Then in operation 430, a current output frame 200, shown in FIG. 5, is generated using energy reduction function 170 as discussed above in reference to FIG. 4 and FIG. 5.

Once the current output frame 200 is generated by the current input frame unavailable function 330 using energy reduction function 170, the current output frame is played for the listener in operation 440 in frame playback module 360 and a voice or sound is generated by output voice unit 370. This output voice unit 370 may be a speaker in communications device 10 discussed in reference to FIG. 1. The lost packet interpolation module 320 then halts execution in operation 530.

In the situation where it is determined by lost packet interpolation module 320 in operation 410 that a current input frame 190 is available, processing proceeds to operation 450 shown in FIG. 7. In operation 450, lost packet interpolation module 320 determines if the lost frame counter is greater than zero indicating that a prior frame has been lost. Where a prior frame was lost as indicated by the lost frame counter, a current input frame available and at least one frame lost function 340 is executed. The current input frame available and at least one frame lost function 340, shown in FIG. 6, comprises operations 460,480 and 490 shown in FIG. 7. In operation 460, a current output frame 200 is generated using energy restoration function 180 discussed above in reference to FIG. 4 and 5. As discussed above, energy restoration function 180 is employed to increase the energy output of the current output frame 200 as shown and discussed in reference to time line 270 of FIG. 5.

Still referring to FIG. 6 and FIG. 7, once the current input frame available and at least one frame lost function 340 using the energy restoration function 180 creates a current output frame 200, the frame playback module 360 plays the current output frame 200. The current input frame available and at least one frame lost function 340 in operation 480 sets the current output frame 200 to current input frame 190 and in operation 490 sets the lost frame counter to zero. The lost packet interpolation module 320 then terminates execution in operation 530 and a voice or sound is generated by output voice unit 370. This output voice unit 370 may be a speaker in communications device 10 discussed in reference to FIG. 1.

Still referring to FIG. 6 and FIG. 7, in the situation where it is determined by lost packet interpolation module 320 in operation 450 that no prior frames have been lost since the lost frame counter is not greater than zero. Processing proceeds to a current input frame available and no lost frame function 350 comprising operations 510 and 520. However, first in operation 500 of FIG. 7, the frame playback module 360 plays the current output frame 200. Then in operation 510, current input frame available and no lost frame function 350 sets the current

output frame 200 equal to the current input frame 190, shown in FIG. 5. In operation 520 the lost frame counter is set to zero by the current input frame available and no lost frame function 350 and a voice or sound is generated by output voice unit 370. This output voice unit 370 may be a speaker in communications device 10 discussed in reference to FIG. 1. The lost packet interpolation module 320 then halts execution in operation 530.

Using the preferred embodiment of the present invention, lost frames of voice data can be replaced to create an excellent substitution of the missing frames which is acoustically pleasing to a human listener. The preferred embodiment of the present invention accomplishes this through a simple and fast executing algorithm which enables the handling of a large number of simultaneous conversations While we have shown and described only a few examples herein, it is understood that numerous changes and modifications as known to those skilled in the art could be made to the present invention. For example, reference has been made to the transmission and reception of voice information in the present invention, however, the present in not limited to voice information. The present invention may be used for any realtime sound transmission over a packet switched IP network. Further, the present invention may be used to receive sound data in conjunction with video data. Therefore, we do not wish to be limited to the details shown and described herein, but intend to cover all such changes and modifications as are encompassed by the scope of the appended claims.

Previous Patent: SERVER AND METHOD PROVIDE ACCESS TO A NETWORK

Next Patent: ELECTRICAL APPLIANCE COMPRISING A HOUSING