PROVIDE TEXT TRANSCRIBED FROM AN UNKNOWN CALLING PARTY'S SPOKEN IDENTIFICATION

Title:

PROVIDE TEXT TRANSCRIBED FROM AN UNKNOWN CALLING PARTY'S SPOKEN IDENTIFICATION

Document Type and Number:

WIPO Patent Application WO/2016/048203

Kind Code:

Abstract:

An enhanced identification service classifies (204) a calling party as a known or unknown party, such as by analysis of metadata (e.g., CDR) or consulting a network address book (119). If the calling party is classified (204) as unknown, it is prompted (208) to provide a brief spoken identification. This voice response is transcribed to text using voice recognition technology. The transcribed text is provided (210) to a device associated with the called party when it is "rung," (206) such as by including it in the "From" field of a SIP message header (from which it may be displayed in a Caller ID notice to the called party). In one embodiment, information relating to the reliability of the transcribed text as an identification of the calling party is maintained, and a recommendation is provided prior to ringing the device associated with the called party as to whether the calling party has a reputation for truthful/helpful identification via the voice response. The called party may update the database after each unknown call by indicating the accuracy of the transcribed text.

Inventors:

WANG KEVEN (SE)
VANDIKAS KONSTANTINOS (SE)
TSIATSIS VLASIOS (SE)

Application Number:

PCT/SE2014/051091

Publication Date:

March 31, 2016

Filing Date:

September 24, 2014

Export Citation:

Click for automatic bibliography generation Help

Assignee:

ERICSSON TELEFON AB L M (SE)

International Classes:

H04M3/436

Domestic Patent References:

WO2000005860A1

2000-02-03

Foreign References:

US20070047532A1	2007-03-01
EP1755324A1	2007-02-21

Other References:

"Preventing Spam For SIP-based Instant Messages and Sessions; Kumar Srivastava and Henning Schulzrinne;", 28 October 2004 (2004-10-28), Retrieved from the Internet
"Digital cellular telecommunications system (Phanse 2+); Universal Mobile Telecommunications System (UMTS); LTE; IP Multimedia Subsystem (IMS); Stage 2 Release 12", 3GPP TS 23.228 VERSION 12.6.0, 22 September 2014 (2014-09-22), Retrieved from the Internet
MUHAMMAD AJMAL AZAD ET AL.: "Caller-REP: Detecting unwanted calls with caller social strength;", COMPUTERS & SECURITY #39, November 2013 (2013-11-01)

Attorney, Agent or Firm:

EGRELIUS, Fredrik (Patent Unit Kista DSM, Stockholm, SE)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A method (200), performed by a network device (114) operative in a communication network (100), of detecting a potentially unknown calling party and providing identifying information about the calling party, comprising:

receiving (202) a call indication from a device associated with a calling party and initiated by the calling party;

classifying (204) the calling party as being a known or unknown party;

prompting (208) an unknown calling party to provide spoken identifying information; and providing (210) text transcribed from the unknown calling party's spoken identification to a device associated with the called party (206).

2. The method (200) of claim 1 wherein classifying (204) the calling party as being a known or unknown party comprises inspecting metadata associated with the call, and classifying (204) the calling party as being unknown if no or null identifying information is provided in a calling party identification field.

3. The method (200) of any preceding claim wherein classifying (204) the calling party as being a known or unknown party comprises inspecting metadata associated with the call, and classifying (204) the calling party as being unknown if the metadata indicates the call originates from a geographical distance in excess of a predetermined range from a geographic region associated with the called party.

4. The method (200) of any proceeding claim wherein classifying (204) the calling party as being a known or unknown party comprises:

accessing an historical database of metadata related to calls to and by the device

associated with the called party;

analyzing the metadata to discover a communication pattern useful in classifying the calling party; and

classifying (204) the calling party as being a known or unknown party based on the

communication pattern.

5. The method (200) of claim 4 wherein the communication pattern is the history of prior calls between the devices associated with the called and calling party, and wherein the calling party is classified as unknown if the device associated with the called party has had no calls to or from any device associated with the calling party within a predetermined call history duration.

6. The method (200) of either of claims 4 or 5 wherein the communication pattern comprises the geographical origin of the call and of prior calls to and from the device associated with the called party, and wherein the calling party is classified as unknown if the device associated with the called party has had no calls to or from the geographical origin of the call within a predetermined call history duration. 7. The method (200) of either of claims 5 or 6 wherein the call history duration is for a predetermined time period.

8. The method (200) of any of claims 4-7 wherein the communication pattern is the duration of calls between the devices associated with the called and calling party, and wherein the calling party is classified as unknown if no call from the device associated with the calling party has lasted more than a predefined call length duration.

9. The method (200) of claim 8 wherein the call length duration is thirty seconds. 10. The method (200) of claim 1 wherein classifying (204) the calling party as being a known or unknown party comprises accessing a network address book (1 19) associated with the called party, and classifying the calling party as being an unknown party if no metadata associated with the calling party is in the network address book (1 19). 1 1. The method (200) of claim 1 wherein prompting (208) an unknown calling party to provide spoken identifying information comprises:

sending a message to a multimedia processing function, the message including an identification of the call and directing the multimedia processing function to synthesize and send to a device associated with the calling party a voice prompt requesting the unknown calling party to provide spoken identifying information, and to transcribe the spoken identifying information to text using speech recognition; and

receiving from the multimedia processing function text transcribed from the unknown calling party's spoken identification.

12. The method (200) of claim 1 wherein providing (210) text transcribed from the unknown calling party's spoken identification to the device associated with the called party (206) comprises inserting the transcribed text into metadata associated with the call, and forwarding the call to the device associated with the called party.

13. The method (200) of claim 12 wherein inserting the transcribed text into metadata associated with the call comprises inserting the transcribed text into the From field of a Session Initiation Protocol (SIP) header. The method (200) of claim 1 further comprising:

receiving from the device associated with the called party a request for a calling party identification service; and

registering the called party for calling party identification service; and

wherein the classifying (204), prompting (208), and providing (210) steps are only

performed if the called party has previously registered for the calling party identification service.

The method (200) of claim 1 further comprising:

receiving from the device associated with the called party an indication of the accuracy of the transcribed text in identifying the calling party; and

updating a reputation database using the accuracy indication. 16. The method (200) of claim 15 further comprising, prior to prompting (208) the unknown calling party to provide spoken identifying information:

searching for the unknown calling party in the reputation database; and

either prompting the unknown calling party to provide spoken identifying information or rejecting the call, based on the results obtained from the reputation database.

17. A network device (1 14) configured to be operative in a communication network, comprising:

a network communication interface (12) operative to send and receive messages to and from other devices (1 15, 116, 118, 120, 122) in the network;

memory (16); and

a controller (14) operatively connected to the network communication interface (12) and the memory (16), and operative to:

receive (202) a call indication from a device associated with a calling party

directed to a device associated with a called party;

classify (204) the calling party as being a known or unknown party; prompt (208) an unknown calling party to provide spoken identifying information; and

provide (210) text transcribed from the unknown calling party's spoken

identification to the device associated with the called party (206).

18. The network device (1 14) of claim 17 wherein the communication network (102) includes an IP Multimedia System (IMS), and wherein the network device (1 14) hosts one or more Call Session Control Functions (CSCF) providing call handling support to the called party.

19. The network device (1 14) of claim 17 wherein the controller (14) is operative to classify (204) the calling party as being a known or unknown party by:

accessing an historical database of metadata related to calls to and by the device

associated with the called party;

analyzing the metadata to discover a communication pattern useful in classifying the calling party; and

classifying the calling party as being a known or unknown party based on the

communication pattern.

20. The network device (1 14) of claim 19 wherein the communication pattern is the history of prior calls between the devices associated with the called and calling party, and wherein the calling party is classified (204) as unknown if the device associated with the called party has had no calls to or from the device associated with the calling party within a predetermined call history duration.

21. The network device (1 14) of either of claims 19 or 20 wherein the communication pattern comprises the geographical origin of the call and of prior calls to and from the device associated with the called party, and wherein the controller (14) is operative to classify (204) the calling party as unknown if the device associated with the called party has had no calls to or from the geographical origin of the call within a predetermined call history duration.

22. The network device (1 14) of any of claims 19-21 wherein the communication pattern is the duration of calls between the devices of the called and calling party, and wherein the controller (14) is operative to classify (204) the calling party as unknown if no call from the device associated with the calling party has lasted more than a predefined call length duration.

23. The network device (1 14) of claim 17 wherein the controller (14) is operative to classify (204) the calling party as being a known or unknown party by accessing a network address book (1 19) associated with the called party, and classify (204) the calling party as being an unknown party if no metadata associated with the calling party is in the network address book (1 19).

24. The network device (1 14) of claim 17 wherein the controller (14) is operative to prompt (208) an unknown calling party to provide spoken identifying information by:

sending a message to a multimedia processing function, the message including an

identification of the call and directing the multimedia processing function to synthesize a voice prompt requesting the unknown caller to provide spoken identifying information and to transcribe the spoken identifying information to text using speech recognition; and

receiving from the multimedia processing function text transcribed from the unknown calling party's spoken identification.

25. The network device (1 14) of claim 24 wherein the multimedia processing function implements a Media Resource Function for an IP Multimedia System.

26. The network device (1 14) of claim 17 wherein the controller (14) is operative to provide (210) text transcribed from the unknown calling party's spoken identification to the device associated with the called party by inserting the transcribed text into metadata associated with the call, and forwarding the call to the device associated with the called party (206).

27. The network device (1 14) of claim 26 wherein the controller (14) is operative to insert (210) the transcribed text into metadata associated with the call by inserting the transcribed text into the From field of a Session Initiation Protocol (SIP) header.

The network device (1 14) of claim 17 wherein the controller (14) is further operative to: receive from the device associated with the called party a request for a calling party identification service; and

wherein the controller is operative to perform the classifying, prompting, and providing steps if the called party has previously registered for the calling party identification service.

The network device (1 14) of claim 17 wherein the controller (14) is further operative to: receive from the device associated with the called party an indication of the accuracy of the transcribed text in identifying the calling party; and

update a reputation database using the accuracy indication.

30. The network device (1 14) of claim 29 wherein the controller (14) is further operative to, prior to prompting (208) the unknown calling party to provide spoken identifying information: search for the unknown calling party in the reputation database; and

either prompt the unknown calling party to provide spoken identifying information or reject the call, based on the results obtained from the reputation database.

31. The network device (1 14) of either of claims 29 or 30 wherein the reputation database is maintained in a Home Subscriber Server in the communication network (102).

32. A communication system (102) comprising:

a network device (114) hosting one or more Call Session Control Functions (CSCF) operative to provide call handling support to a called party;

a network device (118) hosting a Media Resource Function (MRF) operative to provide media related functions; and

one or more network devices (116) hosting application servers;

wherein the CSCF is operative to:

receive (202) an indication of a call from a device of a calling party directed to a device associated with the called party;

wherein the CSCF or one or more application servers is operative to:

classify (204) the calling party as being a known or unknown party; if the calling party is classified as unknown, direct the MRF to prompt the calling party to provide spoken identifying information; and receive from the MRF text transcribed from the unknown calling party's spoken identification; and

wherein the CSCF is further operative to:

provide (210) the device associated with the called party with the transcribed text upon alerting the called party of the call (206); and

wherein the MRF is operative to:

receive information identifying the called party from the CSCF;

play a voice prompt to the calling party requesting a brief voice identification; receive a voice response from the device of the calling party;

transcribe the voice response to text using speech recognition technology; and send the transcribed text to the CSCF.

33. The system (102) of claim 32 further comprising:

A network device (122) hosting a Home Subscriber Server (HSS); and

wherein the CSCF is operative to

upon receiving (202) an indication of a call from the device associated with a calling party, contact the HSS to ascertain whether the called party is registered for enhanced identification service; and

if the called party is not registered for enhanced identification service,

suppressing the communication with the MRF; and wherein the HSS is operative to:

maintain subscriber registrations for enhanced identification service; and respond to a query from the CSCF whether the called party is registered for enhanced identification service.

34. The system (102) of claim 32 further comprising:

a network device (122) hosting a Home Subscriber Server (HSS); and

a network device (120) hosting a Policy Management function;

wherein the HSS is operative to:

maintain reputation information indicating the accuracy of identification

information provided by devices associated with calling parties; and provide the Policy Management function with reputation information associated with a requested party; and

wherein the Policy Management function is operative to:

retrieve reputation information from the HSS for an unknown calling party;

analyze the reputation information and make a recommendation regarding

handling a call from the device associated with the calling party; and send the recommendation to the CSCF; and

wherein the CSCF is further operative to:

receive reputation information from the device associated with a called party at the termination of a call with the device associated with an unknown calling party;

send the reputation information and an indication of the calling party to the HSS.

35. A non-transitory computer readable medium (16) having stored thereon a computer program product (18) for detecting a potentially unknown calling party and providing identifying information about the calling party, the computer program product (18) comprising program instructions operative to cause a computing device (14) to perform the following steps:

receive (202) a call indication from a device associated with a calling partyand initiated by the calling party;

classify (204) the calling party as being a known or unknown party;

prompt (208) an unknown calling party to provide spoken identifying information; and provide (210) text transcribed from the unknown calling party's spoken identification to the device associated with the called party (206).

36. A network device (114) hosting one or more functional modules operative to provide call handling support to a called party in a communication network (102), comprising:

a call receiving module (130) for receiving a call from a device associated with a calling party and initiated by the calling party; a known party determination module (132) determining whether the calling party is known or unknown;

a prompting module (134) for prompting an unknown calling party to provide spoken identifying information;

a call modification module (136) for inserting text transcribed from the unknown calling party's spoken identification into a header field associated with the call; and a call forwarding module (138) for forwarding the call to a device associated with the called party. 37. A computer program (18) for detecting a potentially unknown calling party and providing identifying information about the calling party, the computer program (18) comprising program instructions which when run on a network device (1 14) causes the network device (114) to perform the following steps:

receive (202) a call indication from a device associated with a calling party and initiated by the calling party;

classify (204) the calling party as being a known or unknown party;

Description:

PROVIDE TEXT TRANSCRIBED FROM AN UNKNOWN CALLING PARTY'S SPOKEN

IDENTIFICATION

TECHNICAL FIELD

The invention relates generally to communication networks, and in particular to the detection of calls from devices that are associated with unknown callers and e.g. a network device that obtains spoken identification information from a device of an unknown caller and provides transcribed text of the information to a called party device. BACKGROUND

The telephone is simultaneously one of the greatest communication tools in history, and one of the greatest sources of annoyance. Since deployment of the first telephone exchanges in the late 19 ^th century, individuals have sought ways to screen incoming calls. Businesses accomplished this by having secretarial staff answer and screen all calls, but such screening remained beyond the means of most individuals outside the office setting. The advent of caller identification (caller ID) in the early 1970s provided the first generally available means for ordinary users to ascertain the source of a call prior to answering it. The technology to block or "spoof" caller ID quickly followed.

With the evolution of telephone technology to the digital realm and packet-based networks, e.g. Voice over IP (VoIP), a large amount of information is now associated with each call routed through a communication network. This information may comprise attributes of each specific, individual call (e.g., voice call, SMS message, multimedia session, or other

communication transaction), such as the numbers of the calling and called parties devices, an identification of the calling party (which may be a business or organization name), and the like. Information is also generated about each call in response to its traversal through the network, such as the type of call, the time and duration of the call, and the like. This information is referred to broadly herein as "metadata" - that is, data about data. Metadata is information about the call, and is distinct from the data that makes up the call itself, such as VoIP data, the text of an SMS message, or the like. Metadata may relate to the type or character of a call, such as an emergency E-112 or E-911 call; it may be used for the generation of telephone bills to subscribers; or it may identify and/or include information about the called and calling parties.

Modern telephone terminals, such as smartphones, include sophisticated caller ID features. For example, a terminal may index a local or network-based address book or contacts list using the calling party number, and retrieve a name, company, and even photograph of the calling party, which are displayed on the screen when the phone "rings." It is also known in the art for a calling party device to provide metadata indicating the purpose or context of the call. For example, U.S. Application Publication No. 2007/0047726 describes a dialing application used by a calling party device to place a call to a called party device. The calling party is prompted to enter a text or voice annotation describing the nature of the desired conversation. This information is delivered to the called party device upon the phone ringing. The called party may then accept the call, or defer it to voicemail, in response to the information about the nature of the call.

However, even with all the metadata associated with most calls in modern

communication networks, and with increasingly sophisticated telephone terminal equipment, individuals are still beset by calls from devices associated with unknown or undesirable parties. Some individuals block or spoof calling party information with the express intent of hiding or falsifying the source of the call. However, calling party information may also be blocked or changed beyond a calling party's control. For example, some companies either block the calling party number and other calling party identification information, or substitute some corporate name for it. In other cases, multiple parties may use the same telephone (e.g., members of the same household), and an indication of which individual is placing the call would be beneficial.

Individuals are free to ignore incoming calls which lack proper identification; however, due to the many cases in which the lack of such identification is not necessarily nefarious, they may hesitate to reject all such calls. There exist in the art some client-side applications that enforce predefined rules, such as rejecting all calls not in specified address books, or displaying geographic information to indicate whether the call is local or long distance. However, these systems work only with metadata as it exists at the time a call is received, and such metadata may be incomplete, or may have been blocked or spoofed. Hence, the called party remains unsure of the identification of the calling party, and does not have sufficient information to intelligently decide whether to accept or reject the call. This decision is of particular import during inconvenient times, such as meetings, formal dinners, and the like.

The Background section of this document is provided to place embodiments of the present invention in technological and operational context, to assist those of skill in the art in understanding their scope and utility. Unless explicitly identified as such, no statement herein is admitted to be prior art merely by its inclusion in the Background section.

SUMMARY

The following presents a simplified summary of the disclosure in order to provide a basic understanding to those of skill in the art. This summary is not an extensive overview of the disclosure and is not intended to identify key/critical elements of embodiments of the invention or to delineate the scope of the invention. The sole purpose of this summary is to present some concepts disclosed herein in a simplified form as a prelude to the more detailed description that is presented later.

It is an object of the invention to overcome or limit at least some disadvantages of the prior art. In particular, although it is known to annotate a call with metadata provided by a device associated with a calling party, i.e., in the form of a brief text or voice message, there is still no known means to ascertain the identification of a calling party, which may in many cases be obscured or unhelpful in identifying an individual.

According to one or more embodiments disclosed and claimed herein, an enhanced identification service classifies a calling party as a known or unknown party, such as by analysis of metadata (e.g., CDR) or consulting other network functionality, such as a network address book. If the calling party is classified as unknown, he or she is prompted to provide a brief spoken identification. This voice response is transcribed to text using voice recognition technology. The transcribed text is provided to a device of the called party when it is "rung," such as by including it in the "From" field of a SIP message header (from which it may be displayed in a Caller I D notice to the device of the called party). In one embodiment, information relating to the reliability of the transcribed text as an identification of the calling party is maintained, and a recommendation is provided prior to ringing the device of the called party as to whether the calling party has a reputation for truthful/helpful identification via the voice response. The network device handling the call may update the database after each unknown call by indicating the accuracy of the transcribed text, based on feedback from the called party.

One embodiment relates to a method, performed by a network device, of detecting a potentially unknown calling party and providing identifying information about the calling party. A call is received from a device associated with a calling party and that was initiated by the calling party. The calling party is classified as being a known or unknown party. An unknown calling party is prompted to provide spoken identifying information. Text transcribed from the unknown calling party's spoken identification is provided to the device of the called party.

Another embodiment relates to a network device operative in a communication network. The device includes a network communication interface operative to send and receive messages to and from other devices in the network, memory, and a controller operatively connected to the network communication interface and the memory. The controller is operative to receive a call indication from the device of a calling party directed to the device of a called party; classify the calling party as being a known or unknown party; prompt an unknown calling party to provide spoken identifying information; and provide text transcribed from the unknown calling party's spoken identification to the device of the called party.

Yet another embodiment relates to a communication system including an IMS. The network includes a network device hosting one or more Call Session Control Functions (CSCF) for a called party, a network device hosting a Media Resource Function (MRF), and one or more network devices hosting application servers. The CSCF is operative to receive an indication of a call from the device of a calling party directed to the device of a called party. The CSCF or one or more application servers is operative to classify the calling party as being a known or unknown party; if the calling party is classified as unknown, direct the MRF to prompt the calling party to provide spoken identifying information; and receive from the M RF text transcribed from the unknown calling party's spoken identification. The CSCF is further operative to provide the device of the called party with the transcribed text upon alerting the called party of the call. The MRF is operative to receive information identifying the called party from the CSCF; play a voice prompt to the calling party requesting a brief voice identification; receive a voice response from the calling party; transcribe the voice response to text using speech recognition technology; and send the transcribed text to the CSCF.

Still another embodiment relates to a non-transitory computer readable medium having stored thereon a computer program product for detecting a potentially unknown calling party and providing identifying information about the calling party, the computer program product comprising program instructions operative to cause a computing device to perform the steps of: receiving a call indication from a device associated with a calling party and initiated by the calling party; classifying the calling party as being a known or unknown party; prompting an unknown calling party to provide spoken identifying information; and providing text transcribed from the unknown calling party's spoken identification to the device of the called party.

A still further embodiment relates to a network device hosting one or more functional modules operative to provide call handling support to a called party in a communication network. The functional modules include a call receiving module for receiving a call from a device associated with a calling party and initiated by the calling party; a known party determination module determining whether the calling party is known or unknown; a prompting module for prompting an unknown calling party to provide spoken identifying information; a call modification module for inserting text transcribed from the unknown calling party's spoken identification into a header field associated with the call; and a call forwarding module for forwarding the call to the device of the called party.

Another disclosed embodiment is a computer program for detecting a potentially unknown calling party, and providing identifying information about the calling party to the device of a called party. In this embodiment, the computer program comprising program instructions which when run on a network device causes the network device to receive a call indication from a device associated with a calling party. The call is initiated by the calling party. The program instructions are also , when run on the network device, causing the network device to classify the calling party as being a known or unknown party, prompt an unknown calling party to provide spoken identifying information, and provide text transcribed from the unknown calling party's spoken identification to the device of the called party.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described more fully hereinafter with reference to the accompanying drawings, in which embodiments of the invention are shown. However, this invention should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Depictions of the same or similar elements in different drawing figures are assigned the same reference numeral.

Figure 1 is a functional block diagram of relevant portions of interworking

telecommunication networks.

Figures 2A and 2B are call diagrams of a method of enhanced identification.

Figure 3 is a sequence diagram of a method of detecting a potentially unknown caller and providing identifying information about the caller.

Figure 4 is a functional block diagram of a network device according to one embodiment, and

Figure 5 is a functional module diagram of a network device according to one

embodiment

DETAILED DESCRIPTION

It should be understood at the outset that although illustrative implementations of one or more embodiments of the disclosure are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence. The disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.

Although embodiments of the invention may be implemented in a variety of

communication networks, one specific example of a network environment in which the invention may be advantageous is presented herein, to provide a fully enabling disclosure of the invention. However, it should be noted that the invention is not in any way limited to being implemented in this specific network type.

Figure 1 depicts a network 100, in which only relevant nodes and functions are depicted, for the purpose of discussion herein. The network 100 includes network devices 102 hosting functions that comprise a core network and peripheral services of a wireless communication network, such as UMTS. The wireless network devices 102 communicate via a Radio Access Network (RAN) 104, such as Long Term Evolution (LTE), to a plurality of subscriber User

Equipment (UE) 106. As well known in the art, the RAN 104 includes various radio frequency (RF) devices (not shown) required to effect communication with mobile UE 106.

The network devices 102 also communicate via one or more telecommunication networks 108, and possibly through subsequent networks such as the Public Switched

Telephone Network (PSTN) 110, the Internet (not shown), or the like, to a landline terminal 112. Of course this particular network configuration 100 is representative only.

The numerous network devices 102 that comprise the core network and associated peripheral devices and functions may implement the function of an IP Multimedia Subsystem (IMS), which is a general-purpose, open industry standard for voice and multimedia communications over packet-based IP networks. Communications between devices hosting functions within the IMS network utilize the Session Initiation Protocol (SI P). SI P is a signaling protocol for Internet conferencing, telephony, presence, events notification, instant messaging, and the like.

An IMS network implements multimedia sessions - referred to generically as "calls" - via Call Session Control Functions (CSCF), represented in Figure 1 by a single network device 1 14 hosting CSCF functionality. In any particular implementation, the CSCF function may comprise a number of functions, which may be implemented on different network devices. Such CSCF functions may comprise a Serving CSCF (S-CSCF) that initiates, manages, and terminates multimedia sessions; an optional Interrogating-CSCF (l-CSCF), which is a SIP proxy located at the edge of an administrative domain; and an l-CSCF, which may be connected to a Proxy- CSCF (P-CSCF), which is a SI P proxy that is the first point of contact to the IMS. For simplicity of explanation and without loss of generality, all relevant CSCF functionality is represented as being hosted on network device 1 14. Furthermore, as described more fully herein, the CSCF functionality on the network device 1 14 may perform other functions than call handling, such as maintaining Call Data Records, ascertaining whether a called party is unknown, prompting a calling party to provide spoken identification information, and the like. In embodiments describe herein, the CSCF function on network device 1 14 performs these functions; however, in an IMS system, such functionality may be distributed. For example, one or more application servers 1 16 may perform one, some, or all of these functions.

Additional network devices host functions useful to the communication network 102. Various application servers (AS) hosted by network devices 1 16a, 1 16b, 1 16c, provide services such as email, file sharing, media services, location services, and the like. Additionally or alternatively, such application servers may provide telephony related functionality to augment the functions provided by the CSCF. For example, one particular AS function of relevance herein is a Media Resource Function (MRF), hosted by network device 1 18. The MRF is a service, defined in IMS, that provides media related functions such as media manipulation (e.g., voice stream mixing) and playing of tones and announcements. Another AS function of relevance is a Policy Management function, hosted by network device 120. As explained in greater detail herein, the Policy Management provides guidance, in some cases, to a CSCF function as to how to handing an incoming call. The Policy Management function may access a Home Subscriber Server (HSS) database, hosted by network device 122. As well known, the HSS is a database that contains user-related and subscriber-related information. It also provides support functions in mobility management, call and session setup, user authentication and access authorization, and the like.

Another network device 1 15 hosts one or more functions that maintain address books 1 19 for subscribers. In one embodiment, the network device 1 15 may host a Presence and Group Management (PGM) function 1 17, which tracks and conveys the availability of subscribers for communication services. In other embodiments, different functions hosted on the network device 1 15 may maintain subscriber network address books 1 19.

Numerous other network devices, not depicted in Figure 1 may host numerous functions necessary or ancillary to the network devices 102.

According to one embodiment of the invention, upon receiving a call, an appropriate network function, such as a CSCF hosted on network device 1 14, determines whether the calling party is known or unknown. One straightforward way to make this determination is to inspect metadata associated with the incoming call. For example, if the calling party number or identifier (e.g., the From filed of a SI P header) is empty or contains null data, the calling party may be classified as unknown.

Similarly, geographical information may be extracted, such as by accessing a location service hosted on a network device 1 16a-c. The geographical information about the calling party may be used in various ways to determine whether the calling party is unknown. For example the calling party may be classified as unknown if the call originated from greater than a predetermined geographical distance from the called party's office, region, home network, or similar geographical classification. As another example, all calls from certain countries or regions may be deemed to originate from the device of an unknown calling party. As yet another example, an historic call list may be maintained for each called party, including the geographic location of the origin of the calls. In this case, a calling party calling from a country or region a predetermined distance from the called party's home area, that is additionally from a country or region from which the called party has not received a call in a predetermined duration, may be deemed of unknown origin. Those of skill in the art may readily define other rules using geographical information that may indicate an unknown calling party.

However, such simple inspections will, in general, fail to meaningfully determine whether the calling party is known to the called party. It may be that the called party regularly converses with one or more individuals whose calls emanate from a domain that routinely blocks caller I D information; these individuals are known to the called party, even if their identification is hidden. Conversely, a call may arrive from someone who appears to identify himself - that is, the From header contains a plausible name - but who is completely unknown to the called party. These so-called "cold calls" are often from telemarketers, and the called party may not wish to answer the call. Accordingly, in one embodiment, a determination is made whether a received call is from a device associated with a known or unknown calling party. As used herein the terms "known" and "unknown" mean known or unknown to the called party, respectively.

In one embodiment, the CSCF consults an historical contact list for an identifier of the device associated with the called party, and classifies the calling party as known or unknown. The historical contact list may be constructed from Call Data Records (CDR). As calls are propagated through the network, metadata associated with the call, which is necessary or useful in billing, is recorded. This metadata is maintained in one or more network devices, such as at each CSCF, in CDRs (also known as Charging Data Records). CDR information is a product of the operation of telephony exchange equipment, and is normally used for generating telephone bills to subscribers. As such, CDR information contains attributes of each specific, individual call, such as the identity of calling party, the identity of called party, the time and duration of the call, and the like. For example, the metadata stored in CDRs may include:

Phone number associated with the device of the subscriber originating the call (calling party)

Phone number associated with the device receiving the call (called party)

· Starting time of the call (date and time)

Call duration

Billing phone number that is charged for the call

Identification of the telephone exchange or equipment writing the record

A unique sequence number identifying the record

· Additional digits on the called number used to route or charge the call

Disposition or the results of the call, indicating, for example, whether or not the call was connected

Route by which the call entered the exchange

Route by which the call left the exchange

· Call type (voice, SMS, etc.)

Any fault condition encountered

In one embodiment, for each subscriber that registers for enhanced calling party identification service, the CSCF builds an historical contact list from CDRs. The historical contact list is a list comprising metadata regarding all calls that the subscriber has made, and all calls placed to the subscriber. In another embodiment, an historical contact list may be constructed "on the fly" by dynamically searching the CDRs and extracting metadata for the calls from or to a particular subscriber.

However constructed, the CSCF analyzes the historical call list to discover a

communication pattern useful for classifying a calling party as a known or unknown party. One representative communication pattern may be the history of prior communication with the called party. For example, if the called party has had no calls to or from the calling party device (as well as can be ascertained from the available metadata) within a predetermined call history duration, it may be deemed unknown. As another example, if the called party has had no communication with any party in the calling party's country or other geographic region with a predetermined call history duration, the calling party may be deemed unknown. In both cases, the predetermined call history duration may be three months. As yet another example, a communication pattern may be the duration of past calls between the called and calling party. If no such past call has lasted more than a predefined call length duration - such as thirty seconds, for example - the calling party may be assumed to be a salesperson or the like, and may be deemed an unknown party for the purpose of obtaining further identification information according to embodiments of the invention.

Alternatively, in one embodiment, the CSCF classifies the calling party as known or unknown by accessing a network address book 1 19 associated with the called party. As one example, the CSCF may access a called party's network address book from a function hosted on network device 1 15 that maintains the address book 1 19. In one embodiment, such a function may comprise a PGM function 1 17. However accessed, metadata may be obtained from such a network address book 1 19 associated with the called party, which may be analyzed similarly to metadata from an historical call list, as described above.

In either embodiment, classifying a calling party is, in one embodiment, a binary decision: the calling party is known or unknown. In another embodiment, the classification takes the form of a probability, for example, expressed as a number from 0 to 1 indicating a confidence level of the classification as a known or unknown party.

If a calling party is classified with sufficient confidence as a known party, the CSCF forwards the call to the device of the called party. On the other hand, if the calling party is classified as unknown (or classified as known but with insufficient confidence), the calling party is prompted to provide additional identifying information. To provide the most natural and widely applicable interface, in one embodiment the prompt is a voice prompt (e.g., prerecorded or synthesized), and the calling party is prompted to speak its identifying information. This information is transcribed to text using speech recognition technology.

Accepting identifying information from the called party as speech presents numerous advantages. Most calls are placed from terminals that do not have a keyboard. Even on tablet computer and smartphones, for example, that can display "soft" keyboards and capture text, modification to most voice call software programs (applications, or "apps") would be required to activate such functionality. While most telephone terminals do have a numeric keypad, generating text from such keypads is notoriously difficult, particularly for infrequent users of the techniques. A voice prompt and spoken reply, on the other hand, is a natural interface in the context of a voice telephone. Its only significant limitation is language, and it is well known in the art that both voice synthesis and speech recognition may be tailored to different languages.

Speech recognition is the automated (machine) understanding of human speech. One speech recognition technology of particular relevance is Speech-To-Text (STT), which is the translation of spoken words into text. Speech recognition is a very complex problem, and is an area of active research and technology development. Generally, the accuracy of speech recognition varies with the following factors:

Vocabulary size and confusability

Speaker dependence vs. independence

Isolated, discontinuous, or continuous speech Task and language constraints

Read vs. spontaneous speech

Adverse conditions

In the particular application of transcribing identification information spoken by a calling party over a telephone call, some of these parameters are fixed in a way that is likely to yield high recognition accuracy and require fewer computational resources than is the case in many other speech recognition applications. For example, the vocabulary can be limited to only a subset of nouns and proper nouns (i.e., people names). Also, telephones, including mobile phones, typically have very good isolation from surrounding ambient noise. Moreover, the duration of speech is necessarily very short (i.e., 2-5 words) so that it will fit into a SIP header, and can be displayed on a small screen. Additionally, the speech is continuous.

A number of speech recognition algorithms have been developed, and products are available that implement them. Examples include Hidden Markov models, Dynamic time warping (DTW)-based speech recognition, and neural networks. Given the teachings of the present disclosure, those of skill in the art may select and implement a particular speech recognition approach, as required or desired for any particular implementation. Accordingly, further details of speech recognition are not germane to explication of embodiments of the invention, and are not discussed further herein.

Although, as discussed above, an individuals' caller ID information can be blocked by means beyond their control (e.g., company-wide systems), it is clear that others deliberately hide or spoof the calling party information that attempts to identify them. It can be easily predicted that some individuals will simply speak a lie when prompted for identifying information, effectively misrepresenting themselves as someone else. While this cannot be prevented, in one embodiment, a reputation system attempts to identify such individuals, and provide for future calls an indication of their reputation for truthfully identifying themselves. The reputation system collects feedback from the called party following each call for which an unknown calling party was prompted to provide identifying information. The called party can indicate whether, or to what extent, the additional identifying information was truthful and accurate.

In one embodiment, a mobile application on the called party phone prompts the called party, after the call terminates, to add the calling party to local address book. The application may automatically populate the calling party number, and rather than ask the called party for the caller's name (as may comprise current state of the art) the calling party's name may be populated from the text that was transcribed from the calling party's speech. At this time, the mobile application may also prompt the called party to provide feedback indicating whether identifying information spoken by the calling party aligned with that party's actual identity. Such feedback may be forwarded to, e.g., the Policy Management function hosted on network device 120, or directly to a reputation database, such as may form part of HSS on network device 122. In one embodiment, the Policy Management function hosted on network device 120 is consulted during call setup. The Policy Management function is responsible for proposing an action by utilizing the unknown calling party's reputation data, in addition to information such as the called party's predefined preferences. In one embodiment, the Policy Management function accesses the calling party's reputation data from a reputation database, which may form part of the HSS hosted on network device 122. In other embodiments, the reputation database may be maintained by an AS function hosted on another network device 1 16a-c, or the Policy

Management function may maintain the reputation database directly on the network device 120. In general, the reputation database may be implemented using any known database

management system (e.g., SQL, dBASE, DB2, Access, or the like). At a minimum, the reputation database should maintain, for each calling party, a calling party ID and called-party- provided feedback as to the calling party's truthfulness in identifying himself or herself prior to the call. For example, if the reputation of a calling party is low - meaning the individual's identification speech did not align closely with the called party's assessment of his identity - then the function handling call setup, such as CSCF, may block the call.

Figure 2 depicts a representative call flow in one embodiment of the invention. This example is presented in the context of an IMS network, and some of the messages sent between network entities are SIP messages. Other messages may utilize other known network messaging protocols, such as Diameter, an authentication, authorization, and accounting (AAA) protocol. This example is for illustrative purposes only; the invention is not limited to this environment, and may be mapped to any communication network by those of skill in the art.

Initially, the called party registers for an enhanced identification service (event 1), and the CSCF updates the user's subscription (event 2), for example in the Home Subscriber Server (HSS). In an alternate embodiment, in which the enhanced identification service is implemented throughout the network and is not a subscription service, these steps may be omitted.

A calling party places a call towards the device of the called party, which may for example comprise a SIP Invite message (event 3), and the CSCF responds with a SIP 100 Trying message (event 4). The CSCF classifies the calling party as being a known or unknown party (event 5). As discussed above, in one embodiment this comprises accessing an historical database of metadata related to calls to and by the called party device. The database may be constructed from metadata stored as CDR. In another embodiment, the classification may comprise accessing a network address book 1 19 associated with the called party. Such access may be performed using well-known messaging formats, e.g., SIP in an IMS network.

If the calling party is classified as being known, the call is forwarded to the device of the called party, using standard SIP messaging well known in the art.

If the calling party is classified as being unknown, then in one embodiment the CSCF checks to ascertain if the called party subscribes to the enhanced identification service

(event 6), for example accessing the HSS. If the called party does not subscribe to the service, the call may be refused, routed to voicemail, or otherwise handled per predetermined preferences of the called party.

In one embodiment, if the calling party is classified as being unknown and the called party subscribes to enhanced identification service (or if enhanced identification service is implemented throughout the network, without the need for separate subscriptions), the CSCF searches for the unknown calling party in a reputation database. In general, a reputation database may comprise any suitable database functionality. For example, it may be

implemented according to known protocols, such as an SQL database. Alternatively, the reputation database may comprise a simple key-value storage structure, such as a hash table. The database should store calling party identifying information and one or more historical evaluations as to the veracity of his or her spoken identity responses.

In the embodiment depicted in Figure 2, the CSCF accomplishes this by sending a message to a Policy Manager identifying the unknown caller (to the extent possible by the metadata available), and requesting a recommendation based on the calling party's

identification accuracy reputation (event 7). The Policy Manager may access one or more databases, such as the HSS (event 8) to retrieve information relating the calling party's reputation. The Policy Manager analyzes the retrieved information (event 9) and sends a recommendation to the CSCF (event 10), based on the calling party's reputation. The recommendation may be, e.g., to proceed with enhanced identification service processing or to reject the call. Events 7-10 may be implemented via any suitable network messaging protocol, such as for example, SIP in IMS, or Hypertext Transfer Protocol Representational State

Transfer (HTTP REST).

Assuming the Policy Manager recommends proceeding, or if the calling party has no ascertainable reputation in the database, the CSCF prompts the unknown calling party to provide spoken identifying information. Although in some networks, the CSCF (or other network function handling calls for the called party) may do this directly, Figure 1 depicts an embodiment in which the CSCF prompts the calling party by use of the MRF. The CSCF prepares and sends to the MRF a message identifying the calling party, including a prompt to be voice synthesized, and instructing the MRF to capture and transcribe to text a voice response spoken by the calling party (event 1 1). This message may comprise a SIP Invite message, with the address of the calling party in the To field of the SIP header.

The MRF then synthesizes a voice prompt, or retrieves a prerecorded one (event 12), and acknowledges the CSCF with a SIP 200 OK message (event 13). The CSCF then sends a SIP 200 OK message to the device of the calling party, which includes the Session Description Protocol (SDP) information identifying the MRF (event 14). The MRF then plays the

synthesized/prerecorded voice prompt to the calling party (event 15), and receives a spoken response (event 16). The MRF transcribes the spoken response to text using voice recognition (step 17), and sends the transcribed text to the CSCF in a SIP BYE message (event 18). In one embodiment, the method of including the transcribed text in the SIP BYE message is similar to a method described in the IETF RFC 5552, "SIP interface to VoiceXML Media Services." The CSCF acknowledges with a SIP 200 OK message (event 19).

The CSCF then inserts the transcribed text from the unknown calling party's spoken identification into the From header of a SIP Invite message, and sends it to the device of the called party (event 20). In accordance with normal SIP procedure, the called party device responds with a SIP 180 Ringing message (event 21), and a SIP 200 OK message when the called party answers (event 22). The CSCF then sends a SIP Invite message to the calling party device, including the SDP of the called party (event 23). The calling party device acknowledges with a SIP 200 OK message (event 24), and the called and calling party devices are then connected by the call (event 25).

In an embodiment that supports the reputation database, at the termination of the call, the called party device provides feedback as to the accuracy of the identification information (event 26). The CSCF sends this information to the reputation database, such as may be maintained by the HSS (event 23).

Figure 3 depicts a method 200, performed by a network device 1 14 hosting a function for handling calls for a called party, such as a CSCF in an IMS network, of detecting potentially unknown callers and providing identifying information about the callers. The CSCF receives a call indication from a calling party device (block 202). The CSCF classifies the calling party as being a known or unknown party (block 204). The CSCF may perform this classification by consulting CDR data, a network address book, or the like. If the calling party is classified as being known (block 204), the CSCF forwards the call to the called party device (block 206). However, if the CSCF classifies the calling party as being unknown (block 204), it prompts the unknown calling party to provide spoken identifying information (block 208). The network device may do this directly, or may engage a media function, such as a MRF. When the calling party provides spoken identification information that is transcribed to text by voice synthesis, the CSCF inserts the text transcribed from the unknown calling party's spoken identification to the call, such as the From field of a SIP header (block 210). The CSCF then forwards the call, including the transcribed text identification information, to the called party device (block 206).

Those of skill in the art will readily recognize that numerous variations on the

method 200 are within the capability of those of skill in the art, given the teachings of the present disclosure. For example, the CSCF may, upon classifying the calling party as unknown, consult a reputation database to determine how to proceed. In one embodiment, it may delegate such analysis to a Policy Manager, as described above with respect to Figure 2. Other variations, and additional processing, all of which fall within the broad scope of the invention, may be readily devised by those of skill in the art.

Figure 4 depicts a network device 114 hosting a function operative to implement the method 200 and enhanced identification service described above. The network device 114 comprises a network communication interface 12, a controller 14 operatively connected to a computer program storage product 16 in the form of a memory storing a computer program 18, and a wireless communication interface 20 operatively connected to one or more antennas 22.

The network communication interface 12 may comprise a receiver and transmitter interface used to communicate with one or more other network devices, such as an HSS, MRF, or AS implementing a Policy Management function, over a communication network according to one or more communication protocols known in the art or that may be developed, such as IMS/SIP, Diameter, HTTP, RTP, RTCP, HTTPs, SRTP, CAP, DCCP, Ethernet, TCP/I P, SONET, ATM, or the like. The network communication interface 12 implements receiver and transmitter functionality appropriate to the communication network links (e.g., optical, electrical, and the like). The transmitter and receiver functions may share circuit components and/or software, or alternatively may be implemented separately.

The controller 14 may comprise any sequential state machine operative to execute machine instructions stored as one or more machine-readable computer programs 18 in the memory 16, such as one or more hardware-implemented state machines (e.g., in discrete logic, FPGA, ASIC, etc.); programmable logic together with appropriate firmware; one or more stored- program, general-purpose processors, such as a microprocessor or Digital Signal Processor (DSP), together with appropriate software; or any combination of the above.

The memory 16 may comprise any non-transitory machine-readable media known in the art or that may be developed, including but not limited to magnetic media (e.g., floppy disc, hard disc drive, etc.), optical media (e.g., CD-ROM , DVD-ROM, etc.), solid state media (e.g., SRAM, DRAM, DDRAM, ROM, PROM , EPROM, Flash memory, solid state disc, etc.), or the like. The memory 16 is operative to store, and the controller 14 is operative to execute, computer program 18. When executed, the computer program 18 is operative to implement the enhanced identification function described herein, such as the method 200. The memory 16 may store, and the controller 14 may execute, other computer programs, such as operating system functions and computer programs implementing other network server functions. The memory 16 may also store metadata, such as CDR.

The network device 1 14 is primarily contemplated as residing in the core network, for example hosting a CSCF. However, in one embodiment, the network device 1 14 providing the enhanced identification functionality may comprise a base station (referred to as NodeB or eNodeB in the LTE network) in the RAN 104, operative to communicate with User Equipment (UE) 106 within its geographic area, or cell. In this case, the device 1 14 may include a transceiver 20 operatively connected to one or more antennas 22. The transceiver 20 is operative to communicate with one or more other transceivers via a Radio Access Network 104 according to one or more communication protocols known in the art or that may be developed, such as I EEE 802.xx, CDMA, WCDMA, GSM, LTE, UTRAN, WiMax, or the like. The

transceiver 20 implements transmitter and receiver functionality appropriate to the Radio Access Network 104 links (e.g., (de)coding, (de)modulation, amplification, interference reduction, and the like). The transmitter and receiver functions may share circuit components and/or software, or alternatively may be implemented separately.

Figure 5 depicts in one embodiment a hardware functional module diagram of the network device 1 14. Each module 130-138 comprises in this embodiment dedicated hardware, programmable hardware together with appropriate firmware. In an alternative embodiment, Fig 5 depicts functional modules implemented as one or more processors together with one or more appropriate computer programs, such as computer program 18. The functional modules include at least a call receiving module 130, a known party determination module 132, a prompting module 134, a call modification module 136, and a call forwarding module 138. The call receiving module 130 is operative to receive a call from a calling party device, according to known networking protocols, such as for example SIP over IMS. The known party determination module 132 is operative to determine whether a calling party is known or unknown to the called party. The known party determination module 132 may employ numerous strategies to make this determination, as described herein, for example considering any information in a From field of the call header, metadata such as geographical information, CDR information, information from network address books, and the like.

The prompting module 134 is operative, in the event a call is determined to originate from the device of an unknown calling party, to prompt the calling party to provide spoken identifying information. The prompting module 134 may execute this function by preparing and sending network messages to other functional modules (including those hosted on other network devices) which implement the prompting on behalf of the prompting module 134. For example, the other functional module (e.g. , a MRF hosted on network device 1 15), in response to messages from the prompting module 134, may play to the calling party a synthesized or recorded prompt for identifying information, receive a spoken response, perform speech to text transcription of the spoken response, and return the transcribed text to the prompting module 134.

The call modification module 136 is operative to insert text transcribed from the unknown calling party's spoken identification (which may, for example, be received by the prompting module 134 from another network functional module) into a header field associated with the call, prior to delivering the call to the device of the called party. The call forwarding module 138 is operative to forward the call to the called party device, with the transcribed text in the header. In this manner, the called party may view the transcribed text upon being notified of the call, and may use information gleaned from the transcribed text to decide whether to take the call. In the embodiment where the modules 130-138 are implemented with the help of software, the modules may be software modules, but they may also be seen as instructions (in dependence of computer language being used for e.g. the source code) being a part of the computer program 18 without being divided into specific software modules. The enhanced identification service of the invention presents numerous advantages over the prior art. First, a more accurate assessment of which calling parties are known or unknown to the called party is performed. As described above, an unknown party may not simply be one that does not provide "From" information, but may be deduced from the called party's recent call history. Second, the unknown calling party is prompted for spoken identification information, which may be provided without access to a keyboard and without resort to extracting text from a numeric keypad. The information is transcribed to text and presented to the called party. This assists the called party in determining whether to take the call. Finally, in one embodiment a reputation database and analysis system provides additional guidance in handling the call, based on the particular unknown calling party's past history of truthfulness and helpfulness regarding its spoken identifier.

As used herein, network functions, such as CSCF, PGM, HSS, Policy Management, MRF, and the like, are understood to be implemented in network devices 1 14, 115, 116, 118, 120, 122 operative to communicate in the IMS network 102. Some such devices 116a, 1 16b, 1 16c may host application servers (AS) implementing one or more such functions. Such network devices 115, 116, 120, 122 may substantially resemble the network device of Figure 3.

The invention may, of course, be carried out in other ways than those specifically set forth herein without departing from essential characteristics of the invention. The embodiments are to be considered in all respects as illustrative and not restrictive, and all changes coming within the meaning and equivalency range of the appended claims are intended to be embraced therein.

Previous Patent: CONSTRAINED DEVICES AND METHODS FOR MANAGING A CONSTRAINED DEVICE

Next Patent: METHODS, SYSTEM AND NODES FOR HANDLING MEDIA STREAMS RELATING TO AN ONLINE GAME