Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A TRANSLATION AND TRANSMISSION SYSTEM
Document Type and Number:
WIPO Patent Application WO/2006/012655
Kind Code:
A2
Abstract:
A Universal Communications System, comprises The Lexicon, The Code, The Voice Recognition System, and The Method of Transmitting the Code that is simple to use and is able to allow communication between any language speaker.

Inventors:
ATTECK LOUIS AUGUSTUS GEORGE (AT)
Application Number:
PCT/AT2005/000277
Publication Date:
February 09, 2006
Filing Date:
July 18, 2005
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ATTECK LOUIS AUGUSTUS GEORGE (AT)
International Classes:
G06F17/28
Foreign References:
US6438524B12002-08-20
EP0327408A21989-08-09
US6236963B12001-05-22
US4852172A1989-07-25
US20040078195A12004-04-22
US20040148161A12004-07-29
US5839099A1998-11-17
Download PDF:
Claims:
Claims:
1. I) A universal communication system comprises in combination, Lexicons in a plurality of languages, and each Lexicon includes sentences, phrases and/or words in one language only and which have the same common meaning in all the Lexicons of the plurality of languages, an identification Marker associated only with the common meaning that is allocated and which acts as a link between all of the common meaning sentences phrases and/or words in all the Lexicons in all the plurality of languages, a Speech Recognition Means in each language that has been taught the common meaning utterances of its own language and which can recognise the speaker's utterance in his own language and means is provided for only the associated Marker to be transmitted to the receiver, Triggering Means to use the transmitted Marker to replay a prerecorded utterance or text of the common meaning sentence, phrase and/or words in the receivers language, and Transmitting Means for sending the Marker to the receiver wherein the speech recognition means further comprises means to adjust the speakers voice wave form to a fixed fundamental frequency that stabilises the speakers voice pattern formants. 2) A universal communication system as in Claim 1, in which the Lexicons include normal everyday telephone and mobile chat and conversations that can be understood by a majority of the people, to more advanced conversations, broadcast speeches, lectures and specialised topics such as science and medicine and the like. 3) A universal communications system as in Claims 1 and 2, in which the common meaning is translated faithfully in any other language or dialect regardless of length or complexity of the word sequences to achieve a true translation and comprise discrete blocks common to the culture and language form that convey the essence of the conversation. 4) A universal communication system as in Claims 1 to 3, in which the contents of the Lexicons in each language wherever possible comprise the main stem of the sentence or phrase and the key words that change the meaning are then added with the proviso that the aggregate stem and key words form a coherent sentence or phrase that does not require any further mental translation input and each component can have its own Marker that can be sent in the proper sequence to comprise a complete sentence or phrase that can be understood in all languages. 5) A universal communication system as in Claims 1 to 4 in which all the Lexicons in every language are periodically updated at the same time with new material to meet the requirements of its users and eventually it takes on the semblance of a compromise language form with its own style and which takes into account the cultural differences and the like when selecting the common meanings and in which the aim is to achieve a goal in which all the discrete blocks selected in all the Lexicons represent a common meaning that can be transposed in any language. 6) A universal communication system as in Claims 1 to 5 in which a predominant language such as English is selected as the lead language from which all the utterances in all the Lexicons are derived and which can be modified by feed back to incorporate the equivalent utterances in the other languages. 7) A universal communication system as in Claim 1, in which the Marker can be a code, number or other identifying label that is arbitrarily allocated and has no relationship with the utterance's voice pattern characteristics but is permanently associated only with the individual common meanings in all the Lexicons and represents a link between all the common meaning utterances in the plurality of languages and when the utterance is recognised only its allocated marker is transmitted in digital or analogue form over any communication system. 8) A universal communication system as in Claims 1 and 7, in which for example in English the Marker for "good morning" is arbitrarily allocated the number "1" and this number when transmitted to any user or between all users of the universal system will trigger the common meaning utterance in any language and for example a German receiver will hear "guten morgen" 9) A universal communication system as in Claims 1,7 and 8 in which the marker is encrypted by mutual consent between the parties for security reasons. 10) A universal communications system as in Claim 1 in which in the Speech Recognition Means may be speaker dependent or speaker independent and does not require the correspondents to learn a foreign language or dialect but all the work required is to teach their speech recognition systems as required in their instructions or alternatively with all the sentences phrases, and/or words of their own language Lexicon and the systems have means to attach the relevant Markers and as soon as the speaker's utterance is identified, the translation part of the whole exercise is complete except that means is provided for the marker to be sent automatically to the receiver. 11) A universal communication system as in Claims 1 and 10, in which the Speech Recognition Means is not required to perform a mental act but is only required to match the Lexicon utterances using any commercial system that may include only matching of templates or other identifying parameters of the voice pattern of the discrete blocks of the utterances taught to the system in as much as the Lexicons in all languages have already translated the words, phrases and sentences solving the identification difficulties presented by domain dependence and ethnic speaker variability. 12) A universal communications system as in Claims 1 and 11, in which the same speaker variability is eliminated by replacing the single microphone with two microphones one of which is positioned in the region of the Larynx (Adam's apple) to pick up the fundamental frequency waveform and the other is positioned near the mouth to pick up the complex spoken waveform that includes the fundamental frequency. 13) A universal communication system as in Claims 1 and 12 in which the sound from the Larynx that comprises the fundamental frequency that changes up and down as we speak normally and is in he form of a simple sine wave, is constantly monitored, and any deviation of this frequency from a fixed predetermined frequency is determined continuously. 14) A universal communication system as in claims 1, 12 and 13 in which the deviation from the fixed frequency produces a feedback current that is used to continuously alter the ever changing fundamental frequency of the complex wave pattern (Formants) of the spoken waveform that include the harmonics back to the level of the fixed fundamental frequency, using any suitable feedback techniques such as a Phase Locked Loop Detector (PLL) used in radios and the adjusted wave form of the utterance is fed to the Analogue to Digital Converter (ADC) of the Speech Recognition Means. 15) A universal communication system as in Claims 1, 12 to 14, in which means is provided to cause the feedback current to vary the sampling speed of the ADC, effectively causing the complex waveform to assume the monotone profile. 16) A universal communication system as in Claims 1, 12 to 15 in which mechanical means is provided which includes a constant spinning magnetic disk, or a continuous loop magnetic tape or wire that has recording means which in the case of the disk, is fixed to an arm that rotates around the spindle of the disk and is capable of rotating forward or back under the control of the feedback from the PLL to record the adjusted voice waveform, followed by a fixed pickup means to extract the monotone utterance that is passed on to the ADC, and then immediately by a fixed erase means to clear the recording from the disk for reuse. 17) A universal communication system as in Claims 1, 10 and 11 in which there is provided means to override the recognition system and insert utterances transmitted by normal communication means that are not in the Lexicons such as names of people, specialist words and universal words such as "coca cola" and the like. 18) A universal communication system as in Claim 1, 10 and 11 in which there are less parameters required for recognising the utterance and simple independent speaker recognition programs using less memory are now a reality and the speech recognition computing means can therefore be portable or transferred to a special ISP or third party computing service and the like. 19) A universal communication system as in Claim 1, in which the Triggering Means prerecords into any suitable audio storage system all the utterances of the words, phrases and sentences in its own language Lexicon in a retrievable form together with their associated Markers and has means to use the Marker that can be transmitted to the receiver by the sender that is the same as the associated Marker such that when the Markers match each other it triggers the replaying of the prerecorded utterance of its equivalent word, phrase or sentence back to the receiver in his own language. 20) A universal communication system as in Claims 1 and 19 in which the receiver has the option when getting additions to the Lexicon for teaching his speech recognition system to also use it to manually record his own voice together with the Marker into the audio storage system. 21) A universal communication system as in Claims 1, 19 and 20 in which the marker transmitted to the receiver can trigger its equivalent marker that has been downloaded into the triggering means that causes the writing of a preprogrammed text in his own language of the common meaning utterances on to a typewriter, a computer screen or any type of telephone with video or text link and the like. 22) A universal communication system as in Claim 1, in which the Transmitting Means comprises any communications systems that use any medium efficiently and includes telephone wire or cable or optical fibre or wireless or satellite communication in the form of digital or analogue form, directly or in packets over the Internet including multiplexing and 3 G and WAP technology and the like and the case of a video link or TV broadcast, the utterance latency period between speaker to receiver can be synchronised by delaying the picture to make it more realistic. 23) A universal communication system as in claims 1 and 22 in which the Transmitting Means comprise hardware that includes any type of computer such as desktop, laptop or mini computers, fixed or mobile telephones, broadcast wireless and television or any other person to person electronic equipment that can use the universal communication system. 24) A universal communication system as in any of the preceding Claims in which all • the participating persons worldwide have a similar type of system and the common meaning Marker associated with any utterance in any of the common Lexicons can trigger the equivalent utterance in any other language. 25) A universal communication system as in any of the preceding Claims when used for lectures, conferences or large meetings and the like, the Marker optionally is not transmitted but is sent to a receiving system nearby to trigger a prerecorded voice of an ideal speaker of any chosen language and if the audience is multilingual, individual receivers programmed in any language can be provided for each recipient. 26) A universal communication system as in any of the preceding Claims in which any additions to the Lexicon including their Markers are centrally or regionally prepared and are sent to all users by any means preferably in text form so that users may teach their own Speech Recognition Means and record into their Triggering Means their own voice and optionally the utterance additions may be downloaded in the form of a prerecorded voice of an ideal speaker together with the equivalent text including their common meaning Marker that has also been also centrally prepared direct via the internet or other medium. 27) A universal communication system as in any of the preceding Claims that comprises a portable version of the system such as a minicomputer that has the Triggering Means which can be preloaded from any portable storage system which may include a CD or floppy disk with the Lexicon together with the Marker in any foreign language for speaking only such as for tourists or otherwise and if the receiving person has a similar system in his own language equipped with electrical wire contact or a short distance wireless transmitter between sender and receiver they may converse directly with each other. 28) A universal communication system as in any of the preceding Claims for use in cases where there is no available physical communications system except wireless contact and the like in which the receiving persons has only a receiver such as a radio / Triggering Means combination programmed by any suitable means with their own language Lexicon and Marker, and they may listen to any news or other communication transmitted in any language or means is provided to allow any other use such as for receiving road directions and warnings and the like in towns and the like. 29) A universal communication system as in any of the previous Claims substantially as described in the Text and Claims with reference to the Diagrams and which in no way limits the applications herewith described. Fig l 1/1.
Description:
A translation and transmission system

A Universal Communication System

Introduction

In a world which is getting closer together due to fast transport and electronic communication between distant parts of the planet, one of the last great problems to be solved is how to enable people with many diverse languages to communicate with each other as easily as among themselves. At present, this problem is not yet too difficult while the main industrial trading nations have a reasonable compromise in the English Language and only simple sentences phrases are required if a foreign language is required for tourism. However as the Far East and Eastern European countries etc. become economically important, it will be imperative to either have a Universal Language or a Translation system capable of operating in real time. Additionally, within a vast country that contains many ethnic sub-languages and dialects, it is difficult for people to communicate with each other either by speech or text

Problems.

If a person wishes to talk to another person who speaks a different language, he either has to be able to speak the other language or learn a common language such as Esperanto, but these two options require the person to learn a completely new language, or speak in his own tongue and expect the other person to understand him. It is possible to speak through an interpreter but this is disjointed, takes time and it is difficult to know if your thoughts are correctly translated. Recently, with the advent of Computers with large processing capacity, even if it were possible to translate in real time in only one foreign language, when it is required to do so for many languages it becomes too costly for every person at this time, as it will require a very powerful computer and also the time taken to programme them will be too large. This scenario holds for both parties in a conversation between two different language speakers. Another problem is the amount of the "Band-width" required to carry the increased information on an increasingly overstretched telephone system, even using digital transmission. The Invention would go some way to alleviating these problems and has the advantage in that the person does not have to learn a new language, which usually takes years to achieve and for some people it is impossible, but he must only teach his computer sentences, phrases and specialist words in his own language and which can be easily done incrementally over a period of time.

Background to the Invention

The basis of the Invention is to first create a list of Sentences, Phrases and words in one predominant language such as English that conveys the essence of universal conversational usage to create a common Lexicon, and match it to the equivalent Sentences, Phrases and Words in all languages of the world such that the same meaning is faithfully conveyed to all users of the Lexicon. During the course of a conversation between two people, the common objective is for both persons to understand what is said and even though the person can speak another language he may not be understood because of his accent. For instance far eastern people because of their language cannot pronounce certain words or letters as there is no need for them to cultivate certain sounds, making it difficult to be understood by a foreign person . The Invention may be applied to all types of communication, but to explain the principles, we shall concentrate on languages that can be transmitted by fixed wires or optical cable etc, such as the telephone or computer networks etc, or by radio waves including satellite communications. When people converse with each other, they compose set sentences in discrete blocks common to their language and culture to convey the Essence of the conversation, even though certain key words of the dictionary may be substituted to change the meaning conveyed. A simple analogy is the tourist phrase book. It is also quite common nowadays to transpose certain well-known words or phrases from one language un-translated in another language such as trade names like Coco Cola and facilities such as Toilet and Exit etc. When speaking any language, the meaning is composed as a sentence but the order of the words in the sentence is not always in the same sequence, for instance, the verb placed at the end of the sentence in German as opposed to the English order or many words may have to be used in one language to translate one word in another language. A translator would recognise this and transmit it in the proper order that can be recognised by both parties. If all the sentences and phrases used in speech common to all languages were compiled into a Lexicon that may or may not include all the specialist words in the dictionary that may be used such as by doctors or scientists, then it is possible teach this lexicon into a voice recognition system that can recognise speech in the form of sentences and phrases but need not require it to be able to recognise the meaning as the meaning is already in the sentence. The sentences in the Lexicon in one language would be different from one another in the actual words but the meaning of each sentence would be the same for all languages. In certain cases if someone directly translates the word equivalents in another language, the other person may not understand what he wants to say. This system would require a much longer training time on the computer, but would not require the more complicated programme necessary for word recognition and interpretation, if it also had to guess the meaning of what was being said. At least it would be much easier than learning a language and much less so for many languages. It is envisaged that the individual will be constantly adding new material to his voice training programme and eventually he will also become very proficient in the style of the composition of the sentences consistent with his specific Lexicon. In order to keep up to date with the speakers changing voice that alters with time, the computer up-dates the latest sentences used in its database and adds any necessary information such as changes in the voice pattern. All new additions to the Lexicon in any language must be common to all languages and when new additions become necessary, there must be a central co-ordinating entity that adds the new material that is compatible to all languages. The invention now calls for each of the individual sentences and phrases in all the common Lexicons of each language that has the same meaning, to be assigned the same Code or number or other identifying marker that is common for all the languages. The Code etc. is preferably the only information transmitted and may be in digital or analogue form, and takes up a fraction of the "Band width" compared to the equivalent voice transmission of the whole sentence allowing more economical use of the medium. There is also provision for the code to be encrypted by mutual consent of the parties in contact for security reasons. It is also necessary to record the actual voice sample of the sentence, phrase or word as spoken by the speaker into his own computer memory during the teaching process, and to be able to use its Digital Code to play back the same recorded sentence of the speaker's pre-recorded voice. The end product results in the ability of the Digital Code to trigger its equivalent sentence in any language simultaneously using the receiving party's own voice, including his accent and other attributes. This makes it easier for the conversation to be better understood as it is sometimes possible that the transmission medium has static and is not perfect for normal voice communication, since the receiving party hears his own voice recording that is triggered by the digital Code transmitted by the other party and which is not easily corrupted or distorted during transmission It will now be possible for someone to speak in one language and be understood in all the languages of the world simultaneously and all the work is done in each separate language by a speaker or speakers each taking on the task of teaching and recording all the common sentences etc of the Lexicon in his own language together with the equivalent code that is used to communicate with the other language speakers. Even when real time conversations take place there will be little observed time discrepancy in the conversation, even if the digital number is sent in "packets" via any medium including the Internet instead of a more expensive dedicated telephone line. This packet system is ideal, especially when there is disruption in the transmission due to heavy traffic, by sending back-up packets e.g. by way of the Internet and the route changes automatically if there is congestion or break in the link. Ideally a fast link is required such as cable or optical systems including Broadband to the Internet for instant communication, as there could be annoying time delays in the conversation due to the slow copper cables of the local loop and the international cables. A key element in the invention is that any two or more persons in the world, once they have programmed and trained their voice recognition system using the common Lexicon sentences and phrases, can now communicate with each other using any communications medium in use today. One drawback is that the personal touch of hearing the actual voice of the other party is absent but the tonal touches inherent in expressing the language is present. Also even if the other speaker can actually speak your language it is often difficult to understand him due to his accent such as a Chinese speaker speaking in English. The play back voice may also be programmed electronically with an accent of the other foreign speaker or even from male to female, to lessen the impact. It is also possible to trigger the Voice of an ideal speaker of any one language in the case of making speeches or lectures or for delivering the Newscast to an audience in their own language delivered by a foreign speaker whose voice and computer triggers the digital Code. In this case either the code or the already translated voice can be transmitted. Another application is that the transmitted Digital Code can easily be used to simplify the print out of meaningful text in the language required that is now done by using a commercial voice recognition dictation application. In this case the Sentences and Phrases are typed out during the Voice recognition teaching session or it can be done separately by the distributor of the centralised Lexicon system typing the Sentences and Phrases and assigning the related Digital code that is used to trigger the typing and posting the data or presenting it for downloading on the internet by clients registered as users of the System to load into their memory system. This print-out may have to be edited by the recipient manually or programmed to link the sentences into a literary style for proper reading. The Invention.

A Universal Communications System comprises in combination, Lexicons in a plurality of languages, and each Lexicon includes sentences, phrases and/or words in one language only and which have the same common meaning in all the Lexicons of the plurality of languages, an identification Marker associated only with the common meaning that is allocated and which acts as a link between all of the of the common meaning sentences phrases and/or words in all the Lexicons in all the plurality of languages, a Speech Recognition means in each language that has been taught the common meaning utterances of its own language and which can recognise the speaker's utterance in his own language and means is provided for only the associated Marker to be transmitted to the receiver, Triggering means to use the transmitted Marker to replay a pre-recorded utterance or text of the common meaning sentence, phrase and/or words in the receivers language, and Transmitting Means for sending the Marker to the receiver wherein the Speech Recognition Means further comprises means to adjust the speakers voice wave form to a fixed fundamental frequency that stabilises the speakers voice pattern formants.

The Lexicon.

The Lexicon is made up of a compilation of all the possible sentences and phrases that are required in a conversation in two or more different languages, and is updated continuously as new sentences come into use. The composition of these sentences is based on all the variations possible within the vocabulary and dictionary of the language and the Lexicon expands as new phrases are added. Eventually the whole system takes on a language style of its own as the learning curve creates its own distinctive form. It is essential that the translation from one language to another must be by means of whole sentences or parts of sentences, phrases or words that have the same meaning although specialist words may be used as required using its own code, to indicate certain things or technical terms. To achieve a high level of co-ordination between the languages, the creators of the Lexicon must therefore be excellent linguists in all the languages to give the best matching sentence and they must co-operate with other specialist linguists in creating the Lexicon language.

The Code

The sentences, phrases, and specialist words within the Lexicon are each allocated a specific Code that is the only information transmitted between the sender and receiver. The Code is preferably in digital form that can be easily sent through the normal communication channels or through the Internet as packets that can take the various routing options available at any one time. The Digital Code can be transposed into Analogue form or vice-versa, for some transmissions. It may be that during a conversation we utter a sentence that is not within the Lexicon and therefore no Digital Code would have been allocated. In such an instance the voice recognition system would prompt the speaker to repeat the meaning of his utterance in a different form of words as a sentence. It may be that the Computer or even a mobile phone will be able to bring up on screen a drop-down list of sentences to choose from based on words in the rejected sentence it can recognise and searches for alternatives. At an early stage in the training process many such mistakes would happen until we eventually conform to the Lexicon language style that was originally taught to the computer or it may be that eventually with close co-operation, our mistakes may be incorporated in the universal Lexicon. In the process of time the Lexicon will comprise a language system of its own which is common to all people. The Lexicon can be updated by revising or adding more sentences, phrases and words that are necessary to improve the system and it is therefore necessary to centralise this function. This can be done by a newsletter system that can be sent out to subscribers or downloading through the Internet. The recipient is then required to teach his recognition system while allocating the corresponding Digital Code and recording his voice as he teaches the system and which is triggered by the Code when receiving telephone conversations or announcements etc. The Voice Recognition System

There are many automatic speech recognition technologies that convert speech into the written form in real time. IBM and Dragon are two examples on the market today. They all use basically the same technology with some slight variations to achieve a commercial advantage. The aim is to enable a person's speech as it is being spoken, to be converted into text with a high degree of success and therefore this requires a very complicated Computer programme. All these systems have to be able to overcome the problems of Data rate, ethnic Speaker variability, same Speaker variations, and Domain dependence and they analyse the digital information of the spoken word and print out the equivalent text in the order that it is spoken. It is therefore limited to one language. The voice recognition programme samples the speakers analogue voice waves in digital form as a number, many times per second in order to minimise the Data rate to the computer. This cuts down the size of computing power to a reasonable level using inexpensive hardware. Domain dependence is covered to a great extent in the Invention as it uses whole sentences and phrases that convey the whole meaning. Ethnic speaker variability is also a problem since within the language there are different accents and small differences between areas of the country. This can usually be solved during the teaching program for speaker dependent systems but not easily for speaker independent systems. A big problem is how to solve the variation in the same utterance from the same speaker because every utterance is unique with regard to pitch and speed of delivery etc. When a speaker repeats a word or sentence after a long elapsed period of time, there will be slight differences in the fundamental frequencies due to emphases or stress etc and it is essential to find a solution to this problem. The Invention proposes, by way of example, two preferred solution to this last problem that are used to give an idea of the scope of the invention. When we speak, air from the lungs pass over the vocal folds, in the area of the Adams apple, that vibrate to produce the Fundamental frequency of that person's voice at the time it is spoken. This vibration is modified as it passes through the throat, nasal passages and mouth, including the Tongue, producing harmonics and other vibrations that make up the speech sound. If we speak normally using our vocal folds, or whisper, in which the folds are silent, or even mime the words, the same physical alterations occur in the throat, mouth and nasal passages for the same words. The harmonics are whole multiples of the fundamental frequency and are used by speech scientists to analyse speech. These harmonics can be observed when using a sound spectrogram that produces a two dimensional picture of the harmonics called Formants. The sound emitted by a person is in the form of a complex wave pattern that comprises all the frequencies of these harmonics and it is this sound that has to be sampled and analysed by the computer programme to identify the spoken words. The speaker may also on different occasions emphasise certain words by raising or lowering the pitch of his voice thus varying the fundamental frequency and hence the harmonics. This variation makes it more difficult to carry out e.g. a Fourier analysis on the complex wave pattern to derive the simple sine waves associated with each harmonic mentioned above. If however, we can be made to speak in a monotone voice, in which the fundamental frequency remains relatively constant, the Formants being harmonics inherent in the spoken sound are easier to analyse as the complex waveform is constant for each sentence, phrase or word spoken, making sampling by the analogue to digital converter (ADC) more reproducible. Only the intervals remain which does not pose any problem to overcome in the programme. To achieve this monotone voice electronically, the Invention uses essentially two microphones that are fed into any suitable electronic system that is described below. One microphone is placed as normal in front of the mouth and the other is placed over the region of the vocal folds in the throat region, to pick up only the fundamental frequency. This fundamental frequency from the vocal folds microphone is processed to filter out extraneous sounds, and is essentially a pure sine wave. This wave is monitored for changes in frequency and speed of delivery and a feedback system is used to track this frequency change for example using a PLL detector (phase lock loop detector) that is used in radio receivers. Fig.l shows a basic PLL circuit. This fundamental frequency change is analysed and an adjustment is constantly made with reference to a pre-determined constant frequency for that person. The adjustment feed back current from the PLL is used to modify the Fundamental frequency of the mouth microphone to bring it in line with the pre-determined constant Fundamental Frequency for that person. It is possible to physically stretch or compress the harmonic wave-form after they are created and recorded, and hear the same meaning such as a gramophone voice sound when played at different speeds can be easily understood. A monotone fundamental frequency passing through the mouth and nose passages creates harmonics at the time of speaking. If the fundamental frequency alters, the harmonics will change, but the same meaning received by the listener. Using this PLL technique, it is now possible to examine the speech sound complex waveform for a word or sentence and match the profile against many such repeat recordings for an individual in a dependent speaker mode, or from a sample from one group of a population in a independent speaker mode. By analysing the profiles it is also possible to select the parts of the profile that is common to the individual or group to create a template that can be used in the speech recognition programme for either mode. Another preferred system using the monotone approach, is to train the recognition system by programming in a definite nose, mouth or tongue profile to a speech sound (comparable to sign language for deaf and dumb people) so that the speaker only has to utter a limited sequence of sounds to cover his total voice profile and if accents, dialects and other characteristics of a population can be also programmed in, then a truly foolproof speaker independent system can be devised. This new approach can also deliver both speech recognition and speaker identification systems that can lead to wide set of applications. Another preferred method is to use a simple sound profile matching system that accesses the data bank of sentences or phrases to get the best match. It may be also possible to sequence the matching process by accessing the initial words and if a match is found, to proceed with the next word in the sentence etc, until a good match is achieved. Another preferred method to produce the monotone speech is to use the feed back current of the PLL detector to alter the sampling rate of the Analogue to Digital Converter (ADC) such that when the frequency rises, the sampling rate increases which is analogous to lowering the frequency to the fixed level. This system requires very accurate sampling techniques and feedback accuracy. To achieve this goal, two preferred solutions using the feedback from the PLL system, will now be described. One preferred method for obtaining the monotone voice pattern, is described in Figs 2&3 in which the analogue signal from the mouth microphone is recorded on to any suitable recording medium such as a continuous loop magnetic tape or wire Fig 2, or a recording disk of suitable diameter Fig 3, all running at constant speed. In common with all the media, a movable recording head (a) driven by the feed back from the PLL, moves between points (x) and (y) and the playback head (b) and the erase head (c) are fixed in position. In the example of the Disk, the recording head (a) pivots around the spindle (d) of the disk, and is driven by a servo mechanism (e) that is controlled by the feed back current from the PLL. A fixed playback (b) and erase head (c) are also positioned as far ahead of the recording head as possible as shown. As the fundamental frequency in the throat microphone rises in the PLL system, the feed back current drives the recording head in the opposite direction to the spin of the disk effectively depositing a voice recording on the disk from the mouth microphone at the constant pre-determined fundamental frequency. The recorded sound is effectively stretched out a bit on the disk and is picked up by the play back head as a mono toned voice. This is analogous to the reference to the Gramophone record run at different speeds. If the fundamental frequency drops, the recording head moves in the direction of the spin effectively compressing the recorded sound to compensate for the lowering of the frequency. If the fundamental frequency is equivalent to the pre-determined frequency then the recording head remains stationary. This monotone recording is picked up as the tape or wire moves or the disk spins, under the play back head and fed to the ADC and thence to the voice recognition system. The recording is then immediately erased, as the play back and erase heads are fixed in position and close to each other and the disk is free to continue recording continuously. As the recording head moves forward to compensate for the increased frequency, it is essential that the disk diameter is large enough to accommodate whole sentences and phrases before it reaches the playback and erase heads. At the end of the sentence there is a pause for an answer and the recording head moves back to its original start point ahead of the play back/erase heads unless the speaker starts to talk again causing it to move forward or backward again. Another preferred method is to record the voice as emitted and cause the pickup head / erase heads to move back and forward in response to the PLL current as in the previous method and is analogous to the gramophone reference. Another simple method that does not require the throat microphone is to isolate the lowest frequency of a voice microphone and use it as the fundamental frequency which can be applied to any of the above methods. The utterance that proceeds to the ADC to be analysed is therefore essentially constant for the same speaker whenever he repeats the same sentence and can be more easily processed by any commercial word recognition technology or an even less complicated one which is programmed to exclusively to recognise sentences using templates or neural net-works, or any other easily achievable alternative previously developed but was found to be impracticable when domain dependence posed an even greater problem. The monotone voice is not heard by the persons sending or receiving the message since the digital Code only triggers his spoken word originally taught to and stored in the memory of his computer and which contains all the variations and emphasis of the language. If required the system can be used to identify the caller by accessing his voice pattern and matching it with a database in your computer.

Transmission of the Code

The Code may be transmitted over any medium including Telephone landlines or Optical cables or Radio or Satellite or over the Internet. The Code or other identification data may be encoded for security or sent in packets via the Internet into the receiver's computer to trigger the message. Since the information sent is small compared to a conversation and involves intermittent use of the medium, bandwidth and multiplexing could be used to efficiently use the available space afforded by the medium. The Invention can also be used in WAP- based access systems such as mobile phones or other similar applications including 3 G technology, preferably using the Internet or other communication system. The speaker must have access to a microphone system that can optionally pick up the throat signal together with the mouth signal which can be processed in the recording playback system referred to above and the modified voice signal is fed to the ADC and the sampled digital data can now be processed on a mini computer or transmitted through a mobile telephone to a central computer that has the speaker's voice training stored after being downloaded from the speaker's computer. The service provider (ISP) is one ideal place to conduct this service and the memory required is similar to that allocated for web space by the provider. The data is processed at the ISP and the digital code is sent onward to the receiver's ISP who is similarly programmed to play back the receiver's voice sound that is then transmitted to his mobile phone. The cost will be a local call and the Internet charges of the ISP to any part of the world. Since the Invention is able to cut down on the processing workload is now possible to use the released memory, using the Invention, be used to create a Speaker Independent voice recognition system that does not require individual programming but can achieve a common programme for a language irrespective of dialect or other pronunciation differences by amalgamating the sounds resulting from the physical profile of the mouth, tongue and nose etc, that are common to most speakers of a population together with the smaller differences such as accents. Such a system can be created in the laboratory based on the data fed in to create a viable Recognition Programme and eliminate the tedious task of individual programming.