Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A SYNTHETIC SPEECH GENERATION METHOD FOR GENERATING VOICE MESSAGES
Document Type and Number:
WIPO Patent Application WO/2018/225048
Kind Code:
A1
Abstract:
A method for synthetic generation of voice messages consisting of a sequence of words, which method comprises the following steps: a) visualizing a first set of icons, each icon being matched to at least one word, b) selecting a first icon from the first set by a user, c) visualizing an additional set of icons, the additional set being generated based on the icon selected at the previous step, d) selecting a further icon from the additional set by the user, steps c) and d) being iteratively repeated in order to form an orderly list of selected icons, e) synthetically generating a voice message consisting of the sequence of words matched to each icon in the list under user command, wherein the first set of icons is related to subjects, one or more additional sets are related to verbs, and the remaining sets are related to complements.

Inventors:
FLORIS FEDERICA (IT)
PIAGGIO ALESSANDRA (IT)
Application Number:
PCT/IB2018/054190
Publication Date:
December 13, 2018
Filing Date:
June 11, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
FLORIS FEDERICA (IT)
PIAGGIO ALESSANDRA (IT)
International Classes:
G10L13/04; G09B5/06
Foreign References:
US20130065204A12013-03-14
Other References:
SAMUEL SENNOTT ET AL: "Proloquo2Go Basics Training Tutorial for Proloquo2Go", 26 May 2010 (2010-05-26), XP055446595, Retrieved from the Internet [retrieved on 20180131]
ANONYMOUS: "Road to Proloquo2Go 4 - Yo hablo espaƱol | AssistiveWare", 10 May 2017 (2017-05-10), XP055446590, Retrieved from the Internet [retrieved on 20180131]
AUTISM ADVENTURES: "Nova Chat Device Tutorial", YOUTUBE, 11 November 2013 (2013-11-11), pages 2 pp., XP054978070, Retrieved from the Internet [retrieved on 20180201]
EARL: "NOVA Chat User's Guide", 8 August 2012 (2012-08-08), pages 1 - 81, XP055446571, Retrieved from the Internet [retrieved on 20180131]
Attorney, Agent or Firm:
ARECCO, Andrea (IT)
Download PDF:
Claims:
CLAIMS

1 . A method for synthetic generation of voice messages consisting of a sequence of words, which method comprises the following steps:

a) visualizing a first set of icons, each icon being matched to at least one word,

b) selecting a first icon from the first set by a user,

c) visualizing an additional set of icons, the additional set being generated based on the icon selected at the previous step,

d) selecting a further icon from the additional set by the user, steps c) and d) being iteratively repeated in order to form an orderly list of selected icons,

e) synthetically generating a voice message consisting of the sequence of words matched to each icon in the list under user command,

characterized in that

the first set of icons is related to subjects, one or more additional sets are related to verbs, and the remaining sets are related to complements.

2. The method according to claim 1 , wherein the word matched to each icon in the list is automatically inflected or conjugated based on the word matched to the previous icon in the list.

3. The method according to one or more of the preceding claims, wherein one or more sets comprise icons which are exclusively matched to an additional set.

4. The method according to one or more of the preceding claims, wherein each subject can be changed in gender, person and number in advance before step a).

5. The method according to one or more of the preceding claims, wherein icons can be changed to photos or images in advance.

6. The method according to one or more of the preceding claims, wherein two fixed and user-selectable icons are visualized, which fixed icons are matched to words "YES" and "NO" respectively.

7. The method according to one or more of the preceding claims, wherein the icons in each set comprise a visual element which is different in colour from the additional sets.

8. A device for synthetic generation of voice messages consisting of a sequence of words, which device comprises

a display for visualizing a set of icons, each icon being matched to at least one word,

a user interface for sequentially selecting a plurality of icons by a user to form an orderly list of selected icons,

a voice synthesizer for synthetically generating a voice message consisting of the sequence of words matched to each icon in the list, processing means for controlling the display, the user interface and the voice synthesizer,

characterized in that

said processing means are configured so as to carry out the method according to one or more of the preceding claims.

9. The device according to claim 8, wherein the device is a mobile device, preferably a tablet.

Description:
"A synthetic speech generation method for generating voice messages"

The object of the present invention is a method for synthetic generation of voice messages. The method comprises the following steps:

a) visualizing a first set of icons, each icon being matched to at least one word,

b) selecting a first icon from the first set by a user,

c) visualizing an additional set of icons, the additional set being generated based on the icon selected at the previous step,

d) selecting a further icon from the additional set by the user, steps c) and d) being iteratively repeated in order to form an orderly list of selected icons,

e) synthetically generating a voice message consisting of the sequence of words matched to each icon in the list under user command.

Such methods are currently known in the art and used in the context of the so-called Augmentative and Alternative Communication (AAC), i.e. the whole of knowledges, techniques, strategies and technologies aimed to facilitate and increase communication for individuals who experience difficulties in using the most common communication channels, especially oral language and writing.

Communication usually takes place through speech, writing and body language; however, individuals affected by cognitive, sensory or motor disabilities are often unable to communicate through their body, facial expressions, voice or writing. Communication deficits result in serious negative consequences at relational, linguistic, cognitive and social level, and they always involve a dramatic life situation.

Augmentative and Alternative Communication refers to the whole of knowledges, techniques, strategies and technologies which facilitate and increase communication for individuals who experience difficulties in using the most common communication channels, especially oral language and/or writing, by supplementing or replacing natural language and/or writing with alternative communication means, such as electronic communication devices, gestures or graphic symbols.

The communication modes used in this field are intended not to replace but rather to increase the natural communication: indeed, the aim of the intervention should be to improve the communication skills through any available mode and channel. Therefore, CAA is neither a substitute for oral language nor - if possible - an inhibitor for the development thereof; instead, it always supports relationships, understanding and thoughts.

The purpose of CAA is to augment communication skills both in disabled individuals and in people surrounding them in order to enable each individual having complex communication needs to make decisions, give his/her refusal or agreement, tell about his/her experiences, express his/her moods, and influence his/her own environment.

Problems in the fields of communication and language have been shown to be so important among people suffering from an autism spectrum disorder that they are one of the criteria used when diagnosing the syndrome.

DSM-V (Diagnostic and Statistical Manual of Mental Disorders) identifies specific features which are useful in the diagnosis of autism spectrum disorders:

A. Persistent deficits in social communication and social interaction across multiple contexts - not explainable by general developmental delays - as manifested by the following:

1 . Deficits in social-emotional reciprocity: abnormal social approach and difficulties in conversation, and/or reduced interest in sharing passions, emotions and affections, and/or failure to initiate social interactions. 2. Deficits in nonverbal communicative behaviours used for social interaction, ranging from poorly integrated verbal and nonverbal communication, to abnormalities in eye contact and body language, to deficits in understanding and use of non-verbal communication, to a total absence of facial expressions and gestures.

3. Deficits in developing and maintaining relationships appropriate to the development stage (excluding those with parents and caregivers): difficulties in adjusting behaviour to suit various social contexts, and/or difficulties in sharing imaginative play and making friends, and/or apparent lack of interest in people.

B. Restricted, repetitive patterns of behaviour and/or interests and/or activities as manifested by at least 2 of the following:

1 . Stereotyped and/or repetitive language and/or motor movements and/or use of objects: such as simple motor stereotypies, echolalia, repetitive use of objects, idiosyncratic phrases.

2. Excessive adherence to routine, ritualized patterns of verbal or nonverbal behaviour, and/or excessive resistance to changes (motor rituals, insistence in taking the same route or eating the same food every day, relentless questioning or arguing, or extreme distress at small changes).

3. Highly restricted, fixated interests that are abnormal in intensity or focus: strong attachment to or preoccupation with unusual objects, excessively circumscribed or perseverative interests.

4. Hyper- or hypo-reactivity to sensory input, or unusual interest in certain aspects of the environment: apparent indifference to hot/cold/pain, adverse response to specific sounds or textures, excessive smelling or touching of objects, fascination with lights or moving objects.

C. Symptoms must be present in the early childhood (however, they may not become fully manifest until social demands exceed limit of capacities).

D. Collectively, the symptoms must impair daily functioning. E. These disturbances are not better explained by intellectual disability or global developmental delay. Intellectual disability and autism spectrum disorder frequently co-occur.

Difficulties related to a disturbance attributable to autism spectrum disorder often arise in the fields of communication and language.

In most cases, individuals affected by autism spectrum disorder are unable to use appropriate and effective gestures, and they struggle to understand abstract concepts.

The most common features in verbalization of these individuals are:

1 ) Echolalia: it can be immediate, when a word or sentence is repeated immediately after being heard by another person, or delayed, when repetition relates to something heard some time before.

2) Idiosyncratic language, characterized by the use of unusual, odd expressions apparently unrelated either to the context in which verbal interaction occurs or to the contents thereof. This language is defined as 'idiosyncratic' because it refers to words that the individual may have related to previous contexts or experiences and uses without worrying about being understood by the interlocutor.

3) Pronominal reversal or change: characterized by a frequent tendency to replace the pronoun "I" with "you" and by a poor aptitude in using the other pronouns. These difficulties arise because of poor flexibility in understanding the need to designate roles during a conversation. Due to this poor flexibility, it is difficult to understand why a person is sometimes referred to as "I" or "you" or "he/she". While the origin of this problem is not clear, it is believed that it may result from an imitative behaviour which consists in repeating a sentence that was addressed to the individual.

4) Poor knowledge of the meanings of words and on how to connect them to other words.

Therefore, the consequences are: - syntactic difficulties, i.e. the understanding and use of syntactic elements of language;

- semantic difficulties, i.e. the size of individual's personal dictionary, including concepts and words as well as meanings they represent.

Individuals who are affected by autism spectrum disorder and who have not acquired appropriate communication modes can use alternative systems, such as sign language, exchange of images and pictograms, communication schemes based on written words, and computer aids. CAA provides for the possibility to use tools and different types of special communication aids with the aim of actually improving communication opportunities. Among these aids, devices are known which are provided with voice output, called Speech Generating Devices (SGD). Such devices can be special-purpose tablets or general-purpose tablets, such as for example iPADs, in which a special software application is run. In both cases, an SGD allows the method described at the beginning of the present specification to be carried out by visualizing icons, allowing a user to select a sequence of icons, and synthetically generating a voice output for the sentence corresponding to the sequence of icons.

However, in current methods, all icons are visualized at the same time, and the user has to recognize and choose them autonomously in order to form the desired sequence. Clinical observations allowed us to recognize that this approach has drawbacks in a number of cases. Thus, the user is subjected to a large amount of information, especially if the method includes visualizing a large amount of icons juxtaposed with each other in order to cover a wide range of communication opportunities. Dealing with such an amount of information to create a coherent sentence can be difficult and, in most cases, user gets confused, chooses icons wrongly or randomly, and is unable to obtain an effective communication aid.

For example, document US2013/065204A1 relates to software application "SpeakForYourself" which visualizes a large amount of icons on a single screen. This software application allows a user to select certain icons in order to sequentially visualize sets of icons, wherein a subsequent set is generated based on the icon selected in a previous set, thereby guiding the user through a hierarchically ordered menu of icons.

This is also described for other software applications, such as for example "Proloquo2Go" (see for example document "Proloquo2Go Basics Training Tutorial for Proloquo2Go") and "NOVAchat" (see for example document "NOVA Chat User's Guide").

However, in all the documents referred to above, the elements of the sentence, i.e. subjects, verbs and complements, are visualized on the same screen, thereby making it difficult for a user to create a sentence.

Therefore, there is a currently unmet need to actively guide a user in generating a list of icons, and therefore a sentence to be output as a voice message, in order to obtain a meaningful sentence which can meet the communication needs of the user.

The present invention aims to overcome these disadvantages of the currently known methods by providing a method as described at the beginning of the present specification wherein, in addition, the first set of icons is related to subjects, one or more additional sets are related to verbs, and the remaining sets are related to complements.

The method can guide a user having difficulties in communication and language, particularly an individual affected by autism spectrum disorders, by structuring the sentence in a pre-established manner so as to enable the creation of complete, meaningful sentences which can facilitate the user in expressing his/her needs and desires.

The method is provided with an algorithmic architecture which follows the same logics and rules as spoken language in order to replicate the characteristics thereof: each element in the sentence is clearly distinct from the other elements through the use of dedicated screens. This can mitigate any attention difficulty and promote the development of syntactic skills. Accordingly, the method is not only a tool which enables and promotes communication but also an effective method for training a user in order to develop and - if possible - enhance an autonomous verbal production through continuous learning of language formulation mechanisms.

Thus, the method facilitates the user in introjecting:

1 ) syntactic rules, i.e. the understanding and use of syntactic elements of language;

2) semantic skills, by increasing the number of words known by the individual and facilitating the comprehension of concepts and words as well as meanings they represent.

Therefore, the method clearly has a strong and functional rehabilitation value with a view to an increasingly enhanced autonomy.

The visualized set of icons related to verbs consistently and exclusively refers to the subject selected at the previous step. Likewise, the visualized set of icons related to complements consistently and exclusively refers to the verb selected at the previous step.

In contrast to the methods known in the art, the method according to the present invention guides the user in the construction of meaningful sentences.

In a preferred embodiment, a set is only related to subjects, an additional set is only related to verbs, an additional set is only related to object complements, and an additional set is only related to other complements. The various sets are visualized sequentially and mutually exclusively.

In a further exemplary embodiment, the word matched to each icon in the list is automatically inflected or conjugated based on the word matched to the previous icon in the list.

Firstly, this feature can improve communication because it can lead to the construction of a grammatically correct sentence which cannot be wrongly interpreted, and secondly, it assists the user in learning correct modes of language. Therefore, the method can allocate the elements of a sentence to specific screens according to a tree-architecture based on government, and it can establish consistent connections among the various words in the sentence.

Government is the phenomenon in which the presence of a specific word in a phrase forces other words in that phrase to take a given form: a verb mood, a specific case (for languages with cases, such as Latin or German) or a specific preposition (for languages without cases, such as Italian).

Government is primarily syntactic in function, since it indicates the existence of a connection between the governor and the governee without necessarily giving indications about the semantic value of the connection itself. However, when a governor can have different mutually alternative governees, such alternatives usually result in different semantic values.

Therefore, based on the concept of government, the method is not only an expression tool but also a rehabilitation tool because it leads to syntactic structuring.

According to an embodiment, one or more sets comprise icons which are exclusively matched to an additional set.

In this case, the icons forming the visualized set are not connected to a word and, consequently, they cannot be selected to create the list of icons corresponding to the sentence as the sentence is being populated; rather, they are only conceptual connections to a set of icons related to words. The selection of these conceptually higher-order icons allows the user to visualize icons which are effectively connected to words. This enables the user to visualize screens showing conceptually higher-order icons from which he/she can access the lists of icons related to words depending on his/her selection.

In a further embodiment, each subject can be changed in gender, person and number in advance before step a). This allows the user to edit information related to subjects in advance in order to create sentences which are automatically inflected in a correct manner before performing the steps of the method.

In an exemplary embodiment, icons can be changed with photos or images in advance.

This allows the user to customize the appearance of the icons both to improve the recognition thereof, thereby assisting the user during the selection procedure, and to create a familiar environment for the user and thus increase his/her comfort and encourage him/her to utilize the method.

According to a further exemplary embodiment, two fixed and user-selectable icons are visualized, which fixed icons are matched to words "YES" and "NO" respectively.

This allows to discriminate between the sequential visualization of the sets of icons and a static visualization of icons which are more basic in meaning, such as those related to agreement and dissent or refusal. In this way, these icons are always available and immediately accessible by the user.

In an exemplary embodiment, the icons in each set comprise a visual element which is different in colour from the additional sets.

Indeed, in language disorders, one of the arising difficulties may be related to the ability in discriminating among the various elements in the sentence. The present method works on categorization of these elements not only by allocating a specific screen to the individual elements but also by discriminating among them with the use of specific colours.

Establishing relationships between colours and parts of speech has led to important outcomes in rehabilitation of individuals with language disorders by assisting individuals in the appropriate use of elements. This technique is defined as "sentence-colour" by Dr. Adriana De Filippis (Nuovo manuale di logopedia, Edizioni Centra Studi Erickson). In a preferred embodiment, icons for subjects are yellow in colour, icons for verbs are green in colour, icons for object complements are blue in colour, and icons for other complements are red in colour.

The visual element can be, for example, the background, a frame or portion thereof, or the like.

A further object of the present invention is a device for synthetic generation of voice messages consisting of a sequence of words, which device comprises

a display for visualizing a set of icons, each icon being matched to at least one word,

a user interface for sequentially selecting a plurality of icons by a user to form an orderly list of selected icons,

a voice synthesizer for synthetically generating a voice message consisting of the sequence of words matched to each icon in the list, processing means for controlling the display, the user interface and the voice synthesizer,

wherein said processing means are configured so as to carry out the above-described method.

In a preferred exemplary embodiment, the device is a mobile device, preferably a tablet.

Accordingly, the device of the present invention is a speech generating device (SGD) which is adapted to guide a user in the creation of a meaningful sentence by separately and sequentially visualizing screens containing sets of icons, wherein each additional set is generated based on the selection made in the previous screen.

These and other features and advantages of the present invention will appear more clearly from the following description of certain embodiments illustrated in the accompanying drawings, wherein:

Figg. 1 to 4 show different screens;

Fig. 5 shows a general block diagram of the method;

Fig. 6 shows a detailed block diagram of the method;

Fig. 7 shows a diagram illustrating the use of the method by the user; Fig. 8 shows a block diagram of the device.

Figure 1 shows an exemplary embodiment of the device of the present invention which is configured to carry out the method of synthetic generation of voice messages consisting of a sequence of words.

According to the exemplary embodiment shown in the figure, the device is a tablet 1 provided with a touchscreen display 10 which allows the user to visualize information and interact therewith in order to carry out the method.

The display 10 visualizes a set of icons 2, each icon being matched to at least one word, and the user can sequentially select a plurality of icons 2 to form an orderly list of selected icons. A collecting area 4 is arranged at the top of the graphical interface to collect the orderly list of selected icons, and a visualizing area is arranged at the centre of the interface to visualize the icons 2. Icons are associated with a text string 3 which may show one or more words the icon refers to.

When the device is turned on or the software application is started, an initial or homepage screen is visualized, followed by a first screen which shows a first set 22 of icons as illustrated in Figure 1 . Such first set is advantageously related to subjects, such as "I", "dad", "mum", "girlfriend/boyfriend", "child", "grandfather", and others not specifically indicated in the figure. Both in this set and in the other sets described hereinbelow, icons 2 can show pictograms representing the objects of the icons, symbols or pictures. Each icon can be customized and modified by the user or a caregiver.

The user selects the desired icon, i.e. the icon related to "I", as illustrated by a dashed line.

Then a further screen is visualized, as illustrated in Figure 2, wherein the icon selected at the previous step appears within the collecting area 4 and wherein a second set 23 related to verbs is visualized in place of the first set 22. Such additional set is generated based on the icon selected in the previous screen. For example, verbs can be "go", "love", "have a shower", "write", "give", "want", and others not specifically shown in the figure.

Verbs are inflected by person: first and third singular, third plural.

Once the user selects the icon 2 corresponding to the desired verb, this icon appears in the collecting area 4 in juxtaposition with the icon already visualized in that area, and a further screen is visualized which shows an additional set, for example a set related to complements, which is generated based on the selected icon.

This process is repeated until the creation of the list of icons in the collecting area 4 is completed.

In the simplest cases, the process may already end either with two steps (subject and verb), for example for the sentence "I have a shower", or with three steps (subject, verb and complement), for example for the sentence "I love my girlfriend". The selection of the verb "eat" by the user results in the visualization of a screen related to foods, the selection of the verb "feel sick" results in the visualization of a screen related to body parts, and so on for each selected verb.

The elements of the sentence can be customized according to the gender of the user.

The method can even deal with more complex cases in virtue of the sequential architecture of the screens and because each screen is generated based on the selection made in the previous screen.

For example, if the user selects the word "want", as indicated in Figure 2 by a dashed line, then the screen shown in Figure 3 will be visualized.

The screen of Figure 3 shows a set 24 comprising conceptually higher-order icons which are exclusively matched to an additional set. Such icons are conceptual connections to sets of icons which are related to words, and their selection only allows for visualizing the additional set they refer to. In the example of the figure, the screen visualized when the verb "want" is selected allows the user to choose among a number of complements from set 24: infinitive verbs, foods, drinks, objects and people. If the user selects the icon related to drinks, as illustrated in Figure 3 by a dashed line, a screen is visualized which shows a set of drinks as viewed in Figure 4, including e.g. "water", "coffee", "milk", "soft drink", "tea", and others not specifically illustrated in the figure.

In the example of Figure 4, the user has selected the icon related to "tea" which is then visualized within the collecting area 4 in juxtaposition with the previously selected icons.

The user can decide either that the sentence is complete or that it requires other details to be added with the use of further complements. For example, for the sentence "mom listens", the user can select the icon "music" and then specify where and with whom his/her mother is listening to music through a screen similar to the screen illustrated in Figure 3. This allows even more complex sentences to be constructed, such as e.g. "I play volleyball at school with my friends".

Once the sentence is completed, each element thereof can be viewed in the collecting area 4.

The graphical interface comprises a voice playback icon 50 which can be selected by the user to synthetically generate a voice message consisting of the sequence of words matched to each icon in the list.

Any word can be added or deleted as desired so as to enlarge or simplify expression opportunities.

The graphical interface also comprises two fixed icons which can be selected by the user at any time: an agreement icon 20 matched to the word "YES", and a refusal icon 21 matched to the word "NO".

Other fixed icons include a deletion icon 53 to delete the entire sentence, a home icon 51 to return to the initial or homepage screen, and a back icon 52 to return to the previous screen.

The initial screen contains various icons or buttons allowing the user to alternatively start the method, ask for a break, ask for help or access settings.

In the settings, an object (subjects, verbs, various complements) can be added, changed or deleted. For example, for a subject, it is possible to change its gender (male or female), person (first or third), number (singular or plural) and visualization hierarchy.

Figure 5 shows a block diagram of the method in which a first set of icons is initially visualized, 60. Typically, this first set comprises subjects as described above. Then, the user performs a step of selecting a first icon from the first set, 61 .

Next, the method automatically proceeds alternately in two different modes, depending on the selection made by the user. A first mode involves simply visualizing an additional set of icons, for example a set related to verbs, 66. Subsequently, again, the user performs a step of selecting a further icon from the additional set, 67.

In a second mode, a set of higher-order icons is visualized, 62, as illustrated in Figure 3. The user performs a step of selecting a higher- order icon, 63. This results in the visualization of an additional set of icons, 64, followed by a step of selecting a further icon from the additional set, 65.

These steps are repeated until the creation of the list of selected icons is completed.

Once the sentence is completed, the user can activate the voice playback function, 68.

Figure 6 illustrates the operational algorithm which uses a plurality of government steps 82 to establish a coherent connection among the various words in a sentence when said sentence is created.

In linguistics, government refers to the influence exerted by a grammatical element upon another one. The dominant element activates certain morphemes in the subordinate elements to which it is specifically syntagmatically related.

A typical example of government is case government, in which a verb or preposition establishes the case for which the noun to which it relates has to be inflected.

For example, German prepositions "zu", "mit", "von", "aus", "zwischen" govern the dative case. German verb "helfen" is always followed by the dative case of the "helped person", which is instead expressed by an object complement in Italian ("ich helfe meinem Bruder" = "aiuto mio fratello").

In Italian, adjective "sensibile" can only be followed by preposition "a": e.g. "sono sensibile alle lusinghe".

More specifically, government also refers to phrases wherein the head of the phrase (i.e. the fundamental element on which the other elements depend) influences the modifier (the element which depends on the head) in order to determine the grammar characteristics thereof.

There are two types of government: free government and mandatory government.

In the first case, the dominant word can also be used without a subordinate word (e.g., in Italian, "sono sensibile" is not necessarily followed by "a ..."), while in the second case, the dominant word cannot support itself without the subordinate element (e.g., in Italian, "fare a meno" has no meaning if it is not followed by "di" + name).

When the creation of a sentence is started, 80, a subject is selected and inflected for person chosen from the group consisting of 1 st , 2 nd , 3 rd person singular and 1 st , 2 nd , 3 rd person plural, 81. Based on the selection of the user, a government step 82 is performed, followed by a step 83 in which verbs are inflected for person chosen from the group consisting of 1 st , 2 nd , 3 rd person singular and 1 st , 2 nd , 3 rd person plural. Therefore, when combined with government step 82, the selection of subject 81 results in the visualization of a set of verbs which are visualized according to their meanings and inflected for the correct person so that the user can continue with the selection of verb, 83.

Subsequently, the user can optionally select a category according to the modes described above with reference to higher-order icons, 84, and then select a complement, 85. The selection of category 84 and the selection of complement 85 can be iteratively repeated several times. In the example of the figure, the selection of category 84 and the selection of complement 85 are repeated three times so as to select one object complement and other two further complements. After each step of selecting a subject 81 , inflecting a verb 83 or selecting a complement 85, the selected word is output as a synthetic voice message, 86.

Once the sentence is completed, the user can activate the voice playback function. The entire sentence is then output as a synthetic voice message, 87.

Figure 7 shows a diagram for the use of the method by the user. When the creation of a sentence is started, 80, a subject is selected and inflected for person chosen from the group consisting of 1 st , 2 nd , 3 rd person singular and 1 st , 2 nd , 3 rd person plural, 81 . If the user has not selected the correct subject, then the user can use an appropriate button or icon to delete the choice and return to step 81 in order to select the subject.

If the user has selected the correct subject, then an inflection step 83 is performed in which verbs are inflected for person chosen from the group consisting of 1 st , 2 nd , 3 rd person singular and 1 st , 2 nd , 3 rd person plural. Again, if the user has not selected the correct verb, then the user can use an appropriate button or icon to delete the choice and return to step 83 in order to select the verb.

If the user has completed the sentence, then the entire sentence is output as a synthetic voice message, 87. Otherwise, the user can select a complement 85. Again, if the user has not selected the correct complement, then the user can use an appropriate button or icon to delete the choice and return to step 85 in order to select the complement.

Also in the example of Figure 7, as in the example of Figure 6, three complements - one object complement and other two further complements - can be selected, 85. Once each complement is selected, 85, if the sentence is completed, then the user can output the sentence as a synthetic voice message, 87.

After the output of the synthetic voice message, 87, if the needs of the user are all met, then the user can terminate the process. Alternatively, the user can return to step 81 in order to select a subject, thereby iteratively repeating the above-described method.

Figure 8 illustrates a diagram of an exemplary embodiment of the device which comprises a touchscreen display 10 to visualize and select icons, to control the voice playback and to edit the settings, and a processing unit 70 configured to control the entire method.

The processing unit 70 is in turn connected to a storage unit 73 in which all the icons, images thereof and matching words are stored. The storage unit 73 is divided in a plurality of storage sub-units, each sub- unit being adapted to store specific sets of icons. Certain icons (e.g. people) can be stored in specific hybrid units which can be accessed both as a set of subjects and as a set of complements.

The processing unit 70 is also connected to a synthetic voice generation unit 71 which is adapted to translate the words in the list of icons into sounds. The synthetic voice generation unit 71 is in turn connected to a loudspeaker 72.

The processing unit 74 can be connected to a network for remote data transmission and reception, update, monitor and use purposes.