BRANDT, Christian (Bjergegade 24st, Odense C, DK-5000, DK)
HVASS SCHMIDT, Jesper (Kløvervænget 26A, Odense C, DK-5000, DK)
BRANDT, Christian (Bjergegade 24st, Odense C, DK-5000, DK)
1. A system for conducting a hearing test using a computer program interacting with a test person, said system comprising a computer program running a routine to manage interaction with the test person, and adaptively select sound stimuli based upon said interaction according to a convergent process to determine a hearing threshold for the test person, wherein the hearing threshold is determined through an Alternative Forced Choice paradigm in conjunction with maximum-likelihood fitting of a most probable psychometric function.
2. The system of claim 1 , wherein the hearing threshold is determined by a Two Alternative Forced Choice paradigm (2AFC) in combination with a maximum-likelihood and up-down method.
3. The system of claim 1 or 2, wherein the routine to manage interaction with the test person includes: logic providing graphic constructs for display at the computer corresponding to each of the alternative stimulus intervals, the graphic constructs being aligned in an up and down relationship, causing generation of a selected stimulus during one of the alternative stimulus intervals, and prompting the test person to make a choice by selecting a graphic construct using an input device indicating the test person's perception of the stimulus during said alternative stimulus intervals.
4. The system of any one of the claims 1 to 3, wherein automatic control of the generated hearing threshold is performed by running a routine checking the possibility of false positive or negative hearing threshold.
5. A method for conducting a hearing test using a computer program having interaction with a test person, by adaptively selecting sound stimuli based upon the interaction according to a convergent process to determine a hearing threshold for the test person, wherein the hearing threshold is determined through an Alternative Forced Choice paradigm in conjunction with maximum-likelihood fitting of a most probable psychometric function.
6. The method of claim 5, wherein the hearing threshold is determined by a Two Alternative Forced Choice paradigm (2AFC) in combination with a maximum-likelihood and up-down method.
7. The method of claim 5 or 6, wherein the routine to manage interaction with the test person includes: logic providing graphic constructs for display at the computer corresponding to each of the alternative stimulus intervals, the graphic constructs being aligned in an up and down relationship, causing generation of a selected stimulus during one of the alternative stimulus intervals, and prompting the test person to make a choice by selecting a graphic construct using an input device indicating the test person's perception of the stimulus during said alternative stimulus intervals.
8. The method of any one of the claims 5 to 7, wherein automatic control of the generated hearing threshold is performed by running a routine checking the possibility of false positive or negative hearing threshold.
FIELD OF THE INVENTION
The present invention relates to a computer implemented method and system for conducting reliable hearing tests, in which a visual effect, such as displaying a graphic construct on a computer screen corresponding to each of two or more alternative stimulus intervals, causes generation of a selected audio stimulus during one of the two or more alternative stimulus intervals, and prompts the test subject to make a choice by selecting a visual effect indicating the user's perception of the stimulus during the chosen one of said two or more alternative stimulus intervals.
BACKGROUND OF THE INVENTION
Hearing tests are used to develop hearing profiles of persons, which can be used for fitting hearing aids and for other diagnostic purposes. Professional audiologists are typically required for conducting the tests needed to provide a hearing profile, because of the large number of factors involved in making an assessment necessary for generating a reliable hearing profile. An audiologist is able to set up a controlled environment, and conduct the test according to a testing protocol involving a number of stimuli and response steps that is adapted based on the responses gathered during the test.
The hearing profiles of individuals vary in a number of ways. The ability to hear sounds varies with frequency among individuals across the normal audio frequency range. Also, the dynamic range varies among individuals so that levels of an audio stimulus that are perceived as soft sounds and levels of an audio stimulus that are perceived as loud sounds differ from person to person. Standard hearing tests are designed to produce an audiogram that characterizes such factors as frequency, sensitivity and dynamic range in the hearing profiles of individuals.
There are also other factors that affect a hearing profile. For example, psycho-acoustic factors concerning the manner in which a person perceives combinations of normal sounds affect the ability to hear in ways that can vary from person to person. Also, environmental factors such as the usual listening environment of a person (library, conference room, concert hall) and the equipment on which the sound is produced (loud speakers, ear phones, telephone hand set) are important. In persons wearing hearing aids or using other assistive hearing devices, the type of aid or device affects the hearing profile.
The physiology of an impairment suffered by the individual may also be an important factor in the hearing profile.
The hearing profiles of individuals have been applied in the hearing aid field for customizing and fitting hearing aids for individuals. See, for example, U. S. Patent No. 4,731 ,850 entitled and U.S. Patent No. 5,848,171. Thus, techniques for processing sound to offset variations in hearing are well known. However, these techniques are unavailable to persons not using hearing aids.
Because of the difficulty in obtaining a hearing assessment test, and for a variety of other reasons, many persons who could benefit from devices that would assist their hearing do not follow through with obtaining a prescription for such devices. Thus, it is desirable to simplify the procedures involved in obtaining a reliable hearing assessment.
Traditional audiometry is normally conducted by a technically skilled person, often an audiologist, while user operated audiometry mainly have been used in research . Measurement of correct and reliable hearing thresholds is dependent on correct measurement techniques and subject compliance. Correct measurement techniques are in particular important for the audiologist conducting the audiometry. Several sources of measurement bias are related to the compliance of the test subject influencing both traditional and user operated audiometry procedures. Subjects have different response criterions to what they regard, as the faintest sounds they can hear (Marshall, 1991 ;Marshall and Jesteadt, 1986) . These response criterions will depend on the method used for obtaining hearing thresholds and also on the instructions given to the subject (Marshall,
1991 ;Marvit et al., 2003). Other sources of bias are related to the operator of the equipment and the different headphones used for audiometry (Flottorp, 1995;Shaw, 1966). A reduction of these sources of bias will significantly improve the results obtained from audiometry. User operated audiometry controlled by computer procedures can be performed without the involvement of an operator. Apart from the advantage in use of manpower, automatic procedures can have a clear advantage, both by making measurements more objective and by eliminating bias related to the observer. However, a disadvantage with computer controlled procedures is related to test subjects, who can have difficulties using a computer.
The different sources of measurement bias leads in the most optimal clinical settings to standard deviations of 3-4 dB for the frequencies up to 4000 Hz and larger for higher frequencies.
Furthermore, this will require normally hearing and well motivated subjects. In industrial audiometry the test-retest standard deviations may range from 6-10 dB (Dobie, 1983;Dobie,
The determination of auditory threshold is a complex psychophysical process, where responses determine the next step in the procedure. The method of clinical audiometry has been standardised and is either the ascending method or the bracketing method (ISO 8253-
1 , 1989) Other psychophysical methods based on different paradigms have been used widely in research, but they are typically very time consuming and the results may not be comparable to clinical measures.
Two well-known psychophysical paradigms are the Yes-No paradigm and the Two-
Alternative Forced Choice (2AFC) paradigm (Green, 1993;Gu and Green, 1994;Marvit et al., 2003;Shelton and Scarrow, 1984). These paradigms must be combined with a strategy for the course of the stimulus presentation. Typical strategies could be based on the method of maximum-likelihood (MML), the simple up-down method or the transformed up-down methods (Green, 1993;Levitt, 1971 ). MML is an adaptive method, based on searching for the most probable psychometric function taken from a set of possible candidate psychometric functions, i.e. the function best fitting the data generated by the test subject (Green, 1995;Green, 1993;Marvit et al., 2003). The psychometric function of pure tone detection has been shown to span approximately 8 dB from a near-chance to a near-perfect response (Watson et al., 1972). Usually, MML has been combined with the Yes-No paradigm (Leek et al., 2000;Marvit et al., 2003). MML can be very efficient and a reliable threshold can often be obtained with only 15 trials (Green, 1993;Leek et al., 2000).
A disadvantage of MML used with the Yes-No paradigm for clinical use is that it requires a stable response criterion from the patient (i.e, a constant false alarm rate), which is not always the case (Green, 1995;Marshall, 1991 ;Marvit et al., 2003). Also, the subjects will occasionally be inattentive and the answers will not always be consistent. If the stimulus level is significantly above the threshold level the responses will in general be consistent and all stimuli will be detected. Inconsistent responses at high levels will introduce a large bias to the threshold estimate (Green, 1995;Marvit et al., 2003;Hall, 1983). At low stimulus levels the subjects' response will be guessing.
The rate at witch the subject rapports signal present when there in fact is no signal is called the false alarm rate. The thresholds measured will depend on this false alarm rate (Green, 1995;Marvit et al., 2003). False alarms will influence threshold estimation no matter which method is used. In the psychometric functions used to describe the sensory process of obtaining a hearing threshold, the generation of false alarms can be treated as a separate process and modelled into the equation describing the most probable psychometric function (Green, 1993). However, the false alarm rates as well as the psychometric functions may change throughout the testing procedure and this can introduce bias in the measured thresholds (Marvit et al., 2003).
Other strategies in widespread use are the simple up-down and the transformed up-down methods (Levitt, 1971 ). The simple up-down method uses a staircase method where the stimulus is decreased after a correct response and increased after an incorrect response. In the transformed up-down methods the stimulus is increased after e.g. two correct responses at the same level and decreased after just one incorrect response. This two up, one down strategy will determine the 70.7 % point on the psychometric function. One of the big advantages of the transformed methods is high reliability (Levitt, 1971 ;Marvit et al., 2003). The Up-down methods are often combined with the 2AFC paradigm, but it requires a large number of trials to measure an accurate threshold (Leek et al., 1992;Marvit et al., 2003).
The 2AFC paradigm is more insensitive to a change in the patient's response criterion compared to e.g. a Yes/No paradigm by forcing the subject to guess, i.e. to produce false alarms (Marvit et al., 2003). The psychometric function will now range from 50% detection
(chance) to 100 % detection. 2AFC paradigms are faster than the more sensitive 3AFC paradigms (where the psychometric function range from 33% to 100% detection), but still measurements based on these paradigms are time-consuming (Marshall and Jesteadt, 1986;Marshall et al., 1996). However, the influence on threshold estimates from inattentive subjects can lead to unstable threshold estimates no matter which psychophysical paradigm is used. If the subject's performance change during the testing procedure, this can lead to misleading results (Hall, 1983).
Several different approaches have been made to construct user operated automatic hearing test systems. These systems are typically based on different methods using the ascending method (a modified Hughson-Westlake technique), the von Bekesy tracking technique and similar comparable techniques (Harris, 1979b;Henry et al., 1999;Henry et al., 2001 ;Laroche and Hetu, 1997;Zhao et al., 2002). The method of maximum-likelihood with a modified Yes/No paradigm has also been used as an automated user operated audiometry system (Formby et al., 1996). Response bias may contribute to the results obtained with these automated techniques as results are depended on patients with stable response criterions, who understands the procedure (Zhao et al., 2002;Harris, 1979a). Mean hearing thresholds between automated hearing threshold procedures and manually performed audiometry can show significant differences (Henry et al., 2001 ;Henry et al., 2003;Formby et al., 1996).
United States Patent No. 5,928,160 describes a home hearing test system and method based on the use of calibrated headphones specially manufactured to support the hearing test using home audio equipment. In addition, reference is made to this patent for its discussion of background concerning hearing assessment tests in general. However, home hearing assessment tests have not achieved commercial acceptance.
Some efforts have been made to develop a technique for allowing a web site visitor to measure their hearing loss in an efficient and consistent way that is self-administered. (See, web sites: "www.handtronix.com", "www.onlinehearing.com", and "www.didyouhearme.com.") Some of these attempts have implemented procedures that are similar to if not identical to a clinical audiogram, where a tone is presented and the listener responds if they heard the sound, in a type of yes-no threshold test. Other attempts implement a; screening procedure where tones are presented and results are based on whether or not you heard those tones with no adjustment of sound presentation based on user response.
The yes-no procedures of the prior art are not well suited for self-administered testing, and web implementation of a hear test demands self-administration. One reason is because the listener can fake a threshold and pretend that they are better than they really are, and yes- no procedures are susceptible to user bias. The prior art tests that do not adaptively find a hearing threshold are crude screeners that do not provide significant information about the person's hearing loss. The prior art tests that adapt the stimulus based on user input, also use basic yes-no procedures. Thus the result is determined based on analysis of yes responses and no responses to a sequence of queries.
It is widely understood that hearing levels vary widely among individuals, and it is also known that signal processing techniques can condition audio content to fit an individual's hearing response. Individual hearing ability varies across a number of variables, including thresholds of hearing, or hearing sensitivity (differences in hearing based on the pitch, or frequency, of the sound), dynamic response (differences in hearing based on the loudness of the sound, or relative loudness of closely paired sounds), and psychoacoustical factors such as the nature or and context of the sound.
Actual injury or impairment, physical or mental, can also affect hearing in a number of ways. The most widely used gauge of hearing ability is a profile showing relative hearing sensitivity as a function of frequency, generally called a hearing profile, discussed in more detail below. Yet, it remains true that the art has not succeeded in providing a system for effectively and rapidly generating individual hearing profiles.
The most widespread employment of individual hearing profiles remains in the hearing aid field, where some degree of hearing impairment makes intervention a necessity. This application entails detailed testing in an audiologist or otologist office, employing sophisticated equipment and highly trained technicians. The result is an individually-tailored hearing aid, utilizing multiband compression to deliver audio content exactly matched to the user's hearing response. It will be understood that this process is expensive, time- consuming and cumbersome, and it plainly is not suitable for mass personalization efforts.
The rise of the Internet has offered the possibility for the development of personalization techniques that flow from on-line testing. Efforts in that direction have sought to generate user hearing profiles by presenting the user with a questionnaire, often running several questions, and using the user input to build a hearing profile. Such tests have encountered problems in two areas, however. First, user input to such questionnaires has proved unreliable. Asked about their age alone, without asking for personal information, for example, users tend to be less than completely truthful. To the extent such tests can be psychologically constructed to filter out such bias, the test becomes complex and cumbersome, to that users simply do not finish the test.
Another testing regime is set out in WO03030619A2, which presents a number of techniques for such testing, most particularly a technique called "N-Alternative Forced
Choice," in which a user is offered a number of audio choices among which to select one that sounds best to her. Also known as "sound flavours," based on the notion of presenting sound and asking the user which one is preferred, this method can lack sufficient detail to enable the analyst to build a profile.
In sum, various forms of test procedures have been employed by the art, without arriving at a method that produces accurate results in a way that makes mass deployment possible.
SUMMARY OF THE INVENTION
The claimed invention relates to personalized hearing test system, and more particularly to an effective method for determining hearing threshold of a test person.
Specifically the present invention provides a system for conducting a hearing test using a computer program interacting with a test person, said system comprising a computer program running a routine to manage interaction with the test person, and adaptively select sound stimuli based upon said interaction according to a convergent process to determine a hearing threshold for the test person, wherein the hearing threshold is determined through an Alternative Forced Choice paradigm in conjunction with maximum-likelihood fitting of a most probable psychometric function.
In a preferred embodiment of the present invention the hearing threshold is determined by a Two Alternative Forced Choice paradigm (2AFC) in combination with a maximum-likelihood and up-down method.
In a particularly preferred embodiment of the present invention the routine to manage interaction with the test person includes: logic providing graphic constructs for display at the computer corresponding to each of the alternative stimulus intervals, the graphic constructs being aligned in an up and down relationship, causing generation of a selected stimulus during one of the alternative stimulus intervals, and prompting the test person to make a choice by selecting a graphic construct using an input device indicating the test person's perception of the stimulus during said alternative stimulus intervals.
In another embodiment the invention is a method for conducting a hearing test using a computer program. The computer program according to the invention comprises a routine to manage interaction via an interface coupled to the computer, and adaptively select stimuli based upon user interaction with the interface to a convergent process to determine a hearing characteristic, wherein the interaction comprises an N-alternative forced choice interaction in conjunction with psychometric function analysis. The convergent, adaptive process comprises a staircase function or a maximum likelihood function in alternative embodiments of the invention.
In one embodiment, the routine to manage the interaction includes a process that causes a visual effect, such as displaying a graphic construct on the computer, which corresponds to each of N alternative stimulus intervals, causes generation of a selected audio stimulus during one of the N alternative stimulus intervals, and prompts the test subject to make a choice by selecting a visual effect indicating the user's perception of the stimulus during the chosen one of said N alternative stimulus intervals. In various embodiments, the number N falls in the range of 2-4, for example.
The present process also includes establishing a baseline threshold for a control signal supplied via the communication device which causes the device to generate a sound. Also, the process involves managing an N-alternative forced choice stimulus and response interaction with the test subject. Also, the method includes adaptively producing signals to produce selected stimuli at the device for said interaction according to the convergent process that is based upon said baseline threshold and said interaction to determine a hearing characteristic.
Thus, the present invention enables self-administered hearing tests managed using a personal computer.
In another aspect the present invention provides a method for conducting a hearing test using a computer program having interaction with a test person, by adaptively selecting sound stimuli based upon the interaction according to a convergent process to determine a hearing threshold for the test person, wherein the hearing threshold is determined through an Alternative Forced Choice paradigm in conjunction with maximum-likelihood fitting of a most probable psychometric function.
Preferably the hearing threshold is determined by a Two Alternative Forced Choice paradigm (2AFC) in combination with a maximum-likelihood and up-down method. It is particularly preferred that the routine to manage interaction with the test person includes: logic providing graphic constructs for display at the computer corresponding to each of the alternative stimulus intervals, the graphic constructs being aligned in an up and down relationship, causing generation of a selected stimulus during one of the alternative stimulus intervals, and prompting the test person to make a choice by selecting a graphic construct using an input device indicating the test person's perception of the stimulus during said alternative stimulus intervals.
Preferably the automatic control of the generated hearing threshold is performed by running a routine checking the possibility of false positive or negative hearing threshold.
DESCRIPTION OF THE DRAWINGS
Figure A shows the algorithm for estimating threshold. The figure shows a typical sequence in testing. White arrow down (correct answer), black arrow up (wrong answer) in an example of a threshold seeking algoritm. Horizontal large black arrows symbolise different phases in the threshold seeking algoritm. The calculated threshold (black squares) is based on maximum likelihood fitting on the most probable psychometric function based on previous answers.
Step 1-4 comprise the progression rule, while Step 5 is marking the stopping rule.
Step 1 is marking the trials until the first error occur (marked with a black downward arrow).
This first error results in a 30 dB increase in intensity.
Step 2. The first error results in a 30 dB increase in intensity. Step 2runs until the calculated hearing threshold is crossed. Step 3 runs until the second error is made on trials testing on the upper limit. The upper limit is defined as trials above the previous calculated thresholds. This error (black downward arrow) marks the end of step 3 and a new threshold calculation. The intensity is increased 5 dB in addition to the small 1 dB intensity increase, which is a result of the maximum- likelihood calculation of the expected calculated threshold.
Step 4 runs as in step 3 until the second error (black downward error) made on trials testing the upper limit. A new calculated threshold is based on identical calculations as in step 3.
Step 5 runs until the stopping rule has been fulfilled. This is at least 6 consecutive correct responses at the upper limit and at least 2 errors at the lower limit.
Figure 1 B shows the fitted psychometric function based on all answers in figure 1 . This represents the last threshold calculation (black square in A).Answers to the left of the psychometric function represents answers below the hearing threshold. If answers is on the line representing 100% correct responses this is a result of false positive answers because the 2 AFC paradigm allows subjects to guess on the right stimulus presenting interval. The threshold in this procedure is defined as the point marking 95% correct responses. If false negative answers occur as a result of occasional inattention from the subjects these answers are placed at the 0 % correct response level. One such false negative answer occurs in the present example. It is the answer representing trial 22 in A.
Figure 2 A-H shows Bland Altman Plots of threshold differences obtained from 2AFC Audiometry and traditional audiometry for 30 subjects.
Figure 4 shows results from a traditional performed audiogram (A) and the user operated 2 AFC procedures on the same subject (B). The large black circle around the threshold point at 3000 Hz on the right side is the result of the threshold calculation after the procedure in Figure 1.
Figure 5. shows test-retest average audiogram of test 1 . Test 1 is the first test sequence from the subjects in the test-retest experiment. Differences between test 1 and test 2 can be seen in table 1 . Standard octaves from 0.25-8 kHz including 3 and 6 kHz are on the abscissa and hearing thresholds in dBHL on the ordinate. O marks hearing thresholds at the left ear and X marks hearing threshold at the right ear. Figure 3 A) shows average audiogram of test subjects tested with traditional performed audiometry. Test subjects have been tested in the range from (0-100 dBHL). Standard octaves from 0.25-8 kHz including 3 and 6 kHz are on the abscissa and hearing thresholds in dBHL on the ordinate. O marks hearing thresholds at the left ear and X marks hearing threshold at the right ear.
Figure 3 B) shows average audiogram of test subjects tested with user operated 2AFC audiometry. Test subjects have been tested in the range from (-20-100 dBHL).
Figure 6 shows mean differences between traditional audiometry and user operated 2 AFC audiometry after comparison of 41 subjects. A negative value indicates that user operated hearing thresholds are acuter than corresponding hearing thresholds obtained with traditional audiometry. 95% confidence intervals of test-differences are indicated with error bars. The standard deviation of differences between traditional audiometry and user operated audiometry is indicated above the error bars. Standard octaves from 0.25-8 kHz including 3 and 6 kHz are on the abscissa and the hearing threshold difference between the two tests in dBHL is on the ordinate.
DETAILED DESCRIPTION OF THE INVENTION
A detailed description of various embodiments of the present invention is herewith provided with reference to a study performed with 30 test subjects.
30 males and females where recruited to participate in the validation project. Approximately half of the subjects where recruited from ordinary patient examination in the department of Audiology, Odense University Hospital. The reasons for referral to audiologic examination were various causes (Including tinnitus, hearing loss, control for ototoxicity). The remaining patients were patients without any known hearing loss. In general patients where not trained for better performance of the hearing tests prior to the hearing tests. The majority of the patients had no records of previous hearing tests. The patients were roughly in 3 agegroups: 9 below 30 years, 12 from 30-50 years and nine above 50 years. The age range of the tested patients was from 20-69 years. All subjects had normal otoscopic inspections which also ensured the absence of obstructing cerumen prior to hearing testing. Patients with asymmetric hearing loss with asymmetries of any frequency >30 dB where excluded from inclusion in the project. Patients with known hearing thresholds of any frequency (0,25-8 kHz) >70 dB where also excluded from the study.
All patients underwent the same testing program. The testing program comprises 2 audiological tests. The first test is a traditional audiometry. The second test is a newly developed audiometric hereafter named automatic 2AFC audiometry. All patients took both tests on the same day. The tests were separated by a short break of typically 10 minutes.
The order varied so approximately half of the patients took the traditional test first followed by the automatic 2AFC audiometry. The other half of the patients started with the automatic 2AFC test followed by the traditional audiometry. The allocation to the different groups was not completely random as the groups were more balanced in size due to practical reasons.
12 other persons participated in a test-retest study with the 2 AFC-audiometry alone. These test persons took two tests separated by a time interval from an hour up to several days.
Traditional Audiometry This standard audiometry was performed within a soundtreated booth according to standards described in ISO 8253-1 (International Organization for Standardization, 1989).
The ambient noise level was below the requirements in the ISO 8253-1 standard.
Audiometry was done with MADSEN audiometers including TDH-39 Telephonies headsets.
The procedure was a modified Hughson-Westlake technique involving the following frequencies (250, 500, 1000, 2000, 3000, 4000, 6000, 8000 Hz) tested as air conductions.
Both ears were tested starting out with the right ear as the first ear to be tested.
Automatic 2AFC Audiometry
This procedure is conducted with a computer (Compaq nx6310) coupled to a transportable mobile device - Mobile Processor RM2, Tucker Davis Technologies through an USB- connection. A Senneheiser HDA200 headphone is connected to the mobile device. In order to measure low hearing thresholds possibly below 0 dB HL an extra attenuator of 400 ohm is connected between the mobile processor and the headphones. All 2AFC audiometry tests have been conducted outside a soundtreated booth in a quiet room with the attenuation from the HDA200 heardphones as the only primary sound attenuation. The software controlling the mobile processor is developed as an automatically running routine written in DELPHI enabling the test person himself to respond to the test tones. The routine combines the 2 Alternative Forced Choice without feedback with a combination of the maximum likelihood and a modification of the up-down methods. The test tones are played as 3 test tones length 200 ms separated by intervals of 300 ms. Rise/fall times (cosine ramps) of test tones are 15 ms. Thus total signal length including rise/fall times is 215 ms. The test tones are presented in 1 of 2 intervals. The presence of a test tone is marked by a coloured box on the computer screen. Interval 1 is marked by a red box and interval 2 is marked by a blue box. Test tones are presented randomly in one of the two intervals. The strategy is to select the right interval (red or blue) which lights up when the test tones are presented. The following frequencies (250, 500, 1000, 2000, 3000, 4000, 6000, 8000 Hz) are tested as air conductions. Left ear is tested first. After finishing all selected frequencies on one ear, the procedure automatically switch to the second ear where the selected frequencies are tested as well. To avoid the effect of over listening, the system use automatic masking of the non-testing ear at a level 40 dB below the testing level.
According to Marvit et. al. the measurement strategy can be described as a set of three rules that govern the process of a psychometric procedure (Marvit et al., 2003):
Was chosen to be significant high according to assumed hearing threshold. In most cases a starting level corresponding to 40 dB HL was chosen, but in cases of known hearing loss this starting point was chosen to 60 dB HL or 70 dB HL.
Was developed from a modified 2 down 1 up paradigm combined in a 2 alternative Forced Choice setting. In fact the modification was extensive as the procedure tries to control for well known false positive answers expecting to occur. Experience from these experiments told us that some patients occasionally were picking the wrong interval even in situations where test tones obviously should be heard. To minimise the influence of these unwanted faults patients had the opportunity to correct mistakes noticed by themselves. In the beginning of the test, one correct answer results in a lowering of the test tone with 10 dB. Every correct answer is followed by a lowering of 10 dB. With this method the real hearing threshold is reached after a few trials. When the hearing threshold is crossed and the patients no longer hear the test tone they are forced to guess on the interval which most probably contains the test tone. As test tones are allocated randomly to one of the two intervals a false answer will occur by chance. The first fault made by the patient results in an increase of the intensity of 30 dB. The reason for this large increase is that it will bring the intensity back to a level significantly above the hearing threshold in cases where the patient by chance has "guessed" the right interval in situations where test tones are not heard. This rule also serves to increase the familiarization with the procedure, as an obviously wrong answer in the beginning will be corrected in this way and the patient will get a second chance to respond cooperatively with the testing procedure.
After the 30 dB increase in intensity level the test tones a lowered with 10 dB at each correct response. The steps are now controlled by the maximum-likelihood psychometric function based on the previous answers. The psychometric function is based on a logistic regression model calculating the most probable psychometric function based upon the previous responses for the tested frequency. Thus the lowering of the intensity progress until a level of 5 dB above the calculated threshold is reached. This level is then tested intensively. If two faults occur at this testing level, the test threshold is raised by 5 dB. The reason why the intensity of test tones are raised only after the second fault is because this strategy minimise the effects of a false negative answer from the patients in cases where the testing level is above the hearing threshold. The patients had the opportunity to correct obviously wrong answers, but they do not notice in all cases.
Testing now progress on this new testing level until 2 more faults have been made and the threshold is raised another 5 dB.
The progression of the test can be divided into several steps (fig 1A).
Step 1. Detection of the tone (hits) lowers the test tone by 10 dB and thereby the hearing threshold is crossed after a few trials. Step 1 is ended when the first error occurs. Errors occur when the test subject guess and picks the wrong interval containing the signal. Either the signal is not heard or the response is wrong i.e. due to lack of concentration. Step 2. The initial trial presented at step 2 is corresponding to a 30 dB increase in sound intensity relative to the sound level at the end of step 1. This increases the familiarization with the procedure. Correct responses again results in 10 dB decrease in intensity. Step 2 progresses until the calculated threshold is crossed and this ends step 2. The calculated threshold is based on a set of logistic maximum-likelihood candidate psychometric functions as described by Gu and Green (1994). The most probable psychometric function best fitting the previous answers is chosen. The threshold is defined as the 95% correct point on this psychometric function.
Step 3. The calculated psychometric function is used to place a test interval of 10 dB around the most probable psychometric function. This interval is hypothesised to mark the extremes of the proposed psychometric function. Intensive testing occurs above and below the expected threshold point. These levels are called the upper and lower limits. The upper limit is hypothesised to be close to 100% hits of the psychometric function and the lower limit is hypothesised to be close to 50% hits in a 2AFC paradigm. Stimuli are presented slightly above the calculated threshold and this makes the task easier and subjects make fewer mistakes at the upper limit compared to stimuli presented closer to the calculated threshold. The difference between the upper and lower limits is 10 dB. This test interval is adjusted in 1 dB steps after each threshold calculation. Step 3 is ended when an error of the upper limit occurs. This error results in a new threshold calculation based on the MML calculation as described under step 2. In addition to the MML threshold calculation a 5 dB intensity increase is added to the threshold as a new adjustment.
Step 4. Continues step 3 at the increased stimulus level ending step 3. Step 4 ends by the second fault at the upper limit. This also results in a new threshold calculation and a further
5 dB addition to the threshold similar to the end of step 3. That the intensity is only increased after the second fault reduces the effect of false negatives.
Step 5. It is identical to step 4, however step 5 is ended by the stopping rule. The stopping rule can conclude the test as early as after step 3. The stopping rule will ensure that the calculated psychometric function will contain the final hearing threshold.
If subjects make a wrong answer by mistake they have the ability to click on the computer screen in a field indicating: "erase previous response". In this case the last trial is presented one more time. This can limit the number of false negative responses, which make threshold estimation easier and faster. Subjects are instructed to correct obvious mistakes if they notice that they have chosen the wrong interval. However, subjects do not notice mistakes in all cases.
Above identified steps 3 to 5 constitute an automatic control of the generated hearing threshold, which is performed by running a routine checking the possibility of false positive or negative hearing threshold.
Testing proceeds until at least 6 consecutive correct responses have been made on the tested level. This is called the upper limit. If at least 6 consecutive correct responses can be made it is assumed that these values on the upper limit are close to 1 on the psychometric function. It has previously been estimated that the psychometric function typically span 8 dB. If the patient responds with 6 correct answers the risk of correct guesses are low and furthermore the risk of a false too low hearing threshold is low. Testing 1 O dB below this defined upper limit will be at the lowest part of the psychometric function (lower limit). In a 2 Alternative Forced Choice paradigm this value correspond to 0,50 or close to 0,50 on the psychometric function. For stopping the test it will also require a minimum of 2 incorrect responses at this lower limit.
Too many correct answers at the lower limit will return the test to the beginning of the progression rule as the patient due to a lack of concentration has progressed the testing level to be at least 10 dB above the hearing threshold. The procedure runs until a least 30 trials have been completed for each frequency. If the stopping rule can be fulfilled the test proceed to the next frequency. If the stopping rule can not be fulfilled the test proceeds until the stopping rule can be fulfilled. The number of trials required to determine the hearing threshold varies.
Is set rather conservative to 95% correct responses. The threshold was estimated at the signal level corresponding to the 95% point on the most likely psychometric function after at least 30 trials. In another study using a combination of 2 AFC and maximum-likelihood they tracked the 94% point (Dai and Green, 1992). Calibration
Calibration of headphones is done according to the standards described in ISO 389-8 by using a coupler from Brϋel and Kjaer as specified in I EC 60318-3 and specified for the Sennheiser HDA-200 headphone (International Organization for Standardization, 2004).
The two test procedures are compared by using the method of limits of agreement as described by Bland and Altman (Bland and Altman, 1986). This method is favoured over the use of correlation coefficients which has been used in previous method comparing studies in the audiological literature. As systematic biased effects between two different methods are likely to occur especially in audiological testing, a more reliable comparison of two test methods can be made with the method of Bland and Altman (Bland and Altman, 1986).
Validity of 2 Alternative Forced Choice audiometry
The two test-methods are compared in the Bland Altman plot (Figure 2). This figure shows the 8 tested frequencies in 8 different diagrams. The dashed lines marks the observed 95% confidence intervals, the solid line is the observed mean differences between the two test methods and the dotted line marks the situation if no difference was observed between the two tests. For all frequencies thresholds obtained with 2AFC audiometry are slightly lower or closely equals thresholds obtained with traditional audiometry. The mean audiograms for the two test groups are shown in fig. 3. Examples of audiograms measured with traditional audiometry and 2AFC audiometry from the same subject is shown in fig. 4.
The observed standard deviations and corresponding confidence intervals from figure 2 is listed in table 1. This table includes the results from 30 subjects (60 ears). A summarizing figure of data from 41 subjects (82) ears is presented as figure 6 One potential outlier is representing one ear at 3000 Hz and one representing one ear at 8000 Hz has been excluded from dataanalysis (data not shown). This summarises the observed standard deviations from 250 Hz-4000 Hz from 3.2-4.5 dB. The standard deviations for 6 and 8 kHz are 6.4-6.7 dB. At least for the outlier representing 3000 Hz this measurement was repeated again for the 2AFC procedure and the 2 measurements deviated only 4 dB. It is assumed that an error in the traditional performed audiometry was made at this specific point (data not shown). The corresponding confidence intervals tells us that 2 measurements obtained with traditional audiometry and 2 AFC audiometry with a 95% probability deviates less than +/- 8.8 dB for the frequencies 250-4000 Hz and +/- 13.1 dB for 6-8 kHz.
Frequency Difference Std.dev. 95% limits of agreement
Average Bland and Altman 1986
250 Hz -4.9 4.5 -13.7 3.9
500 Hz -2.6 3.6 -9.7 4.6
1000 Hz -0.9 3.2 -7.3 5.4
2000 Hz -1.9 3.6 -9.0 5.2
3000 Hz -0.5 4.2 -8.9 7.8
4000 Hz -1.1 4.1 -9.0 6.9
6000 Hz -3.4 6.4 -15.8 9.1
8000 Hz -0.6 6.7 -13.7 12.5
Repeatability of 2 Alternative Forced Choice audiometry
The 2 AFC audiometry test did allow patients to test below 0 dB to the absolute minimum they possibly could hear. This will of course add a larger uncertainty especially when the measurement is below 0 dB. Standard deviations from the repeated measurements were calculated as well as the standard deviations based on the pooled variances for the different frequencies. The results are shown in table 2. The mean audiogram describing the test- retest population is shown in figure 5.
Table 2: Frequency no. ears Mean Std.dev. (dBHL)
250 Hz 44 3,0 500Hz 44 3,5 1000 Hz 44 2,8 2000 Hz 44 3,1 3000 Hz 38 2,8 4000 Hz 44 2,3 6000 Hz 38 2,8 8000 Hz 44 3,5 Perspective:
The 2AFC audiometry test system has been used in an occupational hearing conservation study to test a group of 146 professional musicians. The musicians were tested in a group of up to eight people a time. This gives a high efficiency as one person is sufficient to instruct the subjects to the test. This gives the possibility of retrieving at least 8 audiograms every hour including the time used for instruction of the subjects and the test itself. A schematic over view of the diagnostic findings using 2AFC audiometry is shown in table 3.
Type of hearing loss No. of No. of Total left ears right ears
Noise Induced Hearing 31 18 "49"""
Presbyacusis 4 4 8
Other types of Hearing 6 7 13
Normal hearing 105 117 222
The validity of automatic self operated 2 alternative forced choice audiometry has been tested by comparing the test to the known standard procedure of traditional audiometry. The two test methods differ in many important aspects but the objective of both methods is to estimate hearing thresholds. Previous studies comparing thresholds obtained from a procedure based on 2 Interval Forced Choice (2IFC) with traditional obtained thresholds find mean differences of 6,5 dB with the lowest thresholds observed with the 2IFC method. In this study the 71 % correct point was estimated on the psychometric function (Marshall and Jesteadt, 1986). In another study 2AFC thresholds were found to be 2.9 dB lower, when the 79% correct response point was bracketed (Marshall et al., 1996). Others have also found increased differences between a three-interval-forced choice procedure and clinical thresholds related to hearing loss and age (Gatehouse and Davis, 1992). The objective of the present study was not to estimate exact low threshold corresponding to a 70,7 percentage of correct responses on the psychometric function. The purpose was to use the known theories in an alternative way to create a reliable self testing audiometry system. One of the main differences between the previous studies and the present study is the use of a combination of the maximum likelihood and an alternative up-down method in combination with the 2 Alternative Forced Choice paradigm. Instead a 95% correct response point of the most probable psychometric function was chosen, which of cause will make the results in the present study to deviate a little from previous studies as the methods differs. In the validation process it was important to notice that thresholds obtained with 2AFC audiometry would not be poorer than thresholds obtained with traditional audiometry. This mission succeeded for all measured thresholds, as no mean differences were larger than 0 dB. Furthermore the standard deviations of the differences between the traditional audiometry and the 2 AFC Audiometry is below 4.5 dB for 250-4000 Hz as it is evident from Table 2. This corresponds well with the known uncertainties of the traditional audiometry. The known uncertainty for the frequencies 250-4000 Hz has been given in terms of standard deviation to 4,9 dB (International Organization for Standardization, 1989). Furthermore ISO8253-1 states that 2 measurements from the same person deviates with a 95% probability less than +/-10 dB. For frequencies above 4000 Hz the uncertainty is even larger. In terms of this information it is evident that the two methods should not differ significantly in terms of repeatability and it should be possible to use both methods interchangeably. However, it should be taken into account that 2AFC audiometry thresholds tends be slightly lower than traditional clinical obtained thresholds. The reasons for these differences are many, but one of the reasons is absolutely a reduced response bias, which is achieved with the 2AFC paradigm. Patients tend to respond differently and some patients wants to be more sure, when they answer "yes" the tone was heard (Marshall and Jesteadt, 1986;Marshall, 1991 ). In a 2 Alternative Forced Choice paradigm it is "acceptable" to guess and patients can pick the right interval even in situations where they are in doubt.
It was noticed that especially at low frequencies (250 Hz and 500 Hz) a larger difference between the 2 methods could be seen (see figure 3 and 6). One main reason for the differences seen at 250 Hz is of more technical character. Two different headphones were used and the one used for traditional audiometry (TDH-39) is known to be more dependent on correct placement especially when measuring low frequencies and when results are compared to circumaural headphones as HDA-200 (Riedner, 1980;Shaw, 1966). 2 Alternative Forced Choice audiometry is very repeatable if the repeatability coefficients listed in table 5 and 6 are compared to the known uncertainties from the ISO 8253-1 standard (International Organization for Standardization, 1989). Even for thresholds measured below 0 dB the repeatability is comparable to the repeatability expected from the standard audiometry ISO 8253-1 (International Organization for Standardization, 1989).
The clinical potential of Automatic 2 Alternative Forced Choice audiometry will be to lower biases related to the used methods under certain circumstances. This computer and patient operated procedure tends to lower biases related to the operator as well as the patient.
These biases are known contributors of uncertainties in traditional audiometry (International Organization for Standardization, 1989). Thus the method will be valid to use under circumstances where it is difficult to keep up to required standards. This could be as a part of an occupational hearing conservation programme. As it is evident from table 3 it is possible to use the 2AFC audiometry system for occupational hearing tests. The audiometries from these programs are often with much larger uncertainty than the uncertainty observed in the present study (Dobie, 1983). Furthermore if there is a lack of qualified operating personal, this method of audiometry can be used to give as reliable audiometries as clinical obtained audiometries. Furthermore it should be noted that the persons selected for this study by no means represent a perfect test group as at least half of the test persons can not be considered as otologically normal. The group would be more comparable to i.e. a group of industrial workers where hearing losses are likely to occur.
CONCLUSION Automatic 2 Alternative Forced Choice audiometry is a valid alternative to traditional audiometry. When using 2AFC Audiometry it is important to notice that 2AFC Audiometry gives a lower threshold of typically 1-2 dB for the most frequencies. In general this little difference will only have minor clinical consequences when thresholds obtained with two different methods are compared.
Furthermore the reliability of Automatic 2 AFC Audiometry is compareable and as reliable as traditional audiometry. REFERENCES
Atherley GR, DI NGWALL-FORDYCE I (1963) THE RELIABILITY OF REPEATED
AUDITORY THRESHOLD DETERMINATION. Br J lnd Med 20:231-235. Bland JM, Altman DG (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1 :307-310.
BROWN RE (1948) Experimental studies on the reliability of audiometry. J Laryngol Otol
Burk MH, Wiley TL (2004) Continuous versus pulsed tones in audiometry. Am J Audiol 13:54-61.
Carhart R, Jerger JF (1959) Preferred Method for Clinical Determination of Pure-Tone
Thresholds. Journal of Speech and Hearing Disorders 24:330-345.
Dai H, Green DM (1992) Auditory intensity perception: successive versus simultaneous, across-channel discriminations. J Acoust Soc Am 91 :2845-2854. Dobie RA (1983) Reliability and validity of industrial audiometry: implications for hearing conservation program design. Laryngoscope 93:906-927.
Dobie RA (1985) Industrial audiometry and the otologist. Laryngoscope 95:382-385.
Erlandsson B, Hakanson H, Ivarsson A, Nilsson P (1979) Comparison of the hearing threshold measured by manual pure-tone and by self-recording (Bekesy) audiometry. Audiology 18:414-429.
Flottorp G (1995) Improving audiometric thresholds by changing the headphone position at the ear. Audiology 34:221-231.
Formby C, Sherlock LP, Green DM (1996) Evaluation of a maximum likelihood procedure for measuring pure-tone thresholds under computer control. J Am Acad Audiol 7:125-129. Gatehouse S, Davis A (1992) Clinical pure-tone versus three-interval forced-choice thresholds: effects of hearing level and age. Audiology 31 :31-44.
Green DM (1993) A maximum-likelihood method for estimating thresholds in a yes-no task.
J Acoust Soc Am 93:2096-2105.
Green DM (1995) Maximum-likelihood procedures and the inattentive observer. J Acoust Soc Am 97:3749-3760.
Gu X, Green DM (1994) Further studies of a maximum-likelihood yes-no procedure. J
Acoust Soc Am 96:93-101.
Hall JL (1983) A procedure for detecting variability of psychophysical thresholds. J Acoust
Soc Am 73:663-667. Harris DA (1979a) Detecting non-valid hearing tests in industry. J Occup Med 21 :814-820. Harris DA (1979b) Microprocessor versus self-recording audiometry in industry. J Aud Res
Harris DA (1979c) Microprocessor, self-recording and manual audiometry. J Aud Res
19:159-166. Henry JA, Flick CL, Gilbert A, Ellingson RM, Fausti SA (1999) Reliability of tinnitus loudness matches under procedural variation. J Am Acad Audiol 10:502-520.
Henry JA, Flick CL, Gilbert A, Ellingson RM, Fausti SA (2001 ) Reliability of hearing thresholds: Computer-automated testing with ER-4B canal Phone (TM) earphones. Journal of Rehabilitation Research and Development 38:567-581. Henry JA, Flick CL, Gilbert A, Ellingson RM, Fausti SA (2003) Reliability of computer- automated hearing thresholds in cochlear-impaired listeners using ER-4B Canal Phone earphones. J Rehabil Res Dev 40:253-264.
International Organization for Standardization (1989) Acoustics - Audiometric test methods
— Part 1 : Basic pure tone air and bone conduction threshold audiometry. ISO8253-1. Geneva: ISO.
International Organization for Standardization (2004) Acoustics — Reference zero for the calibration of audiometric equipment - Part 8: Reference equivalent threshold sound pressure levels for pure tones and circumaural earphones. ISO 389-8. Geneva: ISO.
Jerlvall L, Dryselius H, Arlinger S (1983) Comparison of manual and computer-controlled audiometry using identical procedures. Scand Audiol 12:209-213.
Laroche C, Hetu R (1997) A study of the reliability of automatic audiometry by the frequency scanning method (Audioscan). Audiology 36:1-18.
Leek MR, Dubno JR, He N, Ahlstrom JB (2000) Experience with a yes-no single-interval maximum-likelihood procedure. J Acoust Soc Am 107:2674-2684. Leek MR, Hanna TE, Marshall L (1992) Estimation of psychometric functions from adaptive tracking procedures. Percept Psychophys 51 :247-256.
Levitt H (1971 ) Transformed up-down methods in psychoacoustics. J Acoust Soc Am
Marshall L (1991 ) Decision criteria for pure-tone detection used by two age groups of normal-hearing and hearing-impaired listeners. J Gerontol 46:67-70.
Marshall L, Hanna TE, Wilson RH (1996) Effect of step size on clinical and adaptive 2IFC procedures in quiet and in a noise background. J Speech Hear Res 39:687-696.
Marshall L, Jesteadt W (1986) Comparison of pure-tone audibility thresholds obtained with audiological and two-interval forced-choice procedures. J Speech Hear Res 29:82-91. Marvit P, Florentine M, Buus S (2003) A comparison of psychophysical procedures for level- discrimination thresholds. J Acoust Soc Am 113:3348-3361.
O'Regan JK, Humbert R (1989) Estimating psychometric functions in forced-choice situations: significant biases found in threshold and slope estimations when small samples are used. Percept Psychophys 46:434-442.
Rabinowitz PM, Galusha D, Ernst CD, Slade MD (2007) Audiometric "early flags" for occupational hearing loss. J Occup Environ Med 49:1310-1316.
Riedner ED (1980) Collapsing ears and the use of circumaural ear cushions at 3000 Hz. Ear
Hear 1 :117-1 18. Robinson DW, Shipton MS, Hinchcliffe R (1981 ) Audiometric zero for air conduction. A verification and critique of international standards. Audiology 20:409-431.
Saberi K, Green DM (1996) Adaptive psychophysical procedures and imbalance in the psychometric function. J Acoust Soc Am 100:528-536.
Schlauch RS, Rose RM (1990) Two-, three-, and four-interval forced-choice staircase procedures: estimator bias and efficiency. J Acoust Soc Am 88:732-740.
Shaw EA (1966) Earcanal pressure generated by circumaural and supraaural earphones. J
Acoust Soc Am 39:471-479.
Shelton BR, Scarrow I (1984) Two-alternative versus three-alternative procedures for threshold estimation. Percept Psychophys 35:385-392. Sinclair A, Smith TA (1984) A three-frequency audiogram for use in industry. Br J lnd Med
Watson CS, Franks JR, Hood DC (1972) Detection of Tones in Absence of External
Masking Noise .1. Effects of Signal Intensity and Signal Frequency. Journal of the
Acoustical Society of America 52:633-643. Zhao F, Stephens D, Meyer-Bisch C (2002) The Audioscan: a high frequency resolution audiometric technique and its clinical applications. Clin Otolaryngol Allied Sci 27:4-10.
Next Patent: DEVICE FOR DISTRIBUTING CHARGE MATERIAL INTO A SHAFT FURNACE