Research Proposal (1): Quantitative orientation

Tara L. Whitehill, Dept. of Speech and Hearing Sciences, University of Hong Kong
Date of Submission: December 1993.  Awarded Ph.D. in 1997 under title:
Speech intelligibility in Cantonese speakers with congenital dysarthria.
University of Hong Kong, 1997. In Library as PhD 98 W1 [Special Collections]

I. Provisional title
II. Field of research/background
III. Identification of the research problem
1V. Literature Review
V. Data Collection and Analysis

VI Possible outcomes
VII References
VIII. Timetable



I. Provisional title

Intelligibility in Cantonese-speakers with dysarthria:
Perceptual, phonetic and acoustic features.

Go to top

II. Field of research/background

    Intelligibility has been defined as the degree to which a speaker's intended message is recovered by a listener (Kent, Weismer, Kent, & Rosenbek, 1989). Reduced intelligibility severely compromises communication and social interaction for affected individuals. Dysarthria refers to a group of motor speech disorders caused by damage to the central or peripheral nervous system which can affect respiration, phonation, resonation, articulation, and/or prosody (Nicolosi, Harryman & Kresheck, 1978) and which can result in reduced intelligibility.

    A reliable measure of speech intelligibility is required for several reasons: to provide an index of the severity of a disorder, to assist in management and treatment decisions, and to quantify changes which may result from spontaneous recovery, treatment, or deterioration of a progressive disease (Ansel & Kent, 1992). While perceptual tests provide a measure of patient intelligibility, phonetic and acoustic analyses attempt to identify which features contribute to reduced intelligibility, both for individual patients and for different types of dysarthria (See Kent et al., 1989).

    In Hong Kong, principles for diagnosis and management of dysarthria are based on the English literature, which is unsuitable, given the great differences in the two languages. Aspects of the Cantonese phonological system which distinguish it from English are lexical tone, unreleased final stops, and aspirated versus unaspirated stops. This study will develop a reliable intelligibility procedure which will serve several purposes: (a) It will provide the first database for Cantonese dysarthria (b) It will contribute to our understanding of dysarthria in Cantonese-speaking persons (c) It will contribute to the assessment, management and monitoring of Cantonese-speaking patients, as discussed above. Finally, this study will contribute to the understanding of intelligibility, through a comparison of language-specific and language-universal features of intelligibility.

    This study will focus on adult males with spastic-type dysarthria. It is anticipated that the findings will be extended to include other age groups, other types of dysarthria (for example, the dysarthria associated with cerebellar disorders), other speech disorders (for example, phonological impairment), and other tonal languages (such as Putonghua).

Go to top

III. Identification of the research problem

    A. Research questions

    1. What are the common speech error patterns made by Cantonese-speaking adult male dysarthrics?
    2. How well can we predict the single-word speech intelligibility of adult Cantonese-speaking dysarthric males by (a) phonetic contrast errors (b) acoustic variables?

B. Further objectives

  1. To develop a single-word intelligibility test for use with adult Cantonese-speaking dysarthric males.
  2. To describe the acoustic features of key phonetic features (i.e. those which have been identified as being important to speech intelligibility) for both normal and dysarthric Cantonese speakers.

Go to top

1V. Literature Review

The most widely used method for assessing speech intelligibility has been interval scaling, where listeners assign a rating to a speech sample on an equal-appearing interval scale. One well-known example is the landmark study of Darley, Aronson and Brown (1975). However, recent work by Schiavetti and his colleagues (Schiavetti, Metz & Sitler, 1981) indicated validity problems with this procedure. Direct magnitude estimation is another scaling procedure. Here, listeners assign a number to a speech sample. The number represents the ratio of each sample heard to a standard or module, with no fixed end points provided (See Schiavetti, 1992 for a detailed review). This procedure is reliable but can be impractical to administer, particularly in a clinical setting (Kent et al., 1989).

A second general method of assessing intelligibility is identification tasks, where listeners hear a speech sample and write down or select what they have heard. Identification tasks have been used extensively in work with the hearing impaired (see Weismer & Kent, 1992, for a review) but have been less popular in work with speech disorders. Identification tasks using longer speech samples such as conversational speech or reading passages have an important face validity but may be difficult for severely affected patients to produce, and are not easily quantified. Single word identification tasks have the advantages of reliability, quantifiability, ease of administration, and allowing phonetic error analysis (Kent et al., 1989). Several studies have reported good correlations between single word and sentence measures (see, for example, Kennedy, Pring & Fawcus, 1993, for a discussion). Chan (1993) found no significant differences between articulation in single words and connected speech in Cantonese-speaking phonologically disordered children.

Relatively few identification tests have been developed specifically for the assessment of intelligibility in dysarthric speakers. Kent et al. (1989) review the available tests, pointing out that, while the tests provide an index or estimate of severity, they do not provide any explanation or interpretation of the speech problem.

Recent work by Kent and his colleagues has focused on the development of "explanatory" tests of intelligibility. (See, for example, Kent et al., 1989; Weismer & Martin, 1992). Through the addition of phonetic and acoustic analyses to traditional perceptual measures, they have begun to explore specifically which phonetic, acoustic and even physiological features may be contributing to reductions in intelligibility for a given speech disorder. Such explanations can be useful for individual patients as well as in understanding intelligibility deficits for a type of disorder. For example, in their study of a group of men with amyotrophic lateral sclerosis (ALS), Kent et al. (1990) found that two factors (stop-nasal contrast and glottal-null contrast) were the most affected for this group. They also described how three men with similar overall intelligibility scores showed different error profiles. Use of this approach in languages other than English has been scant. Ziegler and colleagues have reported work with German-speaking dysarthrics (for example, Ziegler, Hartmann & von Cramon, 1998).

Since Cantonese is a lexical tonal language (that is, a change in tone alone can change meaning), we can expect that tone will play a significant role in intelligibility in Cantonese. The role of tone in intelligibility in Cantonese dysarthrics has not been studied. However, several studies have investigated the perception and production of tone in normal and other disordered populations. Yiu (1989) reported on tone comprehension in a group of Cantonese aphasics. Fok (1984; 1987) studied tone in Cantonese hearing-impaired speakers. Kwok et al. (1990) reported on tone perception in patients who had single-channel cochlear implants. So and Varely (1991) developed a test for lexical comprehension in Cantonese which includes tone comprehension. Gandour and colleagues have reported on the production and perception of lexical tone in Thai, particularly in aphasics (Gandour et al., 1992; Gandour & Dardarananda, 1983).

There are other aspects of the Cantonese phonological system which distinguish it from English, for example, unreleased final stops, aspirated vs. unaspirated stops. The role of these features in the speech intelligibility of dysarthria and other speech disorders needs investigation. The unique features of Cantonese as a language make it imperative that tests and databases be developed specifically for the Cantonese-speaking population.

Go to top

V. Data Collection and Analysis

A. Research Plan

Stage 1: Identification of error patterns.

Rationale: An essential criterion when designing a single-word intelligibility test is that the test incorporate errors known to be common to the particular disorder group (see, for example, Kent, Weismer, Kent & Rosenbek, 1989). No such information exists for Cantonese-speaking populations. Therefore, the first part of this study will analyze the speech of spastic dysarthrics on a variety of speech tasks in order to determine common error patterns for this group.

Subjects: Subjects will be 20 adult males with spastic-type dysarthria. The study will be limited to men, due to known gender differences in speech (see, for example, Kent et al., 1992; Colton & Casper, 1990). The focus will be on spastic-type dysarthria as one of the most common types. Subject selection criteria will include use of speech as sole method of interpersonal communication, some degree of intelligibility deficit, IQ scores of 70 or above, normal hearing, normal oral-peripheral structure, and no evidence of aphasia (Ansel & Kent 1992).

Materials: The speech materials will consist of a list of words, sentences and a passage: The materials will test all the sounds in Cantonese as well as examining performance in connected speech. Materials will be based on the Cantonese Segmental Phonology Test (So, 1992).

Procedure: The procedure is based on that used by Platt, Andrews, Young, and Quinn (1980). Subjects will be seated in a sound-treated booth. All speech materials will be read aloud by the subjects from stimuli presented on a computer screen. The words will be read at a rate of one each five seconds. If a subject is not able to read a word, he will repeat the word after the examiner. For the reading passage, one oral practice will be permitted. If the subject cannot read the passage spontaneously, short phrases will be modelled by the examiner and repeated by the subject. The subjects' speech will be recorded on a high-quality digital tape recorder. Speech samples will be phonetically transcribed by the applicant using standard International Phonetic Alphabet (IPA) symbols with narrow transcription markers as necessary (for example, to denote nasalization, lateralization, dentalization). The speech samples of 5 subjects will be transcribed by a second examiner as a reliability measure. Errors will be analyzed for each subject as well as for group trends.

Results: The results will provide an initial data base of common error patterns in adult Cantonese speaking men with spastic dysarthria. While it can be hypothesized that many errors made in English will also be found in Cantonese, others will be unique to Cantonese. Of particular interest will be tone, aspiration, and final stops.

Go to top

Stage Two: Intelligibility Testing

Subjects: Speakers will be as described in stage one. A control group will be added consisting of 20 adult Cantonese-speaking males with no speech, language or hearing disorder, matched for age with the experimental speakers. The listeners will consist of 10 native Cantonese-speaking adults with normal hearing.

Materials: A single word intelligibility test will be developed based on the procedures outlined by Kent et al. (1989) for multiple choice format. Test items will consist of real monosyllabic words. Stimulus words will focus on several phonemic contrasts. Based on previous work in English, probable contrasts will include: front-back vowels, high-low vowels, stop-nasal place of articulation, stop-fricative, initial consonant-null (Kent et al., 1989). Additional contrasts unique to Cantonese will be included, based on the results obtained in the preliminary study above (for example, tone contrasts, aspirated-unaspirated initial consonants, and unreleased final stops). The test items will be arranged in groups of four words, in order to provide adequate foils for listeners and to allow for random selection of test words from a large master list (Kent et al., 1989). The group of four words might be, for example, (fu) [fire], (fu) [trousers], (fu) [subject] and (fui) [ash]. The test word (fu) [fire] contrasts with each of the three foil words in this group by one phonetic contrast (contrast of tone, contrast of high-low vowel, contrast of vowel-diphthong, respectively).

Procedures: Speakers will read each of the stimulus words once. 10% of test items will be repeated as a reliability measure. Word production will be tape-recorded on a high-quality DAT recorder. Listening tapes will be made consisting of sequences of words randomized within and across subjects, with an interword interval of 5 secs. The recorded speech samples will be presented through high quality headphones in a sound-proofed room. Listeners will be shown a response format on a computer screen with four words in each numbered row (one target test-word item and three alternative foils). The listeners will select the word in each row that most closely approximates the subjects' word productions. (Kent et al., 1990; Kent, Weismer, Kent & Rosenbek, 1992).

Analysis: A. Overall Intelligibility. Computer analysis will provide an overall intelligibility score for each speaker, based upon contrast correctness as scored by all listeners. The score will give a measure of speakers' overall accuracy in single word transmission.

B. Phonetic feature analysis. The response forms will be scored by computer to determine the group profile of feature errors according to the phonetic features used in test construction. Individual patterns for feature profiles will also be calculated, by ranking of error proportions for individual subjects. It may be shown, for example, that subjects with similar overall intelligibility scores show different error profiles. A further level of analysis allows us to compare phonetic errors by overall intelligibility scores. For example, do subjects with low intelligibility scores have a different profile than those with high overall scores?

C. Acoustic analysis. Acoustic analysis will be conducted for several features identified as important to speech intelligibility. Features may include those previously studied in English, for example, stop-nasal contrast, fricative-affricate contrast, front-back vowel contrast, high-low vowel contrast. (Ansel & Kent, 1992; Kent et al., 1992; Weismer & Martin, 1992) and on variables found to be significant to Cantonese, such as tone contrasts and aspirated vs. unaspirated stops. The acoustic analysis procedures may include amplitude waveform, spectrographic and linear predictive (LPC) coding analysis. Analyses for individual features will be as described in the above-listed studies. For example, first formant frequency (F1) for high-low vowel contrast, relative frequency of first and second formants for front-back vowel, and duration of fricative noise and rise time for fricative-affricate contrast.

D. Statistical analysis. Multiple regression analysis will be used to determine the strength of the correlation between overall intelligibility scores and phonetic contrast errors, and between overall intelligibility scores and acoustic contrasts. Results are expected to reveal what percentage of the variance in overall intelligibility scores can be predicted by phonetic features and by acoustic variables respectively. (See, for example, results of Weismer & Martin, 1992; Ansel & Kent, 1992).

Go to top

B. Instrument development

As indicated above, several instruments need to be developed:-

a. Speech test (Stage 1)

This will be based upon the Cantonese Segmental Phonology Test (So, 1992) and other existing Cantonese speech tests/materials. To date, I have (a) conducted review of literature to determined type of material needed. (b) investigated existing tests. Still to be done: (a) refine test materials (b) prepare the materials for presentation and recording. This may be done on computer or using printed materials.

b. Intelligibility test (Stage 2)

As indicated, the development of this test depends upon the results of Stage 1. I have (a) investigated similar tests for English-speakers (b) developed a hypothesis for key features in Cantonese. To be done: (a) determination of key features (from results of stage 1) (b) select appropriate items (c) prepare materials for presentation and screening. As above, this may be done on computer or using printed materials.

Go to top

C. Data Collection

Stage 1

Collection of speech samples from 20 dysarthric males.

Stage 2

    1. Collection of speech samples from 20 dysarthric males (may be the same or a different group from stage 1).
    2. Collection of speech samples from 20 normal adult Cantonese males.
    3. Collection of perceptual judgements from 10 Cantonese adults

Dysarthric subjects

I plan to approach Cheshire Home (Shatin), other residential facilities, speech therapy out-patient clinics, and sheltered workshops for possible subjects.

Normal subjects

I will to recruit HKU students, HKU staff, and other friends/acquaintances.

Means of data collection

As described above (Section V. A). Ideally, subjects' speech will be recorded in the Department of Speech and Hearing Sciences, as sound-proof rooms and all necessary equipment is easily available. However, I am prepared to collect the data in the subjects' places of work/residence or homes.

Possible problems

My plan calls for 20 or 40 Cantonese-speaking males with spastic dysarthria. It is possible I will not be able to enlist enough suitable subjects. If so, I plan to (i) use a different type of dysarthria (i.e. ataxic, athetoid, mixed) (ii) use smaller numbers (some studies have used, for example, five or ten subjects).

Go to top

D. Data analysis

Method: As described above (Section V. A)

Problems: I will need to gain expertise in this area. No other problems are anticipated.

Go to top

VI Possible outcomes

    1. Determination of common speech errors in adult Cantonese-speaking spastic dysarthric males.
    2. Development of a single-word speech intelligibility test for adult Cantonese-speaking dysarthrics which (a) provides an overall intelligibility score (b) allows identification of errors.
    3. Description of key acoustic features (i.e. those which have been identified as being important to speech intelligibility in Cantonese dysarthrics) in both normal and dysarthric speakers.
    4. Determination of how well we can predict overall intelligibility score by (a) phonetic error patterns and (b) acoustic features in adult Cantonese-speaking spastic dysarthrics.

Practical implications

Some of the practical implications have been described above (Section V A). To summarize:

    1. There is a lack of data on both normal and disordered Cantonese speech. This study will provide a valuable database on Cantonese dysarthric speech. It will also contribute data on acoustic patterns of normal Cantonese adult speech.
    2. There are no speech intelligibility tests for Cantonese (except one or two speech audiometry tests, which are unsuitable for this purpose). The development of an intelligibility test will be valuable to clinicians as well as researchers both in Hong Kong and other countries with Cantonese-speakers.

Go to top

VII References

Ansel, B. M. & Kent, R. D. (1992). Acoustic-phonetic contrasts and intelligibility in the dysarthria associated with mixed cerebral palsy. Journal of Speech and Hearing Research, 35, 296-308.

Chan, Y. T. (1993). A comparison of the articulation of Cantonese-speaking phonologically disordered children in single words and connected speech. In Bachelor of Science (Speech & Hearing Sciences) First Degree Dissertations, Volume 1. Department of Speech and Hearing Sciences, University of Hong Kong.

Colton, R. H. & Casper, J. K. (1990). Understanding voice problems: A physiological perspective for diagnosis and treatment. Baltimore: Williams & Wilkins.

Darley, F. L., Aronson, A. E, & Brown, J. R. (1975). Motor Speech Disorders. Philadelphia: W. B. Saunders.

Fok, A. Y. Y. (1987, November). A study of the part tones play in determining the comprehension threshold of Cantonese. Paper presented to the First International Conference on Cantonese and Yue dialects. Chinese University of Hong Kong.

Fok, A. Y. Y. (1984). The teaching of tones to children with profound hearing impairment. British Journal of Disorders of Communication, 19, 225-236.

Gandour, J., Ponglorpisit, S., Khunadorn, F., Dechongkit, S., Boongird, P., Boonklam, R. & Potisuk, S. (1992). Lexical tones in Thai after unilateral brain damage. Brain and Language, 43 (2):275-307.

Gandour, J. & Dardarananda, R. (1983). Identification of tonal contrasts in Thai aphasic patients. Brain and Language, 18, 98-114. University of Hong Kong, November.

Kennedy, G., Pring, T. & Fawcus, R. (1993). No place for motor speech acts in the assessment of dysphagia? Intelligibility and swallowing difficulties in stroke and Parkinson's disease patients. European Journal of Disorders of Communication, 28, 213-226.

Kent, J. F., Kent, R. D., Rosenbek, J. C., Weismer, G., Martin, R., Sufit, R. & Brooks, B. R. (1992). Quantitative description of the dysarthria in women with amyotrophic lateral sclerosis. Journal of Speech and Hearing Disorders.

Kent, R. D., Kent, J. F., Weismer, G., Sufit, R. L., Rosenbek, J. C., Martin, R. E. & Brooks, B. R. (1990). Impairment of speech intelligibility in men with amyotrophic lateral sclerosis. Journal of Speech and Hearing Disorders, 55, 721-728.

Kent, R. D., Weismer, G., Kent, J. F. & Rosenbek, J. C. (1989). Toward phonetic intelligibility testing in dysarthria. Journal of Speech and Hearing Disorders, 54, 482-499.

Kwok, C. L., Wong, C. M., So, K. W., Yiu, M. L., Lau, C. C., Luk, W. S. & Tang, S. O. (1991). Speech and lexical-tone perception in Cantonese-speaking cochlear implant patients. Australian Journal of Human Communication Disorders, 19, 77-90.

Nicolosi, L., Harryman, E. & Kresheck, J. (1978). Terminology of communication disorders: Speech. language, hearing. Baltimore: Williams & Wilkins.

Platt, L. J., Andrews, G., Young, M. & Quinn, P. T. (1980). Dysarthria of adult cerebral palsy: I. Intelligibility and articulatory impairment. Journal of Speech and Hearing Research, 23, 28-40.

Schiavetti, N., Metz, D. E. & Sitler, R. W. (1981). Construct validity of direct magnitude estimation and interval scaling of speech intelligibility: Evidence from a study of the hearing impaired. Journal of Speech and Hearing Research, 24, 441-445.

Schiavetti, N. (1992). Scaling procedures for the measurement of speech intelligibility. In R. D. Kent (Ed.), Intelligibility in speech disorders: Theory, measurement. and management (pp. 11-34). Philadelphia: John Benjamins.

So, L. K. H. (1992). Cantonese Segmental Phonology Test. Hong Kong: Department of Speech and Hearing Sciences, University of Hong Kong.

So L. K. H. & Varley, R. (1991). Cantonese Lexical Comprehension Test. Hong Kong: Department of Speech and Hearing Sciences, University of Hong Kong.

Weismer, G. & Martin, R. E. (1992). Acoustic and perceptual approaches to the study of intelligibility. In R. D. Kent (Ed.), Intelligibility in speech disorders: Theory, measurement, and management (pp. 67-118). Philadelphia: John Benjamins.

Yiu, E. (1989). Tone perception in Cantonese aphasics. Unpublished master's thesis, University of Hong Kong.

Ziegler, W., Hartmann, E. & von Cramon, D. (1988). Word identification testing in the diagnostic evaluation of dysarthric speech. Clinical Linguistics and Phonetics, 2, 291-308.

[An extensive Bibliography was also added but has not been included here]

Go to top

VIII. Timetable


Submit Proposal

Stage 1 - Error patterns
- prepare test materials
- screen and enlist subjects
- collect data
- analyze data
Stage 2 - Intelligibility test
- prepare test materials
- screen and enlist contact subjects
collect data
Stage 3 - Listening task
prepare test materials
enlist subjects
collect data
Data entry
Analysis (for stages 2 & 3)
Perceptual analysis
Phonetic analysis
Acoustic analysis
Statistical analysis


Editing, Revisions, Rewriting


Total Duration:

Estimated Duration

6 months


4 months


4 months





12 months



12 months


38 months

Estimated Dates

December 1993


January 1 - June 1994


July - October 1994


Nov. 1994 - February 1995





March 1995 - February 1996



March 1996 - February 1997


March 1997

Note: 1993/94 - Part-time only on Ph.D. (approx. 2 days week)

1994/95 - Full-time on Ph.D.

1995/96 - Part-time.

Go to top