Doctoral Dissertation of Chih-Hao Tsai >

July 2001

Tsai, C.-H. (2001). Word identification and eye movements in reading Chinese: A modeling approach. Doctoral dissertation, University of Illinois at Urbana-Champaign.

Previous:Chapter 3 | Top:Table of Contents | Next:Chapter 5

p. 18Chapter 4
Word Identification During Online Reading

What could one conclude from the literature of psycholinguistic research on lexical representation and access in Chinese? It appears that the evidence for the involvement of phonological, morphological, and semantics factors on lexical representation and access is minimal. Words appear to be identified directly by matching the encoded visual inputs with entries in the mental lexicon. This is not to say that there are no phonological, morphological, and semantic processes during word identification and the entire lexical access process. There are; they just do not seem to mediate word identification.

Having reviewed research on the identification and access of words in the mental lexicon, I will now turn to past research on the identification of words during reading. Surprisingly, there are only a few related psycholinguistic studies. The field's lack of interest on this issue is quite intriguing. There appear to be two main reasons. First of all, the issue is much more complicated than typical psychological or psycholinguistic issues. Perhaps the complexity of the problem of word identification has intimidated most psychologists, because it demands research methodologies and perspectives other than traditional experimentation methodology. Secondly, as discussed earlier, many psychologists studying Chinese reading have felt uncertain about the existence of word unit (e.g., H.-C. Chen & Zhou, 1999, pp. 425-426). Hoosain (1991, 1992) even questioned the "psycholinguistic reality" of the word in Chinese.

As another example, Perfetti and Tan (1999) said in a recent article discussing Chinese word identification:

These issues [of word finding in Chinese reading], although beyond our present purpose, are important in the long run if a theory of word reading in texts is to be developed. In our narrower context, the question is the definition of word, to set the scope for a theory of word identification. There are two possibilities. One is to take word finding as part of p. 19what a theory has to account for. We think this is more than can be accomplished at present [emphasis added]. It also works against comparisons with the research on word identification in alphabetic writing systems, which focused on single monomorphemic words in the development of models of identification. Thus, we take the second option, which is to take single-character words as the primary model of word identification [emphasis added] (p. 120).

It is clear how the uncertain feeling about the word unit and the complexity of the word identification problem may have driven them (and many other psychologists as well) to defining words as characters and making single-character identification process as their primary scope of research.

Word Spacing and Reading

Perhaps one of the easiest experimental manipulations that can be used in studying the word identification problem is adding word boundaries to the text to see if reading performance changes. Liu, Yeh, Wang, and Chang (1974) did exactly this kind of experiment. They assumed the lack of word boundaries makes word identification more difficult, and that this difficulty in word identification increases reading time. Liu et al. had participants read sentences with word boundaries provided by inserting a space between two consecutive words and predicted that reading should be easier when word boundaries are provided. Contrary to their prediction, reading speed (measured as recognition threshold for a sentence presented in a tachistoscope) in that condition was slower than that in the control condition. They concluded that the readers must have adapted to the conventionally arranged text, so artificial insertion of word boundaries changes the visual environment and disturbs the readers' well-established eye movement habits. Of course, it is also likely that the unusual way of measuring recognition thresholds in the tachistoscope may interfere with reading, or that introducing spaces pushed the extreme characters further into the periphery, making them harder to see.

p. 20More recently, Hsu and Huang (2000) did a set of similar experiments. They had participants read either conventionally arranged text, or text with word boundaries inserted, on a computer screen. What they have found was the reverse of Liu et al.'s finding: The reading time for text with word boundaries was shorter than that for conventionally arranged text. However, their experimental design was a between-subject design, with only three participants in each condition. Besides, each participant only read three passages, each of which contained only about 250 words. Therefore, Hsu and Huang's results may not be very reliable.

Perceptual Units in Reading

H.-C. Chen (1987) and J.-Y. Chen (1999) adopted the letter detection paradigm (Healy, 1976, 1980; Healy & Drewnowski, 1983) to investigate if the reading units can be larger than single characters in Chinese. In a typical experimental setting by Healy and her colleagues, the participants were given a target letter and asked to proofread a text and circle all target letters. It was found that letter detection tended to be less accurate when the target letter was a constituent of a word than when it was a constituent of a misspelled word. This finding is called the word inferiority effect. H.-C. Chen used the rapid serial visual presentation (RSVP) paradigm to present the text, while J.-Y. Chen presented the text normally while the subject searched for a target character. The common pattern of their findings was that when the target character was a constituent of a word, the accuracy of detection was higher than when it was a constituent of a nonword. That is, they obtained an effect of word superiority, rather than word inferiority. The contradiction between the two sets of results is hard to explain. Nevertheless, it demonstrates that the word does have an effect on a task that does not require word recognition, thus suggesting that the tendency of the word to be perceived and processed as a unit is quite strong.

p. 21Disambiguation Heuristics

Perfetti and Tan (1996) found that lexical garden path Chinese sentences were read slower than their controls. The garden path portion of their sentences was a tri-gram where the first and second characters formed a two-character word, while the second and third characters also formed a two-character word. The latter was the correct identification in the sentential context in their experiment. Based on their finding, Perfetti and Tan concluded that Chinese readers read the text word by word using a "two-character assembly strategy", rather than character by character, otherwise lexical garden-path sentences will not bother them. Perfetti and Tan's finding is interesting, and it does reflect some strategy, or heuristic. However, the strategy or heuristic is probably not the "two-character assembly strategy" they had proposed. In fact, it seems that there was a left to right heuristic. Nevertheless, other factors (word frequency, just to mention one) could also have had the same effect. So it is hard to guess what has happened in their experiment.

© Copyright by Chih-Hao Tsai, 2001