Doctoral Dissertation of Chih-Hao Tsai >

July 2001

Tsai, C.-H. (2001). Word identification and eye movements in reading Chinese: A modeling approach. Doctoral dissertation, University of Illinois at Urbana-Champaign.

Top:Table of Contents | Next:Dedication

p. iiiAbstract

In Chinese text, characters are spaced evenly. There are no extra spaces separating words. Since all character boundaries could be possible locations for word boundaries, identifying words should be quite difficult. However, Chinese and English readers read equally fast and have comparable eye movement patterns. Investigating how Chinese readers have solved the word identification problem so efficiently and effectively can contribute to an understanding of the nature of reading and cognitive processes.

This two-part, corpus-based study analyzed the problem of word identification (that is, of identifying which characters constitute a word) in Chinese and developed a model of word identification and eye movements in reading. A large, representative sample of segmented Chinese text was used throughout the study.

In Part 1, the corpus was analyzed systematically to understand the characteristics of different types of ambiguity in word identification. A set of psychologically plausible heuristics of disambiguation was also developed and tested. Part 1 replicated Guo's (1997) finding that many unambiguous word boundaries in a sentence can be found by using a lexicon to match words in the sentence. The properties of critical fragments--strings separated by those unambiguous word boundaries--were also verified. Finally, a set of disambiguation heuristics was found to be very effective.

In Part 2, a model of word identification and eye movements in reading Chinese was developed. Perceptual constraints and an uncertainty-driven heuristic of eye movement control were implemented, based on the results of word identification and assumptions about the perceptual span. The corpus was used to test the model. The model was able to identify 94% of the words correctly and was able to capture many characteristics of real eye movement data, p. ivincluding distributions of saccade lengths, skipping rates, and landing positions. Besides, the effect of word frequency of the fixated word on fixation duration was also observed.

Findings from this study have demonstrated the usefulness of creative methodologies and interdisciplinary perspectives in investigating complicated cognitive processes such as word identification and eye movements in reading Chinese. This study has also set a new direction and established a new framework for reading research in Chinese.

© Copyright by Chih-Hao Tsai, 2001