Doctoral Dissertation of Chih-Hao Tsai >

July 2001

Tsai, C.-H. (2001). Word identification and eye movements in reading Chinese: A modeling approach. Doctoral dissertation, University of Illinois at Urbana-Champaign.

Previous:Chapter 4 | Top:Table of Contents | Next:Chapter 6


p. 22Chapter 5
Some Problems in Chinese Reading Research

It seems clear that the research trend in Chinese reading has been one that follows the "mainstream" of research in psychology of reading. Psychologists studying Chinese reading have been predominantly concentrating on lexical processing issues involving single character or word reading, and have paid rather little attention to the problem of word identification in Chinese text reading. The progress has not been rapid, and our knowledge of word identification in Chinese text processing has been slow to accumulate. There are a number of factors that have led to this situation.

Lack of Interdisciplinary Perspective

Psychologists studying Chinese reading have often not been informed by advances in Chinese linguistics. Those who study Chinese reading tend to use their own intuition to develop theories of the language, which often are incompatible with linguistics. For example, Taft, Liu, and Zhu (1999) and Taft and Zhu (1997) equated morphemes to characters and used terms such as "morphemic processing" and "sub-morphemic processing" which really meant "character processing" and "sub-character processing". Many other researchers, such as Peng, Liu, and Wang (1999) and Zhang and Peng (1992), share this kind of conceptualization. In general, the psychologists have not made use of findings in linguistics, thus preventing them from utilizing linguistically-defined structures, such as the word or morpheme. The lack of input from linguistics has clearly deterred the development of knowledge regarding cognitive processes of Chinese reading. Furthermore, psychologists studying Chinese reading and computer scientists doing natural language processing research also do not interact very often. The bulk of p. 23tokenization literature in computer science does not seem to have much effect on psychologists interested in what would appear to be a very similar subject. This seems strange, but it is not without reasons. Computational methodology is something that many psychologists, especially experimental psychologists, are not conversant with, and while psychologists are usually very concerned about "psychological reality", most computational works are difficult to equate with psychological processing models.

Limitation of Experimentation

Experimentation is the primary research tool of most cognitive psychologists. It is, as Bower and Clapper (1991) nicely described, "a conceptual prosthetic, an intellectual tool that allows us to create in the laboratory possible microworlds never seen before and then observe how specific cognitive subsystems operate in those microworlds. Experimentation provides a generate-and-test heuristic for checking the validity of our causal theories, for testing theoretical predictions." (p. 294). There are two crucial points here. First, experimentation is a theory validation tool rather than a theory construction tool. Experimentation can only be meaningful in the context of a good theory. Second, it is a heuristic, which means it improves the average-case performance on a theory validation task but does not necessarily improve the worst-case performance. In other words, it can resolve most but not all research problems. In light of these two characteristics of experimentation, and especially taking into account the problem of word identification, the limitation of experimentation in this paradigm manifests itself. Word identification in Chinese reading is a complicated computation that does not yet have a good theoretical model that can be used to generate hypotheses. As to the theory construction itself, experimentation can be of help, especially in testing hypotheses and ruling out possibilities. p. 24However, for such a complicated mechanism with so many possibilities, experimentation alone may not be sufficient to exclude the myriad possibilities.

Lack of Analytical Works

Experimentation is most useful when it complements a fully fleshed-out theory. To form a good theory, intense analytical work is indispensable. After all, we are more likely to discover true cognitive mechanisms if we first outline the details of the problem they are designed to solve (Marr, 1982). This is because human minds are adaptive, and have evolved to solve adaptive problems rather than arbitrary tasks (Cosmides, 1989; Pinker & Bloom, 1990).

The word frequency effect can be viewed as the result of adaptation or optimization. Since normal reading materials consist of mostly high frequency words, to optimize reading efficiency one needs not recognize every word in uniform time. Instead, the recognition of high frequency words is optimized, because it is most cost-effective to reduce recognition time for those high frequency words. For example, a reduction of 50 milliseconds in recognizing a word in principle results in a reduction of 5 seconds of total reading time if the word occurs 100 times in the text. The reduction of reading time would be only 0.25 seconds if the word were to occur only 5 times in the text. It is easy to see that to optimize the total reading time, the higher the frequency of a word, the larger the reduction in its recognition time should be. In fact, the above example of optimization is almost a mirror image of the Huffman code, a variable-length code aimed at optimizing the total length of encoded information. The encoding scheme assigns shorter codes to high frequency words and longer codes to low frequency words. (See Hamming, 1986, or textbooks on coding and information theory or data structures, for a formal introduction to the Huffman code.)

p. 25The word identification problem in Chinese reading is much more complicated than the simple word frequency effect, but it is still an adaptation/optimization problem. By carefully probing the nature of the problem, solid inferences can be drawn. These inferences, in turn, can form a very robust theoretical framework that can be used to guide further research. Although the adaptive or optimization perspective has been frequently articulated by recent researchers viewing cognitive problems from an evolutionary or computational perspective (e.g., Barkow, Cosmides, & Tooby, 1992), it is worth noting that such a perspective is not new in psychology. Petrinovich (1979) in his adaptation of Brunswik's (1952) original lens model had already made this clear. He emphasized the discovering of trustworthiness or reliability of cues in the environment that may be used by the organism to achieve some behavioral goals. By doing so, it can "aid the scientist's attempts to construct a theory of behavior.... [and] can give some insights into the problems that one is faced with in general and should influence the choice of variables and the specific testing and analytic strategies to be used" (p. 378). Gibson's (1979) ecological approach to perception is another example. His emphasis on discovering invariant information in the optical array coincides with Petrinovich's or Brunswik's "cues". Although Gibson had a simplified, unrealistic view of how perception should occur insisting that there is enough information in the optical array to make computation or processing unnecessary, the emphasis on reliable cues was nevertheless insightful and remains so in the present day. In fact, paying attention to cues (or constraints) present in the input is a common characteristic of some of the most successful, modern approaches to cognition, among which Parallel Distributed Processing (PDP; Rumelhart, McClelland, & the PDP Research Group, 1986) is most well known.


© Copyright by Chih-Hao Tsai, 2001