Tsai, C.-H. (2001). Word identification and eye movements in reading Chinese: A modeling approach. Doctoral dissertation, University of Illinois at Urbana-Champaign.
Previous:Chapter 4 | Top:Table of Contents | Next:Chapter 6
It seems clear that the research trend in Chinese reading has been one that follows the "mainstream" of research in psychology of reading. Psychologists studying Chinese reading have been predominantly concentrating on lexical processing issues involving single character or word reading, and have paid rather little attention to the problem of word identification in Chinese text reading. The progress has not been rapid, and our knowledge of word identification in Chinese text processing has been slow to accumulate. There are a number of factors that have led to this situation.
Psychologists studying Chinese reading have often not been
informed by advances in Chinese linguistics. Those who study
Chinese reading tend to use their own intuition to develop
theories of the language, which often are incompatible with
linguistics. For example, Taft, Liu, and Zhu (1999) and Taft and
Zhu (1997) equated morphemes to characters and used terms such as
"morphemic processing" and "sub-morphemic processing" which
really meant "character processing" and "sub-character
processing". Many other researchers, such as Peng, Liu, and Wang
(1999) and Zhang and Peng (1992), share this kind of
conceptualization. In general, the psychologists have not made
use of findings in linguistics, thus preventing them from
utilizing linguistically-defined structures, such as the word or
morpheme. The lack of input from linguistics has clearly deterred
the development of knowledge regarding cognitive processes of
Chinese reading. Furthermore, psychologists studying Chinese
reading and computer scientists doing natural language processing
research also do not interact very often. The bulk of tokenization literature in
computer science does not seem to have much effect on
psychologists interested in what would appear to be a very
similar subject. This seems strange, but it is not without
reasons. Computational methodology is something that many
psychologists, especially experimental psychologists, are not
conversant with, and while psychologists are usually very
concerned about "psychological reality", most computational works
are difficult to equate with psychological processing models.
Experimentation is the primary research tool of most cognitive
psychologists. It is, as Bower and Clapper (1991) nicely
described, "a conceptual prosthetic, an intellectual
tool that allows us to create in the laboratory possible
microworlds never seen before and then observe how specific
cognitive subsystems operate in those microworlds.
Experimentation provides a generate-and-test heuristic for
checking the validity of our causal theories, for testing
theoretical predictions." (p. 294). There are two crucial points
here. First, experimentation is a theory validation tool rather
than a theory construction tool. Experimentation can only be
meaningful in the context of a good theory. Second, it is a
heuristic, which means it improves the average-case performance
on a theory validation task but does not necessarily improve the
worst-case performance. In other words, it can resolve most but
not all research problems. In light of these two characteristics
of experimentation, and especially taking into account the
problem of word identification, the limitation of experimentation
in this paradigm manifests itself. Word identification in Chinese
reading is a complicated computation that does not yet have a
good theoretical model that can be used to generate hypotheses.
As to the theory construction itself, experimentation can be of
help, especially in testing hypotheses and ruling out
possibilities. However, for such a complicated mechanism with so many
possibilities, experimentation alone may not be sufficient to
exclude the myriad possibilities.
Experimentation is most useful when it complements a fully fleshed-out theory. To form a good theory, intense analytical work is indispensable. After all, we are more likely to discover true cognitive mechanisms if we first outline the details of the problem they are designed to solve (Marr, 1982). This is because human minds are adaptive, and have evolved to solve adaptive problems rather than arbitrary tasks (Cosmides, 1989; Pinker & Bloom, 1990).
The word frequency effect can be viewed as the result of adaptation or optimization. Since normal reading materials consist of mostly high frequency words, to optimize reading efficiency one needs not recognize every word in uniform time. Instead, the recognition of high frequency words is optimized, because it is most cost-effective to reduce recognition time for those high frequency words. For example, a reduction of 50 milliseconds in recognizing a word in principle results in a reduction of 5 seconds of total reading time if the word occurs 100 times in the text. The reduction of reading time would be only 0.25 seconds if the word were to occur only 5 times in the text. It is easy to see that to optimize the total reading time, the higher the frequency of a word, the larger the reduction in its recognition time should be. In fact, the above example of optimization is almost a mirror image of the Huffman code, a variable-length code aimed at optimizing the total length of encoded information. The encoding scheme assigns shorter codes to high frequency words and longer codes to low frequency words. (See Hamming, 1986, or textbooks on coding and information theory or data structures, for a formal introduction to the Huffman code.)
The word
identification problem in Chinese reading is much more
complicated than the simple word frequency effect, but it is
still an adaptation/optimization problem. By carefully probing
the nature of the problem, solid inferences can be drawn. These
inferences, in turn, can form a very robust theoretical framework
that can be used to guide further research. Although the adaptive
or optimization perspective has been frequently articulated by
recent researchers viewing cognitive problems from an
evolutionary or computational perspective (e.g., Barkow,
Cosmides, & Tooby, 1992), it is worth noting that such a
perspective is not new in psychology. Petrinovich (1979) in his
adaptation of Brunswik's (1952) original lens model had already
made this clear. He emphasized the discovering of trustworthiness
or reliability of cues in the environment that may be used by the
organism to achieve some behavioral goals. By doing so, it can
"aid the scientist's attempts to construct a theory of
behavior.... [and] can give some insights into the problems that
one is faced with in general and should influence the choice of
variables and the specific testing and analytic strategies to be
used" (p. 378). Gibson's (1979) ecological approach to perception
is another example. His emphasis on discovering invariant
information in the optical array coincides with Petrinovich's or
Brunswik's "cues". Although Gibson had a simplified, unrealistic
view of how perception should occur insisting that there is
enough information in the optical array to make computation or
processing unnecessary, the emphasis on reliable cues was
nevertheless insightful and remains so in the present day. In
fact, paying attention to cues (or constraints) present in the
input is a common characteristic of some of the most successful,
modern approaches to cognition, among which Parallel Distributed
Processing (PDP; Rumelhart, McClelland, & the PDP Research
Group, 1986) is most well known.
© Copyright by Chih-Hao Tsai, 2001