Tsai, C.-H. (2001). Word identification and eye movements in reading Chinese: A modeling approach. Doctoral dissertation, University of Illinois at Urbana-Champaign.
Previous:References | Top:Table of Contents
Appendix C lists examples of errors in disambiguating critical fragments with disjunctive ambiguity in Part 1 (Chapter 8), including errors made by GMM, FMM, AWF, and MI. Examples listed here are those with correct tokenizations being among their critical tokenizations. Those with correct tokenizations being covered (covering relationship as defined in Guo, 1997) by at least one of the critical tokenizations are not listed. For example, the character string gou tong cai neng "ditch-connect-talent-ability" has two critical tokenizations: goutong caineng "communication-talent" and gou tongcai neng "ditch-versatile person-ability". However, the correct tokenization (in all of the three contexts where the critical fragment appears in ASBC) is goutong cai neng "communication-then and only then-can; '(something is) possible only via communication'", which is covered by the critical tokenization goutong caineng. This kind of errors is not included in Appendix C.
Please also be reminded that critical fragments are character strings segmented by critical points--unambiguous word boundaries defined mechanically. Consequently, they do not necessarily match any linguistic structure, and therefore do not necessarily have comprehensible meanings.
This section lists examples of errors caused by GMM in disambiguating critical fragments where a single critical tokenization with maximum average of word length (AWL) can be identified (that is, no ties in AWL), but the correct tokenization does not have the maximum AWL.
(1) yi jiu shi ren recall-old-time-people 1. yi jiushi ren recall-old times-people AWL = 1.33 2. *yijiu shiren cherish memory of-contemporaries AWL = 2.00 (2) li zhi shou shi leave-job-keep-time 1. li zhishou shi leave-duty-when AWL = 1.33 2. *lizhi shoushi leave office-show up on time AWL = 2.00 (3) ji gu tou deng chicken-bone-head-class 1. ji gutou deng chicken-bone-and so on AWL = 1.33 2. *jigu toudeng chicken bone-first class AWL = 2.00 (4) yi shu tuan ti neng art-skill-group-body-can 1. yishu tuanti neng art-group-can AWL = 1.67 2. *yishutuan tineng art group-physical strength AWL = 2.50 (5) xue qi zhong xue sheng learning-period-center-learning-life 1. xueqi zhong xuesheng semester-halfway between-students AWL = 1.67 2. *xueqi zhongxuesheng semester-high school students AWL = 2.50 3. *xue qizhong xuesheng learning-midterm-students AWL = 1.67
This section lists examples of errors resulted from at least one of the following heuristics: FMM, AWF, and MI. Naturally, for the above heuristics to be applied, there must be ties in AWL resulted from the application of GMM. The three heuristics were applied and evaluated independently, as described in Chapter 8. Since each heuristic could either succeed or fail, there are eight possible outcome combinations of the three heuristics. Excluding the situation where all heuristics succeed, there are seven different situations with at least one heuristic failing to pick up the correct tokenization.
Each sub-section lists examples of errors of a particular outcome combination, and the sub-section heading denotes the pattern of combination. Heuristic(s) marked with a "(+)" sign succeeded, and those marked with a "(-)" sign failed in picking up correct tokenizations. The FMM score ranges from 1 to the number of competing tokenizations. The tokenization with the highest FMM score is what the FMM heuristic chooses. The averages of logarithmically transformed frequencies for words are scaled up by 10^6 times, and the sums of mutual information for characters are scaled up by 10^9 times, to make them easier to read.
(6) wai guo xue outside-nation-learning 1. waiguo xue foreign country-learning FMM = 2 AWF = 9,435,103 MI = -2,692,968 2. *wai guoxue outside-studies of ancient Chinese civilization FMM = 1 AWF = 7,554,792 MI = 152,131 (7) chang di zu field-land-rent 1. changdi zu place-rent FMM = 2 AWF = 7,714,539 MI = -2,533,996 2. *chang dizu field-land rent FMM = 1 AWF = 6,563,432 MI = -1,024,677 (8) lai zi jia ren come-self-family-people 1. laizi jiaren come from-family members FMM = 2 AWF = 9,166,129 MI = -4,998,085 2. *lai zijiaren come-people on our side FMM = 2 AWF = 8,333,056 MI = 1,253,631 (9) chou bei chu yu prepare-prepare-place-in/at 1. choubeichu yu preparatory office-in/at FMM = 2 AWF = 9,986,222 MI = 2,600,155 2. *choubei chuyi prepare-be (in a certain condition) FMM = 1 AWF = 7,514,168 MI = 3,766,979
(10) bo chang duan wave-long-short 1. bochang duan wavelength-short FMM = 2 AWF = 6,462,777 MI = 1,007,424 2. *bo changduan wave-length FMM = 1 AWF = 7,141,182 MI = -1,851,483 (11) shuo fa ze speak-law-rule/in that case 1. shufa ze statement-in that case FMM = 2 AWF = 10,683,251 MI = 1,109,060 2. *shu faze speak-standard method FMM = 1 AWF = 10,918,540 MI = 330,736 (12) zuo wei shen me make-do-what-suffix for interrogatives and adverbs 1. zuowei shenme serve as-what FMM = 2 AWF = 10,929,141 MI = 2,849,043 2. *zuo weishenme make-why FMM = 1 AWF = 11,817,185 MI = 2,841,067 (13) bi xia gong fu pen-down-attack-husband 1. bixia gongfu ability to write-skill FMM = 2 AWF = 6,178,638 MI = 2,043,163 2. *bi xiagongfu pen-put in time and energy FMM = 1 AWF = 6,272,119 MI = 253,391
(14) di zhu yao land-master-want 1. dizhu yao landlord-want FMM = 2 AWF = 10,676,433 MI = 121,782 2. *di zhuyao land-main FMM = 1 AWF = 12,134,349 MI = 1,613,268 (15) xie xia shan write-down-mountain 1. xiexia shan write down-mountain FMM = 2 AWF = 8,109,442 MI = -1,151,904 2. *xie xiashan write-descend hill FMM = 1 AWF = 8,366,070 MI = 446,429 (16) bao zhuang he zhuang wrap-load-box-load 1. baozhuanghe zhuang package box-load FMM = 2 AWF = 4,438,257 MI = 285,822 2. *baozhuang hezhuang pack-boxed FMM = 1 AWF = 4,628,693 MI = 1,579,818 (17) hua dong hai an flower-east-sea-shore 1. Huadong haian Huadong-coast FMM = 2 AWF = 5,208,926 MI = 1,810,962 2. *hua donghai'an flower-east coast FMM = 1 AWF = 7,168,357 MI = 2,614,353
(18) ke ai qing but/may-love-affection 1. ke aiqing but-love FMM = 1 AWF = 10,507,047 MI = 216,744 2. *ke'ai qing lovely-affection FMM = 2 AWF = 8,392,087 MI = -2,811,633 (19) cai mi yu guess-riddle-language 1. cai miyu guess-riddle FMM = 1 AWF = 5,810,796 MI = 1,561,359 2. *caimi yu guess riddle-language FMM = 2 AWF = 5,435,182 MI = -3,751,759 (20) tai yang guang xian too-sun-light-string 1. taiyang guangxian sun-ray FMM = 1 AWF = 7,825,861 MI = 1,608,593 2. *taiyangguang xian sunlight-string FMM = 2 AWF = 5,798,094 MI = 413,021 (21) diao cha biao shi transfer-inspect-form/indicate-indicate 1. diaocha biaoshi investigate-indicate FMM = 1 AWF = 8,454,764 MI = 3,309,182 2. *diaochabiao shi questionnaire-indicate FMM = 2 AWF = 6,437,682 MI = -2,595,684
(22) yuan zuo zhe original-writings-nominal suffix 1. yuan zuozhe original-author FMM = 1 AWF = 8,933,609 MI = -1,219,412 2. *yuanzuo zhe original work-nominal suffix FMM = 2 AWF = 8,259,210 MI = 2,718,467 (23) na shou qiang to take-hand-gun 1. na shouqiang to take-pistol FMM = 1 AWF = 8,251,819 MI = 628,786 2. *nashou qiang good at-gun FMM = 2 AWF = 6,235,329 MI= 854,392 (24) zi da du hui from-large-metropolis-meeting 1. zi daduhui from-metropolis FMM = 1 AWF = 7,987,699 MI = -3,808,184 2. *zida duhui arrogant-metropolis FMM = 2 AWF = 5,681,793 MI = -1,258,353 (25) dang ri ben ren undertake-day-foundation-people 1. dang ribenren when-Japanese FMM = 1 AWF = 10,034,895 MI = 383,639 2. *dangri benren the same day-oneself FMM = 2 AWF = 6,935,471 MI = 602,685
(26) yi ding zhi one-fixed-value 1. yi dingzhi one-constant FMM = 1 AWF = 7,966,095 MI = 149,282 2. *yiding zhi must-value FMM = 2 AWF = 9,364,571 MI = -1,183,193 (27) you xiao yong have-effect-use 1. you xiaoyong have-effectiveness FMM = 1 AWF = 10,637,599 MI = 2,639,721 2. *youxiao yong effective-use FMM = 2 AWF = 10,969,295 MI = -1,216,556 (28) yi lan xian min suitable-orchid-county-people 1. Yilan xianmin Yilan-county resident FMM = 1 AWF = 6,328,658 MI = 905,642 2. *Yilanxian min Yilan county-poeple FMM = 2 AWF = 6,738,372 MI = -2,393,206 (29) wu li xue hui matter-law-learning-meeting 1. wuli xuehui physics-association FMM = 1 AWF = 8,121,930 MI = 1,182,448 2. *wulixue hui physics-meeting FMM = 2 AWF = 9,168,069 MI = 295,382
(30) ren kou cai people-mouth-talent 1. ren koucai people-eloquence FMM = 1 AWF = 9,888,140 MI = -1,483,541 2. *renkou cai population-talent FMM = 2 AWF = 11,184,133 MI = 1,770,649 (31) lao shi fu old-teacher-father 1. lao shifu old-master FMM = 1 AWF = 8,710,946 MI = -3,404,322 2. *laoshi fu teacher-father FMM = 2 AWF = 9,121,112 MI = -670,914 (32) te shu xing neng special-unique-character-ability 1. teshu xingneng special capability FMM = 1 AWF = 8,771,001 MI = 1,639,140 2. *teshuxing neng specificity ability FMM = 2 AWF = 9,084,085 MI = 2,641,945 (33) ting che chang di stop-car-field-land 1. tingche changdi parking-place FMM = 1 AWF = 7,565,084 MI = - 493,641 2. *tingchechang di parking lot-land FMM = 2 AWF = 10,108,360 MI = 2,995,229
© Copyright by Chih-Hao Tsai, 2001