Tsai, C.-H. (2001). Word identification and eye movements in reading Chinese: A modeling approach. Doctoral dissertation, University of Illinois at Urbana-Champaign.
Back:Chapter 7, Chapter 10 | Top:Table of Contents
Table
A1
Distribution of Word Lengths
Length | Unique words |
Percentage of unique words |
Word tokens | Percentage of word tokens |
Percentage of characters tokens |
---|---|---|---|---|---|
1 | 3,863 | 2.94 | 2,242,590 | 46.18 | 28.50 |
2 | 66,785 | 50.74 | 2,291,738 | 47.19 | 58.26 |
3 | 45,381 | 34.48 | 258,173 | 5.32 | 9.84 |
4 | 12,297 | 9.34 | 55,922 | 1.15 | 2.84 |
5 | 1,878 | 1.43 | 5,563 | 0.12 | 0.36 |
6 | 698 | 0.53 | 1,411 | 0.03 | 0.11 |
7 | 385 | 0.29 | 481 | 0.01 | 0.04 |
8 | 174 | 0.13 | 200 | < 0.01 | 0.02 |
9 | 90 | 0.07 | 106 | < 0.01 | 0.01 |
10 | 26 | 0.02 | 28 | < 0.01 | < 0.01 |
11 | 3 | 0.01 | 13 | < 0.01 | < 0.01 |
12 | 11 | 0.01 | 12 | < 0.01 | < 0.01 |
13 | 7 | 0.01 | 8 | < 0.01 | < 0.01 |
14 | 7 | 0.01 | 7 | < 0.01 | < 0.01 |
15 | 1 | < 0.01 | 1 | < 0.01 | < 0.01 |
Note. Percentage of character tokens = percentage of character tokens in the corpus occurring in words with the given length.
© Copyright by Chih-Hao Tsai, 2001