As a small private research, I calculated the frequency of words and expressions and here goes the result.


AA.txt  (4177 entries) - one-word list


BB.txt (4278 entries) - 2+(two or more)word list


CC.txt (3667 entries) - 3+(three or more)word list


The English text is randomly collected from various source. Not very balanced and not very clean... but it is sufficient enough for my research. (430,000 spaced tokens)


Just for fun.


Posted by nomota multilingual

댓글을 달아 주세요