Articles on Learning Languages by Reading
Learn Language Through Reading: https://linguapath.com/learn-language-through-reading/
Love this article. It includes info from Paul Nation’s research.
I refer to Paul Nation’s 2014 corpus-based study, in which he calculated the amount of input necessary to learn the most frequent 9000 words.1 Here are his findings:
Let’s imagine that you already know the first 2000 most frequent words. In this case, you would have to read just about 300 000 words (or 3 books) to pick up another thousand words at the 3000-level.2
For a rough estimate: 120,000 words is roughly 240 pages in english. this will equal one “token” from the table above.
For comparison (see this article https://chinese.stackexchange.com/questions/1685/for-magazines-or-books-how-much-does-text-volume-differ-when-comparing-chines) “An average result will be that 1,000 Chinese characters can be translated into about 600-700 English words, or 1,000 English words can be translated into about 1,500-1,700 Chinese characters, varying depending on the natures of the source contents and target writing styles. “ (this is what they referenced https://www.actranslation.com/chinese/chinese-wordcount.htm).
So 120,000(.7)=84,000 chinese words per ‘token.’
Or “or each 1000 English words will be translated into about 1500-1700 Chinese characters“ (https://www.actranslation.com/chinese/chinese-wordcount.htm).
So 120,000 english words, will mean 120,000*1.6= 192,000 chinese characters per ‘token.’
So, if I know the word count or character count of a novel, I can estimate how many novels/characters it will take to get to X level.
(I’ve read 69, 315 characters of guardian so far, that’s the length of first arc)
163,913 characters as of sundial arc.
53 chapters of guardian is ~191,000 characters.
54 chapters of guardian is ~195,000 characters.
One chapter on average about ~3600 characters.
So since guardian is about twice that amount of chapters (106 plus extras), the whole novel would be ‘2 tokens’ and count as that much.
So I would estimate, at least in priest’s novel length, every 53-54 or so would count as 1 token on the table above.
For future reference, that’s about 20-21 pages per chapter in Pleco for on average ~3600 characters.
I’d guess a 10 page chapter in pleco to be half that number of characters, 1800 characters. So for those ‘shorter’ chaptered novels, I would need 106 chapters to reach ‘1 token.’ (192,000/1800).
Out of curiosity, I’ve read 66 chapters of hanshe at 10 pleco pages each, 66*1800= 118,800 characters.
Plus how much I’ve read of Guardian, 69,315+118,800= 188,115
Plus Xiao Wangzi, and the chapters of Tamendegushi I’ve read, 14,718 (小王子)19*1800=34,200(他们的故事)
188115+14718+34200= 237,033 is at least how many chinese characters I’ve read so far (I’ve also read a few chapters of various things but didn’t count them well).
237,033/192,000= ~1.23 ‘tokens.’
And, if I want to go easy on myself? I literally re-read the sundial arc a second and third time ToT (because I really love Guardian).
So I may well be at:
Read sundial arc twice: 188115+14718+34200+69315= 306,348 or 1.59 tokens.
Read sundial arc three times: 188115+14718+34200+(69315*2)=375,663 or 1.96 tokens.
***So I might be in the ‘learning the 3000 most common words’ section that is going to take 576,000 more characters to get through. which is 160 more priest-novel 20-pleco-page chapters, or 320 more 10-pleco-page chapters.
Qi Ye ~79 chapters (~284,400 characters, 1.48 tokens) , Tian Ya Ke ~81 chapters (~291,600 characters, 1.51 tokens), Mo Du ~185 chapters (~ 666,000 characters, 3.46 tokens).
So: Qi Ye + Tian Ya Ke = finishing 3000-level (or just Reading Mo Du on its own).
From the article, it summarizes that to learn all 9000 common words in a language from reading: This equals to something like 92 books. With the reading speed of 200 words per minutes (which is comfortable enough: I tend to read English at 295 wpm, and English is far from being my native language), it would take you 150 minutes a day (2.5 hours) to process all this beauty in a year.
So I am guessing, if I read 150-200 (unfortunately - but I read a lot of much shorter books when I push into extensive reading), at 30 min to 1.5 hours a day on average, I may get to that point in idk 3 years?
Any activity you do in your target language will contribute to that cherished number of 11 million tokens you need to upload to your brain. You may listen to podcasts, watch movies and YouTube, talk to strangers, read magazines, play games… When it comes to vocabulary acquisition, anything goes, seriously.
Diversify your input, and you won’t ever feel bored. But keep in mind that spoken input is often way slower than written. Read a book for two hours and you will process approximately 24 000 words. Spend the same two hours watching a movie and you will get just about 10 000 words3With a typical speech rate for movies of 83 words per minute.4.
So an hour of reading (at speaking speed) is ~12,000 words, 30 minutes is 6,000 words, 15 minutes is 3,000 words (that tracks as Guardian chapters take me 15-20 minutes to read being 3000-4000 characters and I read at around speaking speed).
A movie per hour is 5,000 words. A drama is 40 min, 1/3 of a 120 minute movie, so idk probably on average ~3,333 words per drama episode (watched in target language obviously).
I’m not counting how much time I spend on those, because who knows, and I obviously spent lots of time listening to the Guardian audiobook (though Idk if this person’s article thinks ‘repeated reading/listening of a single material’ counts as multiple things or not).
The article suggests: Instead, aim to cover, say, 1000-5000 levels in the first year, 6000-7000 in the next, and gradually work your way through 8000-9000 levels during the following 18 months. This way, you’ll consistently read 20-30 foreign language books a year, which is more than manageable.
(that’s 15 chinese books in first year, or 2,880,000 characters... I feel like I’ve heard of a book easily that long in chinese... anyway, that would be 15 ‘chunks’ of 54 chapters 20-pleco-pages, or 15 ‘chunks’ of 106 chapters 10-pleco-pages... which for chinese novels typical lengths of ~100 chapters that would mean more like ~7.5 actual novels read total and possibly less if the novels are significantly longer).
Another interesting article: https://puroh.it/reading-for-a-fine-vocabulary/
Did a lot of these calculations for me (too bad I didn’t read it first).
They said at about 11.6 million words you will have learned a 35,000 english vocabulary. So 11,600,000*1.6 (the english words to chinese characters scale) = 18,560,000 chinese characters. 18.5 million characters.
That is ~96 ‘chunks’ of 54 chapters of 20-pleco pages, (or 5155 chapters to reach all 18.5 million words). Because chinese web novels are often 100+ chapters, it would be around ~48 novels at 100-ish chapters each.
So to get back on track: this article estimates 11.6 million english words to be 137 novels. For chinese, the equivalent would be 18.5 million characters is either ~96 chunks (54 long-chapters) or ~48 novels (100-ish chapter novels).
48/137= 0.35 so you need to read only .35 the amount of ‘novels’ you’d need to read in english to reach the same level.
In this article, it says 100 books in english would get you to 25,000 words learned. So that would be 100*.35 = 35 (100 chapter) novels read in chinese to get to the same level maybe? And 80 books in english to get to 20,000 words learned, which is 28 (100 chapter) novels read in chinese.
I would guess, based on the two articles combined, yeah 28-35 novels read would probably get you to a comfortable reading level, possibly les novels than that.
Things I want to read: hanshe, guardian, modu, sha po lang, can ci pin, poyun, liuli, qi ye, tian ya ke, xin xiao shi yi lang, dao mu bi ji 1-9, dmbj ten years, peach blossom debt, san ye.... that’s 22 novels by name (likely more if i were to count in 100-chapter segments), plus some other novels and fanfic... i think if i just keep following my plan of reading books i want to read i will eventually hit those benchmarks.
Anyway, those articles above are interesting if you’re curious just how many books it may take you to read (or words) to hit certain levels of reading skill in a language.