Enjoyers of this concept would probably like this wonderful talk about programming language design by Guy Steele (Sun, Java Language): Growing a Language
It is so inspiring. Recently, I've been thinking of making a side project using LLMs for learning new languages too. Transformers were originally designed for machine translation and now we have much better ones. My idea is to write a mobile app which I have zero experience.
Even for Chinese people, Journey to the West is a somewhat difficult text because it belongs to classical literature. Using some children's books published in recent years, and progressing gradually, might be a better approach?
I’m trying to learn to speak Chinese and not read it yet. The issue is most of the language learning apps have a focus on characters. I feel like I just want to see the pinyin. Maybe I don’t know what I need, but I haven’t found the right tool.
There's a language learning method where you just listen to audio, until you develop a basic familiarity with the language. (Then learn reading and writing later.)
You listen to audio you don't understand yet, and over time your brain begins to pick up the patterns. It takes a lot of time but you can do it in the background, because that processing happens subconsciously. So you can get that time "for free".
But he got it from linguist Stephen Krashen and his Input Hypothesis of language acquisition. (i.e. that the way babies and kids learn languages, thru osmosis, works for adults too.)
I think the ideal solution is somewhere in the middle, starting with something like Pimsleur which is the same idea (audio and repetition) but more structured and focused, to give you that "seed" of vocabulary and grammar, before you flesh it out with the "long tail" of the language.
To add a bit more to this: AJATT (all Japanese all the time) later evolved into MIA (mass input approach), which then became Refold.
The gist of those methods is mass input + create SRS cards for sentences where only one word or grammar pattern is unfamiliar to you.
A similar but more relaxed approach is ALG (automatic language growth), where you start from very basic input with lots of visual aids and let the language “wash over you”: no taking notes, no creating flashcards, no dictionary lookups. Sounds crazy, but it works for a lot of people. It’s the method behind Dreaming Spanish, which was inspired by the teaching method at the AUA language school in Bangkok, where Dr. J Marvin Brown used Stephen Krashen’s ideas to create a Natural Approach course to teach foreigners Thai from zero to fluency.
Thanks! I think getting comfortable with characters fairly early is important, as it helps shift your mindset into the right place. That said, I don’t think this project really works until you’re comfortable with at least ~60 characters.
I recently changed all my language flashcards to be like this. Anki is probably the best option. I have the field with the Hanzi, but just configure my cards not to show it at the moment, so I break the habit of translating everything to characters in my head when I'm trying to listen. It's worked well, and the characters will be there when I decire to do something with them again.
Cool idea! You mentioned the model struggling with Chinese a bit. Have you tried any Chinese models, e.g. DeepSeek or GLM? I imagine they probably have a lot more Chinese in the pretraining. (And their English is certainly fine too!)
https://youtu.be/_ahvzDzKdB0
More: https://triviumpursuit.com/childrens-books-in-words-of-one-s...
My first concerns though:
1. How can the system know which words I already know.
2. To what degree will I misunderstand the meaning of words.
3. Somewhat related to 2, how inaccurate will be description / explanation of words be.
I’m trying to learn to speak Chinese and not read it yet. The issue is most of the language learning apps have a focus on characters. I feel like I just want to see the pinyin. Maybe I don’t know what I need, but I haven’t found the right tool.
You listen to audio you don't understand yet, and over time your brain begins to pick up the patterns. It takes a lot of time but you can do it in the background, because that processing happens subconsciously. So you can get that time "for free".
I learned it from this guy https://alljapanesealltheti.me/index.html
But he got it from linguist Stephen Krashen and his Input Hypothesis of language acquisition. (i.e. that the way babies and kids learn languages, thru osmosis, works for adults too.)
I think the ideal solution is somewhere in the middle, starting with something like Pimsleur which is the same idea (audio and repetition) but more structured and focused, to give you that "seed" of vocabulary and grammar, before you flesh it out with the "long tail" of the language.
The gist of those methods is mass input + create SRS cards for sentences where only one word or grammar pattern is unfamiliar to you.
A similar but more relaxed approach is ALG (automatic language growth), where you start from very basic input with lots of visual aids and let the language “wash over you”: no taking notes, no creating flashcards, no dictionary lookups. Sounds crazy, but it works for a lot of people. It’s the method behind Dreaming Spanish, which was inspired by the teaching method at the AUA language school in Bangkok, where Dr. J Marvin Brown used Stephen Krashen’s ideas to create a Natural Approach course to teach foreigners Thai from zero to fluency.