The article does a decent job of explaining of how Chinese characters work, but it falls short of explaining why.
The reason why Chinese continues to use a logographic writing system is due to both tradition and practicality. English has grossly grouped together Chinese as one unified language, when in actuality it is not. In fact, many "dialects" are mutually unintelligible--one speaker cannot understand another speaker. If all of China switched to using a phoenetic writing system, everyone would write everything differently. It'd be very difficult--impossible at some points--to read and write materials from other "dialects". However, with a logographic approach, everyone can understand that the character 工 means "work" even if I pronounce it like [wirk] and someone else pronounces it like [wak], for example. It's one of the reasons why subtitles are so prevalent in Chinese media. Obviously, this problem can be eliminated by eliminating individual "dialects", which is sort of promoted through the adoption of Mandarin Chinese. Many Chinese media is also dubbed in the standard dialect so that actors with regional dialects can be understood.
As for Chinese characters in other languages, Japanese becomes a lot easier to read with the addition of Chinese characters. Kanji allows sentences to be shorter, less ambiguious, and easier to parse. Unlike Chinese, each character is not just a single syllable, and there are many homonyms in Japanese because there's a smaller set of sounds.
As far as I know it's not English or any Western entity that has grouped the Chinese languages together as one, but the Chinese government, for political reasons. Western linguists recognize the variants of "Chinese" as different languages.
There's a saying in linguistics that "a language is a dialect with an army and a navy" [0]. Linguists recognize that where you draw boundaries between languages is essentially arbitrary, even more so than boundaries between biological species. It tends to be that a language is a language if and only if some sovereign state declares it to be so, otherwise it's a dialect.
This is also how you get Portuguese as a distinct language from Spanish even though the two are more mutually intelligible than Scots (a "dialect") and American English are. Portugal has the sovereign government to back up its claim to having its own language where Scotland does not.
I like the expression, but the example of Portuguese/Spanish is absurd IMO. As a Portuguese speaker, the amount of effort required to communicate with Spanish speakers is very, very high, to the point where I avoid trying at all costs. Here in Texas, it is almost always more effective for my family to communicate with Spanish speakers using very broken English and hand gestures on both sides than trying to get any Portuguese-Spanish mutual intelligibility to work.
But the comparison was to Scots, which is (sometimes, not universally) considered a dialect of English rather than a separate language, but is hard for standard English speakers to understand. It's not just English with a Scottish accent. I have no idea how Portuguese feels to Spanish speakers or vice versa, but here's an example of modern Scots from Wikipedia. I'm curious how it compares.
> Noo the nativitie o' Jesus Christ was this gate: whan his mither Mary was mairry't till Joseph, 'or they cam thegither, she was fund wi' bairn o' the Holie Spirit.
Than her guidman, Joseph, bein an upricht man, and no desirin her name sud be i' the mooth o' the public, was ettlin to pit her awa' hidlins.
>But as he had thir things in his mind, see! an Angel o' the Lord appear't to him by a dream, sayin, "Joseph, son o' Dauvid, binna feared to tak till ye yere wife, Mary; for that whilk is begotten in her is by the Holie Spirit.
> "And she sall bring forth a son, and ye sal ca' his name Jesus; for he sal save his folk frae their sins."
> Noo, a' this was dune, that it micht come to pass what was said by the Lord throwe the prophet,
>"Tak tent! a maiden sal be wi' bairn, and sal bring forth a son; and they wull ca' his name Emmanuel," whilk is translatit, "God wi' us."
Sae Joseph, comin oot o' his sleep, did as the Angel had bidden him, and took till him his wife.
> And leev'd in continence wi' her till she had brocht forth her firstborn son; and ca'd his name Jesus.
As a native speaker of English and having conversational ability in Spanish I would describe both Scots and Portuguese as separate languages. Portuguese feels like it has as much in common with Spanish as Italian or French to me, and I can't remotely carry on a conversation in Portuguese. (Or Scots really, though with the somewhat mutual intelligibility I can speak English or Spanish and maybe that's workable, but I'm definitely not going to understand the Portuguese.)
I'm a fluent second-language Spanish speaker and have had a lot of success communicating with native Portuguese speakers, but only Portugal Portuguese and various African Portugueses. I can't understand Brazilian Portuguese at all.
I am Italian and whenever I go to Spain I usually don't really need to speak English because the languages are close enough that you can go by by just knowing a handful of basic words (and the Spaniards I meet usually prefer it that way). This is both a blessing and a curse; all Italians I met living in Spain (and viceversa, all Spanish-speakers I met in Italy) tend to have a hard time learning the other language "properly" because the threshold for being understood is extremely low. If, perchance, someone speaks an Italian with Spanish grammar, people will still understand you perfectly.
Given that Castillan and Portuguese are even closer (both Western Romance, part of a linguistic continuum, ...) I find it very hard to believe that honestly. I am only familiar with the European variants, thought. Maybe the issues you faced are due to how the Latin American variants have diverged significantly over the years?
The big difference between ES and PT is the accent/pronunciation of letters and matching words. Secondarily, is the differing vocabulary. But a lot of these are still understood as archaic/uncommon alternative words.
(see shoen's post below.)
So, if you learn the accent of the other language, all of a sudden a large portion of the language is unlocked. This happened to me, almost like a light switch.
I don't have a lot of experience with Italian but it seems like the pronunciation is closer to Spanish.
I speak fluent second-language Spanish and have had next to no difficulty communicating with native Portuguese speakers from Mozambique, Cabo Verde, and Portugal. What variety of Portuguese do you speak?
I'll concede that it's possible that I actually have an advantage as a second-language speaker, since my Spanish is probably slower than a native's and when I'm listening I'm already doing more work than a native is accustomed to.
Brazilian Portuguese has some phonological differences that I think confuse people in both directions more than other varieties of Portuguese, like the /tʃ/ and /dʒ/ for <t> and <d> in various contexts. For example a Spanish speaker would probably have a hard time recognizing that Brazilian Portuguese /dʒi'abu/ is cognate with Spanish <diablo>. A Brazilian Portuguese speaker who was less familiar with Spanish might similarly have a hard time recognizing /ˈdjablo/ as cognate with Portuguese <diabo> 'devil'.
Or Brazilian /'sedʒi/ is cognate with Spanish <sed> 'thirst'. A Spanish speaker will have to know to effectively ignore the /ʒi/ in order to recognize the word easily!
Maybe more extreme, Brazilian /'hedʒi/ (written <rede>) is cognate with Spanish <red> 'net, network'.
You might also be familiar with a greater variety of Spanish pronunciations as a non-native speaker... if you know Argentine /'ʃubja/ and /'ʃabe/, then you have a better chance to recognize Brazilian /'ʃuvɐ/ and /'ʃavi/ ('rain' and 'key', respectively).
Yeah, I suspect that OP speaks Brazilian Portuguese but I didn't want to assume.
I should have specified in my original post, but I only meant that Portugal Portuguese (and at least a few of the African varieties that are still very close to Portugal's) are mutually intelligible with Spanish. Which actually just further illustrates the complexity of categorizing speech into discrete languages...
a friend of mine who grew up in argentina went for an interview at a university in brasil.
they reported that on the first day, portugese was just gibberish. on the second day they realized they could read a solid chunk of a portugese newspaper. on the third day they felt they were beginning to understand what people were saying to them.
This really what "mutually intelligible" means, I'd say: that the languages are so close you can sort of work it out without explicit instruction. You still need some experience with the other language - and quite often there'll be a geographical and cultural proximity that means almost all speakers have that.
I grew up in a Scandinavian country and visited the others a lot when I was young, and I find I understand most of what I hear in the other languages, but it's quite common for my peers who don't have that experience to understand nothing.
Probably not as absurd as you think. I reckon if you dropped an American in a random town in Scotland (or even a northern English town, for that matter), they would also need to use very broken English and hand gestures to communicate as well. Glaswegian or Geordie is near incomprehensible to RP speaking Brits, yet alone to an American who's only exposure to Scottish is Mel Gibson as William Wallace.
I may be way off here, and happy to be corrected.
My experience is Texas-Spanish is difficult to use in Spain, and would guess the inverse is true. Which I would deduce making Portuguese-Spanish a non-starter in the state.
I know a very limited amount from having grown up and played soccer in the "Mexican" rec leagues in Tx. While traveling to Spain, English is perfectly fine in cities. But days trips to smaller towns/villages they had more trouble understanding my attempts to communicate with the basic texas-spanish I had picked up, than they did the hand gestures and single english word here and there. I understood next to nothing in Portugal (it might as well had been Dutch to my ears; I had no idea until now that they are kinda similar in the way Spanish/Italian is). Of course, this could be that I'm simply horrible at Spanish. But have heard Texas-Spanish is even weird for Baja-California-Spanish speakers.
Spain Spanish and <pick-latam-country> Spanish are the same language with very different vocabulary.
(Well, not quite, because Spain Spanish has loismo and what not that <pick-latam-country> Spanish almost certainly does not, and there's other variations as well, like Argentine Spanish having very different imperatives, Argentine and Colombian voceo vs. tuteo everywhere else, etc.)
Given the context, you'd probably have an easier time talking Portuguese with someone from Vigo than, say, Juarez, but even then, that might depend on you not having a Brazilian dialect...
After all, Spanish & Brazilian speakers in the new world have their own dialects (not languages).
I don't thinm the intention was to paint Spanish and Portuguese are incredibly similar, only to say that they're more similar than Scots and English which are still considered the same language.
A classic example of this is Hindi and Urdu -- the two languages are largely mutually intelligible when spoken, which is the main criterion for being the same language, but are written with different scripts and of course used in separate and adversarial states.
I'm saying that because China has one single army and navy and at the same time a huge narrative wrapped up in the idea that it's all one China, those "dialects" don't get to be languages because the army and the navy say otherwise.
"A language is a dialect with an army and a navy" implies its corollary, which is that "a dialect is a language without an army or a navy".
(In fact, that's likely what was originally meant by the person who coined the phrase—he was a specialist in Yiddish linguistics writing during WW2.)
It's perhaps how it is seen and used in English. But in China Chinese languages tend to be referred as such with Mandarin referred to as the "common language", etc though the character used has an oral connotation.
I think our disagreement is in whether there can be fault lines of mutual intelligibility bewteen dialects. If liguists and Chinese languages speakers are to be believed(no particular reasons not to), there are in China.
I don't disagree that there can be fault lines of mutual intelligibility between dialects. I'm not even commenting on how we define dialects at all—all I'm saying is that the distinction between a dialect and a language is an arbitrary one that is made for political reasons more than linguistic ones, and that's something that even the sources for that Wikipedia page agree with me on. For example (emphasis added) [0]:
> The debate as to whether or not the varieties of speech used by the Chinese should be classified as separate languages or dialects of one language is a difficult one, with reasons on both sides. The main criterion according to which some scholars tend to use the English term 'language' for the varieties of Chinese, is the lack of mutual intelligibility between the various forms of speech, the fact that the "various 'Chinese dialects' are as diverse as the several Romance languages". On the other hand, since there are no extra-linguistic (political, historical, geographical, cultural) reasons to treat these dialects as individual languages, the tradition is to call them dialects of Chinese.
In the absence of nation states I suspect that we'd mostly talk about dialects and dialect continuums. Discrete languages are only really relevant as a concept because of the non-linguistic ties that bind a nation together.
Reminds me of when I went on business trip of several weeks to Sweden from California. The Swedes spoke English reasonably well. I then went to Scotland for vacation and had a much harder time understanding their dialect than I did with the Swedes.
For me as a non-native speaker of English and German, this is quite normal - I mostly have an easier time understanding other non-native speakers since they usually use "international" dialect/pidgin, speak slower and usually articulate more distinctly.
I could believe this is true if you’re only comparing languages that have the same root or parent language such as Latin languages, etc.
But I don’t see how anyone could describe the difference between Chinese and English as arbitrary or as two dialects even if the apocalyptic collapse of all major nations which spoke such languages occurred tomorrow.
My understanding is that theres something called lexical similarity and if it’s over a certain percentage it’s a dialect.
What's arbitrary isn't that languages are different from each other, what's arbitrary is where you draw the line. When you take two languages on opposite sides of the world they're unquestionably different languages. But as you transition slowly from one language to another, how many languages you spin off and which dialects fall under which languages is arbitrary.
> My understanding is that theres something called lexical similarity and if it’s over a certain percentage it’s a dialect.
Even if you tried to use a method like this to draw lines, it requires you to pick a "center" dialect that you compare all other prospective dialects/languages to. Which dialect you pick as your "center" dialect will determine which dialects end up under your umbrella language, and picking a different center would yield very different results. Which language you pick as your center is inherently a political question, one which would be settled by a sovereign state.
And aside from that problem, lexical similarity is not used to define languages. All it measures is how similar word sets are, and language variations are way more complicated than just vocabulary. No serious linguist would ever try to use a single metric like that to draw lines between languages (and again, most serious linguists aren't actually interested in drawing general-purpose lines because they understand that the lines are not real).
How does that work with e.g. French Creole which has French, Carribean, and English in it. What if this feels like a dialect but the percentage of any given parent is less than your cut-off percentage? You make the rule sound very easy to interpret but I think the general principle is that language classification is nuanced and the irony of the "navy and army" language requirement are it kind of has nothing to do with the actual language spoken.
The "navy and army" argument is usually employed when the question arises whether something is a dialect or a separate language. IMHO such Creoles should also be classified as languages, with the caveat of dialect continuums.
Creole is a weird case IMHO because English itself is pretty much a creole between Old English, Norman French, Norse, and some Gaelic and Pictish languages.
On a recent holiday to to Hong Kong {sp}, I noted the Metro announcements were in Cantonese, Mandarin and then English. I also noted that apologies and other minor social skirmishes and interactions, eg after bumping into someone, were mostly delivered in English.
When in Rome, speak English: its the French language.
Hong Kong is a special case where English still has higher prestige to begin with. Locals sometimes code-switch to English even if everyone present would understand Cantonese.
Here in China, linguists consider the different Western "languages" to be dialects, and believe that the Western governments, for political reasons, make people think they speak different languages than their neighbors, so that they cannot unite.
I'm just joking, but what you say is as absurd as my joke. Western linguists don't consider the dialects different languages. If they do, they do it for political reasons. Accept that there are different ways of thinking and the real world never has to submit to how you define concepts like "a language", and not everything China surprises you with has something to do with politics.
Open WALS or Glottolog or any other language catalog and you will see they categorize Chinese as a language family, consisting of multiple languages like Mandarin, Wu/Hui, Min, etc. You are free to disagree of course, but "Western linguists don't consider the dialects different languages" is simply not true to my knowledge.
> Western linguists don't consider the dialects different languages. If they do, they do it for political reasons.
Western linguists generally view the concept of a "language" as being a political one more than a linguistic one, and so rather than quibble about definitions they just use whichever word the people who speak the language/dialect would use. For example, from a book about Chinese dialects [0]:
> The debate as to whether or not the varieties of speech used by the Chinese should be classified as separate languages or dialects of one language is a difficult one, with reasons on both sides. The main criterion according to which some scholars tend to use the English term 'language' for the varieties of Chinese, is the lack of mutual intelligibility between the various forms of speech, the fact that the "various 'Chinese dialects' are as diverse as the several Romance languages". On the other hand, since there are no extra-linguistic (political, historical, geographical, cultural) reasons to treat these dialects as individual languages, the tradition is to call them dialects of Chinese.
The Chinese language varieties are dialects because it is politically expedient for them to be so. The Western Romance languages are languages because it is politically expedient for them to be. Linguists shrug and move on to more interesting (to them) questions.
They are different languages in the sense that they are not mutually intelligible, but the prestige of Mandarin is not a CCP one. Mandarin as it evolved has always been the language of government and of the plains. Other Sinitic languages routinely borrow readings and terms from Mandarin. There's more that I want to say about the topic, but it's less relevant.
They're written the same, and have basically the same grammar. The characters have hugely different readings, but people can communicate easily in writing. You can call that different languages, but that's certainly a different kind of different than people would expect when you say different.
If there were a version of English where all of the letters designated completely different sounds, but was written exactly the same way, would it be a different language? Would people who said that they were dialects of the same language have to be saying this for political reasons?
edit: I mean, Chinese is how you would expect it to be. How would two people living extremely far apart in China even know how each other would pronounce a particular character? How would they have communicated those sounds 500 years ago? The wide variance in the pronunciation of words even in English is also due to our dogshit orthography (largely imposed by the French), which often fails to give a decent hint for how to say something. Chinese characters are symbols of concepts that usually have a hint of what it's meant to sound like in the northern dialect, 1500 years ago, by referring to another character that there's no reason one would know what it sounded like.
> How would two people living extremely far apart in China even know how each other would pronounce a particular character?
China had an imperial bureaucracy for over 2000 years, which sent officials from one end of the country to the next. In fact, a predecessor of Standard Chinese (a.k.a. "Mandarin") was called "the language of officials" (官話).
The phonology differs. Vocabulary differs. Grammar differs. Speaking Cantonese and Mandarin natively, I have no idea what Hokkien or Sichuan people say, whether or not you write it down.
This is especially apparent when speaking to less educated people with less exposure to the standardised, official Chinese language, which is what people do actually write down when intending for a broader audience, of course. Diglossia is real.
Yep. Anybody who’s ever read written Cantonese or Shanghainese would realise they are often unintelligible unless you speak those languages and understand how they’re written. eg 「佢冇做乜嘢」
And yet the incorrect parent comment has been voted to the top of the thread by those who think it’s helped them.
> The wide variance in the pronunciation of words even in English is also due to our dogshit orthography (largely imposed by the French), which often fails to give a decent hint for how to say something.
Others have already corrected your other misunderstandings, but this is also false. Spanish has at least as much variance in pronunciation as English and has an orthography that is extremely regular. Brazilian Portuguese and Portugal Portuguese likewise have the same, highly regular orthography and are barely mutually intelligible.
To the best of my knowledge you actually have the causality mostly reversed: English's orthography is useless largely because the pronunciation changed but the spelling didn't, and English has a variety of pronunciations because the pronunciations changed differently in different regions. English has a messier orthography than other languages because of our complicated history of borrowing words, but the evidence shows that even people who start with a highly consistent orthography don't use it to keep their pronunciation static and shared.
I'm not entirely certain that's a good explanation.
For most of history, literacy isn't exactly common. I'm not finding easily accessible any estimates of literacy rates for early (say, Qin dynasty) China, but numbers for medieval Europe suggest something like 10-30% for relatively broad definitions of literacy, which seem to be commensurate for estimates for Qing dynasty China. Especially if you look at the period at which Chinese characters essentially ossify into their modern form, it's not clear to me that there's a wide diversity of topolects that it has to approximate, almost certainly nothing to the degree of modern Chinese.
For another thing, mediating among linguistic diversity is something that all of our other scripts have had to do. Cuneiform was used to write the administrative languages of different language families (Semitic languages like Akkadian, Indo-European like Persian, and who-knows-what-language-family-these-are like Elamite), and yet it was a syllabary. Even Chinese script itself starts devolving into a syllabary when Japanese adapts it.
The reason I think Chinese resisted becoming a syllabary was because Chinese was poorly suited for such a transition: my understanding is that words in Chinese are largely monosyllabic and involve a decently high degree of homophones. Furthermore, reconstructions of Old Chinese also suggest a relatively complex phonotactic structure, which means a syllabary that largely covers a CV-syllable scheme is hard to adapt. In other words, Chinese may have been a rare language in that conversion from a logography to a syllabary would not have dramatically reduced the amount of characters one would have had to have learned. (Note also that the reduction of a syllabary to an alphabet, abjad, or abugida happened effectively twice, with Phoenician (or some ancestor) and Korean Hangul).
I don't see how your explanation and GP's explanations are mutually exclusive.
To add some detail to this point:
> my understanding is that words in Chinese are largely monosyllabic and involve a decently high degree of homophones.
A syllabary would have had to represent phonetics as well as tones, which would have multiplied the required syllabary by n number of tones. For instance, Mandarin has 4 (or arguably 5) tones. The "ah" sound has four pronunciations: ā, á, ǎ, and à. Hong Kong Cantonese has at least 6 tones, having purportedly lost a few. Different dialects of Chinese have different numbers of tones, and some have been lost or gained within the same dialect throughout history.
> It's one of the reasons why subtitles are so prevalent in Chinese media
I'd love to hear other people's take on this. I heard this many times when I lived in China, however living in Taiwan - people still always use subtitles. In Taiwan there are vanishingly few people that don't speak Mandarin, so it's not inserted for people that are bad at Mandarin. You will see that both in China and Taiwan people that are fluent in Mandarin watching a Mandarin movie will never turn off the subtitles.
Talking to native-speaking friends I've pieced together that it seems Chinese is actively hard to make out (compared to English). Without the subtitles they will miss sections of dialogue in movies/tv-shows. Maybe because it's so tonal and contextual? I've asked people "Okay, but when you talk to people day-to-day, you don't have subtitles - so how are you dealing with it?" and the responses seem to boil down to "often we have to guess what the other person is saying"
I'd love to hear some thoughts from someone who is 100% biligual and able to make the comparison
This jives with my experience as well. Chinese has a ton of near homophones that are distinguished by tone. One interesting result of this is that Chinese speakers seem to hate hate hate accents, even just from other parts of China. I hear because it's just mentally taxing to listen to.
Though I'm certainly not 100% bilingual. I like to think it isn't just that I'm annoying to listen to. I have heard other speakers get put down as sounding like 'birds chirping' which seems to be a popular way to describe accents.
I’m not 100% bilingual, but my take is that there’s a lot more to be gained from the subtitles due to homophones, wordplay, and literary allusions. It’s like getting genius.com rap annotations in real time for any metaphors or references.
Not Chinese, but I might call myself trilingual and this is exactly something that has wondered me. Language "pair": ru>de>en
In very general terms, the longer you listen to a particular voice, the better you get at reconstructing the acoustically lost parts of the words. Imagine hearing low bitrate radio transmission for the first time. It's very hard to understand the words, but over time you'll get accustomed to the noise and the voice, it'll become intelligible to you. Quite literally training your neural net.
It's easy for me to read/listen to RU (native). I have no problem whatsoever listening to DE, but mentally it must be more taxing to also follow the text/train of thought (beyond very short-term memory) because I notice to be swaying away more often unless remain concentrated. So it's a little harder to memorize stuff in DE or read long thoughtful texts.
The walkie-talkie example from above is not arbitrary. I have experienced exactly that when learning DE and going to game servers with voice. Back then the microphones were all poor and noisy, it was much tougher to understand people. The adaptation took me about a week or two.
English is very specific to me. Remember what a radar chart is? That's it, one aspect is skilled to 100% and others are closer to zero. As you can ~see~ read, my writing is doing well. Listening to cleanly recorded and spoken content such as Youtube presents no issues. As with DE, I might not follow the content very well, but I can hear and understand all of it. Movies? Oh dear, subtitles please! Not only because the voice isn't recorded in a studio (sometimes whispering, sometimes mumbling etc.) but the variety of language (vocabulary, slang) and pronunciations overload me to the point of not understanding the spoken language. Yet because my general comprehension of English is good, I'd be mostly reading subtitles to not miss anything. As it stands, I would prefer either DE or RU in movies.
Years ago I had the pleasure to finally subscribe to AdoredTV on Youtube because of his content. Previously, I had skipped a video or two due to his Scottish accent. Slowly I have got used to it though. Admittedly, he had been trying to have an understandable pronunciation; some of the funny "Scottish English" videos I still can't understand. But when we had shown Adored's videos to friends, it was them who couldn't understand a word, because they were new to his accent.
Therefore I can second that subtitles are helpful, especially if you are distracted from listening or need assistance to understand spoken language (thus serves as a "gap filler"). Now knowing that the Chinese have THIS much trouble with their own tongue, I feel pity for them.
I'm nowhere close to being bilingual, and there are without doubt many factors involved in this, but I can think of a few fairly easily.
Chinese has relatively few possible syllable sounds compared to Western languages. There are about 400 possible initial-final combinations in Mandarin, and 4 tones they can be said in (5 if you include the neutral tone that can only appear at the end of a word), but not all combinations exist, and most estimates place it at about 1200 for Mandarin Chinese. This compares to about 15,000 odd for English, as a syllable has more flexibility in terms of initials, vowels and finals, and English does in fact have tones - even though we don't think of the tone as being syntactically significant, it is very hard to interpret someone if they deliberately change the pitch in unusual ways. But anyway, Chinese therefore has about 10% of the syllable range of English, but each syllable is in fact a "meaning unit" in its own right, whereas English will frequently use multiple syllables to express one concept, which makes it more redundant / less ambiguous. BTW, I say "meaning unit" because in older dialects of Chinese, each syllable was exactly a word and this was possible because there was more information in a syllable and so it was easier to disambiguate, but at some point things became confusing and Chinese began using pairs of "meaning units" to represent concepts - for instance "speech" in Mandarin is 说话 "shuo"+"hua" where the first word was an older verb for speaking and the second was an older noun for speech. Modern Chinese will still tend to revert back to the simpler forms if it's unambiguous and various grammar forms show that they still think of the words as being separate even if they are usually used together, e.g. in the positive+negative question form, 你喜不喜欢 were 喜欢 is normally considered a single word.
As a learner with a vocabulary of maybe 5000 words (compared to a native with maybe 10k+), I've already encountered a lot of homonyms and sometimes when you're watching a drama it's easier to look at the Chinese subtitles than trying to guess which word they meant. If you were having an actual conversation with someone, you could figure it out from context or ask for clarification, but that doesn't really work for one-way communication. One example: 这个合同是gong1zheng3的 "This contract is 'gongzheng'". Does this gongzheng mean 公正 (fair or equitable) or 公证 (notarised)? Just from speech alone, it's impossible to tell without further clarification or rewording, as both would be perfectly plausible.
In mainland China, there are a lot of people for whom Mandarin isn't their first language. Most people will speak it to varying levels of ability, sure, but regional dialects often have completely different words, pronunciations or even just different tones to Mandarin. In all those cases, being able to read along while listening helps comprehension. I'm not sure about Taiwan, but I'm sure there's a reasonable number of people who primarily speak Hokkien and only use Mandarin when they have to interact with people outside their community / town.
Finally, you also assume it's a choice... In most cases the subtitles are baked into the original broadcast (for TV). Back in the days of analogue TV when closed captions came along fairly late and required an expensive box, and so were only purchased by deaf people, subtitling on the broadcast was an easy way to ensure that everybody could get them, much as foreign TV shown in the West almost always has hard-coded subtitles. So for many people, it might not actually be an active choice - they might just only have sources that have hard-coded subtitles. I find it interesting that platforms where the content is intended for consumption by native Chinese speakers, e.g. TV dramas on youtube, the Chinese subs are usually hard-coded, but when they are sold to foreign platforms they usually don't have them. Personally, I find it quite frustrating that e.g. Viki doesn't often have Chinese subtitles and I want to know what exact phrase was just said, as then I usually have to find the show elsewhere e.g.on youtube.
There might also be the cases where the subtitles could be turned off, but they just don't bother. This might seem strange, I know I hate English subtitles on English shows, but many Chinese will watch TV shows with so many scrolling comments on top of the screen (often called "bullet text") that it's almost impossible to see what's underneath. Some of my Chinese friends actually watch most shows twice - once for the show, and once for all the comments. They probably aren't in the slightest bit concerned about the single line of subtitles at the bottom, especially if it makes watching a little bit easier.
When I was studying Mandarin Chinese at a school in Shanghai, borrowed a book on Shanghainese. The reason why everyone in shanghai aren’t writing “everything different” is because they are not writing in Shanghainese. They are writing in Mandarin.
I disagree on your assessment of Japanese. I would argue that Japanese is the most difficult written language in common usage / not artificial.
Moreover, one of the greatest literary achievements of Japan, “The Pillow Book” is written entirely in hiragana. Today you have so much text that leans into the resolve of ambiguity that kanji lends that you’d lose a lot of writings were everyone to unlearn kanji, but I disagree that it’s an aid, and had Japan developed its own writing system, it would have felt a lot more like hiragana than kanji.
The analogy to numerals is great to get a western-language speaker to grok the basic mechanism by which a symbol can be unrelated to a sound, reused across completely different languages, and even have different ‘readings’ in different contexts (2, 2nd, 12, 1/2…)
The use of 2nd is a better example of the limitations of this system. The reality of written Standard Chinese is that you'd use "2nd" to write both "second", but also "deuxieme", instead of writing "2eme".
Right, that case is a little more useful to analogize the way Japanese uses kanji (Han characters) with local Japanese inflections (okurigana) to adapt the Chinese writing system to their local inflected language. We write the word ‘second’ using the Arabic symbol that connotes the concept of ‘two’, followed by an irregular English inflection to make it ordinal. Is ‘seco’ a ‘reading’ of the 2 symbol? Kinda sorta?
Also helps you appreciate that Japan is not completely insane for having seemingly completely unrelated number words for different contexts, even though they write them with the same numeral. Turns out, so do many western languages (although generally only for ordinal/cardinal, not for the endless range of counters Japanese has)
The fact that as a native English speaker this seems like not how you actually read or write numerals at all - no, of course 3 doesn’t ’read’ as ‘thi(r)’ - also suggests that there is a less mechanistic way to understand the relationships between hanzi and words than learners often try to apply (we want to find the ‘rules’ that must underpin these things), and that the way native speakers of Mandarin, Cantonese or Japanese think of the relationship between these symbols and the words they are writing is much more organic - and that’s okay.
> seemingly completely unrelated number words for different context
I don't know much Japanese but I don't see it as that weird. I see it as something like "murder of crows" or "pile of sand". Something cultural that was there for a reason (or a monk somewhere) and now we have to memorize it.
I think the GP is referring to a different part of the Japanese numbering system: the two different numerals used with different counting words for 1-10; e.g. the fact that you say "muttsu no koto" [6つのこと] for "six things" but "rokko no retasu" [6個のレタス] for "six heads of lettuce".
This is what they are comparing to the difference between "first" and "one" in English, which are obviously two different origin words for the number 1 (unlike sixth and six, where sixth is clearly just derived from six).
Ah so it's about onyomi and kunyomi (Chinese and Japanese reading)? It screws me up sometimes too but there's only two choices. What I heard lots of people cries about sometimes is the counter words, that's indeed so many, so I'm referring to that instead.
Yeah, measure words aren't really that strange a concept in English, it's just that we don't normally consider them. But we have e.g. "a pinch of salt", "a lashing of ginger beer" [1], "a lump of coal", "a slice of cake", "a piece of music", "a spit of rain", "a drop of water", "a bucket of water", "a sheet of paper", "a ream of paper", "a glass of beer", "a bottle of beer", "a sip/mouthful of beer", etc. The last few sets of words were to show an even more useful correspondence with Chinese - as they would be 一杯酒,一瓶酒, 一口酒 respectively, and suddenly they seem a lot more reasonable.
We also have a LOT of these for plurals and take great pride in knowing all the stupid ones - "a gaggle of geese", "a murder of crows", "a herd of cows", "a fleet of ships". These are all really just the same as the measure words in Chinese with many specialisations for large groups of specific things.
We also have plenty of measure words in English when we take something from the realm of uncountable to countable, as I showed in the examples above, and in many ways Chinese tends to have less weird single-use ones like "gaggle of geese", just a larger variety of the common ones. So there's a special word for "long thin things", "vertical long thing things", "flat things on a (mathematical) plane", "a lump", "a slice", "a vehicle", "a machine" as well as the fallback 个/個 when none of the more specific ones are appropriate. I think it's only because we generally don't use any measure word for small groups of countable nouns in English that we think it's somehow strange that other languages do.
There's also not so many counter words to be a problem. Sure, the absolute number is high (some say very precise numbers like 214, 232, etc, some people low ball it at 100, some say over 900 words if you count all the old obscure words), but in practice, about a dozen of them will cover almost all the cases you think are hard for English speakers. So, 个,条,只,支,本,件,张, 家,座,部,辆,头 are probably the most common ones that don't usually have an English translation, almost every other measure word would also require a measure word in English, e.g. 条=strip, 张=sheet, 些=few, 双=pair, 样=kind, 种=type, 头=head, 片=slice, 分/份=piece, 点=bit, 块=lump, etc.
As with most things about learning a new language, it's also important to embrace the differences - they give you a new insight into how other people think about things, and seeing when someone else's internal model for something differs to yours can help you better understand your own model as well as the new one you're learning.
[1] You'll understand this if you ever read The Famous Five!
I don't think any "common way" of this kind really exists. In my country, the common way to write "second" ("a doua") is either II or II-a (so using roman numerals). In French, either 2eme or 2e are the common ways. I've never even seen this "2." spelling for ordinals.
Regardless, my point was that Chinese spelling is not as universal as it is made out to be, that different Chinese dialects/languages just use the Standard Chinese spelling, even when it doesn't match their own spoken language, just like a French person using "2nd" to spell "deuxieme".
Unlike with numbers, day to day language needs to convey a larger variety of concepts hence why logograms are still needed. Much like how more advanced math requires different symbols and operators that are akin to mathematical logograms to convey additional concepts beyond the fundamental quantities of 0 to 9.
Alphabets are often linked to sound, and it would be a tremendous challenge to create an "alphabet" analogue that links to a set of fundamental concepts that you can somehow rearrange to form higher level concepts and can still be universally understood without it linking to sound.
All of which native speakers have no trouble distinguishing between in conversation when there are no Kanji anywhere. It's the same with homophones in any language, usually the context makes it clear because the alternatives don't fit.
The homophones in Japanese and Korean pretty much all come from the vocabulary they share with Chinese which makes up the bulk of the vocabulary for both those languages.
One doesn't use Kanji anymore, and no one seems to struggle to read it?
Japanese on the other hand I have seen even natives struggle to read. Heck even the existence of furigana in novels is an admission of this.
Written kana drops intonation information that's present in speech. Writing with kanji makes up for this, and also allows for more complex sentences that aren't as common in spoken Japanese.
I personally find the most difficult part of reading kana-only text to be detecting word boundaries. It's much easier when kanji is used, and I'm not even a native speaker.
An English analogy isthatyoucouldwritewithoutspacesandbeunderstood but it's more difficult to read and unnatural.
Young gen-z types on Japanese Twitter abbreviate everything, but even they don't drop kanji.
Adding whitespace is a pretty simple solution. Heck if you really, really absolutely needed to resolve tonal ambiguity in kana you could add something to Kana to do that. That'd enhance the readability even further since, it's basically impossible for foreigners to learn correct intonation in Japanese unless they explicitly study it and that's on top of memorizing all that Kanji, but it would become explicit. I can recall exactly once in the last 10 years having a conversation where the there was ambiguity between two homonyms and someone asked a clarifying question to resolve it. The vast majority of the time it's just clear from context.
So.. I would say even that ambiguity isn't something people would actually have much a problem with.
> One doesn't use Kanji anymore, and no one seems to struggle to read it?
Chinese/Japanese has a level of written mutual intelligibility. Korean lost it.
> Japanese on the other hand I have seen even natives struggle to read.
It's like a native English speaker encountering new vocabulary. Happens quite often.
> Heck even the existence of furigana in novels is an admission of this.
I'd agree that manga use of furigana helps (perhaps school-aged readers) reading, but furigana in novels are standard tools in the language that authors can use to achieve some effects that is hard to describe to non-speakers.
Sometimes furigana can be used artistically,sure, but that's the exception to the rule and it's by and large a reading aide in the vast majority of cases, and the inclusion of it in novels aimed at adults indicates that without it the author expects a certain percentage of readers may struggle with how to read the Kanji otherwise.
Why does this tool in the language need to exist? The answer cannot be because Kanji make things easier to read, else you wouldn't need tools to help you read Kanji you at times otherwise wouldn't be able to.
If you come across a word you don't know as either a native speaker of English or Korean, you can at least sound it out, which ups the probability you can connect it with a word you've heard before, otherwise since you know how to type it out it's trivial to look it up in a dictionary. If you come across a word you don't know in Japanese as a native speaker and there's no furigana it's a guessing game. The meaning is slightly more obvious to you, so you might be able to guess, but if you can't guess and you care to know and the word is in print then looking it up becomes a bit more of a pain.
Korean didn't completely lose the mutual intelligibility aspect entirely since the underlying pronunciation of the words still remains and can be used to correctly guess the word in a lot of cases. Like 시간 and 時間 as an example, but there's many, many words I've been able to guess in Korean based off knowing Japanese. I was able to score 50% on TOPIK II reading exam after only having studied Korean for 4 months in large part because of this.
> Why does this tool in the language need to exist? The answer cannot be because Kanji make things easier to read, else you wouldn't need tools to help you read Kanji
This just isn't true. Even most native JP speakers agree that kanji are oppressively hard to learn and remember, so if it were feasible to get by with kana alone, then at least some native speakers would do it in some contexts. But outside of language learning it's virtually never done, and there's a reason for that.
Also, I think you're overlooking that Chinese and Korean have a lot more vowels/tones to work with than Japanese. There are a lot of Chinese-derived compound words that are homophones in Japanese but not elsewhere.
"Science" and "chemistry" are homophones in Japanese: We have special disambiguation reading for "chemistry", bake gaku, used only when misunderstanding is suspected.
There are numerous other examples. Those are all unnatural sounding, mostly industry/field specific, and not replacing the main homophone readings.
It really isn't.
In conversation fluent JP speakers tend to avoid compounds that would be ambiguous, or add distinguishers like "学校の校歌". Honestly, try converting a paragraph of text from a newspaper to all kana, and having a native speaker read it.
Try have them be educated in a kana only system and then have them try read it. They'd probably do just fine. You'd expect anything you've spend a decade doing to be easier than the thing you've spent much less time doing.
かんちょうが かんちょうで かんちょうに かんちょう された。 Is probably a sentence that definitely requires Kanji to understand the precise meaning given how many homophones かんちょう has, but it's a toy example.
> Try have them be educated in a kana only system and then have them try read it.
I can't, because there are no native speakers who learned that way, as I'm sure you know :D
But there are many learners like that, and my experience in Japan is that anyone who doesn't learn kanji has a very low ceiling on their vocabulary, even if they use the language daily for decades.
Because the reality is that it's hard to memorize 1K kanji, but if you do it then it's relatively easy to learn 10K+ compound words. Without kanji, to reach fluency somebody would have to memorize 5-10 completely unrelated meanings for "kouka", then 5 more for kakou, and so on for every combination of common single-kanji readings.
I mean - if you're in Japan, you surely know people who try to get by without kanji. Do you know any who've reached fluency? Like who could use and understand 5-6 different "kouka"s without any idea of the kanji they use? If your premise here is true then people like that should be the norm, since learning would be so much easier for them compared to those wasting their time on kanji?
So... how do Koreans and Korean learners do it? Not to mention other languages that used to use Kanji but dropped it?
You have to imagine the entire education system and everyone in it got changed to Kana only and Kanji was subsequently removed from modern literature from that point onward. That's the thought experiment. It was even tried successfully in places...
I don't know anything about Korean. It's a different language, and I'd imagine that some of the things I'm saying here don't apply to it, but that's a guess.
For thought experiments, you're assuming your consequent - obviously a society raised without kanji would get by without kanji, though I think the language would have to change somewhat. But for non-hypothetical people that exist now, I think the things you've been saying about kanji not making it easier to read Japanese just aren't true.
As someone who isn't even fluent in Japanese, I find it easier to read text with Kanji (in contexts where I am relatively fluent) than without.
Japanese without Kanji is like English (or any Latin alphabet language) without punctuation or spaces or capitalization. And also if English had a ton more homophones. You basically need to word-split and disambiguate as part of the reading processing; it's painful.
usually a parenthesis with corresponding character is used to be explicit or to avoid confusion but its strictly for chinese loan words like:
ex) 시간 (時間)
compared to furigana (note that its not even possible to display the phonetic hiragana): 第二巻
younger generation are no longer learning Chinese so more English/European loan words are directly used which ironically fixes the issue.
it is impossible to converse in Korean without using English/European loan words
ex) 아이러니 (irony)
This allows new ideas/concepts to quickly disseminate in the collective Korean psyche. Constantly new words are being invented, slangs used by primary/middle school are unknowable.
Abbreviation/concat combos to create totally new words:
ex) 씹상타치 (way **ing above average) also written as ㅆㅅㅌㅊ (wfaa) literally means
If you look at old Japanese video games before the hardware could do Kanji, they used spaces to separate words. But when the game was capable of Kanji, the spaces went away.
Without Kanji, it severely degrades readability. One has to reconstruct the word from syllables, which introduces another layer of cognitive load.
In Korean, it works similarly as well though, most people nowadays are quite used to not incorporating Hanja in sentences over multiple decades, to the point where it would be impractical to mingle Hanja in Korean.
> Without Kanji, it severely degrades readability. One has to reconstruct the word from syllables, which introduces another layer of cognitive load.
Which is what every language with an alphabetic writing does, and it works just fine. It is not "another layer of cognitive load", it is just a different layer, one that can be said to be much lighter or other languages would not have switched centuries/millenia ago.
The real problem of Japanese is the massive amount of homophones coming from Chinese. It is already a problem in Chinese, but even worse in Japanese due to the smaller phonetic repertoire.
> The real problem of Japanese is the massive amount of homophones coming from Chinese
So you're saying that verbal communication doesn't work in Japan and everyone just texts each other? "I'm sorry Mr Honda Kawasaki but due to how Japanese language works it's impossible for me to tell whether you want to buy three oranges or cook prostate cancer, please send me a letter" "Okay I will fight the sky colander"
Most countries at some point had to simplify their languages in order to promote literacy. Korea didn't ditch hanja just for shits and giggles, it did so in order to make it easier for schools. Japan never really had to face this problem at a scale that required complete removal of kanji because by the time people got such ideas Japan was already quite literate, so kanji stuck around. Plus, Japan is an extremely conservative society, they only ever change anything once all other options have been exhausted.
Same reason why English spelling is so ridiculous. It's not that English is such a unique language that it absolutely requires a spelling system that doesn't make sense and effectively forces everyone to memorize each word's spelling aside from it's pronounciation (wow just like kanji), it's just that English spelling has never been a problem to a degree that required a systematic solution, so now we're stuck with what we have. If we suddenly decided to make a giant reform of English spelling to have it reflect actual pronounciation, the resistance would be equally giant.
I actually live in Japan, I am trying to learn the language, and it is royal PITA. Heck, every single Japanese person I have asked has complained about their language being so ridiculously difficult. They wish their language was easier, but as you say it is also such a conservative society that it will never change.
In comparison, my mother language is Spanish, a language with a highly phonemic spelling. My girlfriend is trying to learn it, and she always commends how once you learn a few basic rules, you can read anything.
> Which is what every language with an alphabetic writing does, and it works just fine. It is not "another layer of cognitive load",
Disagree. Once you get accustomed to reading kanji, and did not learn to visually parse (except very briefly as a young grade school child) nearly all of the words that you see regularly as logograms first, and groups of sounds second, the experience would be akin to reading English while afflicted with a strangely selective amnesia hole for entire words. Such that reading a word like 'shoe' would not instantly evoke an association with a piece of footwear but would have to be (admittedly very rapidly) sounded out letter for letter each time, instead of scanning the entire unit as a whole.
That's what reading a word normally represented by a familiar kanji character but "expanded" into hiragana feels like, and slightly more pronounced if it's, for some hipster reason, written in katakana.
> Disagree. Once you get accustomed to reading kanji
And how many years does it take to? What about words that you have never seen before? What about ambiguous or uncommon readings that require furigana even for fully educated adults?
As I said in the sibling comment, I live in Japan and every Japanese person I have met complaints about the massive effort it took them to learn how to read and write.
> every Japanese person I have met complaints about the massive effort it took them to learn how to read and write
This stupid phenomenon is due to the fact that Japanese Gov decided to teach only arbitrarily 1000 kanji to school kids and this number decrease every 10 years.
> People complain that kanji being used in the prefecture are not include in the list (no included so no obligation to teach in school) e.g.阪鹿奈岡熊梨阜埼茨栃媛, so the gov finally add those to the next revision.
While at the same time Chinese people are learning 3X more. No one ever complain about the difficulty of Chinese character after all.
Sounds like an education problem. Traditional Chinese, which is the defacto language used in Taiwan doesn't even have any comparable phonetic alphabet (beyond a phonetic pronunciation alphabet in the form of bopomofo) such as hiragana and katakana.
It's effectively "all kanji" as it were, and yet Taiwan has one of the highest literacy rates in the world, and I never met any Taiwanese when I lived there (for years) that complained that Chinese was too difficult.
In Chinese tho with extremely rare exceptions all the characters have only one reading.
Japanese has onyomi and kunyomi. The onyomi also come from different periods in Chinese so there's multiple onyomi for most Kanji.
Then you get two Kanji words that come in all varieties. Most are onyomi + onyomi, but you get some that are onyomi + kunyomi or kunyomi + kunyomi or kunyomi + onyomi.
There's also not really any solid rules to it, and when there are, there are plenty of exceptions.
It's a real nightmare of a system. A fun one though.
And then you have nanori, the non-standard readings used for people and places names that are impossible to read without furigana or already knowing the name. One that really surprised me was a village called 愛子 (a common female name read as "Aiko") near Sendai but in this case read as "Ayashi".
Yes, so basically the arguments around lack of Kanji leading to worse readability are actually hitting upon the fact that readability suffers short-term not because Kanji enhances readability,but because they're simply not used to processing the language only through kana, and that were they to acclimatize to that, it becomes readable again and in fact easier to read than before.
Kana would be slightly easier to read if we spent as much time reading in Kana as we have in Kanji.
Hangul has some funny rules around patchim that need to be memorized. Kana does a great job avoiding this, so on balance kana is probably just fine compared to Hangul.
I don't think so. Kana just don't have enough entropy compares to Kanji. A kanji can be compose with up to twenty strokes with high variety of stroke patterns. Those excessive complexity make it identifiable even in extreme situations. (Blurry or tainted or whatever situation). In some case, a kanji with half of its size masked still be decoded without any ambiguity. But this will never work with an voice based language.
Most HN users who are coders should be able to read and write Korean within a few hours...
Good luck with any other language. Japanese is tougher since you have to memorize thousands of what are basically hieroglyphics and learn two separate alphabets (one for native Japanese, another for foreign).
With Korean you can express many sounds phonetically (with some abstractions for non-korean sounds)
I don’t speak or read Korean but I am studying Japanese.
I think GP was trying to say that kanji helps:
たまねぎ
玉ねぎ
いつつ
五つ
In both of these examples the words are the same. I’m still early enough in my studies that I don’t know the rules of when someone might choose to write one way or the other, but I’ve seen examples of ads that “spell it out” with hiragana. (Which is harder for me to read, which is what GP was trying to convey imo)
I've been fluent in Japanese for over a decade and am about 6 months into studying Korean.
I understand the issues of Kanji vs no Kanji well. Korean successful ditched it, isn't painful to read, is far more accessible to read for beginners, and doesn't suffer from an extreme long-tail of ambiguous difficult readings like Japanese does.
With Japanese no matter how many vocab you learn, you hit new words like 仲人, think you know how to read it correctly, can never quite be sure, look it up every time as a consequence and are surprised often enough at the reading that you never really settle into a sense of confidently being able to read new words correctly. It sucks.
In contrast I was able to score 50% on the reading section of TOPIK II after only 4 months of study.
So, on balance I'd say reading Korean is way easier because they ditched Kanji.
仲 go-between (which I never saw before but I knew the radicals as "person middle" but didn't know what they were combined, but this one made great logical sense)
人 person
So while I can't "read" it (in Japanese) I can know what it means pretty confidently as kanji very regularly mean the same thing in compound words.
If I saw "なこうど" I'd have no idea because those hiragana don't mean anything to me until I learn the meaning.
Am I making sense? Like the first time I saw 花火 I knew "flower fire" and was able to guess firework.
same with 大人 being adult.
I'm not saying you are wrong that Korean is easier -- I'm saying, learning kanji can make it easier to understand a lot of meaning with never actually being able to "read" the words. and the reading is absolutely hard because of kunyomi and onyomi etc etc.
Being able to guess the meaning of new words was neat earlier on in the Japanese journey, but in the end the problem of "gah but the how hell do you actually read this?" was a greater detriment than that was a benefit.
In contrast if I saw なこうど I could at least be perfectly confident I was reading it correctly even if I didn't know what it meant. Sometimes I may be able to guess from the context at least partially what it means, but if not, then I could simply opt to move on having collected an instance of seeing the word. I might then later hear it elsewhere, or perhaps see it again and if I encounter it enough times I can get curious and look it up.
I could do the same thing with Kanji except I'd have to look it up anyway to be confident I was reading it correctly. Else I just don't know what the word is, so its harder to mentally file it anywhere in my brain. I found this lead to a very long-tail of pain when reading Japanese that didn't abate even when I got up to around 17k vocab in Anki after which I just said bugger it.
So, on balance I prefer the set of problems that no Kanji poses over the set of problems that Kanji poses.
I vastly prefer the ability to potentially infer the rough meaning of an unrecognized word, then the ability to pronounce it.
As an ESL CELTA certified teacher for years, their rubric also seems to back this up in order of relative importance: it's meaning, then form, then finally pronunciation.
> I've been fluent in Japanese for over a decade and am about 6 months into studying Korean.
Me too! Nice to talk with someone with a similar background :)
> you never really settle into a sense of confidently being able to read new words correctly. It sucks.
Native speakers don't magically gain the ability to read correctly every new word. So it is fine to hit the dictionary every now and then!
In the case of 仲人, I can guessimate 仲 (naka) and 人 (hito), but recall that 素人 and 玄人 are pronouced <long vowel>~uto. I would try to pronounce it as nakouto (the correct spelling is nakoudo). So people do gain a heuristic for reading.
Also Kanji provides a mnemonic device after learning the meaning of the word. (One who makes/improves 仲).
>So, on balance I'd say reading Korean is way easier because they ditched Kanji.
I think kanji can make more clear the meaning if it can be understood, but not for pronouncing. Kana is other way around. (In case it is difficult, it is also possible to add furigana.)
I think it is beneficial to have both in Japanese writing.
As far as I understand this, this is quite an oversimplification. The differences between different dialects of Chinese is huge, especially in terms of vocabulary. The writing system isn't as purely logographic as it is often touted. There are only ~4000 characters in common use (university level literacy), but many more common words. So, lots of words are written with multiple characters. In Standard Chinese (corresponding mostly to the dialect of Beijing), each of the characters in a word represents a syllable in that word. This correspondence doesn't hold for other dialects.
Overall, people speaking other dialects of Chinese than the standard essentially write in a different language then they speak, unless they also adapt a different variety of written Chinese and lose any mutual intelligibility (a lot of such varieties exist, though few are standardized). It is in some ways like writing English words with the latin spelling of their etymology, say writing the English phrase "Jules appreciates art" and the French phrase "Jules apprecie l'art" both as "Iulius appretio ars".
>It'd be very difficult--impossible at some points--to read and write materials from other "dialects". However, with a logographic approach, everyone can understand that the character 工 means "work" even if I pronounce it like [wirk] and someone else pronounces it like [wak], for example.
I'd like to point out this isn't unique to CJK (Chinese/Japanese/Korean). Languages descending from or based on Latin can be understood, at least very generally, by each other because the equivalent words in each language usually have similar spellings or appearances.
It's cute, and gets at something real, but it's overstated. The first question is "are these things mutually intelligible", with an answer on a spectrum between "yes, perfectly, obviously" and "no, not at all, obviously". There is a huge gray area that stretches pretty far from the middle in which contingency, identity, community, and nation building projects (both military and literary) move the lines around quite a bit... but as we near either pole that factual question dominates. I think. I'd be interested to learn of counterexamples.
There are German "dialects" that are completely mutually unintelligible. Take someone who speaks the standard German dialect and drop them into the Swiss Alps, and they will understand next to nothing. It will be much easier for them to learn to understand Swiss German (linguists call it "Allemannic") than it would be for someone who doesn't speak German, but it will still take time and effort to adapt. Why are they both called German?
There is Luxembourgish, which is basically the local dialect codified as one of the official languages of Luxembourg. It's otherwise perfectly understandable for people from adjacent parts of Belgium and Germany. But I guess the locals would see that it really is almost the same.
Similarly to Chinese, Germans see themselves as mostly the same culture. Standarddeutsch is pretty much a fusion between the different varieties and has evolved along with them for a long time; differently from 普通話, which is much younger and the standardized form of a northern variety of Chinese. Germans also really cling to their dialects, and Switzerland and Austria both use slightly different versions of Standarddeutsch.
The opposite example are the varieties of Serbian, Croatian, Bosnian, and Montenegrin, which are quite inter-intelligible, but which are considered as different languages by their speakers.
> Standarddeutsch is pretty much a fusion between the different varieties and has evolved along with them for a long time; differently from 普通話, which is much younger and the standardized form of a northern variety of Chinese.
Standard German is not that old. It was largely developed in the 19th Century, and it was not until the 20th Century that most people in Germany were able to speak it. Standard German is also heavily based on a regional dialect of German (in particular, central German dialects).
Standard Chinese is a product of the early-to-mid 20th Century, so about 50-100 years younger than Standard German. This just reflects the fact that German unification was in the mid-1800s, while China's modernization occurred in the early 20th Century.
The origins of Standarddeutsch trace back to Martin Luther's bible translation that exposed a huge audience to a very important text, which was most importantly also written to be approachable to that audience.
The German dialect it used was a colonial dialect that already contained mixed features from multiple dialect area and was thus suitable for a wide audience.
Of course the emerging standard underwent development since then, and low literacy rates meant that few people actually spoke and wrote it.
> The origins of Standarddeutsch trace back to Martin Luther's bible translation that exposed a huge audience to a very important text, which was most importantly also written to be approachable to that audience.
If you want to go this route, then you could also say that the origins of Standard Chinese date back hundreds of years, to the language spoken by the imperial bureaucracy, or to massively popular works written in vernacular Chinese, like the Dream of the Red Chamber (late 1700s). There are always ancient antecedents that you can trace, but they become less and less directly related to the development of the modern language.
Yes, the Luther Bible was an important influence on the development of a standard literary German, but if you want to trace the development of a standardized spoken dialect of German, you have to go to the 19th Century and the development of Bühnendeutsch ("stage German"), which because of its use in theater had to have a standard pronunciation.
Those vernacular versions of written Chinese always existed, but they had very little prestige compared to the Classical Chinese (文言文) mainly used by the imperial bureaucracy, which was based on literature from the Han dynasty and earlier. Elegant and concise, but it required dedicated education, which was tested in the imperial examinations, and starkly differed from the vernacular versions in both grammar and vocabulary. Modern Written Chinese was standardised only after the fall of the Qing dynasty by the successive governments. Quite younger again than Bühnendeutsch, and so recent that spoken Mandarin has not yet managed to supplant the other languages of China. German has only managed to do so in the big cities, where people at times often don't speak nor understand the dialects of the surrounding rural areas anymore.
intermediate take: there are many Chinese languages
expert take: there is one Chinese language
There are also plenty of languages in China that are not Chinese or a dialect of Chinese. Tibetan and Mongolian (and their many dialects) are obviously not Chinese. Chinese written language is used as a phonetic script for some minority languages (although many are based on uighur script is used a lot also, Uighur itself uses arabic).
Tibet is a part of China today. Whether Tibet should be a part of China is up for debate, but the fact that it has been completely integrated into the PRC for at least 70 years now isn't really. I'll be the first one to argue that the PRC's moral claim to the territory comes from it being conquered by the Mongolians during the Yuan dynasty is lame. But it still doesn't mean anything when China controls it today. And that's just the TAR, there are plenty of Tibetans living in areas of China that aren't very contested as not being part of China.
If it is nearly impossible to switch the US from imperial to metric, I cannot imagine what it would take to unify a massive population under a single dialect. I think the answer is measured in generations.
Americans use metric where it's required by law (e.g., food and drug packaging), it just takes government force. Government force can also change a population's language. See, e.g.,
> I cannot imagine what it would take to unify a massive population under a single dialect
Ask French how they've managed to (almost) eradicate the Occitan and Breton and spread the Parisian dialect as the offical variant of French language throughout the country.
Eh, it’s “hard” for the U.S. to switch from customary to metric units because nobody really cares enough to do so. The forces causing language standardization (in many countries, not just China) are much more powerful.
The dialects of English (such as Scots) most certainly do not spell words the same as a general rule. The fact that there are slight pronunciation differences between different accents of standard English is not at all the same as writing entirely different words with the same sequence of characters.
For example, the word for "enemy" in Mandarin is "dírén", in Cantonese it is "dik6 jan4", in Shanghainese it is "dih nyin", and in Hakka it is "tit5 ngin11" (including different tone markers, I'm taking these translations from different sites). All of them would write it as "敵人". These words are far more different though than the difference in how an American and Englishman would pronounce "four".
The influence of loan words on English give vastly different pronunciations of words even regionally. For extreme examples see "shibboleth", especially in the UK.
What I've found with some cursory googling is mostly place names, which I agree often have major differences between spelling and pronunciation. Even there, many of the examples on the Wikipedia page are still plausible spellings for the pronunciation, especially given how ambiguous English spelling is in the first place. Others seem more like nicknames that have essentially replaced the original full name of the place, while the full name is conserved in the spelling.
It’s worth noting that the first emperor of China was the one that unified the language. The country was at war for about 250 years during the warring state period. One of the main pushes to maintain unification was standardized writing system throughout the country, increase of commerce, and unified monetary system.
Also somewhat disputed, but the first emperor of China killed all the scholars from every other nation they conquered to facilitate the language unification.
This explanation doesn't make any sense. Your example is two words which are barely different at all in their pronunciation, certainly not sufficiently to cause unintelligibility (i.e. [wirk] vs [wak]). Differences in pronunciation of this kind are everywhere in English.
I studied Chinese for 2 years in University and hitchhiked mainland China in 2019.
A common misconception is that Chinese "makes more sense" because many characters look like what they mean. So you can guess what a new character means just by looking at it.
A downside is that for many Chinese characters it becomes impossible to know how to pronounce a new word. I've seen adult native speakers ask how to pronounce a new word many times. Oftentimes there are hints in the characters (the "phonetics" mentioned by the writer), but usually not enough to guess correctly.
English is also bad at this, ironically.
Spanish is really good at this, if not the best. When you come across a new word, it's 99.99% of the time pronounced how its written.
I'm not who you replied to, but I have such a story.
I was in southern China for a couple weeks, using Google Translate to try to talk to locals sometimes. The voice-to-voice feature seemed useful and more approachable than asking someone to type on my phone. I'd often start by saying "Talk into the phone and it will translate."
This did not go well. People seemed bewildered, utterly unwilling to try it out. Some people already knew what I was doing but the rest seemed to think I was crazy. One or two tried taking the phone and holding it to their ear like they were going to have a normal phone conversation with someone far away.
I later found out that Google had translated my phrase as something like "There is a phone call for you."
> I later found out that Google had translated my phrase as something like "There is a phone call for you."
Language learning stories are always amazing because the perpetrator of the comedy is almost always entirely oblivious. Love your story.
My shining moment when I was learning Portuguese... I went to the pharmacy because I'd hurt my ankles hiking. They kept everything behind the counter, so I asked for some painkillers that would work for ankle pain.
The cashier gave me the weirdest look and the rest of the line started giggling. Eventually, after several attempts at repetition and variation, they gave me some ibuprofen - perfect. As I was leaving I processed my mistake - I'd accidentally swapped calcanhares (ankles) for calcinhas (panties), so my series of requests (as a male) had come out as: "My panties are killing me, do you have anything to help? ... Sorry, my Portuguese...I'll try again: My panties. Pan-ties. Pain. Pain in my panties. Panties pain. Medicine?"
Fun fact: Quality of Google translate and Google search in Simplified Chinese and Google maps in mainland China has plummeted over the past decade. Barely usable if at all. I would suggest avoid them at all cost for your peace of mind
I've found ChatGPT 4o (the voice one) to be useful, I just need to prompt it first by saying `Whatever you hear in Chinese, say in English, whatever you hear in English, say in Chinese`. It refuses to cooperate if you tell it to be a live translator.
Anyways, it seems to understand context really well, I tried it out with a Chinese person and they said it worked really well.
It even works for Croatian which is awesome as it is a really small language that anything voice-related usually does not work with.
My next goal was Panzhihua, in Sichuan province. I hitched a ride (I can't remember where) with a group who invited me to a party. I was 21 and they were like late 20s. I graciously accepted their invitation, on the condition that they would take me back to the highway a few hours before sunset.
After a fun few hours in the car, we arrive at the party destination- a beautiful plantation of rice paddies and fruit trees. I see escalades and Porsches parked along the long, dirt driveway.
Rural China is often extremely poor, even without running water. This is what made me come back from China less confident in their ability to take over America as #1.
But this place was in the middle of nowhere and very well kept, with a pool and some nice cabins.
We drink a lot at this girl's birthday party. I was used to being the center of attention by now and was quite fluent in Chinese at this time. We drank a lot of baijiu ('white alcohol"), which you must spill on the ground a little before you yell "clean cup!" and then drink.
We go out picking lychees and mangos in the mountains. I pack a bunch in my backpack for later.
Then the group graciously brings me back to the highway so I can hitch a ride to my destination.
I get picked up by an old farmer and his son. I hop in the truck a little drunk. The old farmer starts talking to me about George Washington, so I fumble a quarter out of my pocket and give it to him. He takes me straight to the center of Panzhihua.
By this time, I'm hungry, drunk and tired. I wander around the center and find a place to eat. After finishing my meal at the restaurant, the owner sits next to me and shows me his phone. My face is on the screen, smiling alongside all the others at the party. I had travelled at least 100 miles, but small world anyways.
I left the restaurant, looking for a place to sleep and it was raining. Finding a hostel or just a place to stay is notoriously difficult for foreigners in China. This time I wasn't even gonna try to find a hostel.
I see an almost finished construction job. Prime location to sleep, protected from the rain. Just as I'm about to put down my sleeping bag, I hear a guard yell "HEY, what are you doing!". He runs at me with his flashlight. I grab my things and sneak off into the night.
I wander the city, unable to find even just an awning to sleep under. I'm soaked. I decide to sleep on some sidewalk steps under a few trees.
A couple out on a date talk enthusiastically until they see me in the fetal position on their sidewalk. They hush down the steps.
I hope to finally get some shut-eye. Just as I'm about to fall asleep, I feel a slight itch... then fire throughout my body.
I leap up and take off all my clothes. I find I'm covered in fire ants. They were after the lychees and mangoes I had in my bag.
So there I am, in the middle of Communist China, NAKED, on the side of the road, soaking wet and covered in ants.
I really did stand in the rain and contemplate my existence with my balls hanging out.
I swipe off the ants from my bag, put on my clothes and left those stairs.
I had enough. I walked right across the street and knocked on the door of a security guard's station.
"Hi, I just wanted to let you know I'm standing here under your awning"
The security guard gaped in amazement. After a few questions we just stared at each other. I wasn't going to take no for an answer.
Then he graciously invites me into his guard-station, and lets me sleep in his chair.
The next day, I bought him breakfast. I am eternally grateful to him.
This was just 24 hours in a 90 day trip. Crazy experiences immediately precede and follow this story.
I hitch-hiked around Europe in the mid-nineties (when English wasn’t commonly spoken in Central/Eastern Europe) so I appreciated this story. I stayed in a hostel most nights (unless I was offered a place to stay) so my experience was nowhere near as adventurous as yours. Thanks for the anecdote.
Japanese is great for this also, at least for signage written in Katakana or Hiragana. If you learn those alphabets you will always know how to pronounce words written in them.
There are exceptions, certainly in colloquial speech variants, 雰囲気 and 原因 come to mind. Both have an annoying ん ("n") in the middle that makes them a pain to pronounce even for natives, so the first gets changed from ふんいき to ふいんき and the latter from げんいん to げいいん. So, if written properly in hiragana you'd know how to pronounce them "properly" but not necessarily as they're actually used by a significant amount of people when spoken. Even the auto-kanji-conversion of my keyboard respects the variants.
I can accept ふいんき as an exception, but for 原因 that's just a consequence of the rules for how to pronounce ん, which depends on the surrounding sounds (it's not quite い either). I wouldn't really call it an exception. Consider 一千円, 禁煙 etc all of which are pronounced like that.
There are two distinct pronunciations, げんいん and げいいん, the first is the standard, the second is a variant. The only consequence of the ん in the second one is that it has been replaced because it's a pain to pronounce in that position. That there are other words with ん in a similar position (一千円 not being one of them) is by the by, 原因 and 雰囲気 are two of most frequently used words in the language (in the first 800), 禁煙 is used far less (in the first 5000) so under less pressure for adaptation.
this was a big reason why Korean alphabet was invented because the literacy was so poor for the reasons you mentioned.
Lot of terms/loanwords from Chinese language can be found in all neighbouring countries but you'd have to be part of the artistocracy to get the schooling.
Japan still uses it but North Korea banned it out of the gate. South Korea slowly phased out use of traditional chinese characters. It was common to see Chinese characters up until late 00s but definitely used a lot more sparingly.
Curiously, French is also good at being "pronounced how its written", but horrible the other way around, writing down what you hear.
With only a few exceptions, you can always read out loud a written word. There is quite a lot of rules, but they are rather strict. Once you know them, you are good to go and read everything out loud.
But if you want to learn French by "listening and looking up the words in a dictionary" - good luck with that. There are multiple ways to write down the same sounds. You hear [ku] and it can be "coup", "cou", "coût"...
> Oftentimes there are hints in the characters (the "phonetics" mentioned by the writer), but usually not enough to guess correctly.
Yeah, it's unfortunately not enough to sightread purely from radicals. My girlfriend has been trying to teach me and it's the biggest frustration. The discussion keeps going "this radical is how you know it's pronounced ___, oh, but not in this one, that's pronounced entirely differently".
If I don't know a character, the phonetic radicals might let me guess close but not correctly or it might not have anything in common. The semantic radicals are a little better IMO, but not enough to guess more than the category something might be in sometimes. I'm not sure if the rules are just full of exceptions or if it's simply change over the past millennia, but it means rote learning has been the only way for me to learn hanzi.
Yes, anyone can sightread pinyin after an hour or two, but turning everything into it won't really help me learn to read Chinese as the language is actually used.
> Winston Churchill would be represented by hanzi that would be transliterated Wensuteng Chuerqilu.
reminds me of one of my favorite throw-away gags in George Alec Effinger’s A Fire in the Sun, a cyberpunk novel set in future Arabia, a character quotes “the great English shahrir, Wilyam al-Shaykh Sābir”
I've spent just enough time studying that language in the last few months that I am calling it "Zhongwen" in my head and find it hard to write "Chinese" instead of 中文.
Certainly if Chinese people met English speakers when English speakers didn't have a writing system they'd find a way to write English in Chinese characters the same way they did for Japanese circa 950AD and that they've done for several languages unrelated to "Chinese" that are written with those characters.
The effort in that article goes in the direction of making something regular that works a lot like "writing Chinese in Chinese characters" but it seems to me more likely to go in the more complex direction of preserving Chinese semantics at the expense of phonetics that happens when you "write Japanese with Chinese characters".
Note that modern Chinese is heavily influenced by modern English. During the May Fourth Movement[1], prominent authors like Lu Xun diligently explored how to write modern Chinese with "westernized" style. The experiment largely failed, but modern Chinese did get influenced a lot, to the point that multiple authors wrote books or articles pleading people not to write "westernized" Chinese. A typical example is nounification of verbs, something that traditional Chinese never had. In contrast, younger generations love to say Chinese equivalent to something like "do improvement" instead of "improve" (进行改进,instead of 改进, even though it is still considered bad writing style.
I haven't seen any credible claims of how Chinese was actually "westernized" aside of Chinese writers seemingly randomly pointing fingers and blaming European influence for a change of style initiated by the younger generation.
Even your example shouldn't convince anyone, since in English we don't say "do improvement", and generally English doesn't overuse "do <noun>" much.
Yup, 余光中 is one of such authors who urge people to write authentic instead of "westernized" Chinese. I wouldn't call it "blame", though. Just like in English, there's certain standard on good writing. In Chinese, 白话文 is really the way to go. That's why among the authors in the May 4th movement, 钱钟书´s modern Chinese is just so much more pleasant to read than Lu Xun's. If you don't believe me, try reading his translation of Nikolai Gogol's Dead Souls. Oh my...
Joan Pinkham wrote a book titled The Translator's Guide to Chinglish[1]. It's a fun book to read and really teaches native Chinese speakers how to write good English. Interestingly, a lot of the lessons in that book can be applied to Chinese writing too, such as avoiding nounification, and removing duplication. For instance, just say "dislike" instead of "to have a dislike for". And instead of saying "give guidance to", just say "guide".
I maintain that "blame" is the right word. He doesn't establish any real casual connections of poor Chinese writing to European linguistic influence, so it seems to me that he is just pointing fingers at some convenient villain (European/English influence) instead of admitting that maybe people are just bad at writing.
Or it occurred at a time when literacy rates massively increased, and thus there was some vulgarization of the language to some extent.
I think he explains how to write better Chinese pretty well, I just don't think pointing fingers at other languages is helpful especially if it isn't grounded in fact or evidence.
I can feel a sort of a double translation shims hiding in that example. Chinese "改进" would work as a verb on its own but dictionary wise it will tend to mentally map to noun form "improvement" than the verb form "[to]improve", and that necessitates a "do" verb, which plausibly results in bloated forms such as "进行改进"(forward going [with] improving forward).
usually this effort happened in the opposite direction, where Japanese people adopted the writing system of China since they didn't have one yet and needed to communicate diplomatically with China.
The hanzi approach is the most historical one. The problem is that it is generally not intuitive for vernacular languages. Even non-Mandarin Sinitic languages like Cantonese look wildly different between the standard writing form (which is just Mandarin) vs. writing the spoken vernacular form. The closest Western equivalent would be everybody in the European Dark and Middle Ages using Latin.
> I've spent just enough time studying that language in the last few months that I am calling it "Zhongwen" in my head and find it hard to write "Chinese" instead of 中文.
I'm rather curious about this actually - do other bilinguals experience this? I first acquired French as a child and spent several years taking formal French lessons, and yet I simply call it French when I think, speak or write in English. A learner finding it hard to write "French" instead of "Le français" in an English sentence would come off as more than a little overzealous to me.
i'm bilingual and have absolutely not experienced this. that being said, i think being bilingual - rather than a learner - does make it a lot easier to code-switch instinctively between different languages.
It wouldn't sound like English if they did. Both Japanese and Korean share a huge amount of vocab with Chinese, and you can tell because all the words sound similar too. So, they didn't just take the characters, they also took and adapted version of how to pronounce those characters.
The amount of Chinese words is usually much greater in written Japanese or in legal/technical/scientific jargon than in casual spoken Japanese.
The proportion of Chinese words also depends on the manner of accounting the compound words and the inflected words.
I have seen the claim that the modern Japanese vocabulary consists roughly of 45% Chinese words, 45% native Japanese words, 8% English words and 2% words borrowed from other languages, mainly from Portuguese or other European languages.
I do not know what assumptions have been used to compute these percentages, but they are similar to the frequencies of the words encountered in many modern Japanese texts.
There are a lot of modern Japanese words that are composed from two or three or four or even more Chinese words. If all the compounds would be counted separately, the proportion of Chinese words in the Japanese vocabulary could increase a lot.
btw this is true for historical arguments, but actual size of workable shared vocabulary in common use is tiny - famously "手紙(hand paper)" means postal letters in Japanese, and toilet paper in Chinese.
I never considered how a Chinese speaker (writer?) would deal with a foreign language that doesn't have a written form. What did they do in 950AD? Surely there's some way of transcribing sounds in languages like Chinese for foreign languages.
then in the next 30-50 years or so they developed the system that we know today which use the kana in a secondary role. In Japanese, for instance, you tend to put the verb at the end of the sentence and the “stem” of the verb is usually written in Chinese characters which often mean the same thing they would in Chinese, but a few kana are added at the end to specify the tense of the verb and similar attributes. I think a Chinese speaker would recognize many characters which basically mean the same thing as in Chinese but Japanese adds new characters which are important grammatically.
The character の for instance can be used in spelling out bigger words phonetically but it is usually used for the word “no” which roughly means “of”. (It’s good to know because any substantial Japanese text will use it so it’s an easy tell of what language you’re looking at)
Chinese does have its own characters that play a similar particle role though
the one that sticks out to me is 了 which is pronounced “le” and is used in sentences that are describing a change in a situation as opposed to describing an unchanged situation.
"Its first origins can be traced back to a Proto-Sinaitic script developed in Ancient Egypt to represent the language of Semitic-speaking workers and slaves in Egypt. Unskilled in the complex hieroglyphic system used to write the Egyptian language, which required a large number of pictograms, they selected a small number of those commonly seen in their surroundings to describe the sounds, as opposed to the semantic values, of their own Canaanite language."
Hiragana is used for phonetically spelling Japanese words, Katakana is used for phonetically spelling foreign words.
For instance there is the popular anime titled “Sailor Moon” which is written in Katakana like セーラームーン and stranger still the second season is called “Sailor Moon R” (I think for “return”) and is written セーラームーン R.
My understanding is that Bopomofo never caught on particularly well
it is still used in education in Taiwan but in Taiwan, the mainland and the rest of the sinosphere people who want to spell out Chinese words are more likely to use Pinyin
I understand it they incorporate foreign words by matching up with Chinese characters bearing the matching syllables with some preference for preserving semantics, repetition, and an implied semantic anyway that is in the consciousness of the reader for either the native or foreign word: it must matter that a person’s name is spelled “stream creek creek” but in the end it is best kept in 中文.
This is the modern usage of hiragana and katakana.
Earlier, e.g. in the books written a few hundred years ago, hiragana was used like today, i.e. inlined with kanji, for writing verb and adjective terminations or various grammatical particles, but katakana was used as furigana, for showing the pronunciation of the more seldom used or ambiguous kanji, especially when the Chinese reading was used for those kanji.
The use of katakana for the writing of foreign words has developed from its use for writing the approximate pronunciation of Chinese words.
The writing reform enforced after WWII has introduced significant changes in the use of hiragana and in the correspondences between kanji and hiragana (and also many kanji have been substituted with simplified variants), so reading a pre-war book can be challenging for someone familiar only with modern Japanese.
Unless things have changed in the last 15 years, when I lived there most of my Taiwanese peers used bopomofo aka 注音 as their primary IME for smartphones / laptops / etc.
The few English ex-pats that I knew used the 搜狗 software which defaults to pinyin I think.
I mostly used pinyin on my laptop and bopomofo on my phone (old model that didn't support pinyin) which was mildly annoying. I constantly got it confused with my Japanese since I also read hiragana/katakana and some of the symbols are highly similar.
This is still mostly true. Kids books also have bopomofo rubies, like the kana rubies in Japanese. And occasionally you'll see bopomofo as a typographic choice to represent a sound that feels more natural in Taiwanese amidst an otherwise Mandarin sentence.
This is just my personal experience, but I think the big change in the past 15 years isn't Bopomofo -> Pinyin, but rather Wade Giles -> Pinyin. Bopomofo seems equally prevalent, but the Wade Giles romanizations on street signs have begin to get replaced with Pinyin for the sake of non-native speakers who are almost certainly more familiar with Pinyin than WG.
Ah, my source was a co-worker from Hong Kong, which is apparently a stronghold of Bopomofo. A little ironic that the more westernized island would prefer non-romanized spelling... politics with the mainland I guess. He mentioned Pinyin when talking about input methods though.
He was definitely an exceptional and somewhat eccentric person, but it's probably my recollection that's muddled. He gave a presentation on Chinese character sets and encodings like big5 vs gb2312 vs unicode (we were an antispam shop, so very useful info) and also covered input methods, mentioning bopomofo and pinyin. The bopomofo topic segued into the four tones of Mandarin, so we spent the day saying "mā má mǎ mà" a lot, probably sounding like a Chinese kindergarten :)
> Surely there's some way of transcribing sounds in languages like Chinese for foreign languages.
Same as in all other languages, with an extra step: first, sounds from the source language that are missing from the target language are mapped onto their closest phonetic counterparts in the target language according to the phonetic rules of the target language.
The extra step: a Chinese character is assigned to each syllable to convey the «it sounds like this character» principle, multiple characters are pieced together and then there is a new borrowed word is born into existence. The assigned Chinese characters bear no relevance to their semantic or well established meaning.
For example, 保羅 is the Chinese for the English name of «Paul» which is pronounced as:
Mandarin: Bǎoluó
Cantonese: bou2 lo4
Hakka: Pó-lò
Hokkien: Pó-lô
Teochew: bao2 lo5
Neither 保 (to defend; to protect; to keep; to guarantee; to ensure etc) nor 羅 (to collect; to gather; to catch etc) have anything to do with the actual pronunciation of the word, nor with the meaning of each character, nor the original Latin meaning of Paulus («small», «humble», «least» or «little»).
Just use the closest Chinese characters to represent a syllable / sound, and slightly modify it if they have to distinguish them. Actually this is not just for foregin languages. It was how Chinese itself evolved.
An example from around then or a bit before, arising from the spread of Buddhism in China.
The name of Vairocana Buddha, whose name in Sanskrit is वैरोचन and means 'solar', is given in Chinese in two forms:
1. 大日如来 Dàrì Rúlái
This form is a loan translation, where the meaning of the Sanskrit is translated bit by bit into Chinese (日 means 'sun').
2. 毘盧遮那佛 Pílúzhēnà Fó
This form approximates the sound of the Sanskrit word using Chinese characters. It's typical of Chinese phonetic translations, which still today largely just use characters, and thus often don't sound much like the word in the original language.
To add to this, Chinese (and Japanese) Buddhist scholars' emphasis on retaining the original pronunciation of chants led them to hang on to the phonetically written Siddham script in which their masters had learned the scriptural texts in universities in India.
I guess it depends on whether Chinese speakers collectively like you or not.
E.g., Germany (Deutschland) => take the initial sound "De" => find a good character with that sound => Germany becomes "De Guo" ("De" nation), or 德国, which could also mean "virtuous nation."
But the once great nomadic tribe of Xiongnu, who rivaled the Han Empire, will be forever known as "Barbarian (匈) Slaves (奴)" - it doesn't help the first character contains 凶 (unlucky, horrible, evil).
Turkish used to be written using Arabic characters before the Turkish Republic. Now, the Latin alphabet is used. So, it was fairly easy (probably not) to switch alphabets for the same underlying language.
And 29 characters are sufficient to represent the sounds of the language with a couple of controversial accents like '^'.
The sounds in Chinese are probably very different and nuanced, but for Turkish, I am always surprised that sounds were very similar to European languages so switching alphabets was possible.
The issue is the opposite. Chinese has a comparitively small phonetic pallette (depending on how you view tones). Chinese written completely phoenetically can easily become incomprehensible.
The functional load of tones (that is, the importance of a pronunciation difference for distinguishing words) in Chinese is comparable to that of vowels[1]
"Depending on how you view tones" dismisses the important phonemic value of tones. Writing Chinese completely phonetically includes writing the tones.
Curious to know how you think it's possible that Chinese people are able to speak with each other if you think writing their language phonetically would render it incomprehensible.
I find hard to believe every language in the planet can be successfully expressed using some sort of alphabetic system, except for Japanese/Chinese (and local variants)
They could be written alphabetically, of course. The question is just what you lose, given that the characters are a massive part of Chinese and Japanese culture.
Chinese has evolved alongside its writing system for about 3000 years, and switching over to pinyin (the standard Latin transliteration) would be a complete revolution in Chinese culture.
I was responding to the assertion that writing Chinese using an alphabet would render the text incomprehensible, not arguing that China or Japan should switch over.
Then you might be even more surprised to hear the Latin alphabet is the official way to write Chinese phonetically (in mainland China), and is used to write on computers and phones. It's called pinyin.
If any language was written like Chinese has the same answer -- the written form of Chinese was not necessarily meant to be phonetic, although there are portions of it that have evolved to be phonetic. The characters have meanings and the grammar is very fluid to the point where a sequence of characters stringed together (such as in poetry) can be interpreted and debated.
Cantonese and Mandarin are considered dialects, so I won't use that as an example, but this problem has already been solved in Korean. For a long time, Hangul did not exist and Korean scholars used Chinese as the written system despite speaking in a completely different language. This is obviously an old article (1999), but the fact that it doesn't consider how this is a solved problem from a real historical use case makes the musing incomplete.
Ah, I perhaps should have read all the comments before posting here, because it seems that you're answering my question, and confirm this idea that phonetic interpretation of written Chinese is a "recent" development.
This idea seems to be foreign to all native Chinese speakers I've encountered, and this seems to be in contradiction with what I can grasp from research.
If I may, I've got another related question: Chinese speakers all parrot this idea that literary Chinese is to modern (let's set aside character simplification) Chinese what (ancient?) Greek is to English.
But it's not my impression, at all; my intuition is that they don't properly understand neither Greek nor literary Chinese. For example, a modern Chinese speaker can be expected to read literary Chinese and at least make some sense out of it, but a modern English speaker won't even be able to read (ancient?) Greek, let alone interpreting it.
>confirm this idea that phonetic interpretation of written Chinese is a "recent" development.
As I understand it, this is a recent development as in the science of language is a recent development. We might not have known about it but it was always there.
I think the comparaison with Greek or Latin is a good one. I can read modern French and Chinese, and my understanding of Latin and Classical Chinese is about the same: virtually nonexistent, at most a word here and there. The reason why Chinese understand it is because they learn it at school.
There's no great analogy where the modern language is English, because English does not position itself as a successor of a long, linguistically-continuous literary tradition. That said, there is certainly an Anglo tradition of education in which well-educated schoolboys were expected to be able to puzzle through a smattering of horribly butchered Greek and Latin.
I'm not sure we all that much need a (1999) on an article inflecting "bodacious" without apparent irony. I suppose it helps in the "before" direction, but I certainly wasn't going to assume this was written much after...
I've recently read the "Remembrance of the Earth's past" trilogy in english and the first thing that struck me was how different the dialogue (both internal and external) felt to novels which were written originally in english.
Been wondering if it had anything to do with the way language structures differ between chinese and english.
I can't entirely buy this line of reasoning because it depends on rhyming/sounds-like reasoning but Mandarin and Cantonese are not sound alike. And, I would not expect all sound-like root terms to work in both. I mean french and English sound alike mostly but in no way is sun/son translating to soleil/fils.
> Instead we'll use it only for king, which will be the phonetic for this set, and add little signs called radicals to distinguish the rest
This is something I would dearly enjoy to have a genuine expert opinion on. I've looked at some research[0] (§1.3 in particular), and as far as I understand, the idea that radicals are essentially/purely phonetic doesn't match with historical records.
Meaning, if I understand correctly, characters used to be systematically considered as semantic combinations, with authors "debating" about the proper way to interpret some characters (again, see §1.3 where Xu Shen proposes different interpretations than Confucius's).
I've always learned that there are sound components and meaning components, radicals are just an afterthought to make dictionairies work.
I don't think we need to look too much about what ancient scholars thought about characters, they were not around when the characters were created, were not trained to do science and didn't know as much about the evolution of languages as us.
> I've always learned that there are sound components and meaning components
The Shuowen indeed already mentions sounds. But the fact that there are sound components isn't contradictory with them being also semantic.
> I don't think we need to look too much about what ancient scholars thought about characters
Let's say that I enjoy the study of ancient thought processes, and that some of it seems to be recorded in etymology & the like.
> were not trained to do science
I believe their approaches were different, more "naive" perhaps, which might be the source of confusion. Let me give you and example where we currently fall short, and where they did not.
Consider political organisation: from Plato (The Republic) we knew that they studied the way one political system changes to another, and studied the causes of this evolution. Yet, the overwhelming majority of contemporary people have been "in-doctrinated" to think that democracy is paramount.
Not only are we disregarding history and previous data, but most businesses, where we all spend a considerable amount of time, for subsistence, are fundamentally non-democratic: we're also severely inconsistent.
Perhaps ancient Greeks didn't had modern means (e.g. peer-reviewed papers, social "sciences"), but they nevertheless captured the jest of it, while we're essentially not.
>But the fact that there are sound components isn't contradictory with them being also semantic
No, I mean only sound, not semantic.
>Perhaps ancient Greeks didn't had modern means (e.g. peer-reviewed papers, social "sciences"), but they nevertheless captured the jest of it, while we're essentially not.
You're making a very unfair comparaison, the average ancient Greek did know even less about politics than we did. I don't follow your conclusion at all.
But that's my point regarding the paper mentioned above: I have the impression that this viewpoint doesn't match with the way old beards used the language:
> In other words in each of these quotations, the author uses the graphic structure of a character to represent a key notion in the discursive reasoning to support or confirm a reality or a fact. The meaning of the character can be systematically related to the meaning of the graphic components
> [...]
> According to these texts, characters are analysed into pure semantic components: “west” and “rice” for “grain”; “cereal,” “entering” “rice” for “broomcorn millet”; “eight” and “ten” for “tree,” and the choice of the components is essentially explained in terms of the Yin/Yang and Five Elements theories.
Of course, I am well aware of the modern viewpoint, which disregards such systematic semantic interpretations, but I am wondering whether I understand the paper correctly: « were really characters components always understood semantically in the past? »
And I am cautiously wondering about it, precisely because all Chinese speakers I've talked to shrugs the idea off (« am I missing something? »).
> You're making a very unfair comparaison
Alright, but we were talking about ancient scholars, it seems fair to compare them to modern scholars. Let me try to rephrase myself. My point was multiple:
(1) ancient scholars were trained to do science, just not the way we do
(2) not being "trained to do science" in the modern sense isn't sufficient to disregard their viewpoint;
(2') the converse is also true: being "trained to do science" in the modern sense isn't sufficient to validate a viewpoint. Example: many trained scientists or engineers will hold democracy in high-esteem despite the previous "research" & history.
That's to say, I believe it's well worth studying their viewpoints, despite the fact that they "weren't trained to do science," as commonly understood today.
> The basic principle will be, one yingzi for a syllable with a particular meaning
The problem is that the English language does not have this structure, whereas Chinese does.
The individual syllables in Chinese words have their own independent meanings. If you break apart the English word "random," "ran" and "dom" don't independently mean anything. The reason why Chinese characters work well for Chinese is because the fundamental unit of Chinese is really the syllable, rather than the word. English isn't like that, so a Chinese-like writing system would be a very poor fit to the English language.
That being said, I sometimes wonder whether this property of Chinese comes from the writing system, or whether it preexisted writing. In other words, to what extent has Chinese been influenced by its writing system?
> It's as if the US had its own versions of a large fraction of English yingzi.
English is a foreign language for me. I don't know how native speakers see it, but to me it does sometimes feel like US English is the "simplified" one compared to British.
This isn't language, it's accent, but in America the vast majority of speakers merge lots of vowels - cot/caught are merged, Mary/marry/merry are merged.
> I've attempted in this sketch to lay out, by analogy, the nature and structure of the Chinese writing system. All of the concepts apply
Do they? I think there is one section that has nothing whatsoever to do with how Chinese works, namely "Inflections". Chinese does not have them, at all. I guess the author felt compelled to at least give a token acknowledgement to the concept, even while it was irrelevant to what he was really going for (a parable about the Chinese writing system).
Interestingly Chinese did have things like this at the time the writing system formed, and they simply weren't written. You can see remnants of the old morphological system today in characters that alternate in tone.
For example there used to be a suffix -s that turned a verb into a noun. During the Han Dynasty this turned into -h, and then later turned into the falling/departing/fourth tone. Hence 教 jiāo is the verb "to teach", from Old Chinese *kraw, and 教 jiào is the noun "teaching" from Old Chinese *kraws.
> Winston Churchill would be represented by hanzi that would be transliterated Wensuteng Chuerqilu.
If we actually used Chinese characters, we would write Churchill with meaningful hanzi and not strictly transliterate. Though there'd of course be variations as there are variations in spelling.
The only Chinese I know is through Japanese, but I imagine Churchill would look something like 教丘
Or writing in English can go the other way and become ortophonetic like Italian and Romanian, where each letter denotes a sound and "to", "too" and "two" will become "tu".
> So two, to, and too will each have their own yingzi..... We can simplify the task enormously with one more principle: syllables that rhyme can have yingzi that are variations on a theme.
Could anyone find a live link to the "Belorussian translation" mentioned at the beginning?
I wasn't able to find another version with search engines or Waybach machine.
can someone explain the linguistic logic to me like I'm five? I've read it through twice and I don't understand how the connection is made between the characters and the english language?
It's an imagining of how the principles used to construct Chinese characters could be applied to English. The author names these hypothetical English characters yingzi, after hanzi.
As for the article, I believe one of the reasons English and by extension the US ended up "owning" the computer revolution was it was a large language with a simple alphabet. It has less letters than many other large language and was easily coded into the tiny computers of the 40s and 50s.
I think there were a lot of historical contingencies involved. The nature of the language itself (small vs large alphabet, etc) is probably one of the least important.
There was an interesting BBC article a while back about the decline of German usage in science: https://www.bbc.co.uk/news/magazine-29543708 They put it down to anti-German sentiment during WWI (not WWII as I would have assumed!)
It’s pretty easy to imagine an alternative world where German was the common international language of science and became the basis of most programming languages too.
The importance of German as a scientific language was even greater before WWI than claimed in that BBC article.
While the BBC article estimates that of the scientific literature of the 19th century and of the pre-WWI 20th century a third was published in French, a third in English and a third in German, I believe that an estimate much closer to reality would be a quarter in French, a quarter in English and a half in German.
That doesn’t make them American any more than Amazon is Japanese because they have a Japanese division.
The companies I listed are Asian businesses that make hardware in Asian regions for Asian consumers and also have a massive presence in the west too.
I could also list multiple Asian companies that don’t have a large presence in the US but then you wouldnt have heard of them so what would the point of that be?
Some of the comments in this thread smack of “I’ve never needed to use anything outside of America so I just assume English-speaking businesses are the only thriving industries”.
All 3 languages have a bit more letters that English. Some of those language's letters are marked with inflections.
Plus the first computers used only Upper Case Letters which were designed similar to how Ancient Latin was carve letters in stone. So it was far easier to design a printers, storage, punch cards for when you only care about 26 letters and 10 numbers and a few punctuation marks.
The “first computers” (which can actually mean a lot of different things depending on what you’re defining as a computer) didn’t output text at all.
Even in the digital era, they output binary rather than text. In the 50s it would have been machine code in and machine code out.
Punch cards would have been binary, Initially machine code but later text encodings in binary. Given machines back then weren’t fixed with 8-bit bytes, it meant you have have larger or smaller character sets.
There were plenty of Japanese, Chinese and Russian computers using non-Latin characters. There were also plenty of European computers that supported native characters outside of the standard 26 English letters too. That’s why character encodings have been such a nightmare to work with prior to Unicodes adoption (and frankly, Unicode creates a new set of problems, but that’s a different topic entirely).
Just because you haven’t had to work with non-Latin, or even non-American, encodings doesn’t mean that all machines were English-centric.
Most of the lowercase Cyrillic letters are just smaller version of the uppercase ones. And neither the German nor the Russian alphabet is larger to any extent that matters.
CJKV languages are a bit more subtle subject. Vietnamese nowadays uses a romance-inspired alphabet, where only the tone marks are slightly difficult to typeset. Japanese and Korean could have gotten rid of Kanji/Ganja if they really wanted. But in any case printing technology for Chinese characters existed and was in wide use at the turn of the 20th century.
French and German are both completely intelligible when written in all caps with no diacritics (in French by simply omitting them, in German by replacing e.g. Ü with UE).
I think English spelling is a pain but overall I think 中文 is worse. There are an awful lot of characters to know for one thing. My understanding is that you have to work pretty hard to be literate in 中文. Also I understand that the horizon where 中文 gets hard to read is closer in time than it is English, that is even 150 year old texts are difficult to read and reading Confucius is harder than reading Chaucer. Contrast the problem of memorizing irregular spellings to the problem of remembering how to draw thousands of characters.
which is kind of a cleaned-up Latin. As a native English speaker who knows a lot of Latin roots from science and literature and who also has gotten good doses of Spanish and French, I can read the sample texts in that Wikipedia page right off that bat.
It is a strange thing though that you can go through the first few chapters of a book on 中文 grammar and get the feeling there is something refreshingly regular about that language, almost as if it was an artificial language. When you get towards the end you find there are many things that will strain this sense of quick familiarity and I am told if you actually try to use the language it's harder than you think.
The curse of Chinese characters is that they unite the country. You can't get rid of them because Chinese then becomes 10 languages, and they're suddenly broken up like Europe. If everything were Mandarin, you could Romanize or Bopomofize no problem (although with much grumbling of course.) But it's not, and probably not going to be as far as I know; it's like Hindi. It's the writing that makes China one country.
For a non-native speaker, sure, that's not obvious. For a native, especially if they're well read and have a decent vocabulary or have studied any Latin it's just fine. The many styles of englishizations of foreign words (romanization just doesn't quite fit here) and obsolete-in-terms-of-pronunciation spellings provide valuable context for pronunciation and meaning to people who have sufficient familiarity.
The reason why Chinese continues to use a logographic writing system is due to both tradition and practicality. English has grossly grouped together Chinese as one unified language, when in actuality it is not. In fact, many "dialects" are mutually unintelligible--one speaker cannot understand another speaker. If all of China switched to using a phoenetic writing system, everyone would write everything differently. It'd be very difficult--impossible at some points--to read and write materials from other "dialects". However, with a logographic approach, everyone can understand that the character 工 means "work" even if I pronounce it like [wirk] and someone else pronounces it like [wak], for example. It's one of the reasons why subtitles are so prevalent in Chinese media. Obviously, this problem can be eliminated by eliminating individual "dialects", which is sort of promoted through the adoption of Mandarin Chinese. Many Chinese media is also dubbed in the standard dialect so that actors with regional dialects can be understood.
As for Chinese characters in other languages, Japanese becomes a lot easier to read with the addition of Chinese characters. Kanji allows sentences to be shorter, less ambiguious, and easier to parse. Unlike Chinese, each character is not just a single syllable, and there are many homonyms in Japanese because there's a smaller set of sounds.
https://history.stackexchange.com/questions/46658/did-china-...