When borrowing from Chinese isn't "Sinoxenic"
7 February 2016
1. What is Sinoxenic?
In around October last year, a rather curious question appeared on Quora:
If the Ainu language were to borrow words directly from Chinese, how would they differ from the other Sino-Xenic vocabulary while taking both certain Proto-Ainu sounds into account (like certain vowels) & modern survivors?
This interesting hypothetical question contains a flawed assumption: that vocabulary directly borrowed from Chinese is automatically "Sinoxenic vocabulary". This is not, in fact, the case. So-called "Sinoxenic vocabulary" is the result of a very particular mode of borrowing involving the wholesale adoption of the Chinese script. This is a very striking, possibly unique, mode of borrowing, and plays a large part in giving the major languages of East Asia their peculiar flavour and remarkable commonalities.
Sinoxenic vocabulary reflects systems of reading Chinese characters that were developed by the Vietnamese, Koreans, and Japanese well over a millennium ago. It is a product of the process of gaining literacy in Chinese at a time when it was the most prestigious language and dominant cultural vehicle of East Asia. It involved a wholesale attempt to master the Chinese written language, including virtually the entire Chinese literary vocabulary. This has left these languages in modern times with broad swathes of vocabulary, or building blocks of vocabulary, borrowed from Chinese. Japanese and, to a lesser extent, Korean, still use Chinese characters as a key element of their writing systems, and even in Vietnamese, which has abolished the use of Chinese characters, they loom as an unseen presence behind a latinised script.
There is a great deal that is hazy about the circumstances and details of how these vocabulary systems developed. Some of its distinctive features are as follows:
1. Sinoxenic vocabulary is the result of formal language learning. At least in the early stages, it involved the active transmission of literacy from teachers to students, either directly from Chinese teachers, or in some cases through intermediaries (notably in Japan, where Korean teachers may have been involved). Learning how to read Chinese involved not only the memorisation of characters, but also the recitation, reading, and (ideally) composition of Classical Chinese texts. (The concept that Sino-Vietnamese resulted from contact between Chinese-speaking teachers and Vietnamese-speaking learners has been questioned recently by John Duong Phan, who posits the existence of a bilingual situation in Vietnam.)
2. Characters were read by the Koreans, Japanese, and Vietnamese in as close an approximation to the Chinese pronunciation as was possible. However, pronunciations were inevitably modified to fit the different phonological systems of these languages. For instance, the Chinese character 六, which is believed to have been pronounced /luwk̚/, became two syllables, /ro-ku/, in Japanese pronunciation.
3. The adoption of Chinese characters involved borrowing the system as a whole, comprising literally thousands of characters. The main systems of readings for Chinese characters date from the second half of the first millennium -- Late Old Chinese to Late Middle Chinese in the case of Korean, and southern varieties of Late Middle Chinese in the case of Vietnamese. In Japanese, the wholesale borrowing of characters and their accompanying sets of pronunciations took place on two occasions, with the result that Japanese now has two main sets of readings: the go-on system (Late Old Chinese - Early Middle Chinese), which were probably adopted from a more southerly area of the northern dialect or possibly involved Korean teachers; and the later kan-on system (Late Middle Chinese), which was borrowed from the prestigious Chang'an dialect in the northeast. So systematised did these two sets of readings become that later dictionary-makers even tried to fill in gaps where the go-on or kan-on reading for a particular character was missing.
4. Systems of pronunciation eventually became conventionalised ways of reading texts, divorced from the contemporary Chinese pronunciation and probably passed on by local teachers. In Japan, which is geographically isolated from the Mainland, it's quite likely that many teachers and students of Chinese texts never heard Chinese spoken during their entire lives. Despite changes in pronunciation, character readings still generally run in parallel in Modern Standard Mandarin (MSM), Chinese dialects, and Sinoxenic languages, even where the actual physical pronunciation has diverged. The parallels become clear when characters are placed in lists:
(Table 1)
Characters | English | Chinese (MSM) | Japanese go-on | Japanese kan-on | Korean | Vietnamese |
京 | 'capital' | jīng | きょう kyō |
けい kei |
경 gyeong |
kinh |
形 | 'shape' | xíng | ぎょう gyō |
けい kei |
형 hyeong |
hình |
情 | 'feeling' | qíng | じょう jō |
せい sei |
정 jeong |
tình |
明 | 'bright' | míng | みょう myō |
めい mei |
명 myeong |
minh |
停 | 'stop' | tíng | じょう jō |
てい tei |
정 jeong |
đình |
丁 | 'situation' | dīng | ちょう chō |
てい tei |
정 jeong |
đinh |
平 | 'flat, peaceful' | píng | びょう byō |
へい hei |
평 pyeong |
bình |
A second series:
(Table 2)
Characters | English | Chinese (MSM) | Japanese go-on | Japanese kan-on | Korean | Vietnamese |
筆 | 'writing implement' | bǐ | ひち hichi |
ひつ hitsu |
필 pil |
bút |
蜜 | 'honey' | mì | みち michi |
びつ bitsu |
밀 mil |
mật |
質 | 'quality' | zhì | しち shichi |
しつ shitsu |
질 jil |
chất |
実 | 'actual' | shí | じち jichi |
じつ jitsu |
실 sil |
thực, thật |
日 | 'sun' | rì | にち nichi |
じつ jitsu |
일 il |
nhật |
Despite irregularities (including a few that I've omitted from the list), the parallels are remarkable. A study of these "Sinoxenic" readings of Chinese characters has been instrumental in modern scholars' attempts to elucidate the phonology of Middle Chinese.
5. Since Chinese characters were borrowed as a combination of form (the written character), meaning, and pronunciation (the 'reading' of the character), they were eventually taken in as morphemes in the host languages. These borrowed morphemes could then be used to create new Sinitic-style vocabulary by reading them with their respective conventionalised readings. Such vocabulary was available for borrowing between the different Sinoxenic languages. The Japanese, in particular, coined large amounts of Sinoxenic vocabulary in the modern era to render new concepts that had come from the West, and this has been widely borrowed into Korean, Chinese, and Vietnamese.
For example, the following words are shared between the four languages, based on the same combinations of morphemes:
Characters | English | Chinese (MSM) | Japanese | Korean | Vietnamese |
形態 | 'form' | xíngtài | けいたい keitai |
형태 hyeongtae |
hình thái |
形狀 | 'shape' | xíngzhuàng | けいじょう keijō |
형상 hyeongsang |
-- |
狀態 | 'state' | zhuàngtài | じょうたい jōtai |
상태 sangtae |
trạng thái |
狀況 | 'situation' | zhuàngkuàng | じょうきょう jōkyō |
상황 sanghwang |
trạng huống |
形成 | 'formation' | xíngchéng | けいせい keisei |
형성 hyeongseong |
hình thành |
(There are, however, limits, as not all languages have borrowed or use all combinations, and meaning and usage can differ.)
As a final note, Sinoxenic languages also have vocabulary that doesn't cleanly belong within the Sinoxenic framework. Vietnamese has a few hundred words that were borrowed before the systematic borrowing of Sinoxenic vocabulary took place. As a result, there are many forms or words that were borrowed twice, once as "Old Sino-Vietnamese", and later on a much comprehensive scale as "Sino-Vietnamese". In general, the earlier borrowings tend to be regarded by speakers as native Vietnamese forms, while the later Sinoxenic forms form a separate stratum of the Vietnamese vocabulary. A few examples include:
(Table 4)
Chinese (MSM) | Meaning | Old Sino-Vietnamese | Sino-Vietnamese |
味 wèi | 'flavour' | mùi | vị |
黃 huáng | 'yellow' | vàng | hoàng |
鶯 yīng | 'oriole' | anh | oanh |
Japanese also has some vocabulary of this type, notably 梅 ume 'plum' and 馬 uma 'horse', from Middle Chinese /mwəj/ and /maɨX/ respectively. Note that an additional vowel /u/ has been added at the start of the word in both cases. Both words are regarded by speakers as 'native Japanese' vocabulary. In Korean, older layers of borrowed vocabulary do not form a separate stratum and are subsumed under Sino-Korean as a whole [But see Disqus note from "Geisendorf" below.]
2. Non-Sinoxenic borrowing
While the Sinoxenic model has traditionally held the limelight as the most distinctive and influential model for borrowing Chinese vocabulary, it is not the only model. Here I'll look at a language that has borrowed a considerable amount of vocabulary from Chinese but isn't a "Sinoxenic" language.
Chronologically, Mongolian borrowing of Chinese vocabulary took place later than that of Sinoxenic languages. The vast majority of words have been borrowed in the past 800 years, sourced from Early, Middle, and Modern Mandarin as spoken in northern China.
Modes of borrowing are not uniform. Some vocabulary was borrowed indirectly, such as the word for 'writing', ᠪᠢᠴᠢᠭ᠌ бичиг bichig, which appears to have entered from Turkic in ancient times but ultimately derives from Chinese 筆 (MSM bǐ 'writing brush'). The same Chinese word was later borrowed via Tibetan as ᠪᠢᠷ бийр biir 'writing instrument'.
In more recent times, by far the greater proportion of words has been borrowed directly. Some are significantly different from the Chinese pronunciation, a result of either great age, or of impressionistic auditory borrowing. One example is the word ᠴᠣᠩᠬᠣ цонх tsoŋx 'window', from Chinese 窗戶 chuānghu 'window'.
Another likely example is the word ᠲᠠᠢᠢᠪᠣᠩ тайван taivaŋ 'peace', which is supposedly from Chinese 太平 MSM tàipíng 'peace'. The traditional spelling (which equates to taibuŋ) makes no attempt to reproduce the original vowel in 平 píng.
Since the Qing dynasty, however, spellings in the traditional Mongolian script have generally come closer to the Chinese pronunciation. For example, ᠶᠠᠩᠬᠠᠨ янхан yaŋxaŋ 'prostitute', spelt yaŋxan in the traditional script, accurately reproduces the vowels and consonants of Chinese 養漢 yǎnghàn. Whether yaŋxan is how the Mongolians heard the word (auditory loan), or whether this represents an auditory loan that has been rectified to confirm with the Chinese pronunciation in its written form, is not clear. It is likely, however, that spelling conventions in the traditional script reflect familiarity with the Chinese written language among at least part of the Mongolian intellectual elite. This accords with the actual situation during the Qing, in which there was considerable intimate interaction between the Mongols (particularly in the south) and the Chinese cosmopolis.
The tendency for the written form to accurately reflect the Chinese pronunciation has become even more marked in modern times, to the extent that the transliteration of Chinese words into Mongolian has now been fully standardised in China, as found here and here.
While transliteration from Chinese is now carried out according to rigid rules, there is a crucial difference between Mongolian and Sinoxenic languages: Mongolian borrowing of Chinese vocabulary is not the result of an attempt to adopt Chinese as the literary language of the Mongols, and has never involved adopting the Chinese writing system as a whole. Mongolian has had its own scripts since the time of Genghis Khan and no consistent attempt has been made to use Chinese characters to write Chinese loan words in Mongolian. One of the most important documents in Mongolian history, the Secret History of the Mongols, survives only in a transliteration into Chinese characters, but these characters are used purely for their phonetic value, similar to but clumsier than Japanese kana. (In modern Inner Mongolia, however, with the teaching of Chinese in the educational system, bilingual Mongolians at times informally use Chinese characters to write Chinese words in a Mongolian context.)
Words also tend to be borrowed as single forms, without being analysed into constituent morphemes for use in further word-building. Borrowings are 'one-off' in nature rather than systematic. There are, of course, monosyllabic words that do not need further analysis and can be used as independent morphemes, such as ᠵᠢᠩ жин ǰiŋ 'weight, catty', which is from the Chinese word 斤 jìn 'catty'. But bisyllabic words from Chinese tend to be perceived as single units, much like native Mongolian words, which are generally composed of more than one syllable.
For example, both the archaic term ᠲᠠᠢᠢᠭᠠᠨ тайган taigaŋ 'eunuch' (written taigan) and ᠲᠠᠢᠢᠪᠣᠩ тайван taivaŋ 'peace' (written taibuŋ) contain the Chinese morpheme 太 tài, but this is not perceived as an identifiable morpheme in Mongolian. Similarly, ᠬᠣᠪᠣᠩ хоовон xoovoŋ 'brazier' (traditionally spelt xoboŋ) and ᠲᠦᠩᠫᠦᠩ төмпөн tömpöŋ 'pots and pans' (traditionally spelt töŋpöŋ) both contain the Chinese morpheme 盆 pén 'bowl', but it is unlikely that a Mongolian speaker would be conscious of a connection between -вон -voŋ and -пөн -pöŋ in the two words.
There are, of course, some morphemes that are clearly identifiable in borrowed vocabulary. One is the morpheme 匠 jiàng, which is used in a number of words relating to tradesmen, including the very common word мужаан muǰaaŋ 'carpenter'. The traditional spelling of this morpheme in Mongolian is -ᠵᠢᠶᠠᠩ (ǰiyaŋ), and even in Cyrillic it is normally -жаан -ǰaaŋ, making it an easily recognisable form. This is reinforced in words like түнжаан tünǰaaŋ 'brazier (copper worker)', which ignores vowel harmony and strengthens the perception of -жаан -ǰaaŋ as a productive morpheme. However, there are also forms where vowel harmony detracts from this unity of form, e.g., the word for a stone mason, which is pronounced as шожоон šoǰooŋ.
(Table 5)
Mongolian | Trad script | Meaning | Chinese characters | Chinese (MSM) | Japanese | Korean | Vietnamese |
мужаан (muǰaan) |
ᠮᠤᠵᠢᠶᠠᠩ (muǰiyaŋ) |
'carpenter' | 木匠 | mùjiang | ぼくしょう, もくしょう bokushō*, mokushō* |
목장 mokjang* |
mộc tượng* |
инжаан (inǰaan) |
ᠢᠨᠵᠢᠶᠠᠩ (inǰiyaŋ) |
'silversmith' | 銀匠 | yínjiang | ぎんしょう ginshō* |
은장 injang* |
ngân tượng* |
тижаан (tiǰaan) |
ᠲᠢᠵᠢᠶᠠᠩ? (tiǰiyaŋ?) |
'ironworker' | 鐵匠 | tiějiang | てつしょう tetsushō* |
철장 cheoljang* |
thiết tượng* |
түнжаан (tünǰaan) |
ᠳᠥᠨᠵᠢᠶᠠᠩ (tünǰiyaŋ) |
'brazier' | 銅匠 | tóngjiang | どうしょう dōshō* |
동장 dongjang* |
đồng tượng* |
шожоон (šoǰoon) |
ᠱᠤᠵᠢᠶᠠᠩ (šoǰiyaŋ) |
'stone-mason' | 石匠 | shíjiang | せきしょう sekishō* |
석장 seokjang* |
thạch tượng* |
While the traditional script often tends to highlight the original Chinese pronunciation, the Cyrillic orthography, which spells words as they are pronounced, serves to obscure any connection with Chinese. For example, the word ᠶᠠᠨᠲᠣᠩ (yantuŋ), from Chinese 煙筒 yāntong, is spelt яандан (yandan) in Cyrillic, considerably removed from the traditional spelling of yantuŋ.
Moreover, syllable-final н in the Cyrillic script is pronounced /ŋ/ in Mongolia, thus neutralising the earlier distinction between /ŋ/ and /n/ in this position and further obscuring the regularity of relationships with Chinese. This distinction between /ŋ/ and /n/ is maintained in Inner Mongolian.
A few examples of the way in which Mongolian words are a poor reflection of the Chinese original will make the situation clear. The main source is Монгол хэлний Хятад ормол үгийн судалгаа by Н. Балжинням, and I'm assuming for the sake of this exercise that the Chinese etymologies given are accurate. I've picked two sets of words that fall into typical patterns to illustrate the point. What is noteworthy is the way that Mongolian vocabulary follows set patterns that tend to obscure the patterns of the original Chinese.
The Chinese pronunciation given is that of the modern standard language, although many Mongolian borrowings may actually have been from local northern dialects of Chinese. Sinoxenic pronunciations are given for comparison. Asterisks indicate words that do not actually occur in the Sinoxenic language in question. There are also a few words where I don't have a reliable source for the traditional spelling. Note that final н in Cyrillic, which I've transliterated as 'n', is actually read ŋ.
Pattern One
(Table 6)
Mongolian | Trad script | Meaning | Chinese characters | Chinese (MS) | Japanese | Korean | Vietnamese |
байван (baivan) |
ᠪᠠᠢᠢᠪᠠᠩ (baibaŋ) | 'vitriol' | 白礬 | báifán | はくばん hakuban* |
백반 baegban |
bạch phèn |
тайган (taigan) |
ᠲᠠᠢᠢᠭᠠᠨ(taigan) | 'eunuch' | 太監 | tàijiàn | たいかん taikan |
태감 taegam |
thái giám |
янхан (yanxan) |
ᠶᠠᠩᠬᠠᠨ (yaŋxan) | 'prostitute' | 養漢 | yǎnghàn | ようかん yōkan* |
양한 yanghan |
dưỡng hán* |
зондон (zondon) |
ᠵᠣᠩᠳᠠᠨ (ǰoŋdan) | 'satin, silk' | 妝緞 | zhuāngduàn | そうどん sōdon* |
장단 cangtan |
trang đoạn* |
хоовон (xoovon) |
ᠬᠣᠪᠣᠩ (xoboŋ) | 'brazier' | 火盆 | huǒpén | かぼん kabon* |
화분 hwabun |
hỏa bồn* |
төмпөн (tömpön) |
ᠲᠦᠩᠫᠦᠩ (töŋpöŋ) | 'bowl, basin' | 銅盆 | tóngpén | どうぼん dōbon* |
동분 dongbun |
đồng bồn* |
зайсан (zaisan) |
ᠵᠠᠢᠢᠰᠠᠩ (ǰaisaŋ) | 'clan or religious title' | 宰相 | zǎixiàng | さいしょう saishō |
재상 jaesang |
tể tương |
паалан (paalan) |
ᠫᠠᠯᠠᠩ (palaŋ) | 'enamel' | 珐琅 / 琺瑯 | fàláng | ほうろう hōrō |
법랑 beoblang |
pháp lang |
яандан (yandan) |
ᠶᠠᠨᠲᠣᠩ (yantuŋ) | 'pipe, chimney' | 煙筒 | yāntong | えんとう entō* |
연통 yeontong |
yên đồng* |
тайван (taivan) |
ᠲᠠᠢᠢᠪᠣᠩ (taibuŋ) | 'calmness, peace' | 太平 | tàipíng | たいへい taihei |
태평 taepyeong |
thái bình |
лууван (luuvan) |
ᠯᠣᠣᠪᠠᠩ (luubaŋ) | 'carrot' | 蘿蔔 | luóbo | らふく, らほく rafuku*, rahoku* |
나복, 라복 nabog, labog |
la bặc |
цүүвэн (tsüüven) |
ᠴᠦᠦᠪᠦᠩ (čüübüŋ) | 'rent' | 租費 | zūfèi | そひ sohi* |
조비 jobi* |
to phí* |
дааман (daaman) |
ᠳᠠ ᠮᠧᠨ (da mEn) | 'gates' | 大門 | dàmén | だいもん daimon |
대문 daemun |
đại môn* |
хуасан (xuasan) |
ᠬᠣᠸᠠᠱᠧᠩ (xuwašEŋ) | 'peanut' | 花生 | huāshēng | かせい kasei* |
화생 hwasaeng* |
hoa sinh* |
Pattern Two
(Table 7)
Mongolian | Trad script | Meaning | Chinese characters | Chinese (MS) | Japanese | Korean | Vietnamese |
хулуу (xuluu) |
ᠬᠣᠯᠣ (xulu) | pumpkin | 葫蘆 | húlu | ころ koro |
호로 holo |
hồ lô* |
саахуу (saaxuu) |
ᠰᠠᠬᠣᠣ (saxuu) | 'teapot' | 茶壺 | cháhú | ちゃこ chako* |
차 daho, chaho |
trà ho*, chè ho* |
санхүү (sanxüü) |
ᠰᠠᠩᠬᠦᠦ (saŋxüü) | 'warehouse' | 倉庫 | cāngkù | そうこ sōko |
창고 chang-go |
thương kho* |
пүнлүү (pünlüü) |
ᠫᠦᠩᠯᠦ (püŋlü) | 'salary' | 俸祿 | fēnglù | ほうろく hōroku |
봉록 bonglog |
bổng lục* |
цууянбуу (tsuuyanbuu) |
ᠴᠣᠣᠶᠠᠩᠪᠣ (čuuyaŋbu) | 'calico' | 粗洋布 | cūyángbù | そようふ soyōfu* |
조양포 joyangpo* |
to dương bố* |
мантуу mantuu |
ᠮᠠᠨᠲᠠᠣ (mantau) | 'flat cake' | 饅頭 | mántou | まんとう mantō* |
만두 mandu |
man đầu |
лантуу lantuu |
ᠮᠠᠩᠲᠣᠣ (laŋtuu) | sledgehammer | 榔頭 | lángtou | ろうとう rōtō* |
랑두, 낭두 rangdu, nangdu |
lang đầu* |
жинтүү ǰintüü |
ᠵᠢᠨᠲᠦᠦ (ǰintüü) | 'pillow' | 枕頭 | zhěntóu | ちんとう chintō |
침두 chimdu |
chấm đầu* |
пямбуу pyambuu |
ᠫᠢᠩᠪᠣᠣ? (piŋbuu?) | 'carpenter's plane' | 平刨 | píngbǎo | へいほう heihō* |
평포* pyeongpo* |
bình bào*, bình bao* |
ембүү yembüü |
ᠶᠣᠠᠮᠪᠣᠣ (yuwambuu), ᠶᠣᠠᠨᠪᠣᠣ (yuwanbuu) |
'ingot' | 元寶 | yuánbǎo | げんぽう genpō |
원보 wonbo |
nguyên bảo |
тэмбүү tembüü |
ᠲᠡᠮᠪᠦᠦ (tembüü) | 'syphilis' | 天疱 | tiānpào | てんぽう tenpō |
천포 cheonpo |
thiên bào* |
хуажуу xuaǰuu |
ᠬᠣᠸᠠᠵᠣᠣ (xuwaǰuu) | 'pepper' | 花椒 | huājiāo | かしょう kashō* |
화초 hwacho |
hoa tiêu* |
чинжүү činjüü |
ᠴᠢᠨᠵᠦᠦ (činǰüü), ᠴᠢᠩ ᠵᠢᠶᠣᠣ (čiŋ ǰiyuu) |
'green pepper' | 青椒 | qīngjiāo | せいしょう seishō* |
청초 cheongcho |
thanh tiêu* |
поолуу pooluu |
ᠫᠣᠣᠯᠣᠣ (pooluu) | wicker basket | 笸籮 | pǒluó | はら hara* |
-- | -- |
дэнлүү deŋlüü |
ᠳᠧᠩᠯᠦ (dEŋlüü) | lantern | 燈籠 | dēnglóng | とうろう tōrō |
등롱, 등농 deungrong, deungnong |
đèn lồng |
With one exception, санхүү saŋxüü, all pronunciations follow the Mongolian requirement for vowel harmony. In modern times, Mongolian has adopted the phoneme /f/ in borrowing vocabulary from Russian, but at the time this vocabulary was borrowed /f/ was alien to the Mongolian sound system and /p/ was used. In modern Inner Mongolia usage, ᠪᠠᠢᠢᠹᠠᠩ (baifaŋ) is also used instead of ᠪᠠᠢᠢᠪᠠᠩ (baibaŋ).
The above suggests that Mongolian borrowing of Chinese vocabulary has involved both auditory borrowing of an impressionistic type and, increasingly, more systematic borrowing based on a close acquaintance with Chinese phonology.
However, Mongolian has largely avoided the borrowing of Sinoxenic morphemes that is a strong feature of Sinoxenic languages, as seen in Table 3 above. Instead, Mongolian has tended to form compounds of native Mongolian words based on the Chinese model, i.e., calques. But this is the stuff of a separate post...
(This post is a patchy revision of my answer at Quora. In writing this post I've become increasingly aware of shortcomings in that answer, which was based on a superficial inspection of borrowed forms in the Cyrillic script. In the process of writing the Quora answer and this post, I've had reference to Н. Балжинням's Монгол хэлний Хятад ормол үгийн судалгаа; Marc Hideo Miyake's Old Japanese: a phonetic reconstruction; several dictionaries featuring the traditional Mongolian script; John Duong Phan's Dissertation preview of "Lacquered Words: the evolution of Vietnamese under Sinitic influences from the 1st century BCE to the 17th century CE; Wikipedia; and various flavours of Wiktionary. However, a deeper understanding of the process of borrowing vocabulary from Chinese requires a far greater familiarity with the history and nature of borrowing than I currently possess.)