My theory is...

a) That Japanese and Tamil are sister languages, even though they do not share close vocabulary correspondences in their current forms [I].

b) That there was once a Proto-Japonic-Dravidian language (PJD), that later diverged into Proto-Japonic (or Proto-Japonic-Ryukyuan), and Proto-Dravidian. Proto-Dravidian later diverged into the languages of southern India (including Tamil, Malayalam, Telugu and Kannada). Proto-Japonic (or Japonic-Ryukyuan) diverged into the modern day Japanese (and perhaps other Ryukyuan languages) [II].

c) That over the centuries, Japanese was strongly influenced in turn by other language groups: Old Korean, Austronesian, Altaic and Chinese and ended up absorbing a lot of their vocabulary. However, it largely retained its native grammar, which it inherited from Proto-JD [III].

d) That a nearly similar fate befell Tamil. With the growing influence of Sanskrit in northern India, and the increasing patronage of Vedic Brahmins by southern Indian rulers [IV], Tamil began to absorb a huge corpus of Sanskrit words, while retaining its underlying grammar (which remains distinctly different from Sanskrit even today).

e) That, if one should look for vocabulary correspondences, one should look not in old/classical texts (either of Japanese or Tamil) but in the regional dialects in the heartlands. In an attempt to standardize the Japanese language (and weed out native dialect differences) during the time classical texts were being written, it is possible that Japanese rulers imported several Chinese words into Japanese [V]. So vocabulary correspondences between Japanese and Dravidian, if preserved at all, would be preserved in local dialects, and not the standardizes (textual) form of the language.

Clearly, a lot of this is speculation, but in the next sections I will present evidence to lend weight to Japanese-Dravidian hypothesis.

[I]: Although they do share very similar (agglutinative) grammars, with very strong word-order similarities, as noted before.

[II]: There exists the possibility of a Proto-Japonic-Altaic-Dravidian language, but as I have no knowledge of any Altaic language, I cannot bring any new ideas to that hypothesis.

[III]: It is well-established that Chinese and Japanese have completely different grammars, and are not part of the same language family [1]. For further evidence, see next section (Japanese and Chinese)

[IV]: The Pallava kings (c. 600CE, date uncertain) wooed Brahmins from northern India with land grants and settled them into the Kaveri basin in southern India [12].

[V]: China has been the economic and intellectual powerhouse of East Asia over a lot of history. So it is plausible that the Japanese rulers used Chinese words (instead of a native dialect) to standardize the language for texts (after all, the writing system in these texts is the Chinese system!).

This situation is not without parallel in history. The Mughals who ruled India (15th-18th centurey CE) were essentially of Turko-Mongol origin (the word Mughal is, in fact, cognate to the word Mongol). Now, both Turkic and Mongolic belong to the Altaic family of languages. However, the language of the Mughal court and a lot of beatiful literature composed at this time was neither the language of the rulers (Turkic/Mongolic) nor the language of the natives (Sanskrit/Prakrit). Instead the Mughals adopted Persian, an Indo-Iranian language, and (ironically) a sister language of Sanskrit, which was the lingua franca in adjoining Persia, the political and intellectual powerhouse of the near East under the Safavid dynasty in the 15th century [14,15]. 

Japanese & Chinese: Not all that looks like Chinese is Chinese

Although Japanese writing shares a lot a common with Chinese, Japanese and Chinese are, in fact, completely unrelated languages! Japanese has two (in fact three) scripts: (a) the Chinese-borrowed script (Kanji), with related sounds, to represent vocabulary/ideas (such as verb roots), and (b) (Kanji-derived) Kana to represent the grammatical conjugations specific to Japanese. Each Japanese word is, thus, a composite of a Chinese kanji (to express a core idea) with Japanese-specific grammar endings (written with Kana) tacked on.

For example: 食べられなかった (taberarenakatta) means "Could not eat". However the root of the idea, eat, is the Chinese-borrowed Kanji 食 (ta). The rest of the characters are Kana suffixes indicating the mood or tense of the word. In this example, られ (rare) indicates potential/ability (can), な (na) indicates negative (not), かった (katta) is a past tense marker. There is no equivalent grammatical structure in the Chinese language.

But why did Japanese assimilate only Chinese vocabulary and not the entire grammatical structure of the language? Perhaps because grammar is more resilient to change than vocabulary (any linguists?). To take a simple example, a child learning a foreign language often tries to add the vocabulary of the new language (as superstratum) over the basic structure of his/her native language grammar. For instance, native Tamil speakers not proficient in English can be heard to say "why you are leaving?" (instead of "why are you..."). This is probably because in Tamil the word that naturally follows the question word ("why") is the subject of the sentence ("you") [The situation is actually, a little more complicated, because modern Tamil has lost the copula ("are"). However, even if it were to appear, the copula would appear at the end of the sentence exactly as in Japanese, and as in modern Malayalam (which has retained the copula)].

  1. "Perhaps because grammar is more resilient to change than vocabulary". I agree. I am not a linguist, though. Proto-JD looks implausible as both the cultures/countries have a completely different history. How about this> Dravidian rulers (per-historic) tried to invade deep east and several groups settled or put like this> Dravidian nomads (for thousands of years) began settling in the deep east. Sea routes:- As Dravidian group settled in other parts of the world like Mauritius, Fiji, Malaysia, Andaman islands, etc, I opine that the settlements (though the history says it happened some hundred years ago) could have happened during per-historic ages for reasons like climatic disaster, invasion/exploration. who knows! maybe/maybe not.

  2. Your example of Taberarenakatta is mislead.

    Japan didn't borrow the vocabulary or the grammar, but just the character (the script). "taberu" is the native work for "to eat" in its infinitive form.

    Actually it is basic knowledge in linguistics that basic vocabulary is more resilient to change than grammar.

    For example, 食べられなかった is only the standard form. In dialects there are countless ways of saying it. But the root "taber" would be the same. The same goes for Old Japanese.

    Another word for eat is "kuu" and the same Chinese character is also used for this. 食う "kuu" is used more around Kansai and also in older forms of Japanese.

    So 食べられなかった would have been expressed quite differently in different periods of Japan. 食えず ("kuezu") would be one way of saying it in OJ, perhaps. "ku" (root for eat) "e" (possibility) "zu" (negative-past).

    You should study more kanji and look into the difference between Onyomi and Kunyomi. Onyomi is the Chinese ways of reading it. It certainly *added* a lot of new vocabulary into Japanese but vary few have completely *replaced* native Japanese vocab.

    As for the connection with Tamil there is really no evidence at the moment. Some earlier works by the likes of Susumu Ono are fascinating but he's not really a comparative linguist and his methodologies are sometimes dodgy.

    But I'm interested in Japonic connections with other language families so if you want to prove something, good luck!

  3. Ken,

    Thank you for the detailed and thoughtful response. I would like to address a key point you raise here. Due to the word limit counts imposed by the blog, my reply is in three parts.

    Part 1:

    A main point in your response seems to be that (I quote). "... it is basic knowledge in linguistics that basic vocabulary is more resilient to change than grammar."

    I would be grateful if you could provide some reference or study that backs up this claim, so I can pursue this further for myself. I have heard this claim from several sources, but in my own first-hand experience the opposite, viz. grammar being more resilient to change than vocabulary, is definitely more common.

    Let me elaborate: In my own native language, Tamil (a Dravidian language), a large part of the basic vocabulary has been replaced with Sanskrit-borrowed words (common to Indo-European/Indo-Aryan languages); these include words for everyday items like water, fire, food etc. The replacement perhaps started out as an addition as you point out, but the non-native terms have come into such common use that the native terms have all but vanished.

    To prove my point, I’ve made up a game that I would like to invite native Dravidian speakers to play (at the end of this reply, Part III).

    The vocabulary swap is even more dramatic and obvious in its sister languages such as Malayalam, Telugu, indicating a greater influence of Sanskrit in these languages. Any native speaker of these Dravidian languages (who has some knowledge of Tamil and Sanskrit) will be able to attest to this. Yet all of these languages share a virtually identical grammatical structure.

    More recently, with the spread of Western culture and ideas, many Tamil words in day-to-day use have been or are being rapidly substituted with English terms. So much so that there are contests on Tamil TV shows challenging an audience member to engage in conversation with the host without using a single English word -- as proof of the pervasiveness of English into everyday Tamil, a majority fail to carry on a successful, English-free conversation, even for a couple of minutes!

    I hear that this is also happening with Japanese. In fact, here's a recent news story about a Japanese viewer who is suing NHK (Japanese news corp.) for excessive use of English-borrowed words, even when perfectly good native equivalents exist:


    1. Part II:

      Now one may argue that importing words is simply more convenient when referring to imported ideas or items, such as "terebi" (TV) or "rajio" (radio) in Japanese, for which native equivalents could make for clumsy terms. I have two responses to such a claim. For one, the list of replaced words is by no means exclusive to imported items: basic words like "work", "trouble" (Japanese), or "food", brother", "water" (Tamil/Malayalam) have been systematically replaced. Secondly, novel, imported ideas may be the very reason driving wholesale replacement of vocabulary!

      The advent of ideas and technologies that powerfully influenced society and culture could have served as a catalyst for importing, and in most cases replacing, even basic native vocabulary, especially among the intelligentsia. History is rife with such events: the spread Western culture and modernism across the world (and into Japan) in the 20th century, or the advent of Persian culture into northern India following the Mughal conquest c.1500 CE, or perhaps the Chinese alphabet and culture into Japan in the early years of the millennium! A (presumed) sense of superiority by association with the foreign culture may also have contributed to the diminishing use of native vocabulary alternatives.

      I have experienced first-hand the consequences of such social dynamics driving vocabulary change in my native language with the advent of the English-based education system. And history documents the Sanskritization of Tamil with the advent of Indo-Aryan religion and beliefs into Southern India (see the list above).

      Again, what changed in each case is *not* the grammar, but the vocabulary. And (in the case of Western influences) these changes have happened within the last 100-200 years. These facts fly in the face of any claim that vocabulary is less resilient to change than grammar. Perhaps the linguistic community is making this sweeping claim on the basis of one or a few language families alone (such as Indo-European), which have perhaps never (or rarely) underwent the turmoil of being repressed or dominated by other linguistic groups?

      Your thoughts?


  4. Part 3 A:

    The borrowing of Sanskrit words into Tamil is so pervasive that it has spawned various movements over many centuries to purify Tamil, and to expunge the Sanskrit borrowed terms (

    This phenomenon is apparent in other languages too (cf., the “diglossic” vocabulary of English). (

    Although, a key difference between Dravidian and these other languages is that, what perhaps started out as distinct “high” (Sanskritic) and “low” (Dravidian) layers of vocabulary, eventually merged into a unified vocabulary, with the native (“low”) vocabulary being completely replaced by the foreign (“high”) vocabulary in various instances.

    To demonstrate what I mean, let’s play a game (for native Tamil speakers).

    Provided below is a small sample (20) of common English words. Think of the first Tamil word that comes to mind when attempting to translate these words.

    Village – City/Town – Animal – Blood – Farmer – Cow – Studies – Clean/Pure – Anger – Sorrow – Happiness – Health – Bravery – Story – Beginning – Cloud – Island – Earth - Star – World

    If you, like me, are a native Tamil speaker growing up in an urban environment, your translations should match the following Tamil words. Amazingly, all of these words of fairly common usage are directly Sanskrit derived (Sanskrit word/root in parentheses).

    1. Village – Gramam (GrAma)
    2. City/Town –Nagaram (Nagara)
    3. Animal –Mirugam (Mruga; also specifically, “deer”)
    4. Blood – Rattham (Rakta)
    5. Farmer –Vivasayi (VyavasAyi; also generally, “industrious”)
    6. Cow – Pasu (Pashu; also generally, “animal”)
    7. Studies – Padippu (PaTh-; to read)
    8. Clean/Pure – Suttham (Shuddha)
    9. Anger – Kovam (Kup/Kop-)
    10. Sorrow – Dukkam/Sogam (Du:kha, Shoka)
    11. Happiness – Santhosam (Santhosha)
    12. Health – Aarogyam (Arogya)
    13. Bravery – Veeram, Dhairiyam (VIr/Dhairya)
    14. Story – Kadhai (Katha)
    15. Beginning – Aarambam (Arambha)
    16. Cloud – Megam (Megha)
    17. Island – Theevu (Dweepa)
    18. Earth – Bumi (Bhumi)
    19. Star – Natchathiram (Nakshatra)
    20. World – Ulagam (Loka)

    (end-of-part-3 A)

  5. Part 3 B:

    Now, here’s the real challenge. Take your time, and try to come up with the “pure” Tamil equivalents. I could come up with these for only a few, for many others I had to consult a dictionary. For several of these even the Tamil dictionary could only provide approximate, “pure” equivalents often comprising of compound words (*, below) or obscure terms that the average native speakers would not easily recognize (#, below)!

    1. Village – Sitroor* (lit. small town; siru=small; oor=town)
    2. City/Town –Oor
    3. Animal –Vilangu
    4. Blood – Kurudhi (#)
    5. Farmer –Uzhavar
    6. Cow – Pasu-maadu*
    7. Studies – Kalvi
    8. Clean – Thooya/Thooymai
    9. Anger – Sinam
    10. Sorrow – Varuttham/Thuyar
    11. Happiness – Magizhchi or Uvappu(#)
    12. Health – Udalnalam* (lit. udal=body + nalam=well-being)
    13. Bravery – Thunivu
    14. Story – Seydi* (lit. news) or Nigazhvu (#) (lit. happening)
    15. Beginning – Thodakkam
    16. Cloud – Mugil (#)
    17. Island – Neersoozhnilam* (lit. neer=water + soozh=surrounded + nilam=land)
    18. Earth – ?* (not available!)
    19. Star* – Vinmeen* (lit. vin=sky/space + meen=fish)
    20. World – Oozhi (#)

    These examples of common words should make it abundantly clear that vocabulary can be readily displaced by a dominant language, and flies in the face of any argument that basic vocabulary is resilient to change. And while the “kun”/”on” yomi of Japanese catalogues the words that were added to Japanese from Chinese, there is perhaps little to indicate which words were replaced wholesale due to borrowing from Chinese or other local languages. Perhaps I could use the term displacement (instead of replacement), with the idea that displaced words are not entirely replaced, but displace the native term to such an extent that it falls out of common use? I would like to reiterate my amazement at the commonly accepted claim that vocabulary is more resilient to change than grammar.

    Incidentally, I used the example of Kanji (“taberarenakatta”) simply to point out that people shouldn't be deceived by the physical similarity of the Chinese and Japanese writing (see title: “Not all that looks like Chinese is Chinese”). I didn’t intend for this to be a demonstration of vocabulary displacement (or replacement). Which is perhaps the actual answer to your question :)

    (end-of-part 3 B).

    1. Interesting Dr(!).Sridharan , a fascinating game . " Puvi " may be used as a pure tamil word for earth. But I am not sure about the origin though.

      Thanks for the enjoyable experience ,

      Dr.K.Jayasundaram , Fujairah United Arab Emirates.

    2. cow- aa- ஆ
      the planet on which we live; the world.
      "the diversity of life on earth"
      synonyms: world, globe, planet, sphere, "

      if this explanation is considered we can find some other words in Tamil.
      ulagam- உலகம்
      they suit for world as well,
      if it says about the ground on which we stand then we can use
      for clouds we have plenty of words
      mangul, konmoo, kondal...#
      angry- veguli- வெகுளி#
      but for the word kathai i still doubt that it has its origin in Sanskrit. It can be other way around. Because kathaithal "கதைத்தல்" is used in eelam tamil which means chatting. Yet I am not sure.