Yates on the origin of Urdu

William Yates was an Oriental scholar during British India. He summarizes the Islamic origins of Urdū (which he calls Hindūstānī) from Perso-Arabic and how it is different from Hindi (which he calls Hindūī). His book, Dictionary, Hindūstānī and English , was published in 1847.


Yates asserts that Hindī is often confused with Urdū. Urdū was formed by a mixture of Perso-Arabic of the Islamic invaders and Hindī of the natives of North West India. Meaning, Hindī was already an existing language of India but Urdū was formed as a mixture (reḳhta). He doesn’t discuss the origin of Hindī itself, but does acknowledge its relationship with Sanskrit. Nouns and adjectives are mostly of Perso-Arabic origin whereas the rest (verbs, adverbs, prepositions, etc.) are of Hindī origin. He claims to have intentionally avoided words of Hindi origin in his dictionary, so that his book contains only “pure” Urdū words. The majority of the words in Urdū are of Arabic origin (not Persian directly), and hence an Appendix of Arabic roots is provided in his dictionary.

The following text is from the Preface of his book, reproduced verbatim here. I’ve replaced the acute accents for long vowels with macrons, keeping in line with modern IAST conventions. I’ve added descriptive headings for paragraphs, but not altered them otherwise.

Source: Dictionary, Hindūstānī and English

Urdu is a mixture of Persian and Hindi

"The Hindūstānī or Urdū is peculiarly the language of the Muhammadan population of Hindūstān…

The origin and structure of this dialect may be briefly explained. The language spoken by the Hindū population of the North West Provinces at time of the first Muhammadan invasion, was the Hindūī or Hindī, a language entirely distinct from the Urdū, although often confounded with it. That of the conquerors was the Persian. As the Musalmāns settled in India, their descendants adopted the grammatical forms of the Hindūī, retaining, in great measure, their own Persian and Arabic words. Hence the Urdū is often called the Reḳhta, or the Mixed language.

Nouns and adjectives are mostly from Perso-Arabic

Let the reader examine any good Urdū work, (as the Ḳhirad Afroz for example,) and he will find that almost all the Nouns and Adjectives are Arabic or Persian. The Pronouns and Post-positions, the Auxiliary Verbs as Karnā (करना), Honā (होना), &c., the Numerals, most of the Adverbs, many of the Prepositions, with a few nouns and adjectives, and a small number of simple Verbs, are from the Hindūī. From this statement it is obvious that the relation of the Urdū to the Persian is precisely the same as that of the Persian to the Arabic.

Analogy with English

We may further illustrate the matter by reference to the English. The language spoken in England at the time of the Norman invasion was the Anglo-Saxon, that of the conquerors, the Norman-French. From the mixture of these two the English was formed, which in its grammatical forms is entirely Anglo-Saxon, but in the greater part of its words, is of French and Latin origin. We have therefore, the following parallel:–

    Latin.    French.     Anglo-Saxon.    English.
    Arabic.   Persian.    Hindūī.         Urdū.

Hindi-Sanskrit words are intentionally excluded

A few words are necessary to explain the plan of the present work. It was the object of the compiler to present the students of this language with a work more suited to their wants than the ponderous and expensive quarto of Mr. Shakespear. To this effect he has been guided by the following rules:–

  1. All those words of Sanscrit or Hindūī origin which are peculiar to Hindūī and are never used in good Urdū writing or conversation have been excluded. Such words ought never to have been admitted into a Dictionary of this language, and their insertion has been owing to its being confounded with the Hindūī.

  2. The derivations of Sanscrit and Hindūī words have been omitted. They are of little use or interest except to the Sanscrit sutdent, and may be found generally in Sanscrit dictionaries. The spelling of the words in Deb-nāgarī has also been omitted, as occupying too much space and being quite unnecessary. The Urdū is seldom written in Deb-nāgarī, that chracter being peculiar to the Sanscrit and Hindūī.

  3. The derivations of the Arabic words have been given in an Appendix at the end of the work. This arrangement offers some advantages. A complete list is given of all the Arabic roots whose derivatives are used in Urdū, together with the rules for the formation of those derivatives. In looking out a word in the Dictionary, one wishes to get at the meaning at once, and not be delayed by the derivation. As the majority of the words in Urdū are of Arabic origin, the utility of this appendix is obvious, since anyone by mastering the list of roots and understanding the rules of derivation, may easily remember the meaning of every Arabic word in the language."

– Calcutta, January, 1847, W. Yates, Dictionary, Hindūstānī and English.