* Before starting, please check our Platform Guidelines.
5.1. Grammatical and Lexical Registers
1. Post-edition at Unbabel
At Unbabel we have a unique approach to translation: each text submitted by a customer is translated by our Machine Translation system, and then corrected by our community of editors in an online platform. By editing the output of the software, the editors ensure the quality of the translations and confirm that the message is accurate (i.e., has the same meaning as the original), fluent (i.e., can be easily understood and sounds natural) and is in line with the style requested by the clients (i.e. respects their register and terminology). In order to help editors do the best job possible, we provide various types of information:
1. Customer instructions, which include the identification of the client and his requests to personalize the translation, such as the register that must be used to address the recipient of the message. Following these instructions is vital to deliver translations that match the client’s expectations.
2. Glossaries, which correspond to specific vocabulary and expressions used by the client, and that must be respected by the editors.
3. Translation Memories, which correspond to stored segments (expressions, sentences or paragraphs) that have previously been translated and accepted for customer usage. They are useful for ensuring consistency across translations.
We also have Smartcheck, which is an application that checks the grammar, morphology, orthography and style of the translations while being edited. By using a large set of rules, Smartcheck flags words or groups of words that may present some kind of issue.
Finally, in order to deliver the best possible translation, we also provide these guidelines about your language specifications. Please, read them carefully and always follow these instructions in your editions.
In Hindi, case marker agrees with case assigning nouns, adjectives agree with the nouns they qualify and verb agrees with subject or object for person, number, gender and tense, aspect, mood. Sometimes, a verb may also inflect to mark honorificity for the agreeing subject.
Given below are some common grammatical errors. You should have a keen eye for details and must ensure that the translation reads clearly and is free of grammatical errors.
Do not employ आपने (second person ergative pronoun) in place of आपको (second person accusative/dative pronoun), as it is uncommon across native Hindi language speakers.
Source text: You have to come to my house party tomorrow.
✘ आपने कल मेरे घर पार्टी में आना है। [Delhi, Punjab dialect]
✓ आपको कल मेरे घर पार्टी में आना है।
In the same way, don’t use the accusative/dative case marker को with the first person possessive pronoun मेरे in place of the first person accusative pronoun मुझे, as it is also a dialectal variation.
Source text: I love playing cricket.
✘ मेरे को क्रिकेट खेलना बहुत पसंद है। [Mumbai, Hyderabad dialect]
✓ मुझे क्रिकेट खेलना बहुत पसंद है।
Note: Keep your postpositions simple, easy and more in line with the regular conversations we have. Avoid using complex postpositions, which have a simpler variant available.
In Hindi, nouns have either masculine or feminine gender. You must ensure that the agreeing possessive pronoun, adjective and/or verb form are in in line with that (i.e. agree in gender).
Source text: That's my ball.
✘ वह मेरा गेंद है।
✓ वह मेरी गेंद है।
Note: Unless the context calls for an exception, use the neutral masculine form to refer to nouns that designate a class of beings. Also, ensure the neutral masculine form in adjectives and pronouns that accompany the nouns that designate a class of beings.
Do not use suffixes/inflections in parentheses to cover masculine and feminine form together. Instead, you should use hyphens between both forms, which is a norm as per Hindi language.
Source text: actor - actress
An honorific pronoun should always get an honorific verb agreement.
Source text: What are you doing?
✘ आप क्या कर रहे हो?
✓ आप क्या कर रहे हैं ?
Source text: Will you come to the party tomorrow?
✘ आप कल पार्टी में आओगे?
✓ आप कल पार्टी में आएँगे?
Note: While conjugating verbs, the third-person-honorific form of the verb should be avoided, especially while referring to persons who have had a negative impact on society as per Indian government sources or Indian mythology.
Source text: Ravana was the king of Lanka.
✘ रावण लंका के राजा थे।
✓ रावण लंका का राजा था।
Source text: Colonel Dyer was responsible for the Jallianwala Bagh massacre.
✘ कर्नल डायर जलियाँवाला बाग़ नरसंहार के लिए जिम्मेदार थे।
✓ कर्नल डायर जलियाँवाला बाग़ नरसंहार के लिए जिम्मेदार था।
2.1.4. Verb agreement
Make sure that the verb form is in line with agreeing subject or object for person, number and gender. Also, in case of serial verbs, all subsidiaries should give tense, aspect and mood information in a grammatically correct way.
Source text: The king gave the watch to the doctor.
✘ राजा ने डॉक्टर को घड़ी दिया।
✓ राजा ने डॉक्टर को घड़ी दी।
Source text: Whom is this man writing a letter to?
✘ ये आदमी किसको चिट्ठी लिख रही है?
✓ ये आदमी किसको चिट्ठी लिख रहा है?
Source text: It has been raining.
✘ बारिश हुई रही होगी।
✓ बारिश हुई होगी।
✓ बारिश हो रही होगी।
2.2. Passive voice
Keep the voice consistent with the source sentence as much as possible. When the use of passive voice has a negative readability effect, use active voice or passive reflexive.
2.3. Transitive verbs
When a transitive verb has an implicit direct object that is evident from the context, there is no need to add it. It is common in a user interface discourse.
Source text: Click the screen to see more
✓ और देखने के लिए क्लिक करें
✓ और देखने के लिए स्क्रीन क्लिक करें
Note: If there is a clear grammatical mistake in the source text, please provide a correct translation in the target language without repeating the error.
This section helps in maintaining consistency over different styles of writing and is in accordance with the Expert Committee appointed by the Government of India for the purpose, and is approved by the Government.
3.1. Case Markers
The case markers in Hindi should always be written as separate words, except in case of pronouns where they should be tagged on to the stems.
Source text: Ram killed Ravan.
✘ रामने रावण को मारा।
✓ राम ने रावण को मारा।
Source text: He killed Ravan.
✘ उस ने रावण को मारा।
✓ उसने रावण को मारा।
When the pronouns have complex (multiword) postpositions at a time, the first should be tagged on to the stem while the second should be written separately.
Source text: John bought flowers for her.
✘ जॉन उस के लिए फूल लाया
✓ जॉन उसके लिए फूल लाया
Source text: Which is good among these?
✘ इस में से कौन अच्छा है?
✓ इसमें से कौन अच्छा है?
When the particles ही, तक etc. are between a pronoun, then their case marker should be written as a separate word.
Source text: For you only
✘ आपही के लिए
✓ आप ही के लिए
Source text: Even to me
✘ मुझतक को
✓ मुझ तक को
In case of coordinating compounds, viz. adverbials, copulatives, echo-words, fractional ordinals, and onomatopoeic words, you should place a hyphen between the constituent words.
Source text: Slowly [adverbial]
✘ धीरे धीरे
Source text: Mother-father [copulative]
✘ माता पिता
Source text: Tea-Shea [echo words]
✘ चाय वाय
Source text: One third [fractions]
✘ एक तिहाई
Source text: Woof-woof [onomatopoeic]
✘ भौ भौ
Plural of copulative compounds should be made by adding inflection to the both constituents.
Source text: Boys and girls
This section is about standard and non-standard spelling conventions used in Hindi language.
- Glidal य, व
Where the use of glidal य, व is optional, it may be avoided, i.e., in the words like गए-गये, नई-नयी, हुआ-हुवा etc., using only the former (vowel) forms. This rule is applicable in all cases viz., adjectival and indeclinable forms. The following examples with ✓ are the standard.
Source text: went
Source text: rupees
Source text: What happened?
✘ क्या हुवा
✓ क्या हुआ
Source text: New Delhi
✘ नयी दिल्ली
✓ नई दिल्ली
- Halant & Visarga
Tatsam words borrowed from Sanskrit should ordinarily be written in their original Sanskrit form. But where the halant sign ((्)) and visarg (:) has dropped out of use, it doesn’t need not be revived.
Source text: Equal
Source text: Sadness
Indeclinable are the words without grammatical inflection. There are several types of indeclinables in Hindi that denote various types of feelings and senses, e.g. आह, ओह, ही, तो, सो, भी, न, तब, कब, यहाँ, वहाँ, सदा, क्या, श्री, जी, तक, भर, मात्र, साथ, कि, लेकिन, या, अथवा, और
The rule lays down that the indeclinables be written as separate words and not attached.
Source text: Whole night
✓ रात भर
The honorific indeclinables श्री and जी should also be written as separate words.
Source text: Modi Ji
✓ मोदी जी
- Participial verbs
All subsidiaries in a serial verb should always be written as separate words, but the participial verbs, also called verbal modifiers, should always be written as single word.
Source text: After adding sugar
✘ शक्कर मिला कर
✓ शक्कर मिलाकर
- Parallel forms
Some Hindi words have two parallel forms in currency, both of which have been generally recognized by scholars in the field, e.g. गरदन-गर्दन, गरमी-गर्मी, बरतन-बर्तन, बिलकुल-बिल्कुल, भरती-भर्ती, दोबारा-दुबारा etc. Uniformity in the spelling of such words is not considered essential.
3.4. Foreign sounds
Foreign sounds which have been adopted in the Hindi language should continue to be used.
Nuqtā is a diacritic mark which was introduced in Devanāgari to represent some Arabo-Persian sounds that do not have a native character in the scripts. You should apply nuqtā wherever it should be applied.
Source text: Ghazal
Source text: Paper
Source text: Speed
You should be able to differentiate between the words forming a minimal pair and these foreign sounds to avoid their incorrect use, and deliver a semantically correct translation.
Also, avoid any incorrect application of the diacritic nuqtā.
Source text: then
- Vowel /ɒ/
To represent the half open /ɒ/ vowel of the English language, ऑ has been introduced in Hindi. Thus, it must be used where the English words with this vowel sound is desired.
Source text: Hostel
Source text: Pause
- Grapheme <अ>
Use <अ> in words adopted from Turkish, Persian and Arabic where it comes in the middle.
Source text: Begining
Source text: Actually
Source text: The Quraan
Regarding number format (written in numbers or words), you must always respect the source text: if they are written as digits in the source text, this should be maintained in the translation; on the other hand, if they are written as words, they should be translated to the target language.
Use Arabic numerals (0,1,2,3…) not Hindi (०,१,२,३…), which is in line with the standard rule.
Source text: 0123456789
Symbols are used in Hindi in the same way as in English. Just ensure that there is no whitespace before any symbol.
Source text: 74%
✘ 74 %
Source text: 81/5 Indiranagar
✘ 81 / 5 इंदिरानगर
✓ 81/5 इंदिरानगर
Note: If some weird symbols (broken html) are there in the source, remove them in the translation.
Except for full stop, all other punctuation marks in Hindi are same as in English. Make sure that you avoid whitespaces before punctuation marks as well.
Hindi generally doesn’t use comma between phrases. You should not use any unnatural comma in Hindi even if it is present in the source text.
Source text: Colonel Dyer was an officer of the British Indian Army who, as a temporary brigadier general, was responsible for the Jallianwala Bagh massacre.
✘ कर्नल डायर ब्रिटिश भारतीय सेना का एक अधिकारी था, जो एक अस्थायी ब्रिगेडियर जनरल के रूप में, जलियाँवाला बाग़ नरसंहार के लिए जिम्मेदार था।
✓ कर्नल डायर ब्रिटिश भारतीय सेना का एक अधिकारी था जो एक अस्थायी ब्रिगेडियर जनरल के रूप में जलियाँवाला बाग़ नरसंहार के लिए जिम्मेदार था।
Also, avoid the wrong use of comma before और, या, अथवा, तथा in coordinating nouns in a phrase.
Source text: grapes, apples, oranges and sapodillas
✘ अंगूर, सेब, संतरे, और चीकू
✓ अंगूर, सेब, संतरे और चीकू
If needed, commas can be added in the target text to provide a more readable translation.
The Hindi full stop, called pūrṇavirām, should be represented by (।) and not by (.)
Source text: This is the train to Borivali.
✘ यह बोरीवली जाने वाली गाड़ी है.
✓ यह बोरीवली जाने वाली गाड़ी है।
- Other Punctuation
Extra punctuation (commas, quotation marks, question marks, exclamation marks, etc.) not present in the source could be added in the translation only if it is really needed to provide a more fluent translation.
Source text: Translate “I love you” in Hindi
✘ I love you का हिंदी में अनुवाद करें।
✓ “I love you” का हिंदी में अनुवाद करें।
5.1. Grammatical and Lexical Registers
Register refers to the level of formality used in the text. It shows how our clients address their customers and contributes to the voice of the brand itself. Register may vary depending on the company, the brand, the service they offer, the customers, and the target language.
We make a first main distinction between grammatical and lexical register: the first one regards the use of pronouns and verb person (for the languages to which this morphological feature is applied), while the latter is related to lexical choices, since some words and expressions also have a degree of formality or colloquialism.
Both these registers are also divided into formal and informal, as shown below.
5.2. Formal Register
Formal register is used in formal domains like administration, profession, law, etc. You must deeply understand your source to deliver translation in the desired register and style.
Type of Register
के संदर्भ में , की अपेक्षा
कोर्ट ने मेरी छुट्टी मंज़ूर की थी।
कृपया, बटन दबाएँ।
5.3. Informal Register
Nowadays, the general trend in media and marketing is the use of modern Hindi. If the client requires informal register, you should avoid any formal translation as a general guideline. The belief is that the Hindi translation tends to be too formal that some users cannot understand it.
Type of Register
के बारे में, की बजाए
गुड मॉर्निंग!, शुक्रिया!
कोर्ट ने मुझे छुट्टी दी थी।
प्लीज़, बटन दबाएँ।
In general, informal register is open to English words and code-mixing, and focuses on the daily spoken language, while the formal register leans towards written and traditional form.
6. Localization Challenges
6.1. Proper Nouns
Proper nouns refer to unique entities, such as persons, places, organizations, brands, events, etc. As far as foreign proper nouns are concerned, languages may adopt different rules regarding whether they should be translated or kept in the original language. When editing a text, you should follow your language rules for all different types of proper nouns listed below. However, please note that if there is a glossary provided by the client that includes these types of units, you should always apply the glossary items.
While transliterating non-Hindi person names, check for existing spellings in standard media sources. Do not just go by their spellings, as pronunciation may differ in the target language.
Source text: François Hollande
✘ फ्रेंकोइस हॉलैंड
✓ फ्रांस्वा ओलांद
If the name is not available, transliterate it to the closest sounds in the Hindi language.
On the other side, while transliterating Hindi person names from roman script, you should not get carried away by the roman spellings of the names in English language.
Source text: Shankaraditya
Source text: Gupta Empire
✘ गुप्ता साम्राज्य
✓ गुप्त साम्राज्य
Ensure that you are translating place names with the available Hindi names, and not just transliterate everything. They should be transliterated only if no local term is available.
Source text: Persia
Source text: Houston
Always check if the client wants organizations’ names to be left in roman script only. If not, please do not translate unless they have an established name in Hindi language.
Source text: FBI
✘ संघीय जाँच विभाग
Source text: World Health Organization
✘ वर्ल्ड हेल्थ ऑर्गनाइजेशन
✓ विश्व स्वास्थ्य संगठन
6.1.4. Brands and products
Brand and product names should be left in English. Do not translate proprietary nouns unless the firm has a localised name for the proprietary noun. Please double check if the client wants them left in roman script only.
Source text: Microsoft Word
✘ माइक्रोसॉफ्ट शब्द
✓ माइक्रोसॉफ्ट वर्ड [in case client doesn’t want transliteration, leave this in roman script Microsoft Word]
6.1.5. Other Entities
For movies, songs, book, titles etc., if they are known titles, and you have an existing translation in your target language, please provide it. If you don’t have an existing translation in your target language, please transliterate it.
Source text: My Experiments With Truth
✘ माय एक्सपेरिमेंट्स विद ट्रुथ
✓ सत्य के प्रयोग
6.2. Acronyms and initials
Acronyms are abbreviations formed from the first letters of a multiword term, with those letters pronounced as one word.
Acronyms, whether in English or Hindi, should be transliterated and written as single words.
Source text: UNESCO
Source text: UNICEF
Note: Translate abbreviations and acronyms only when the translated version is currently more common for target language speakers.
Initials are abbreviations that are pronounced one letter at a time. Check the clients’ instruction to see if they want English initialism to be left in roman script only. If transliterated, English initials are generally written without any space or dot.
Source text: HDFC
✘ एच डी एफ सी
If it is a Hindi initial written in Devanāgarī alphasyllabary system, and it is pronounced fully despite shortening in written form, it should be written with the dot symbol without spaces.
Source text: U.P. [Uttar Pradesh]
✘ उ प्र
6.3. Date format
The English format mm/dd/yyyy is not used in Hindi, so please convert it to the dd/mm/yyyy format.
Source text: 03/22/2018
Also, remember that the short forms of month names are not widely used in Hindi.
Source text: Jan. 3, 2018
✘ 03 जन 2018
✓ 03 जनवरी 2018
The above statement is not applicable to the names of days viz. सोमवार, मंगलवार, बुधवार etc.
6.4. Time format
The equivalent term for ‘a.m.’ and ‘p.m.’ in Hindi are पूर्वाह्न and अपराह्न respectively. You may translate time written in a 12-hour format with am/pm using one of these equivalents.
Source text: 9:00 a.m.
✘ 9:00 am
✓ पूर्वाह्न 9:00 बजे
However, if the context allows, you may use सुबह, दोपहर, शाम or रात to increase the readability.
Measures should always keep the format of the source text and should never be converted. Measurement units should be written in full or short depending on the source text.
Source text: 10 km
The currency unit symbol for the Indian Rupee (INR) is ₹, and it should be used in place of रु.
Source text: Rs. 5,000.00
✘ रु 5,000.00
✓ ₹ 5,000.00
Currency symbols (€, $, £, etc.) should be placed before the number, with a whitespace.
Source text: £ 500.98
✓ £ 500.98
Currency initials (USD, GBP, RUB, INR, DKK, NOK, etc.) should not be translated, as they are a convention accepted worldwide. You just need to place the initial after the number, with a whitespace.
Source text: USD 5,000
✘ USD 5,000
For full currency names (e.g. 100 dollars, 100 pounds, etc.), transliterate it in Devnagri script.
6.7. Technical terminology
You should adhere to the glossary or instructions provided by the clients for technical terms.
In case it is not provided, you are encouraged to do online research and preferably source them from standard sources. It is desired to provide the transliterated term in parentheses.
Source text: Ratio range (geometric progression)
✘ जियोमेट्रिकल प्रोग्रेशन
✘ जियोमेट्रिकल श्रेणी
✘ गुणोत्तर प्रोग्रेशन
✘ गुणोत्तर श्रेणी
श्रेणी (जियोमेट्रिकल प्रोग्रेशन)
Note: Only if the translation of a term is unavailable, it should be transliterated.
The rules of Sandhi should be applied to the Indic translation and not the transliterated terms.
7. Tricky cases
Do not always phrase the Hindi sentence on the English sentence structure.
Source text: India got its independence on:
✘ भारत स्वतंत्र हुआ:
✓ भारत कब स्वतंत्र
In Hindi, verb carries gender information. You must ensure that adherence to suffixes and sandhi rules does not lead to gender information when desired (e.g. in a user interface where a string is used for both male and female users).
Source text: What is the relation between X and Y?
✘ प्रश्न: X, Y की क्या लगेगी?
✓ प्रश्न: X का Y से क्या संबंध है?
In case of idioms and proverbs, if there goes any parallel idiom and phrase for the source language idiom and phrase, please provide it. Otherwise, you are encouraged to use your creativity and translate it into something which can accurately convey the original meaning.
Source text: Birds of same feather flock together.
✓ चोर-चोर मौसेरे भाई / एक ही थैली के चट्टे-बट्टे
Source text: Diamonds cut diamonds.
✓ लोहा लोहे को काटता है।
8. Most frequent errors
It is critical that before you start editing the translations you read the client’s instructions and pay attention to glossaries and translation memories. Please note that this document has been created in general, but the rules to follow are as per the standards in Hindi language.
As general rules, you should try to avoid the following orthographical errors, which are often frequent and consume the majority of time in the post-edit process:
- use chandrabindu (ाँ)
- use nuqta for क़, ख़, ग़, ज़ and फ़
- use Hindi full stop (।) not English full stop (.)
Also, try not to sound too literal. Never impose the source language syntax on the target language, which may make it sound unnatural. Ensure that the items do not portray bias against any caste, gender, differently-abled group or religious community.
Source text: Railways administration is installing escalators for visually impaired people and senior citizens.
✘ रेलवे प्रशासन अपाहिजों और बढ़ूों के लिए स्टेशनों पर चलती सीढ़ियाँ लगा रहा है।
✓ रेलवे प्रशासन दिव्यांग जनों और वरिष्ठ नागरिकों के लिए स्टेशनों
पर चलती सीढ़ियाँ लगा रहा है।
9. Useful online resources
Hindi Verb Conjugator
English verb conjugator
English grammar guide
Article is closed for comments.