116x Filetype PDF File size 0.18 MB Source: nats-www.informatik.uni-hamburg.de
Resolving Pattern Ambiguity for English to Hindi
Machine Translation Using WordNet
Niladri Chatterjee Shailly Goyal Anjali Naithani
Department of Mathematics
Indian Institute of Technology Delhi
Hauz Khas, New Delhi - 110 016, India
{niladri iitd, shailly goyal}@yahoo.com
Abstract Ram has a pen ∼ ram (Ram) ke pass (near
to) ek (one) kalam (pen) hai (is).
Acommonbelief about natural language trans-
lation is that sentences of similar structure in Ram has fever∼ram (Ram)ko (to)bukhaar
the source language have translations that are (fever) hai (is).
similar in structure in the target language too.
However,withrespecttoEnglishtoHinditrans-
lation, this assumption does not hold well al- Although the structures of the above two En-
ways. At least eleven different patterns can be glish sentences are very similar, the structures
found in the Hindi translation of English sen-
tences in which the main verb is “have” or any of their Hindi translations are visibly very dif-
of its declensions. This poses a serious prob- ferent. This creates a different type of ambigu-
lem for designing any English to Hindi transla- ity to the translator, which we term as “pattern
tion system. Traditionally such variations are
termed as “translation divergence”. Typically ambiguity”. Typically, such variations in transla-
a study of divergence considers some standard tions are considered under the study of “transla-
translation pattern for a given input sentence tion divergence” (Dorr 93), (Gupta & Chatterjee
structure. A translation is said to be a diver-
gence if it deviates from this standard pattern. 03). However, a subtle difference between pat-
However, this is not the case with the above- tern ambiguity and divergence can be observed
mentioned sentence structures. We term this easily. Study of divergence assumes some typi-
ambiguity as “pattern ambiguity”. In this on-
going work we propose a rule-based scheme to cal translation pattern (P, say) for a given source
resolve the ambiguity using word senses given language sentence structure S. A translation di-
by WordNet. vergence is said to occur if a source language sen-
tence having the structure S assumes a pattern
1 Introduction P1 that is different from P, upon translation into
the target language. On the other hand, pattern
Natural language translation between any two ambiguity does not assume any standard trans-
languages almost inevitably suffers from ambigu- lation pattern. Rather, corresponding to differ-
ities of various types, such as, lexical ambiguity, ent input sentences of the same structure differ-
semantic ambiguity, syntactic ambiguity (Dorr et ent translation patterns are observed, leading to
al. 99). Typically, all these ambiguities are re- “pattern ambiguity”. Handling this ambiguity re-
lated to deciphering the inherent meaning of the quires deep semantic analysis of source language
source language sentence. Normally these ambi- sentences to find answers to:
guities can be resolved by considering the part- (a) How serious is pattern ambiguity in English
of-speech of the word concerned, or from other to Hindi translation?
words of the sentence, or from the context of the
sentence. Once the ambiguity is resolved, obtain- (b) How to find ways to resolve this ambiguity
ing the correct translation in the target language while translating from English to Hindi?
becomes simpler.
However, with respect to English to Hindi With respect to (a) we notice that the presence
translation a different type of ambiguity is ob- of pattern ambiguity is most prominent in deal-
served (Goyal et al. 04). The problem here is not ing with English verbs. In particular, we observe
in understanding the sense of the sentence, rather, that as many as eleven different translation pat-
the difficulty is in deciding the correct structure terns may be obtained in the translation of En-
of the Hindi translation. The following sentences glish sentences where the main verb is “have”, or
and their Hindi translations illustrate this point: some of its declensions.
To provide an answer to (b), we suggest a rule can be made with respect to the English verb
based scheme that takes into account the senses “have”. Although the number of possible senses
of the underlying English verbs, and other con- for “have” is relatively less (only 19, as per Word-
stituent words of a sentence to resolve the ambi- Net 2.0), we have obtained as many as 11 trans-
guity. lation patterns for sentences where “have” (or
In framing the above-mentioned rules we make its declensions) is the main verb of the sentence.
1 Further, depending upon the situation, there are
significant use of WordNet 2.0 . In WordNet,
English nouns, verbs, adjectives and adverbs are variations in the verb used, or the case-ending
organized into synonym sets, each representing used, or sometimes even in the overall sentence
one underlying lexical concept. In the pro- structure. This makes pattern ambiguity to be a
posed scheme semantic information about the serious problem for English to Hindi translation
constituents of the sentence under consideration while translating sentences of this type. Below we
is extracted using WordNet, and this information describe the different translation patterns that we
is then processed to resolve the ambiguity. observed in dealing with the English verb “have”.
2 Translation Patterns of Different Translation Pattern P1: Here, genitive case
English Verbs to Hindi ending (kaa, kii, ke) is used to convey the sense
One interesting aspect of English is that here of the “have” verb. For example,
a single verb is used to convey different senses. The school has good name ∼ vidyaalay
However, almost for each of these senses, a spe- (school) kaa (of) achchhaa (good) naam (name)
cific verb exists in Hindi. Table 1 shows some hai (is).
of the Hindi equivalents for the verb “run” when Which of the genitive case endings (i.e. kaa,
used in different senses. kii, ke) will be used in a given case depends upon
the number and gender of the object. It is “kaa”
Sentences Translation if the object is masculine singular, “kii” if the
of Verb object is feminine (irrespective of the number of
They run an N.G.O. chalaanaa the object), and “ke” for masculine plural.
The army runs from one end failnaa
to another. Translation Pattern P2: In this pattern the
The river ran into the sea. milnaa object and its pre-modifying adjective in the En-
He runs for treasurer. khadaa honaa glish sentence are realized as the subject and
Wax runs in sun. galnaa subjective complement (SC), respectively, in the
Weran the ad three times. prakaashit Hindi translation. The subject of English sen-
karnaa tence is realized as possessive case of the subject
of the Hindi translation. For example,
Table 1: Different translations of “run” 2
Gita has beautiful hair ∼ Gita (Gita) ke
(of) baal (hair) sundar (beautiful) hain (are).
The same observations have been made with Translation Pattern P3: Here a locative case
respect to different English verbs, such as, be, ending “ke paas” is used instead of genitive post-
go, take, let, give. All these English verbs position. For illustration, consider the following,
can be used to convey different senses in dif-
ferent contexts. WordNet 2.0 provides different Mohan has a book ∼ Mohan (Mohan) ke paas
senses in which the above-mentioned verbs can be (near to) ek (a) kitaab (book) hai (is).
used. For example, the verb “run” has 41 senses,
“call”has28senses, “take” has 42 senses. Since Translation Pattern P4: In this pattern a
the use of the appropriate Hindi verb can be de- postposition “ko” is used in the Hindi translation
termined by identifying the sense in which the of the given sentence. For example,
English verb is used, resolving pattern ambiguity My uncle has asthama ∼ mere (my) chaachaa
for these verbs is relatively simple. (uncle) ko (to) asthamaa (asthama) hai (is).
Most interesting observation in this regard
2Note that according to P1 it should have been Gita ke
1http://wordnet.princeton.edu/ sundar baal hain.
Translation Pattern P5: Here the postposi- Evidently, the verb of the translated sentence
tion “mein” is used for conveying the sense of the is obtained from the “sense” in which the verb
verb “have”. For example, “have” is used in the English sentence.
This city has a museum ∼ iss (This) shahar Translation Pattern P10: In all the above
(city) mein (in) ek (a) sangrahaalay (museum)
hai (is). cases the structure of the English sentences con-
sidered has been . But, if the sentence
Translation Pattern P6: This translation has an additional component in the form of ad-
pattern is similar to the pattern P5, except for junct, then a variation in the translation may be
the fact that postposition “mein” is replaced with noticed. For illustration, consider the two sen-
anotherpostposition “par”. For example consider tences:
the following:
The tiger has stripes ∼ baagh (tiger) par (a) Ram has two rupees
(on) dhaariyan (stripes) hain (are).
Translation Pattern P7: Here, upon transla- (b) Ram has two rupees in his pocket.
tion in Hindi, the object of the English sentence
is realized as an SC which is an adjective. The While the translation of the first one is “Ram ke
following translations illustrate this pattern. pass do rupayaa hain”, the translation of the sec-
She has grace ∼ wah (She) aakarshak ond one is “Ram ki (Ram’s) zeb (pocket) mein
(graceful) hai (is). (in) do (two) rupay (Rupees) hain (are)”.
Despite the obvious differences all the above- Under this pattern the following changes take
mentioned patterns have one common feature: place:
the main verb of the Hindi sentence is “hai”,
which means “to be”, or any of its declension (a) The object and the adjunct (PP) in the En-
(hain, thaa, the, thii, thiin). But patterns P8 and glish sentence are realized as the subject and
P9, given below, illustrate cases when some other the predicative adjunct, respectively, in the
verb is used as the main verb instead of “hai” (or Hindi translation.
its declension).
Translation Pattern P8: This pattern occurs (b) The subject of the English sentence con-
if the main verb of the Hindi translation is ob- tributes as the possessive case to the pred-
tained from the object of the English sentence. icative adjunct.
For illustration, consider the following example: Translation Pattern P11: Thispattern is ob-
Gita has regards for old men ∼ Gita (Gita) served if, along with the subject, verb and object,
buzurgon (old men) kii (of) izzat (respect) kar- the sentence has an infinitive verb phrase. For
tii hai (does). example,
The main verb of the Hindi sentence is izzat My children had me buy the car ∼ mere
karnaa, which comes from the object “regards”. (my) bachchon ne (children) mujhse (me) gaadi
In this respect one may note that Hindi verbs are (car) kharidvaayai (buy).
often made of a noun followed by a commonly- Further, we have found instances where the
used verb. The verb “izzat karnaa” is an example Hindi translation follows pattern pertaining to
of this type. two or more classes. We term them as “mixed
Translation Pattern P9: This pattern is sim- patterns”. Due to page limitation we keep mixed
ilar to the translation pattern P8, but here the patterns out of the present discussion.
verb is not obtained from the object. Rather, a Suchalargevarietyoftranslationpatternspose
completely new verb is introduced in the Hindi great difficulty for any MT system, as the sys-
translation. For example, temneedstotakeadecisionregardingthepattern
I had tea ∼ maine (I) chai (tea) pee (drank). that will be most suitable for a given input sen-
But, tence. In this work we study whether a rule-based
I had rice ∼ maine (I) chaawal (rice) khaaye scheme can be developed to resolve this ambigu-
(ate). ity.
3 HowtoDesign Rules? 3. I have two dogs at home ∼ mere (my)
We first attempted to frame rules based on sen- ghar (home) par (at) do (two) kutte (dogs)
tence structures. We observed that translation hain (are). Although this sentence also vio-
patterns P10 and P11 are associated with spe- lates condition (b), still the translation pat-
cific sentence structures. The sentence structure tern in P10.
for rest of the patterns is . The rules for Thus we notice that if the input sentence violates
P10 and P11 that we could frame on the basis of any of the above two conditions, then a variety of
studying translations of sentences of these struc- translation patterns may be obtained.
tures are given below: The above rules, however, exclude the major-
Rule for P11: If the input sentence structure ity of the sentences, as these are relevant to some
is such that the object of the verb (which is typ- special structures only. The majority of the pat-
ically noun or pronoun) is followed by another terns are related to sentences having the simple
verb, then Translation Pattern P11 is observed. structures. Hence we needed to investi-
I had Rama write a letter ∼ maine (I) rama gate them further. In this respect the following is
(Rama) se (by) patr (letter) likhvaayaa (write). observed.
Rule for P10: If the given sentence structure 3.1 Inadequacy of Subject/Object
is of the type , and the PP satisfies the following two the basis of the subject and/or object of the sen-
conditions, then the translation of the concerned tence. However, we found that the subject of the
sentence will have pattern P10: sentence alone is not sufficient to determine the
translation pattern of the sentence. For illustra-
(a) The head noun of PP is not animate. tion, all the sentences given in Table 2 have the
(b) Head of the PP has a genitive pre-modifier same subject, yet they differ in their translation
that refers to the subject of the sentence. patterns.
For example, consider the following sentences: English sen- Hindi Trans- Pattern
tence lation
1. The table has dust on its surface ∼ Mohan has a Mohan kaa di- P1
mej ki (table’s) satah (surface) par (on) good brain maag achchhaa
dhool (dust) hai (is). hai
2. Sita has vermillion on her forehead ∼ Mohan has a Mohan ke paas P3
Sita ke (Sita’s) maathe (forehead) par good pen ek achchhii
(on) sindoor (vermillion) hai (is). kalam hai
However, the pattern may not be appropriate if Mohan has Mohan ko tej P4
one of the two conditions given above is not sat- high fever bukhaar hai
isfied. Consider, for instance, the following trans- Mohan had a Mohan ne P9
lations: sweet apple meethaa seb
khaayaa
1. She has regards for her uncle ∼ wah Table 2: Translation patterns for same subject
(she) apne (her) chaachaa (uncle) ki izzat
kartii hai (respects). Note that the head In a similar vein, one can see that the transla-
noun of the sentence is animate. Thus it vi- tion pattern does not depend on the object too.
olates the condition (a) and one can observe The sentences given in Table 3 have the same ob-
that the translation pattern is P8, i.e. it is ject, yet their translation patterns are different.
different from P10. These examples highlight the inadequacy of the
2. Sita has degree from IIT ∼ Sita (Sita) subject/objectindeterminingthetranslationpat-
ke paas (near to) IIT (IIT) ki (from) degree tern. In the next step we considered the senses
(degree) hai (is). This sentence violates the of the nouns used as subject/object as given in
condition (b) above and the translation pat- WordNet 2.0. We have been able to frame a few
tern is P3. rules in this way. For illustration:
no reviews yet
Please Login to review.