jagomart
digital resources
picture1_Language Pdf 98695 | Goyal Item Download 2022-09-21 08-32-03


 116x       Filetype PDF       File size 0.18 MB       Source: nats-www.informatik.uni-hamburg.de


File: Language Pdf 98695 | Goyal Item Download 2022-09-21 08-32-03
resolving pattern ambiguity for english to hindi machine translation using wordnet niladri chatterjee shailly goyal anjali naithani department of mathematics indian institute of technology delhi hauz khas new delhi 110 ...

icon picture PDF Filetype PDF | Posted on 21 Sep 2022 | 4 years ago
Partial capture of text on file.
                               Resolving Pattern Ambiguity for English to Hindi
                                           Machine Translation Using WordNet
                                   Niladri Chatterjee               Shailly Goyal             Anjali Naithani
                                                          Department of Mathematics
                                                    Indian Institute of Technology Delhi
                                                   Hauz Khas, New Delhi - 110 016, India
                                              {niladri iitd, shailly goyal}@yahoo.com
                                      Abstract                                     Ram has a pen ∼ ram (Ram) ke pass (near
                                                                                   to) ek (one) kalam (pen) hai (is).
                    Acommonbelief about natural language trans-
                    lation is that sentences of similar structure in               Ram has fever∼ram (Ram)ko (to)bukhaar
                    the source language have translations that are                 (fever) hai (is).
                    similar in structure in the target language too.
                    However,withrespecttoEnglishtoHinditrans-
                    lation, this assumption does not hold well al-            Although the structures of the above two En-
                    ways. At least eleven different patterns can be            glish sentences are very similar, the structures
                    found in the Hindi translation of English sen-
                    tences in which the main verb is “have” or any            of their Hindi translations are visibly very dif-
                    of its declensions. This poses a serious prob-            ferent. This creates a different type of ambigu-
                    lem for designing any English to Hindi transla-           ity to the translator, which we term as “pattern
                    tion system. Traditionally such variations are
                    termed as “translation divergence”. Typically             ambiguity”. Typically, such variations in transla-
                    a study of divergence considers some standard             tions are considered under the study of “transla-
                    translation pattern for a given input sentence            tion divergence” (Dorr 93), (Gupta & Chatterjee
                    structure. A translation is said to be a diver-
                    gence if it deviates from this standard pattern.          03).   However, a subtle difference between pat-
                    However, this is not the case with the above-             tern ambiguity and divergence can be observed
                    mentioned sentence structures. We term this               easily.  Study of divergence assumes some typi-
                    ambiguity as “pattern ambiguity”. In this on-
                    going work we propose a rule-based scheme to              cal translation pattern (P, say) for a given source
                    resolve the ambiguity using word senses given             language sentence structure S. A translation di-
                    by WordNet.                                               vergence is said to occur if a source language sen-
                                                                              tence having the structure S assumes a pattern
               1 Introduction                                                 P1 that is different from P, upon translation into
                                                                              the target language. On the other hand, pattern
               Natural language translation between any two                   ambiguity does not assume any standard trans-
               languages almost inevitably suffers from ambigu-                lation pattern. Rather, corresponding to differ-
               ities of various types, such as, lexical ambiguity,            ent input sentences of the same structure differ-
               semantic ambiguity, syntactic ambiguity (Dorr et               ent translation patterns are observed, leading to
               al.  99). Typically, all these ambiguities are re-             “pattern ambiguity”. Handling this ambiguity re-
               lated to deciphering the inherent meaning of the               quires deep semantic analysis of source language
               source language sentence. Normally these ambi-                 sentences to find answers to:
               guities can be resolved by considering the part-                (a) How serious is pattern ambiguity in English
               of-speech of the word concerned, or from other                      to Hindi translation?
               words of the sentence, or from the context of the
               sentence. Once the ambiguity is resolved, obtain-              (b) How to find ways to resolve this ambiguity
               ing the correct translation in the target language                  while translating from English to Hindi?
               becomes simpler.
                  However, with respect to English to Hindi                   With respect to (a) we notice that the presence
               translation a different type of ambiguity is ob-                of pattern ambiguity is most prominent in deal-
               served (Goyal et al. 04). The problem here is not              ing with English verbs. In particular, we observe
               in understanding the sense of the sentence, rather,            that as many as eleven different translation pat-
               the difficulty is in deciding the correct structure              terns may be obtained in the translation of En-
               of the Hindi translation. The following sentences              glish sentences where the main verb is “have”, or
               and their Hindi translations illustrate this point:            some of its declensions.
                To provide an answer to (b), we suggest a rule       can be made with respect to the English verb
             based scheme that takes into account the senses         “have”. Although the number of possible senses
             of the underlying English verbs, and other con-         for “have” is relatively less (only 19, as per Word-
             stituent words of a sentence to resolve the ambi-       Net 2.0), we have obtained as many as 11 trans-
             guity.                                                  lation patterns for sentences where “have” (or
                In framing the above-mentioned rules we make         its declensions) is the main verb of the sentence.
                                               1                     Further, depending upon the situation, there are
             significant use of WordNet 2.0 .        In WordNet,
             English nouns, verbs, adjectives and adverbs are        variations in the verb used, or the case-ending
             organized into synonym sets, each representing          used, or sometimes even in the overall sentence
             one underlying lexical concept.        In the pro-      structure. This makes pattern ambiguity to be a
             posed scheme semantic information about the             serious problem for English to Hindi translation
             constituents of the sentence under consideration        while translating sentences of this type. Below we
             is extracted using WordNet, and this information        describe the different translation patterns that we
             is then processed to resolve the ambiguity.             observed in dealing with the English verb “have”.
             2 Translation Patterns of Different                      Translation Pattern P1: Here, genitive case
                  English Verbs to Hindi                             ending (kaa, kii, ke) is used to convey the sense
             One interesting aspect of English is that here          of the “have” verb. For example,
             a single verb is used to convey different senses.        The school has good name          ∼       vidyaalay
             However, almost for each of these senses, a spe-        (school) kaa (of) achchhaa (good) naam (name)
             cific verb exists in Hindi. Table 1 shows some           hai (is).
             of the Hindi equivalents for the verb “run” when          Which of the genitive case endings (i.e. kaa,
             used in different senses.                                kii, ke) will be used in a given case depends upon
                                                                     the number and gender of the object. It is “kaa”
               Sentences                         Translation         if the object is masculine singular, “kii” if the
                                                 of Verb             object is feminine (irrespective of the number of
               They run an N.G.O.                chalaanaa           the object), and “ke” for masculine plural.
               The army runs from one end        failnaa
               to another.                                           Translation Pattern P2: In this pattern the
               The river ran into the sea.       milnaa              object and its pre-modifying adjective in the En-
               He runs for treasurer.            khadaa honaa        glish sentence are realized as the subject and
               Wax runs in sun.                  galnaa              subjective complement (SC), respectively, in the
               Weran the ad three times.         prakaashit          Hindi translation.   The subject of English sen-
                                                 karnaa              tence is realized as possessive case of the subject
                                                                     of the Hindi translation. For example,
                  Table 1: Different translations of “run”                                         2
                                                                     Gita has beautiful hair ∼ Gita (Gita) ke
                                                                     (of) baal (hair) sundar (beautiful) hain (are).
                The same observations have been made with            Translation Pattern P3: Here a locative case
             respect to different English verbs, such as, be,         ending “ke paas” is used instead of genitive post-
             go, take, let, give.      All these English verbs       position. For illustration, consider the following,
             can be used to convey different senses in dif-
             ferent contexts. WordNet 2.0 provides different          Mohan has a book ∼ Mohan (Mohan) ke paas
             senses in which the above-mentioned verbs can be        (near to) ek (a) kitaab (book) hai (is).
             used. For example, the verb “run” has 41 senses,
             “call”has28senses, “take” has 42 senses. Since          Translation Pattern P4: In this pattern a
             the use of the appropriate Hindi verb can be de-        postposition “ko” is used in the Hindi translation
             termined by identifying the sense in which the          of the given sentence. For example,
             English verb is used, resolving pattern ambiguity       My uncle has asthama ∼ mere (my) chaachaa
             for these verbs is relatively simple.                   (uncle) ko (to) asthamaa (asthama) hai (is).
                Most interesting observation in this regard
                                                                       2Note that according to P1 it should have been Gita ke
                1http://wordnet.princeton.edu/                       sundar baal hain.
             Translation Pattern P5: Here the postposi-             Evidently, the verb of the translated sentence
             tion “mein” is used for conveying the sense of the   is obtained from the “sense” in which the verb
             verb “have”. For example,                            “have” is used in the English sentence.
             This city has a museum ∼ iss (This) shahar           Translation Pattern P10: In all the above
             (city) mein (in) ek (a) sangrahaalay (museum)
             hai (is).                                            cases the structure of the English sentences con-
                                                                  sidered has been . But, if the sentence
             Translation Pattern P6: This translation             has an additional component in the form of ad-
             pattern is similar to the pattern P5, except for     junct, then a variation in the translation may be
             the fact that postposition “mein” is replaced with   noticed. For illustration, consider the two sen-
             anotherpostposition “par”. For example consider      tences:
             the following:
             The tiger has stripes ∼ baagh (tiger) par             (a) Ram has two rupees
             (on) dhaariyan (stripes) hain (are).
             Translation Pattern P7: Here, upon transla-           (b) Ram has two rupees in his pocket.
             tion in Hindi, the object of the English sentence
             is realized as an SC which is an adjective. The      While the translation of the first one is “Ram ke
             following translations illustrate this pattern.      pass do rupayaa hain”, the translation of the sec-
             She has grace      ∼ wah (She) aakarshak             ond one is “Ram ki (Ram’s) zeb (pocket) mein
             (graceful) hai (is).                                 (in) do (two) rupay (Rupees) hain (are)”.
               Despite the obvious differences all the above-        Under this pattern the following changes take
             mentioned patterns have one common feature:          place:
             the main verb of the Hindi sentence is “hai”,
             which means “to be”, or any of its declension         (a) The object and the adjunct (PP) in the En-
             (hain, thaa, the, thii, thiin). But patterns P8 and       glish sentence are realized as the subject and
             P9, given below, illustrate cases when some other         the predicative adjunct, respectively, in the
             verb is used as the main verb instead of “hai” (or        Hindi translation.
             its declension).
             Translation Pattern P8: This pattern occurs           (b) The subject of the English sentence con-
             if the main verb of the Hindi translation is ob-          tributes as the possessive case to the pred-
             tained from the object of the English sentence.           icative adjunct.
             For illustration, consider the following example:    Translation Pattern P11: Thispattern is ob-
             Gita has regards for old men ∼ Gita (Gita)           served if, along with the subject, verb and object,
             buzurgon (old men) kii (of) izzat (respect) kar-     the sentence has an infinitive verb phrase. For
             tii hai (does).                                      example,
               The main verb of the Hindi sentence is izzat         My children had me buy the car ∼ mere
             karnaa, which comes from the object “regards”.       (my) bachchon ne (children) mujhse (me) gaadi
             In this respect one may note that Hindi verbs are    (car) kharidvaayai (buy).
             often made of a noun followed by a commonly-           Further, we have found instances where the
             used verb. The verb “izzat karnaa” is an example     Hindi translation follows pattern pertaining to
             of this type.                                        two or more classes. We term them as “mixed
             Translation Pattern P9: This pattern is sim-         patterns”. Due to page limitation we keep mixed
             ilar to the translation pattern P8, but here the     patterns out of the present discussion.
             verb is not obtained from the object. Rather, a        Suchalargevarietyoftranslationpatternspose
             completely new verb is introduced in the Hindi       great difficulty for any MT system, as the sys-
             translation. For example,                            temneedstotakeadecisionregardingthepattern
             I had tea ∼ maine (I) chai (tea) pee (drank).        that will be most suitable for a given input sen-
               But,                                               tence. In this work we study whether a rule-based
             I had rice ∼ maine (I) chaawal (rice) khaaye         scheme can be developed to resolve this ambigu-
             (ate).                                               ity.
             3 HowtoDesign Rules?                                  3. I have two dogs at home ∼ mere (my)
             We first attempted to frame rules based on sen-           ghar (home) par (at) do (two) kutte (dogs)
             tence structures.  We observed that translation          hain (are). Although this sentence also vio-
             patterns P10 and P11 are associated with spe-            lates condition (b), still the translation pat-
             cific sentence structures. The sentence structure         tern in P10.
             for rest of the patterns is . The rules for    Thus we notice that if the input sentence violates
             P10 and P11 that we could frame on the basis of     any of the above two conditions, then a variety of
             studying translations of sentences of these struc-  translation patterns may be obtained.
             tures are given below:                                The above rules, however, exclude the major-
             Rule for P11: If the input sentence structure       ity of the sentences, as these are relevant to some
             is such that the object of the verb (which is typ-  special structures only. The majority of the pat-
             ically noun or pronoun) is followed by another      terns are related to sentences having the simple
             verb, then Translation Pattern P11 is observed.      structures. Hence we needed to investi-
             I had Rama write a letter ∼ maine (I) rama          gate them further. In this respect the following is
             (Rama) se (by) patr (letter) likhvaayaa (write).    observed.
             Rule for P10: If the given sentence structure       3.1   Inadequacy of Subject/Object
             is of the type , and the PP satisfies the following two        the basis of the subject and/or object of the sen-
             conditions, then the translation of the concerned   tence. However, we found that the subject of the
             sentence will have pattern P10:                     sentence alone is not sufficient to determine the
                                                                 translation pattern of the sentence. For illustra-
             (a) The head noun of PP is not animate.             tion, all the sentences given in Table 2 have the
             (b) Head of the PP has a genitive pre-modifier       same subject, yet they differ in their translation
                 that refers to the subject of the sentence.     patterns.
             For example, consider the following sentences:        English sen-     Hindi Trans- Pattern
                                                                   tence            lation
              1. The table has dust on its surface          ∼      Mohan has a      Mohan kaa di-      P1
                 mej ki (table’s) satah (surface) par (on)         good brain       maag achchhaa
                 dhool (dust) hai (is).                                             hai
              2. Sita has vermillion on her forehead ∼             Mohan has a      Mohan ke paas      P3
                 Sita ke (Sita’s) maathe (forehead) par            good pen         ek      achchhii
                 (on) sindoor (vermillion) hai (is).                                kalam hai
             However, the pattern may not be appropriate if        Mohan has        Mohan ko tej P4
             one of the two conditions given above is not sat-     high fever       bukhaar hai
             isfied. Consider, for instance, the following trans-   Mohan had a      Mohan         ne   P9
             lations:                                              sweet apple      meethaa      seb
                                                                                    khaayaa
              1. She has regards for her uncle ∼ wah              Table 2: Translation patterns for same subject
                 (she) apne (her) chaachaa (uncle) ki izzat
                 kartii hai (respects). Note that the head         In a similar vein, one can see that the transla-
                 noun of the sentence is animate. Thus it vi-    tion pattern does not depend on the object too.
                 olates the condition (a) and one can observe    The sentences given in Table 3 have the same ob-
                 that the translation pattern is P8, i.e. it is  ject, yet their translation patterns are different.
                 different from P10.                                These examples highlight the inadequacy of the
              2. Sita has degree from IIT ∼ Sita (Sita)          subject/objectindeterminingthetranslationpat-
                 ke paas (near to) IIT (IIT) ki (from) degree    tern. In the next step we considered the senses
                 (degree) hai (is). This sentence violates the   of the nouns used as subject/object as given in
                 condition (b) above and the translation pat-    WordNet 2.0. We have been able to frame a few
                 tern is P3.                                     rules in this way. For illustration:
The words contained in this file might help you see if this file matches what you are looking for:

...Resolving pattern ambiguity for english to hindi machine translation using wordnet niladri chatterjee shailly goyal anjali naithani department of mathematics indian institute technology delhi hauz khas new india iitd yahoo com abstract ram has a pen ke pass near ek one kalam hai is acommonbelief about natural language trans lation that sentences similar structure in fever ko bukhaar the source have translations are target too however withrespecttoenglishtohinditrans this assumption does not hold well al although structures above two en ways at least eleven dierent patterns can be glish very found sen tences which main verb or any their visibly dif its declensions poses serious prob ferent creates type ambigu lem designing transla ity translator we term as tion system traditionally such variations termed divergence typically study considers some standard tions considered under given input sentence dorr gupta said diver gence if it deviates from subtle dierence between pat case with tern...

no reviews yet
Please Login to review.