Kỹ thuật của chúng tôi chất chiết xuất từ các diễn giải từ corpora song song. Trong khi nó có vẻ tròn để cố gắng làm giảm bớt các vấn đề liên quan với corpora song song nhỏ bằng cách sử dụng các diễn giải được tạo ra từ corpora song song, nó không phải là. | 86 Chapter 5. Improving Statistical Machine Translation with Paraphrases arma politica political weapon political tool recurso politico instrumento politico political weapon political asset political instrument instrument of policy policy instrument policy tool political implement political tool arma weapon arm arms palanca politica herramienta politica political lever political tool political instrument Table Example of paraphrases for the Spanish phrase arma politica and their English translations Increasing coverage of parallel corpora with parallel corpora Our technique extracts paraphrases from parallel corpora. While it may seem circular to try to alleviate the problems associated with small parallel corpora using paraphrases generated from parallel corpora it is not. The reason that it is not is the fact that paraphrases can be generated from parallel corpora between the source language and languages other than the target language. For example when translating from English into a minority language like Maltese we will have only a very limited English-Maltese parallel corpus to train our translation model from and will therefore have only a relatively small set of English phrases for which we have learned translations. However we can use many other parallel corpora to train our paraphrasing model. We can generate English paraphrases using the English-Danish English-Dutch English-Finnish English-French English-German English-Italian English-Portuguese English-Spanish and English-Swedish from the Europarl corpus. The English side of the parallel corpora does not have to be identical so we could also use the English-Arabic and English-Chinese parallel corpora from the DARPA GALE program. Thus translation from English to Maltese can potentially be improved using parallel corpora between English and any other language. Note that there is an imbalance since translation is only improved when translating from the resource rich language into the resource poor