Đang chuẩn bị liên kết để tải về tài liệu:
A new feature to improve Moore’s sentence alignment method

Không đóng trình duyệt đến khi xuất hiện nút TẢI XUỐNG

The sentence alignment approach proposed by Moore, 2002 (M-Align) is an effective method which gets a relatively high performance based on combination of length-based and word correspondences. Nevertheless, despite the high precision, M-Align usually gets a low recall especially when dealing with sparse data problem. | VNU Journal of Science: Comp. Science & Com. Eng. Vol. 31. No. 1 (2015) 32–44 A New Feature to Improve Moore’s Sentence Alignment Method Hai-Long Trieu1 Phuong-Thai Nguyen2 Le-Minh Nguyen1 1 Japan Advanced Institute of Science and Technology, Ishikawa, Japan University of Engineering and Technology, Hanoi, Vietnam 2 VNU Abstract The sentence alignment approach proposed by Moore, 2002 (M-Align) is an effective method which gets a relatively high performance based on combination of length-based and word correspondences. Nevertheless, despite the high precision, M-Align usually gets a low recall especially when dealing with sparse data problem. We propose an algorithm which not only exploits advantages of M-Align but overcomes the weakness of this baseline method by using a new feature in sentence alignment, word clustering. Experiments shows an improvement on the baseline method up to 30% recall while precision is reasonable. c 2015 Published by VNU Journal of Science. Manuscript communication: received 17 June 2014, revised 4 january 2015, accepted 19 January 2015 Corresponding author: Trieu Hai Long, trieulh@jaist.ac.jp Keywords: Sentence Alignment, Parallel Corpora, Word Clustering, Natural Language Processing 1. Introduction Online parallel texts are ample and substantial resources today. In order to apply these materials into useful applications like machine translation, these resources need to be aligned at sentence level. This is the task known as sentence alignment which maps sentences in the text of the source language to their corresponding units in the text of the target language. After aligned at sentence level, the bilingual corpora are greatly useful in many important applications. Efficient and powerful sentence alignment algorithms, therefore, become increasingly important. The sentence alignment approach proposed by Moore, 2002 [14] is an effective method which gets a relatively high performance especially in precision. Nonetheless, this method

Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.