Abstract of doctoral dissertation Computer science: Enhancing performance of mathematical expression detection in scientific document images

The thesis mainly aims to solve the following tasks: Firstly, the thesis extensively analyzes a wide range of existing approaches for the ME detection in scientific document images. Then, the thesis investigates and proposes novel methods to improve the detection accuracy of MEs. After enhancing the detection accuracy of MEs, the thesis investigates and pro poses a framework to improve the accuracy of the recognition of MEs in scientific document images. | MINISTRY OF EDUCATION AND TRAINING UNIVERSITY OF SCIENCE AND TECHNOLOGY BUI HAI PHONG ENHANCING PERFORMANCE OF MATHEMATICAL EXPRESSION DETECTION IN SCIENTIFIC DOCUMENT IMAGES Major Computer Science Code 9480101 ABSTRACT OF DOCTORAL DISSERTATION COMPUTER SCIENCE Hanoi 2021 This study is completed at Hanoi University of Science and Technology Supervisors 1. Assoc. Prof. Hoang Manh Thang 2. Assoc. Prof. Le Thi Lan Reviewer 1 Reviewer 2 Reviewer 3 This dissertation will be defended before approval commitee at Hanoi University of Science and Technology Time date month year 2021 This dissertation can be found at 1. Ta Quang Buu Library - Hanoi University of Science and Technology 2. Vietnam National Library INTRODUCTION Motivation Up to now a huge number of scientific documents have been produced. Scientific doc- uments have provided valuable information for research community. The documents need to be digitized to allow users to retrieve information efficiently. Recently most documents have been published in the PDF format. However a large number of documents have been still available in raster format. It is obvious that the PDF processing techniques cannot be applied for such raster document images. We need to apply image processing for the digitization of the document images. The key steps of the document digitization are document analysis optical character recognition and content searching 2 . The digitization of standard text rich docu- ments has considered as a solved problem. However the digitization of scientific documents that contained rich MEs is a non trivial task. Actually scientific documents usually consist of heterogeneous components tables figures texts and MEs. In scientific documents MEs may be mixed with various components and sizes styles of MEs may frequently vary. Therefore the improvement of accuracy of the detection and recognition of MEs is an important step of the digitization of scientific documents. Inspired by the above ideas the thesis .

Không thể tạo bản xem trước, hãy bấm tải xuống
TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.