Lecture “Natural language processing – Chapter 4: Computational linguistics” has contents: What is computational linguistics, corpus definitions, corpus categories, parallel corpora application, alignment methods, normalization, lemmatization and tokenization. | Trường Đại học Công nghiệp Tp. HCM Khoa Công nghệ thông tin (Faculty of Information Technology) . NATURAL LANGUAGE PROCESSING Teacher: Lê Ngọc Tấn Email: Blog: Chapter 4 Computational Linguistics NLP. What is computational linguistics? It is an interdisciplinary field dealing with the statistical or rule-based modeling of natural language from a computational perspective Corpus, Corpora Pre-processing : normalization, tokenization, Alignment Methods Programming NLP. Corpus Definitions What is a corpus? – It contains an important number of texts – Corpora : a set of corpus Golden corpus – Brown Corpus – Susanne Corpus – EUROPARL Corpus Corpus can be annotated or POS tagged NLP. Corpus Categories (1) Schema of corpus evolution NLP. .