Bài giảng "Xử lý ngôn ngữ tự nhiên - Chương 2: Gán nhãn từ loại" cung cấp cho người học các kiến thức: PennTreebank, Hidden Markov model, conditional random fields, đánh giá. nội dung chi tiết. | Chương 2 Gán nhãn từ loại PennTreebank Hidden Markov model Conditional Random Fields IT4772 Xử lý ngôn ngữ tự nhiên Viện CNTT-TT, ĐHBKHN Đánh giá 2 Chương 2 Xác định từ loại PennTreebank INFORMATION EXTRACTION NATURAL LANGUAGE UNDERSTANDING END-TO-END APPLICATIONS ● Created by University of Pennsylvania ● Eight-years project: 1989 – 1996 ● 7 millions words of POS tagged texts ● POS tagset is based on Brown Corpus NATURAL LANGUAGE GENERATION DATA + LINGUISTICS + MACHINE LEARNING 3 4 Penn POS tagset ● CC He bought a car and a house. ● CD Five years later, autocar will be popular. ● DT Pierre Vinken will join the board. ● EX There is no asbestos in our product now. 5 ● IN 6 ● The percentage of lung cancer appears to be highest. Mr Vinken is chairman of Elsevier . ● ● JJ MD US should regulate the class of asbestos. Rudolph Agnew was named an executive director. ● JJS ● JJR NN It’s more than three times the expected number. The number of death was higher than expected ● NNS Portfolio managers expect further declines in interest rates. 7 8 ● NNP ● It expects to obtain regulatory approval. Alexis Sanchez joined Manchester United yesterday. ● ● NNPS PP$ Shareholders approve its acquisition by Royal Trustco Ltd. the Japan Automobile Dealers’ Association. ● PRP ● RB depends heavily on creativity POS ● at Monday’s auction RBR worked for the project for more than six years 9 ● RBS 10 ● to return home the most mundane aspect of its workers ● VB TO ● VBD the executives joined Mayor William He decided to stay ● VBG before boarding the buses again ● VBN A buffet breakfast was held in the museum 11 12 ● VBP ● Plans that give advertisers disscount ● WRB where employees are assigned lunch partners VBZ The plan is not an attempt ● WDT a project that did not include Seymor ● WP who couldn’t be reach for comment 13 14 15 16 Chương