Data Structures and Algorithms: Text Processing! presents about Pattern Matching, Strings, Brute-Force Pattern Matching, Boyer-Moore Heuristics, Last-Occurrence Function, The Boyer-Moore Algorithm. | Data Structures and Algorithms
Text Processing! Outline" • Pattern Matching! • Tries! • The Greedy Method and Text Compression! • Dynamic Programming! Phạm Bảo Sơn - DSA 2 Pattern Matching " Strings " • A string is a sequence of characters! • Examples of strings:! – – – – Java program! HTML document! DNA sequence! Digitized image! • An alphabet Σ is the set of possible characters for a family of strings! • Example of alphabets:! – ASCII! – Unicode! – {0, 1}! – Phạm C, G, T}! DSA {A, Bảo Sơn - • Let P be a string of size m ! – A substring P[i j] of P is the subsequence of P consisting of the characters with ranks between i and j – A prefix of P is a substring of the type P[0 i] – A suffix of P is a substring of the type P[i m - 1] • Given strings T (text) and P (pattern), the pattern matching problem consists of finding a substring of T equal to P • Applications:! – Text editors! – Search engines! – Biological research! 4 Brute-Force Pattern Matching " • The brute-force pattern matching algorithm compares the pattern P with the text T for each possible shift of P relative to T, until either! – a match is found, or! – all placements of the pattern have been tried! Phạm Bảo Sơn - .