This chapter provides knowledge of pattern matching. Data structures and algorithms in java provides an introduction to data structures and algorithms, including their design, analysis, and implementation. | Pattern Matching 3/25/14 15:54 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldwasser, Wiley, 2014 Pattern Matching a b a c a a a b a c a b a b a b 1 4 © 2014 Goodrich, Tamassia, Goldwasser 3 2 c a b Pattern Matching 1 Strings ! A string is a sequence of ! ! Let P be a string of size m characters Examples of strings: n n n n n Python program HTML document DNA sequence Digitized image n n ! An alphabet Σ is the set of ! possible characters for a family of strings Example of alphabets: n n n n ASCII Unicode {0, 1} {A, C, G, T} ! Given strings T (text) and P ! (pattern), the pattern matching problem consists of finding a substring of T equal to P Applications: n n n © 2014 Goodrich, Tamassia, Goldwasser A substring P[i j] of P is the subsequence of P consisting of the characters with ranks between i and j A prefix of P is a substring of the type P[0 i] A suffix of P is a substring of the type P[i m - 1] Pattern Matching Text editors Search engines Biological research 2 1 Pattern Matching 3/25/14 15:54 Brute-Force Pattern Matching ! The brute-force pattern matching algorithm compares the pattern P with the text T for each possible shift of P relative to T, until either n n a match is found, or all placements of the pattern have been tried ! Brute-force pattern matching runs in time O(nm) ! Example of worst case: n n n n T = aaa ah P = aaah may occur in images and DNA sequences unlikely in English text © 2014 Goodrich, Tamassia, Goldwasser Algorithm BruteForceMatch(T, P) Input text T of size n and pattern P of size m Output starting index of a substring of T equal to P or -1 if no such substring exists for i ← 0 to n - m { test shift i of the pattern } j←0 while j n - 1 return -1 { no match } © 2014 Goodrich, Tamassia, Goldwasser Case 1: j ≤ 1 + l . . . a . i . . . b