RabinKarp Algorithm
RabinKarp Algorithm is the important topic of the Analysis & Design of Algorithm. Moreover, freestudy9 has all kind of important information and topic related to it.
 This algorithm makes use of elementary numbertheoretic notions such as the equivalence of two numbers modulo a third number.
 Let us assume that Σ = {0, 1, 2… 9}, so that each character is a decimal digit. (In the general case, we can assume that each character is a digit in radixd notation, where d = Σ).
 We can then view a string of k consecutive characters as representing a lengthk decimal number. The character string 31415 thus corresponds to the decimal number 31,415.
 Given a pattern P [1…m], let p denotes its corresponding decimal value.
 In a similar manner, given a text T [1…n], let t_{s} denote the decimal value of the lengthm substring T[s + 1…s + m], for s = 0, 1. . . n – m.
 Certainly, t_{s} = p if and only if T [s + 1…s + m] = P [1…m]; thus, s is a valid shift if and only if t_{s} = p.
 We can compute p in time θ(m) using Horner’s rule: P = P[m] + 10 (P[m – 1] + 10(P[m – 2] + · · · + 10(P[2] + 10P[1])…))

The value t_{0 }can be similarly computed from T [1…m] in time θ(m).
 To compute the remaining values t_{1}, t_{2}, . . . , t_{nm} in time θ(nm), it suffices to observe that t_{s+1 }can be computed from t_{s} in constant time, since t_{s+1} = 10(t_{s} – 10^{m1}T[s + 1]) + T[s + m + 1]
 Subtracting 10^{m1 }T[s + 1] removes the highorder digit from t_{s}, multiplying the result by 10 shifts the number left one position, and adding T [s + m + 1] brings in the appropriate lower order digit.
 For example, if m = 5 and t_{s} = 31415 then we wish to remove the high order digit T [s + 1] = 3 and bring in the new lower order digit (suppose it is T [s + 5 + 1] = 2) to obtain t_{s+1} = 10 (31415 – 10000 *3) + 2 = 14152
 The only difficulty with this procedure is that p and t_{s} may be too large to work with conveniently.
 There is a simple cure for this problem, compute p and the t_{s}‘s modulo a suitable modulus q. t_{s+1} = (d(t_{s} – T[s + 1]h) + T[s + m + 1]) mod q Where h = d^{m1} (mod q) is the value of the digit “1” in the highorder position of an mdigit text window.
 The solution of working modulo q is not perfect, however, t_{s }= p (mod q) does not imply that t_{s }= p but if t_{s }≠ p (mod q) definitely implies t_{s }≠ p, so that shift s is invalid.
 Any shift s for which t_{s }= p (mod q) must be tested further to see whether s is really valid or it is just a spurious hit.
 This additional test explicitly checks the condition T [s + 1…s + m] = P [1…m].
Algorithm RABINKARPMATCHER(T, P, d, q)
n ← length[T];
m ← length[P];
h ← d^{m1} mod q;
p ← 0;
t_{0} ← 0;
for i ← 1 to m do
p ← (dp + P[i]) mod q;
t_{0} ← (dt_{0} + P[i]) mod q
for s ← 0 to n – m do
if p == t_{s} then
if P[1..m] == T[s+1..s+m] then
print “pattern occurs with shift s”
if s<nm then
t_{s+1} ← (d(t_{s} – T[s+1]h) + T[s+m+1]) mod q
Analysis for RabinKarp Algorithm
RABINKARPMATCHER takes Θ(m) preprocessing time and it matching time is Θ(m(n – m + 1)) in the worst case.
Related Terms
ADA Algorithms, Travelling Salesman Problem, Naive String Matching Algorithm, Minmax Principle
Leave a Reply