**Rabin-Karp Algorithm**

Rabin-Karp Algorithm is the important topic of the Analysis & Design of Algorithm. Moreover, freestudy9 has all kind of important information and topic related to it.

- This algorithm makes use of elementary number-theoretic notions such as the equivalence of two numbers modulo a third number.
- Let us assume that Σ = {0, 1, 2… 9}, so that each character is a decimal digit. (In the general case, we can assume that each character is a digit in radix-d notation, where d = |Σ|).
- We can then view a string of k consecutive characters as representing a length-k decimal number. The character string 31415 thus corresponds to the decimal number 31,415.
- Given a pattern P [1…m], let p denotes its corresponding decimal value.
- In a similar manner, given a text T [1…n], let t
_{s}denote the decimal value of the length-m substring T[s + 1…s + m], for s = 0, 1. . . n – m. - Certainly, t
_{s}= p if and only if T [s + 1…s + m] = P [1…m]; thus, s is a valid shift if and only if t_{s}= p. - We can compute p in time θ(m) using Horner’s rule: P = P[m] + 10 (P[m – 1] + 10(P[m – 2] + · · · + 10(P[2] + 10P[1])…))

##### The value t

_{0 }can be similarly computed from T [1…m] in time θ(m).- To compute the remaining values t
_{1}, t_{2}, . . . , t_{n-m}in time θ(n-m), it suffices to observe that t_{s+1 }can be computed from t_{s}in constant time, since t_{s+1}= 10(t_{s}– 10^{m-1}T[s + 1]) + T[s + m + 1]

- Subtracting 10
^{m-1 }T[s + 1] removes the high-order digit from t_{s}, multiplying the result by 10 shifts the number left one position, and adding T [s + m + 1] brings in the appropriate lower order digit. - For example, if m = 5 and t
_{s}= 31415 then we wish to remove the high order digit T [s + 1] = 3 and bring in the new lower order digit (suppose it is T [s + 5 + 1] = 2) to obtain t_{s+1}= 10 (31415 – 10000 *3) + 2 = 14152 - The only difficulty with this procedure is that p and t
_{s}may be too large to work with conveniently. - There is a simple cure for this problem, compute p and the t
_{s}‘s modulo a suitable modulus q. t_{s+1}= (d(t_{s}– T[s + 1]h) + T[s + m + 1]) mod q Where h = d^{m-1}(mod q) is the value of the digit “1” in the high-order position of an m-digit text window. - The solution of working modulo q is not perfect, however, t
_{s }= p (mod q) does not imply that t_{s }= p but if t_{s }≠ p (mod q) definitely implies t_{s }≠ p, so that shift s is invalid. - Any shift s for which t
_{s }= p (mod q) must be tested further to see whether s is really valid or it is just a spurious hit. - This additional test explicitly checks the condition T [s + 1…s + m] = P [1…m].

**Algorithm RABIN-KARP-MATCHER(T, P, d, q)**

n ← length[T];

m ← length[P];

h ← d^{m-1} mod q;

p ← 0;

t_{0} ← 0;

**for** i ← 1 **to** m **do**

p ← (dp + P[i]) mod q;

t_{0} ← (dt_{0} + P[i]) mod q

**for** s ← 0 **to** n – m **do**

**if** p == t_{s} **then**

** if **P[1..m] == T[s+1..s+m] **then**

**print** “pattern occurs with shift s”

**if** s<n-m **then**

t_{s+1} ← (d(t_{s} – T[s+1]h) + T[s+m+1]) mod q

**Analysis for Rabin-Karp Algorithm**

RABIN-KARP-MATCHER takes **Θ(m)** preprocessing time and it matching time is **Θ(m(n – m + 1)) **in the worst case.

**Related Terms**

ADA Algorithms, Travelling Salesman Problem, Naive String Matching Algorithm, Minmax Principle

## Leave a Reply