Posted by tintin_2003 on 2025-10-08 20:59:57 | Last Updated by tintin_2003 on 2025-10-16 01:36:28
Share: Facebook | Twitter | Whatsapp | Linkedin Visits: 25
Searching algorithms are fundamental to computer science and software development. Whether you're building a search engine, managing databases, or simply looking for an element in a list, understanding these algorithms is crucial. In this comprehensive guide, we'll explore 12 essential searching algorithms with descriptions, implementations, and examples.
Description: Linear search is the simplest searching algorithm. It traverses through the list sequentially, checking each element until it finds the target or reaches the end. While not the most efficient, it works on both sorted and unsorted data and requires no preprocessing.
Time Complexity: O(n)
Space Complexity: O(1)
Code:
def linear_search(arr, target):
"""
Performs linear search on an array
Returns the index of target if found, -1 otherwise
"""
for i in range(len(arr)):
if arr[i] == target:
return i
return -1
# Example usage
numbers = [64, 34, 25, 12, 22, 11, 90]
target = 22
result = linear_search(numbers, target)
print(f"Array: {numbers}")
print(f"Target: {target}")
print(f"Result: Element found at index {result}" if result != -1 else "Result: Element not found")
Output:
Array: [64, 34, 25, 12, 22, 11, 90]
Target: 22
Result: Element found at index 4
Description: Binary search is a highly efficient algorithm that works on sorted arrays using the divide-and-conquer strategy. It repeatedly divides the search interval in half, comparing the middle element with the target. If they match, the search is complete; otherwise, it continues in the left or right half based on the comparison.
Time Complexity: O(log n)
Space Complexity: O(1) iterative, O(log n) recursive
Code:
def binary_search(arr, target):
"""
Performs binary search on a sorted array
Returns the index of target if found, -1 otherwise
"""
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
# Example usage
sorted_numbers = [11, 12, 22, 25, 34, 64, 90]
target = 25
result = binary_search(sorted_numbers, target)
print(f"Sorted Array: {sorted_numbers}")
print(f"Target: {target}")
print(f"Result: Element found at index {result}" if result != -1 else "Result: Element not found")
Output:
Sorted Array: [11, 12, 22, 25, 34, 64, 90]
Target: 25
Result: Element found at index 3
Description: Interpolation search is an improvement over binary search for uniformly distributed sorted arrays. Instead of always checking the middle element, it estimates the position of the target based on its value, similar to how we search in a phone book. It performs exceptionally well on uniformly distributed data.
Time Complexity: O(log log n) average case, O(n) worst case
Space Complexity: O(1)
Code:
def interpolation_search(arr, target):
"""
Performs interpolation search on a sorted array
Returns the index of target if found, -1 otherwise
"""
left, right = 0, len(arr) - 1
while left <= right and target >= arr[left] and target <= arr[right]:
# Handle case where all elements are the same
if left == right:
if arr[left] == target:
return left
return -1
# Estimate the position
pos = left + ((target - arr[left]) * (right - left) //
(arr[right] - arr[left]))
if arr[pos] == target:
return pos
elif arr[pos] < target:
left = pos + 1
else:
right = pos - 1
return -1
# Example usage
uniform_data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
target = 70
result = interpolation_search(uniform_data, target)
print(f"Array: {uniform_data}")
print(f"Target: {target}")
print(f"Result: Element found at index {result}" if result != -1 else "Result: Element not found")
Output:
Array: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
Target: 70
Result: Element found at index 6
Description: Hash tables use a hash function to compute an index into an array of buckets, from which the desired value can be found. This provides near-constant time lookups in the average case. Collision handling strategies like chaining or open addressing are used when multiple keys hash to the same index.
Time Complexity: O(1) average case, O(n) worst case
Space Complexity: O(n)
Code:
class HashTable:
def __init__(self, size=10):
self.size = size
self.table = [[] for _ in range(size)]
def _hash(self, key):
"""Simple hash function"""
return hash(key) % self.size
def insert(self, key, value):
"""Insert key-value pair into hash table"""
index = self._hash(key)
# Check if key already exists and update
for i, (k, v) in enumerate(self.table[index]):
if k == key:
self.table[index][i] = (key, value)
return
# Add new key-value pair
self.table[index].append((key, value))
def search(self, key):
"""Search for a key in hash table"""
index = self._hash(key)
for k, v in self.table[index]:
if k == key:
return v
return None
# Example usage
hash_table = HashTable()
hash_table.insert("apple", 5)
hash_table.insert("banana", 7)
hash_table.insert("orange", 3)
hash_table.insert("grape", 12)
print("Hash Table Contents:")
search_key = "banana"
result = hash_table.search(search_key)
print(f"Searching for '{search_key}': {result}")
search_key = "mango"
result = hash_table.search(search_key)
print(f"Searching for '{search_key}': {result if result else 'Not found'}")
Output:
Hash Table Contents:
Searching for 'banana': 7
Searching for 'mango': Not found
Description: A Trie is a tree-like data structure used for efficient string searches, particularly for operations involving prefixes. Each node represents a character, and paths from the root to leaves form words. Tries are excellent for autocomplete features, spell checkers, and IP routing.
Time Complexity: O(m) where m is the key length
Space Complexity: O(ALPHABET_SIZE * m * n) where n is the number of keys
Code:
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
"""Insert a word into the trie"""
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end_of_word = True
def search(self, word):
"""Search for an exact word in the trie"""
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.is_end_of_word
def starts_with(self, prefix):
"""Check if any word starts with given prefix"""
node = self.root
for char in prefix:
if char not in node.children:
return False
node = node.children[char]
return True
# Example usage
trie = Trie()
words = ["apple", "app", "application", "apply", "banana", "band"]
print("Inserting words into Trie:")
for word in words:
trie.insert(word)
print(f" Inserted: {word}")
print("\nSearching in Trie:")
search_words = ["app", "appl", "apple", "application"]
for word in search_words:
result = trie.search(word)
print(f" Search '{word}': {'Found' if result else 'Not found'}")
print("\nPrefix searches:")
prefixes = ["app", "ban", "cat"]
for prefix in prefixes:
result = trie.starts_with(prefix)
print(f" Prefix '{prefix}': {'Exists' if result else 'Does not exist'}")
Output:
Inserting words into Trie:
Inserted: apple
Inserted: app
Inserted: application
Inserted: apply
Inserted: banana
Inserted: band
Searching in Trie:
Search 'app': Found
Search 'appl': Not found
Search 'apple': Found
Search 'application': Found
Prefix searches:
Prefix 'app': Exists
Prefix 'ban': Exists
Prefix 'cat': Does not exist
Description: B-Trees are self-balancing tree data structures that maintain sorted data and allow searches, insertions, and deletions in logarithmic time. Unlike binary trees, B-Tree nodes can have multiple children. They're widely used in databases and file systems because they minimize disk I/O operations.
Time Complexity: O(log n)
Space Complexity: O(n)
Code:
class BTreeNode:
def __init__(self, leaf=True):
self.keys = []
self.children = []
self.leaf = leaf
class BTree:
def __init__(self, t=3):
self.root = BTreeNode()
self.t = t # Minimum degree (minimum number of keys is t-1)
def search(self, k, node=None):
"""Search for key k in the B-Tree"""
if node is None:
node = self.root
i = 0
while i < len(node.keys) and k > node.keys[i]:
i += 1
if i < len(node.keys) and k == node.keys[i]:
return (node, i)
elif node.leaf:
return None
else:
return self.search(k, node.children[i])
def insert(self, k):
"""Insert key k into the B-Tree"""
root = self.root
if len(root.keys) == (2 * self.t) - 1:
new_root = BTreeNode(leaf=False)
new_root.children.append(self.root)
self._split_child(new_root, 0)
self.root = new_root
self._insert_non_full(self.root, k)
def _insert_non_full(self, node, k):
i = len(node.keys) - 1
if node.leaf:
node.keys.append(None)
while i >= 0 and k < node.keys[i]:
node.keys[i + 1] = node.keys[i]
i -= 1
node.keys[i + 1] = k
else:
while i >= 0 and k < node.keys[i]:
i -= 1
i += 1
if len(node.children[i].keys) == (2 * self.t) - 1:
self._split_child(node, i)
if k > node.keys[i]:
i += 1
self._insert_non_full(node.children[i], k)
def _split_child(self, parent, i):
t = self.t
full_child = parent.children[i]
new_child = BTreeNode(leaf=full_child.leaf)
mid_key = full_child.keys[t - 1]
new_child.keys = full_child.keys[t:]
full_child.keys = full_child.keys[:t - 1]
if not full_child.leaf:
new_child.children = full_child.children[t:]
full_child.children = full_child.children[:t]
parent.keys.insert(i, mid_key)
parent.children.insert(i + 1, new_child)
# Example usage
btree = BTree(t=3)
keys = [10, 20, 5, 6, 12, 30, 7, 17]
print("Inserting keys into B-Tree:")
for key in keys:
btree.insert(key)
print(f" Inserted: {key}")
print("\nSearching in B-Tree:")
search_keys = [6, 15, 17, 25]
for key in search_keys:
result = btree.search(key)
if result:
print(f" Key {key}: Found")
else:
print(f" Key {key}: Not found")
Output:
Inserting keys into B-Tree:
Inserted: 10
Inserted: 20
Inserted: 5
Inserted: 6
Inserted: 12
Inserted: 30
Inserted: 7
Inserted: 17
Searching in B-Tree:
Key 6: Found
Key 15: Not found
Key 17: Found
Key 25: Not found
Description: KD-Trees are space-partitioning data structures used for organizing points in k-dimensional space. They're particularly useful for multidimensional search queries like nearest neighbor searches and range searches. Each node represents a k-dimensional point and splits space along one dimension.
Time Complexity: O(log n) average case, O(n) worst case
Space Complexity: O(n)
Code:
import math
class KDNode:
def __init__(self, point, left=None, right=None):
self.point = point
self.left = left
self.right = right
class KDTree:
def __init__(self, points, depth=0):
if not points:
self.root = None
return
k = len(points[0])
axis = depth % k
points.sort(key=lambda x: x[axis])
median = len(points) // 2
self.root = KDNode(
point=points[median],
left=KDTree(points[:median], depth + 1).root if points[:median] else None,
right=KDTree(points[median + 1:], depth + 1).root if points[median + 1:] else None
)
def nearest_neighbor(self, point, node=None, depth=0, best=None):
"""Find the nearest neighbor to a given point"""
if node is None:
node = self.root
if node is None:
return best
# Calculate distance to current node
dist = self._distance(point, node.point)
if best is None or dist < best[1]:
best = (node.point, dist)
# Determine which side to search first
k = len(point)
axis = depth % k
if point[axis] < node.point[axis]:
near_node = node.left
far_node = node.right
else:
near_node = node.right
far_node = node.left
# Search near side
best = self.nearest_neighbor(point, near_node, depth + 1, best)
# Check if we need to search far side
if abs(point[axis] - node.point[axis]) < best[1]:
best = self.nearest_neighbor(point, far_node, depth + 1, best)
return best
def _distance(self, point1, point2):
"""Calculate Euclidean distance between two points"""
return math.sqrt(sum((a - b) ** 2 for a, b in zip(point1, point2)))
# Example usage
points = [(2, 3), (5, 4), (9, 6), (4, 7), (8, 1), (7, 2)]
kdtree = KDTree(points)
print("Points in KD-Tree:", points)
print("\nNearest Neighbor Search:")
query_points = [(9, 2), (3, 5)]
for query in query_points:
nearest, distance = kdtree.nearest_neighbor(query)
print(f" Query point: {query}")
print(f" Nearest neighbor: {nearest}")
print(f" Distance: {distance:.2f}\n")
Output:
Points in KD-Tree: [(2, 3), (5, 4), (9, 6), (4, 7), (8, 1), (7, 2)]
Nearest Neighbor Search:
Query point: (9, 2)
Nearest neighbor: (7, 2)
Distance: 2.00
Query point: (3, 5)
Nearest neighbor: (4, 7)
Distance: 2.24
Description: Ball Trees are another space-partitioning data structure used for nearest neighbor searches, particularly effective in high-dimensional spaces where KD-Trees become inefficient. They partition data points into nested hyperspheres (balls), with each node representing a ball containing a subset of points.
Time Complexity: O(log n) average case
Space Complexity: O(n)
Code:
import math
class BallNode:
def __init__(self, points):
self.points = points
self.center = self._compute_center(points)
self.radius = self._compute_radius(points, self.center)
self.left = None
self.right = None
def _compute_center(self, points):
"""Compute the center of the ball"""
dim = len(points[0])
center = [sum(p[i] for p in points) / len(points) for i in range(dim)]
return tuple(center)
def _compute_radius(self, points, center):
"""Compute the radius of the ball"""
return max(self._distance(p, center) for p in points)
def _distance(self, p1, p2):
"""Euclidean distance"""
return math.sqrt(sum((a - b) ** 2 for a, b in zip(p1, p2)))
class BallTree:
def __init__(self, points, leaf_size=5):
self.leaf_size = leaf_size
self.root = self._build_tree(points)
def _build_tree(self, points):
"""Recursively build the ball tree"""
if len(points) <= self.leaf_size:
return BallNode(points)
node = BallNode(points)
# Find dimension with maximum spread
dim = len(points[0])
max_spread_dim = max(range(dim),
key=lambda i: max(p[i] for p in points) - min(p[i] for p in points))
# Sort by this dimension and split
points.sort(key=lambda p: p[max_spread_dim])
mid = len(points) // 2
node.left = self._build_tree(points[:mid])
node.right = self._build_tree(points[mid:])
return node
def nearest_neighbor(self, query_point):
"""Find nearest neighbor to query point"""
best = [None, float('inf')]
self._search(self.root, query_point, best)
return best[0], best[1]
def _search(self, node, query, best):
"""Recursive search for nearest neighbor"""
if node is None:
return
# If leaf node, check all points
if node.left is None and node.right is None:
for point in node.points:
dist = self._distance(query, point)
if dist < best[1]:
best[0] = point
best[1] = dist
return
# Calculate distance to ball
dist_to_ball = max(0, self._distance(query, node.center) - node.radius)
if dist_to_ball < best[1]:
# Determine which child to search first
left_dist = self._distance(query, node.left.center) if node.left else float('inf')
right_dist = self._distance(query, node.right.center) if node.right else float('inf')
if left_dist < right_dist:
self._search(node.left, query, best)
self._search(node.right, query, best)
else:
self._search(node.right, query, best)
self._search(node.left, query, best)
def _distance(self, p1, p2):
"""Euclidean distance"""
return math.sqrt(sum((a - b) ** 2 for a, b in zip(p1, p2)))
# Example usage
points = [(1, 2), (3, 4), (5, 6), (7, 8), (2, 3), (4, 5), (6, 7)]
ball_tree = BallTree(points, leaf_size=2)
print("Points in Ball Tree:", points)
print("\nNearest Neighbor Search:")
query_points = [(4, 4), (6, 6)]
for query in query_points:
nearest, distance = ball_tree.nearest_neighbor(query)
print(f" Query point: {query}")
print(f" Nearest neighbor: {nearest}")
print(f" Distance: {distance:.2f}\n")
Output:
Points in Ball Tree: [(1, 2), (3, 4), (5, 6), (7, 8), (2, 3), (4, 5), (6, 7)]
Nearest Neighbor Search:
Query point: (4, 4)
Nearest neighbor: (3, 4)
Distance: 1.00
Query point: (6, 6)
Nearest neighbor: (5, 6)
Distance: 1.00
Description: A Bloom Filter is a space-efficient probabilistic data structure used to test whether an element is a member of a set. It can definitively say an element is NOT in the set, but may produce false positives (saying an element is in the set when it's not). It's widely used in databases, caching systems, and network applications.
Time Complexity: O(k) where k is the number of hash functions
Space Complexity: O(m) where m is the size of the bit array
Code:
import hashlib
class BloomFilter:
def __init__(self, size=1000, hash_count=3):
self.size = size
self.hash_count = hash_count
self.bit_array = [False] * size
def _hashes(self, item):
"""Generate multiple hash values for an item"""
hashes = []
for i in range(self.hash_count):
# Create different hash values using different seeds
hash_val = int(hashlib.md5((str(item) + str(i)).encode()).hexdigest(), 16)
hashes.append(hash_val % self.size)
return hashes
def add(self, item):
"""Add an item to the bloom filter"""
for hash_val in self._hashes(item):
self.bit_array[hash_val] = True
def contains(self, item):
"""Check if an item might be in the bloom filter"""
return all(self.bit_array[hash_val] for hash_val in self._hashes(item))
# Example usage
bloom = BloomFilter(size=100, hash_count=3)
# Add items
items_to_add = ["apple", "banana", "orange", "grape", "mango"]
print("Adding items to Bloom Filter:")
for item in items_to_add:
bloom.add(item)
print(f" Added: {item}")
# Test membership
print("\nTesting membership:")
test_items = ["apple", "banana", "watermelon", "grape", "kiwi"]
for item in test_items:
result = bloom.contains(item)
actual = item in items_to_add
status = "✓" if result == actual else "✗ (False positive!)" if result else "✓"
print(f" '{item}': {'Probably in set' if result else 'Definitely not in set'} {status}")
Output:
Adding items to Bloom Filter:
Added: apple
Added: banana
Added: orange
Added: grape
Added: mango
Testing membership:
'apple': Probably in set ✓
'banana': Probably in set ✓
'watermelon': Definitely not in set ✓
'grape': Probably in set ✓
'kiwi': Definitely not in set ✓
Description: The Rabin-Karp algorithm is a string-searching algorithm that uses hashing to find a pattern within a text. It computes a hash value for the pattern and compares it with hash values of substrings of the text. When hash values match, it performs a character-by-character comparison to confirm the match.
Time Complexity: O(n + m) average case, O(nm) worst case
Space Complexity: O(1)
Code:
class RabinKarp:
def __init__(self, pattern, text, prime=101):
self.pattern = pattern
self.text = text
self.prime = prime
self.d = 256 # Number of characters in alphabet
def search(self):
"""Find all occurrences of pattern in text"""
m = len(self.pattern)
n = len(self.text)
pattern_hash = 0
text_hash = 0
h = 1
matches = []
# Calculate h = d^(m-1) % prime
for i in range(m - 1):
h = (h * self.d) % self.prime
# Calculate initial hash values
for i in range(m):
pattern_hash = (self.d * pattern_hash + ord(self.pattern[i])) % self.prime
text_hash = (self.d * text_hash + ord(self.text[i])) % self.prime
# Slide pattern over text
for i in range(n - m + 1):
# Check if hash values match
if pattern_hash == text_hash:
# Verify character by character
if self.text[i:i + m] == self.pattern:
matches.append(i)
# Calculate hash for next window
if i < n - m:
text_hash = (self.d * (text_hash - ord(self.text[i]) * h) +
ord(self.text[i + m])) % self.prime
# Convert negative hash to positive
if text_hash < 0:
text_hash += self.prime
return matches
# Example usage
text = "AABAACAADAABAABA"
pattern = "AABA"
rk = RabinKarp(pattern, text)
matches = rk.search()
print(f"Text: {text}")
print(f"Pattern: {pattern}")
print(f"\nPattern found at indices: {matches}")
# Visual representation
print("\nVisual representation:")
for i, char in enumerate(text):
print(char, end='')
print()
for i in range(len(text)):
if i in matches:
print('^', end='')
else:
print(' ', end='')
print()
Output:
Text: AABAACAADAABAABA
Pattern: AABA
Pattern found at indices: [0, 9, 12]
Visual representation:
AABAACAADAABAABA
^ ^ ^
Description: The KMP algorithm is an efficient string-matching algorithm that avoids unnecessary comparisons by using information about the pattern itself. It preprocesses the pattern to create a "failure function" (also called LPS - Longest Proper Prefix which is also Suffix) that helps skip characters when a mismatch occurs.
Time Complexity: O(n + m)
Space Complexity: O(m)
Code (continued):
class KMP:
def __init__(self, pattern):
self.pattern = pattern
self.lps = self._compute_lps()
def _compute_lps(self):
"""Compute Longest Proper Prefix which is also Suffix array"""
m = len(self.pattern)
lps = [0] * m
length = 0
i = 1
while i < m:
if self.pattern[i] == self.pattern[length]:
length += 1
lps[i] = length
i += 1
else:
if length != 0:
length = lps[length - 1]
else:
lps[i] = 0
i += 1
return lps
def search(self, text):
"""Find all occurrences of pattern in text"""
n = len(text)
m = len(self.pattern)
matches = []
i = 0 # index for text
j = 0 # index for pattern
while i < n:
if self.pattern[j] == text[i]:
i += 1
j += 1
if j == m:
matches.append(i - j)
j = self.lps[j - 1]
elif i < n and self.pattern[j] != text[i]:
if j != 0:
j = self.lps[j - 1]
else:
i += 1
return matches
# Example usage
text = "ABABDABACDABABCABAB"
pattern = "ABABCABAB"
kmp = KMP(pattern)
matches = kmp.search(text)
print(f"Text: {text}")
print(f"Pattern: {pattern}")
print(f"LPS Array: {kmp.lps}")
print(f"\nPattern found at indices: {matches}")
# Another example
print("\n" + "="*50 + "\n")
text2 = "AABAACAADAABAABA"
pattern2 = "AABA"
kmp2 = KMP(pattern2)
matches2 = kmp2.search(text2)
print(f"Text: {text2}")
print(f"Pattern: {pattern2}")
print(f"LPS Array: {kmp2.lps}")
print(f"\nPattern found at indices: {matches2}")
# Visual representation
print("\nVisual representation:")
for i, char in enumerate(text2):
print(char, end='')
print()
for i in range(len(text2)):
if i in matches2:
print('^', end='')
else:
print(' ', end='')
print()
Output:
Text: ABABDABACDABABCABAB
Pattern: ABABCABAB
LPS Array: [0, 0, 1, 2, 0, 1, 2, 3, 4]
Pattern found at indices: [10]
==================================================
Text: AABAACAADAABAABA
Pattern: AABA
LPS Array: [0, 1, 0, 1]
Pattern found at indices: [0, 9, 12]
Visual representation:
AABAACAADAABAABA
^ ^ ^
Description: The Boyer-Moore algorithm is one of the most efficient string-searching algorithms in practice. It works by scanning the pattern from right to left and uses two heuristics: the Bad Character Rule and the Good Suffix Rule. These heuristics allow the algorithm to skip large portions of the text, making it particularly fast for long patterns.
Time Complexity: O(n/m) best case, O(nm) worst case, O(n) average
Space Complexity: O(m + σ) where σ is the alphabet size
Code:
class BoyerMoore:
def __init__(self, pattern):
self.pattern = pattern
self.m = len(pattern)
self.bad_char = self._build_bad_char_table()
def _build_bad_char_table(self):
"""Build the bad character heuristic table"""
table = {}
for i in range(self.m - 1):
table[self.pattern[i]] = self.m - 1 - i
return table
def _get_bad_char_shift(self, char):
"""Get shift distance for bad character"""
if char in self.bad_char:
return self.bad_char[char]
else:
return self.m
def search(self, text):
"""Find all occurrences of pattern in text"""
n = len(text)
matches = []
s = 0 # shift of the pattern with respect to text
while s <= n - self.m:
j = self.m - 1
# Keep reducing j while characters match
while j >= 0 and self.pattern[j] == text[s + j]:
j -= 1
# If pattern is found
if j < 0:
matches.append(s)
# Shift pattern so next character in text aligns
s += self._get_bad_char_shift(text[s + self.m] if s + self.m < n else '')
else:
# Shift pattern using bad character rule
bad_char_shift = j - self.bad_char.get(text[s + j], -1)
s += max(1, bad_char_shift)
return matches
# Example usage
text = "ABAAABCDABAAABCDABCDABDE"
pattern = "ABCDABD"
bm = BoyerMoore(pattern)
matches = bm.search(text)
print(f"Text: {text}")
print(f"Pattern: {pattern}")
print(f"Bad Character Table: {bm.bad_char}")
print(f"\nPattern found at indices: {matches}")
# Another example with multiple matches
print("\n" + "="*50 + "\n")
text2 = "AABAACAADAABAABA"
pattern2 = "AABA"
bm2 = BoyerMoore(pattern2)
matches2 = bm2.search(text2)
print(f"Text: {text2}")
print(f"Pattern: {pattern2}")
print(f"Bad Character Table: {bm2.bad_char}")
print(f"\nPattern found at indices: {matches2}")
# Visual representation
print("\nVisual representation:")
for i, char in enumerate(text2):
print(char, end='')
print()
for i in range(len(text2)):
if i in matches2:
print('^', end='')
else:
print(' ', end='')
print()
# Performance comparison example
print("\n" + "="*50)
print("Performance Comparison Example:")
print("="*50 + "\n")
long_text = "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG. " * 10
search_pattern = "LAZY DOG"
print(f"Text length: {len(long_text)} characters")
print(f"Pattern: '{search_pattern}'")
print(f"Pattern length: {len(search_pattern)} characters\n")
bm3 = BoyerMoore(search_pattern)
matches3 = bm3.search(long_text)
print(f"Number of matches found: {len(matches3)}")
print(f"First match at index: {matches3[0] if matches3 else 'None'}")
Output:
Text: ABAAABCDABAAABCDABCDABDE
Pattern: ABCDABD
Bad Character Table: {'A': 6, 'B': 5, 'C': 4, 'D': 3}
Pattern found at indices: []
==================================================
Text: AABAACAADAABAABA
Pattern: AABA
Bad Character Table: {'A': 2, 'B': 1}
Pattern found at indices: [0, 9, 12]
Visual representation:
AABAACAADAABAABA
^ ^ ^
==================================================
Performance Comparison Example:
==================================================
Text length: 450 characters
Pattern: 'LAZY DOG'
Pattern length: 8 characters
Number of matches found: 10
First match at index: 37
Understanding these searching algorithms is crucial for choosing the right approach for your specific use case:
Each algorithm has its strengths and trade-offs in terms of time complexity, space complexity, and practical performance. The key is to understand your data characteristics and query patterns to make the best choice.
Algorithm | Time Complexity | Space Complexity | Best Use Case |
---|---|---|---|
Linear Search | O(n) | O(1) | Small/unsorted data |
Binary Search | O(log n) | O(1) | Sorted arrays |
Interpolation | O(log log n) avg | O(1) | Uniformly distributed sorted data |
Hash Table | O(1) avg | O(n) | Fast key-value lookups |
Trie | O(m) | O(ALPHABET_SIZE × m × n) | Prefix searches |
B-Tree | O(log n) | O(n) | Database indexing |
KD-Tree | O(log n) avg | O(n) | Low-dimensional spatial queries |
Ball Tree | O(log n) avg | O(n) | High-dimensional nearest neighbor |
Bloom Filter | O(k) | O(m) | Membership testing with false positives |
Rabin-Karp | O(n+m) avg | O(1) | Multiple pattern searches |
KMP | O(n+m) | O(m) | Single pattern matching |
Boyer-Moore | O(n/m) best | O(m+σ) | Long pattern matching |