Hash Table Problems¶
This section covers common coding problems that can be efficiently solved using hash tables. These problems frequently appear in technical interviews and demonstrate the power of hash-based data structures.
Problem Categories¶
1. Frequency Counting¶
2. Two-Sum Variants¶
3. Substring Problems¶
4. Set Operations¶
5. Caching and Memoization¶
1. Frequency Counting Problems¶
Character Frequency¶
Problem: Count the frequency of each character in a string.
def char_frequency(s):
"""
Time: O(n), Space: O(k) where k is number of unique characters
"""
freq = {}
for char in s:
freq[char] = freq.get(char, 0) + 1
return freq
# Example
text = "hello world"
print(char_frequency(text))
# Output: {'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}
Most Frequent Element¶
Problem: Find the most frequently occurring element in an array.
def most_frequent(arr):
"""
Time: O(n), Space: O(n)
"""
if not arr:
return None
freq = {}
for num in arr:
freq[num] = freq.get(num, 0) + 1
return max(freq, key=freq.get)
# Example
numbers = [1, 3, 2, 3, 4, 3, 2]
print(most_frequent(numbers)) # Output: 3
2. Two-Sum Variants¶
Classic Two Sum¶
Problem: Find two numbers in array that add up to target.
def two_sum(nums, target):
"""
Time: O(n), Space: O(n)
"""
seen = {}
for i, num in enumerate(nums):
complement = target - num
if complement in seen:
return [seen[complement], i]
seen[num] = i
return []
# Example
nums = [2, 7, 11, 15]
target = 9
print(two_sum(nums, target)) # Output: [0, 1]
Three Sum¶
Problem: Find all unique triplets that sum to zero.
def three_sum(nums):
"""
Time: O(n²), Space: O(1) excluding output
"""
nums.sort()
result = []
for i in range(len(nums) - 2):
# Skip duplicates for first number
if i > 0 and nums[i] == nums[i-1]:
continue
left, right = i + 1, len(nums) - 1
while left < right:
total = nums[i] + nums[left] + nums[right]
if total == 0:
result.append([nums[i], nums[left], nums[right]])
# Skip duplicates
while left < right and nums[left] == nums[left + 1]:
left += 1
while left < right and nums[right] == nums[right - 1]:
right -= 1
left += 1
right -= 1
elif total < 0:
left += 1
else:
right -= 1
return result
3. Substring Problems¶
Longest Substring Without Repeating Characters¶
Problem: Find length of longest substring without repeating characters.
def length_of_longest_substring(s):
"""
Time: O(n), Space: O(min(m,n)) where m is character set size
"""
char_index = {}
left = 0
max_length = 0
for right, char in enumerate(s):
if char in char_index and char_index[char] >= left:
left = char_index[char] + 1
char_index[char] = right
max_length = max(max_length, right - left + 1)
return max_length
# Example
s = "abcabcbb"
print(length_of_longest_substring(s)) # Output: 3 ("abc")
Group Anagrams¶
Problem: Group strings that are anagrams of each other.
def group_anagrams(strs):
"""
Time: O(n * k log k) where n is number of strings, k is max string length
Space: O(n * k)
"""
from collections import defaultdict
anagram_groups = defaultdict(list)
for s in strs:
# Sort characters to create key
key = ''.join(sorted(s))
anagram_groups[key].append(s)
return list(anagram_groups.values())
# Example
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# Output: \[\[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']\]
4. Set Operations¶
Intersection of Two Arrays¶
Problem: Find intersection of two arrays.
def intersection(nums1, nums2):
"""
Time: O(n + m), Space: O(min(n,m))
"""
set1 = set(nums1)
result = set()
for num in nums2:
if num in set1:
result.add(num)
return list(result)
# Example
nums1 = [1, 2, 2, 1]
nums2 = [2, 2]
print(intersection(nums1, nums2)) # Output: [2]
Find Missing Number¶
Problem: Find missing number in array containing n distinct numbers from 0 to n.
def missing_number(nums):
"""
Multiple approaches - Hash table version
Time: O(n), Space: O(n)
"""
num_set = set(nums)
n = len(nums)
for i in range(n + 1):
if i not in num_set:
return i
return -1 # Should never reach here
# More efficient approaches:
def missing_number_math(nums):
"""
Time: O(n), Space: O(1)
"""
n = len(nums)
expected_sum = n * (n + 1) // 2
actual_sum = sum(nums)
return expected_sum - actual_sum
def missing_number_xor(nums):
"""
Time: O(n), Space: O(1)
"""
missing = len(nums)
for i, num in enumerate(nums):
missing ^= i ^ num
return missing
5. Caching and Memoization¶
LRU Cache¶
Problem: Implement Least Recently Used cache.
class LRUCache:
def __init__(self, capacity):
self.capacity = capacity
self.cache = {} # key -> node
# Create dummy head and tail nodes
self.head = Node(0, 0)
self.tail = Node(0, 0)
self.head.next = self.tail
self.tail.prev = self.head
def get(self, key):
if key in self.cache:
node = self.cache[key]
# Move to head (most recently used)
self._remove(node)
self._add_to_head(node)
return node.value
return -1
def put(self, key, value):
if key in self.cache:
# Update existing
node = self.cache[key]
node.value = value
self._remove(node)
self._add_to_head(node)
else:
# Add new
if len(self.cache) >= self.capacity:
# Remove least recently used
lru = self.tail.prev
self._remove(lru)
del self.cache[lru.key]
node = Node(key, value)
self.cache[key] = node
self._add_to_head(node)
def _remove(self, node):
node.prev.next = node.next
node.next.prev = node.prev
def _add_to_head(self, node):
node.next = self.head.next
self.head.next.prev = node
self.head.next = node
node.prev = self.head
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.prev = None
self.next = None
Fibonacci with Memoization¶
Problem: Compute Fibonacci numbers efficiently.
def fibonacci_memo(n, memo={}):
"""
Time: O(n), Space: O(n)
"""
if n in memo:
return memo[n]
if n <= 1:
return n
memo[n] = fibonacci_memo(n-1, memo) + fibonacci_memo(n-2, memo)
return memo[n]
# Using functools.lru_cache decorator
from functools import lru_cache
@lru_cache(maxsize=None)
def fibonacci_lru(n):
if n <= 1:
return n
return fibonacci_lru(n-1) + fibonacci_lru(n-2)
Problem-Solving Patterns¶
When to Use Hash Tables¶
- Fast Lookups: Need O(1) average case access
- Counting: Frequency analysis problems
- Caching: Store computed results
- Set Operations: Intersection, union, difference
- Mapping: Key-value relationships
Common Techniques¶
- Use as Counter:
collections.Counteror manual counting - Use as Set: Fast membership testing
- Use as Cache: Store expensive computations
- Two-Pointer with Hash: Combine with other techniques
- Sliding Window with Hash: Track elements in window
Time/Space Trade-offs¶
- Time: Usually improves from O(n²) to O(n)
- Space: Additional O(n) space for hash table
- Worth it: When lookup speed is critical
Practice Problems¶
Easy¶
- Contains Duplicate
- Valid Anagram
- Two Sum
- Intersection of Two Arrays
Medium¶
- Group Anagrams
- Top K Frequent Elements
- Longest Substring Without Repeating Characters
- 4Sum II
Hard¶
- Substring with Concatenation of All Words
- LRU Cache
- First Missing Positive
- Longest Consecutive Sequence
Related Topics¶
- Python Dictionaries - Implementation details
- Python Sets - Set operations
- Time Complexity Guide - Analysis techniques