Advent of Code Prep: Core Skills
· by Michael Doornbos · 2706 words
This is part 2 of a three-part series on preparing for Advent of Code. See the main overview for the complete roadmap, Part 1: Foundation for the basics, or Part 3: Advanced Techniques for harder algorithms.
With the foundation skills from Part 1, you can parse any input and manipulate basic data structures. Now it’s time for the techniques that solve the actual puzzles. These three skills—grid navigation, graph traversal, and recursion—appear in probably 70% of AoC problems.
2D Grid Navigation
Grids are everywhere in Advent of Code. Mazes, maps, game boards, seating charts. If you’re not comfortable navigating a 2D grid, you’ll struggle with half the puzzles.
The mental model
Think of a grid as a list of lists. The outer list is rows (top to bottom), the inner lists are columns (left to right).
Row 0: [A, B, C, D]
Row 1: [E, F, G, H]
Row 2: [I, J, K, L]
To access cell F: row 1, column 1, so grid[1][1].
The convention I use: grid[r][c] where r is the row (vertical position) and c is the column (horizontal position). Some people prefer grid[y][x]. Either works, but pick one and stick with it.
Parsing a grid
data = """#.##
..#.
#...
####"""
grid = [list(line) for line in data.strip().split('\n')]
rows = len(grid)
cols = len(grid[0])
Using list(line) converts each string into a list of characters. This matters because strings are immutable—you can’t do grid[r][c] = 'X' if the row is still a string.
Iterating through all cells
for r in range(rows):
for c in range(cols):
cell = grid[r][c]
# do something with cell at position (r, c)
If you need position and value together, this pattern works well.
Finding specific cells
Often you need to locate a starting point or collect all cells of a certain type:
def find_start(grid):
"""Find the 'S' cell."""
for r in range(len(grid)):
for c in range(len(grid[0])):
if grid[r][c] == 'S':
return (r, c)
return None
def find_all(grid, target):
"""Find all cells matching target."""
positions = []
for r in range(len(grid)):
for c in range(len(grid[0])):
if grid[r][c] == target:
positions.append((r, c))
return positions
The neighbor pattern
This is the most important grid pattern. Memorize it.
# Four cardinal directions (up, down, left, right)
DIRECTIONS_4 = [(-1, 0), (1, 0), (0, -1), (0, 1)]
# Eight directions (including diagonals)
DIRECTIONS_8 = [
(-1, -1), (-1, 0), (-1, 1),
(0, -1), (0, 1),
(1, -1), (1, 0), (1, 1)
]
def get_neighbors(r, c, rows, cols, directions=DIRECTIONS_4):
"""Return valid neighboring positions."""
neighbors = []
for dr, dc in directions:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
neighbors.append((nr, nc))
return neighbors
The bounds check 0 <= nr < rows and 0 <= nc < cols prevents accessing invalid positions. Without it, grid[-1][0] wraps to the last row instead of raising an error—a subtle bug.
Direction mapping
Many puzzles involve movement with named directions:
DIRECTION_MAP = {
'U': (-1, 0), 'D': (1, 0), 'L': (0, -1), 'R': (0, 1),
'N': (-1, 0), 'S': (1, 0), 'W': (0, -1), 'E': (0, 1),
'^': (-1, 0), 'v': (1, 0), '<': (0, -1), '>': (0, 1),
}
def move(pos, direction):
r, c = pos
dr, dc = DIRECTION_MAP[direction]
return (r + dr, c + dc)
Turning
For puzzles with rotation:
# Directions in clockwise order
DIRS = [(-1, 0), (0, 1), (1, 0), (0, -1)] # N, E, S, W
def turn_right(dir_idx):
return (dir_idx + 1) % 4
def turn_left(dir_idx):
return (dir_idx - 1) % 4
def turn_around(dir_idx):
return (dir_idx + 2) % 4
The modulo trick
Those % 4 operations are doing something important: they handle wraparound. When you’re at direction index 3 (West) and turn right, 3 + 1 = 4, but there is no direction 4. The modulo wraps it back to 0 (North).
This pattern appears constantly in AoC—anywhere you have circular structures:
# Position on a 100-element circular dial
position = (position + steps) % 100
# Circular buffer of size n
buffer[write_pos % n] = value
write_pos += 1
# Cycle through days of the week
day = (day + offset) % 7
The math: % gives the remainder after division. 105 % 100 = 5 because 105 divided by 100 is 1 remainder 5. No matter how large the number, the result stays in range.
Python handles negatives correctly too: -3 % 100 = 97, not -3. Three steps backward from 0 lands at 97. Other languages (C, Java) return -3, which is less useful for circular movement.
Practice: Flood fill
Count connected regions of the same character:
def flood_fill(grid, start_r, start_c, visited):
"""Fill all connected cells of the same type, return count."""
rows, cols = len(grid), len(grid[0])
target = grid[start_r][start_c]
count = 0
stack = [(start_r, start_c)]
while stack:
r, c = stack.pop()
if (r, c) in visited:
continue
if grid[r][c] != target:
continue
visited.add((r, c))
count += 1
for dr, dc in DIRECTIONS_4:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
stack.append((nr, nc))
return count
Batch your updates
A common grid puzzle: update cells based on their neighbors. Cellular automata, spreading simulations, game-of-life variants. The trap: modifying the grid as you iterate.
# WRONG: Sequential updates corrupt the state
for r in range(rows):
for c in range(cols):
if count_lit_neighbors(r, c) >= 2:
grid[r][c] = '#' # This affects later neighbor counts!
When cell (0,1) becomes lit, cell (0,2) now sees the new value as a neighbor. But the rule was supposed to use the state at the start of the round.
The fix: collect changes first, apply second.
# CORRECT: Two-phase update
to_update = []
for r in range(rows):
for c in range(cols):
if count_lit_neighbors(r, c) >= 2:
to_update.append((r, c))
for r, c in to_update:
grid[r][c] = '#'
Every cell’s decision is based on the original state. An alternative is keeping two grids and swapping them each iteration.
BFS and DFS: Graph Traversal
Many AoC problems are graph problems in disguise. The grid is a graph where cells are nodes and neighbors are edges. Master these two traversal algorithms.
Breadth-First Search (BFS)
BFS explores nodes level by level—all nodes at distance 1, then distance 2, then distance 3. This makes it perfect for finding shortest paths in unweighted graphs.
from collections import deque
def bfs_shortest_path(grid, start, end):
"""Find shortest path length from start to end."""
rows, cols = len(grid), len(grid[0])
queue = deque([(start, 0)]) # (position, distance)
visited = {start}
while queue:
(r, c), dist = queue.popleft()
if (r, c) == end:
return dist
for dr, dc in DIRECTIONS_4:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
if (nr, nc) not in visited and grid[nr][nc] != '#':
visited.add((nr, nc))
queue.append(((nr, nc), dist + 1))
return -1 # No path exists
Key points:
-
Use
deque, not a list.queue.popleft()is O(1) for deque, O(n) for list. -
Mark visited when adding to queue, not when processing. If you wait, you’ll add the same node multiple times from different neighbors.
-
The first time you reach the goal is the shortest path. BFS guarantees this for unweighted graphs.
Reconstructing the path
Sometimes you need the actual path, not just the length:
def bfs_with_path(grid, start, end):
"""Find shortest path and return it."""
rows, cols = len(grid), len(grid[0])
queue = deque([(start, [start])]) # (position, path so far)
visited = {start}
while queue:
(r, c), path = queue.popleft()
if (r, c) == end:
return path
for dr, dc in DIRECTIONS_4:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
if (nr, nc) not in visited and grid[nr][nc] != '#':
visited.add((nr, nc))
queue.append(((nr, nc), path + [(nr, nc)]))
return None
This stores the full path with each node. For large graphs, it’s more memory-efficient to store parent pointers and reconstruct the path at the end.
Depth-First Search (DFS)
DFS explores as deep as possible before backtracking. It’s useful for:
- Checking if a path exists (not necessarily shortest)
- Exploring all possibilities
- Finding connected components
Recursive version:
def dfs_recursive(grid, pos, end, visited):
"""Check if end is reachable from pos."""
if pos == end:
return True
r, c = pos
rows, cols = len(grid), len(grid[0])
visited.add(pos)
for dr, dc in DIRECTIONS_4:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
if (nr, nc) not in visited and grid[nr][nc] != '#':
if dfs_recursive(grid, (nr, nc), end, visited):
return True
return False
Iterative version (avoids stack overflow for large inputs):
def dfs_iterative(grid, start, end):
"""Check if end is reachable from start."""
rows, cols = len(grid), len(grid[0])
stack = [start]
visited = set()
while stack:
r, c = stack.pop() # pop() for DFS, popleft() for BFS
if (r, c) == end:
return True
if (r, c) in visited:
continue
visited.add((r, c))
for dr, dc in DIRECTIONS_4:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
if (nr, nc) not in visited and grid[nr][nc] != '#':
stack.append((nr, nc))
return False
BFS vs DFS: When to use which
| Use BFS when… | Use DFS when… |
|---|---|
| You need shortest path | You just need any path |
| You need minimum steps | You’re counting all paths |
| Order matters (level by level) | Memory is tight (stack vs queue) |
| The graph might have cycles | You’re doing backtracking |
Multi-source BFS
Sometimes you start from multiple points simultaneously. Classic example: find the maximum distance from any starting point.
def multi_source_bfs(grid, starts):
"""BFS from multiple starting points at once."""
rows, cols = len(grid), len(grid[0])
queue = deque((pos, 0) for pos in starts)
visited = set(starts)
max_dist = 0
while queue:
(r, c), dist = queue.popleft()
max_dist = max(max_dist, dist)
for dr, dc in DIRECTIONS_4:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
if (nr, nc) not in visited and grid[nr][nc] != '#':
visited.add((nr, nc))
queue.append(((nr, nc), dist + 1))
return max_dist
Recursion and Memoization
Recursion is thinking about problems in terms of smaller subproblems. Memoization is remembering solutions to avoid redundant work.
The recursive mindset
A recursive function has two parts:
- Base case: The simplest version of the problem, solved directly.
- Recursive case: Break the problem into smaller pieces, solve them recursively, combine the results.
Example: Fibonacci numbers.
def fib(n):
# Base cases
if n <= 1:
return n
# Recursive case
return fib(n - 1) + fib(n - 2)
This works but is terribly slow. fib(40) takes seconds because it recalculates the same values over and over.
Memoization: Remember what you’ve done
The fix is simple: cache results.
from functools import lru_cache
@lru_cache(maxsize=None)
def fib(n):
if n <= 1:
return n
return fib(n - 1) + fib(n - 2)
Now fib(40) is instant. The decorator remembers every result, so each subproblem is solved only once.
The hashability requirement
@lru_cache needs hashable arguments. These work:
- Integers, strings, floats
- Tuples (if contents are hashable)
- Frozensets
These don’t work:
- Lists (use
tuple(my_list)) - Sets (use
frozenset(my_set)) - Dicts (convert to tuple of items)
Bitmasks: A better way to track state
When you’re tracking “which items have I collected” and the set is small (under ~20 items), bitmasks are perfect. An integer where each bit represents one item.
# Track which of 5 keys we've collected
mask = 0 # binary: 00000 - nothing collected
# Collect key 2
mask |= (1 << 2) # binary: 00100
# Collect key 0
mask |= (1 << 0) # binary: 00101
# Check if we have key 2
if mask & (1 << 2):
print("Have key 2")
# Check if we have all 5 keys
all_keys = (1 << 5) - 1 # binary: 11111
if mask == all_keys:
print("Got everything!")
Why bitmasks instead of frozensets?
- Already hashable — Integers work directly with
@lru_cache. No conversion needed. - Faster — Bitwise operations are among the fastest CPU operations.
- Compact — One integer vs. a collection object.
The state example from earlier becomes cleaner:
@lru_cache(maxsize=None)
def count_paths(r, c, keys_mask):
if (r, c) == end and keys_mask == all_keys:
return 1
# ... rest of logic using keys_mask
Classic AoC pattern: Path counting
Count the number of ways to reach the end:
@lru_cache(maxsize=None)
def count_paths(r, c):
# Base case: reached the end
if (r, c) == end:
return 1
# Can't pass through walls
if grid[r][c] == '#':
return 0
# Sum paths through all valid neighbors
total = 0
for dr, dc in DIRECTIONS_4:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
total += count_paths(nr, nc)
return total
Without memoization, this would explore the same positions repeatedly. With it, each position is calculated once.
State in the cache key
Sometimes the state is more than just position. Maybe you’re collecting keys, or you can only move in one direction.
@lru_cache(maxsize=None)
def count_paths_with_keys(r, c, keys):
"""keys is a frozenset of collected keys."""
if (r, c) == end and len(keys) == total_keys:
return 1
total = 0
for dr, dc in DIRECTIONS_4:
nr, nc = r + dr, c + dc
if not valid(nr, nc):
continue
cell = grid[nr][nc]
new_keys = keys
# Collect key
if cell in 'abcdef':
new_keys = keys | {cell}
# Door requires key
if cell in 'ABCDEF' and cell.lower() not in keys:
continue
total += count_paths_with_keys(nr, nc, new_keys)
return total
The cache key is (r, c, keys). Different key combinations at the same position are different states.
Converting loops to recursion
Any loop can be written recursively. This is useful when the “loop” has complex branching.
Loop version:
def sum_list(lst):
total = 0
for item in lst:
total += item
return total
Recursive version:
def sum_list(lst):
if not lst:
return 0
return lst[0] + sum_list(lst[1:])
For AoC, recursion shines when:
- The branching factor varies (some positions have 2 choices, others have 4)
- You need to track complex state through the traversal
- Backtracking is required (see Part 3)
Debugging recursive functions
Recursion bugs are tricky. Add print statements showing the call:
def fib(n, depth=0):
print(" " * depth + f"fib({n})")
if n <= 1:
return n
return fib(n - 1, depth + 1) + fib(n - 2, depth + 1)
This shows the tree of calls, making it clear where things go wrong.
Putting It Together: Maze Solving
Here’s a complete example combining all three skills:
from collections import deque
def solve_maze(maze_str):
"""Find shortest path through maze from S to E."""
# Parse grid
grid = [list(line) for line in maze_str.strip().split('\n')]
rows, cols = len(grid), len(grid[0])
# Find start and end
start = end = None
for r in range(rows):
for c in range(cols):
if grid[r][c] == 'S':
start = (r, c)
elif grid[r][c] == 'E':
end = (r, c)
# BFS for shortest path
queue = deque([(start, 0)])
visited = {start}
DIRS = [(-1, 0), (1, 0), (0, -1), (0, 1)]
while queue:
(r, c), dist = queue.popleft()
if (r, c) == end:
return dist
for dr, dc in DIRS:
nr, nc = r + dr, c + dc
if 0 <= nr < rows and 0 <= nc < cols:
if (nr, nc) not in visited and grid[nr][nc] != '#':
visited.add((nr, nc))
queue.append(((nr, nc), dist + 1))
return -1
maze = """
#########
#S......#
#.#####.#
#.#...#.#
#.#.#.#.#
#...#.#E#
#########
"""
print(solve_maze(maze)) # 12
Exercises
-
Grid rotation: Write a function to rotate a grid 90 degrees clockwise.
-
Island counting: Given a grid of
'.'(water) and'#'(land), count the number of separate islands (connected land regions). -
Shortest path with obstacles: Modify BFS to find the shortest path that passes through at most k wall cells.
-
Path counting with memoization: Count all unique paths from top-left to bottom-right of a grid, moving only right or down.
-
All paths: Use DFS to find and print all possible paths through a small maze.
Ready to practice? The Advent of Code archives have years of puzzles. Days 1-10 from any year are good for practicing these core skills.
Next in this series: Part 3: Advanced Techniques covers Dijkstra’s algorithm for weighted graphs, backtracking for constraint satisfaction, and Union-Find for connected components.