Reflections on Advent of Code 2020

For a long time it puzzled me how something so expensive, so leading edge, could be so useless, and then it occurred to me that a computer is a stupid machine with the ability to do incredibly smart things, while computer programmers are smart people with the ability to do incredibly stupid things. They are, in short, a perfect match.

- Bill Bryson, I'm a Stranger Here Myself

I'm a few weeks late but I've finally finished all 25 days of Advent of Code 2020. See my solutions here where I've tried my best to provide explanations and comment my code well.

AoC has taken up all of my spare coding time for the past 6 weeks, I'd estimate about 50 hours in total (and more coffee than I care to admit). My favorite thing about Advent of Code is the little things that I learn and pick up on the way so, in honor of that, here's a random collection of my thoughts and favorite things that I learned.

collections.defaultdict (day 7)

Python's collections.defaultdict provides a convenient way to handle missing keys when creating a dictionary. Say we want to take a list and build a dictionary that maps each value in the list to a list of indices where that value exists.

With a regular dictionary you might do it as follows, by first checking whether the value is already in the dictionary and, if not, initializing it to an empty list.

In [1]:
nums = [1, 2, 4, 2, 5, 4]

num_to_indices = {}

for index, num in enumerate(nums):
    if num not in num_to_indices:
        num_to_indices[num] = []

    num_to_indices[num].append(index)

num_to_indices
Out[1]:
{1: [0], 2: [1, 3], 4: [2, 5], 5: [4]}

Using a defaultdict does this exact thing for you. When a key is missing, it'll call the function that we've supplied and create an entry for the missing key. In our case, we'll supply the list function which returns an empty list when a missing key is encountered.

In [2]:
from collections import defaultdict

num_to_indices = defaultdict(list)

for index, num in enumerate(nums):
    num_to_indices[num].append(index)

num_to_indices
Out[2]:
defaultdict(list, {1: [0], 2: [1, 3], 4: [2, 5], 5: [4]})

Counting a boolean condition

Python has a fun trick that's a bit confusing the first time you see it. When a boolean is used in an arithmetic expression, True is implicitly converted to 1 and False to 0.

In [3]:
print(f"1 + True  = {1 + True}")
print(f"1 + False = {1 + False}")
1 + True  = 2
1 + False = 1

To count the number of elements that satisfy some condition, you can keep a counter and increment it with the result of a boolean expression. Below is a (slightly confusing) example to demonstrate. The expression num % 2 == 0 is first evalauated to True or False and the result is added to the num_evens counter (i.e. 0 is added if the result is False and 1 is added if the result is True).

In [4]:
num_evens = 0
for num in range(1, 11):
    num_evens += num % 2 == 0

num_evens
Out[4]:
5

Finally, an even more succinct version of this is a generator expression that returns True or False for each element and the built-in sum totals the number of True results.

In [5]:
sum(num % 2 == 0 for num in range(1, 11))
Out[5]:
5

functools.partial (day 5)

functools.partial takes a function along with arguments and/or keyword arguments and returns a new function with those arguments fixed. For example, say we want a set of functions that takes a list of integers and counts how many integers fit some condition (i.e. how many are less than a given number, greater than a given number, or equal to a given number). We might write those functions like so:

In [6]:
from typing import Callable, List

def num_less_than(nums: List[int], val: int) -> int:
    return sum(num < val for num in nums)

def num_greater_than(nums: List[int], val: int) -> int:
    return sum(num > val for num in nums)

def num_equal_to(nums: List[int], val: int) -> int:
    return sum(num == val for num in nums)

print(f"# less than 5 on [0, 10)    = {num_less_than(range(10), 5)}")
print(f"# greater than 5 on [0, 10) = {num_greater_than(range(10), 5)}")
print(f"# equal to 5 on [0, 10)     = {num_equal_to(range(10), 5)}")
# less than 5 on [0, 10)    = 5
# greater than 5 on [0, 10) = 4
# equal to 5 on [0, 10)     = 1

This isn't bad, but all of those functions are identical except for the operator between num and val. And the amount of repeated code continues to grow with the more conditions we are interested in (say we become interested in counting integers less than or equal to, divisible by, or coprime with a given integer).

Indeed, we can write a single generic function which takes a comparison between num and val and use that along with functools.partial to generate all of the above functions more succinctly.

In [7]:
from functools import partial
from operator import eq, gt, lt

def num_matching_condition(nums: List[int], val: int, condition: Callable):
    return sum(condition(num, val) for num in nums)

num_less_than = partial(num_matching_condition, condition=lt)
num_greater_than = partial(num_matching_condition, condition=gt)
num_equal_to = partial(num_matching_condition, condition=eq)

print(f"# less than 5 on [0, 10)    = {num_less_than(range(10), 5)}")
print(f"# greater than 5 on [0, 10) = {num_greater_than(range(10), 5)}")
print(f"# equal to 5 on [0, 10)     = {num_equal_to(range(10), 5)}")
# less than 5 on [0, 10)    = 5
# greater than 5 on [0, 10) = 4
# equal to 5 on [0, 10)     = 1

The Walrus operator (day 14 and day 16)

This was my first time using the controversial Walrus operator to create assignment expressions in places where assignment wasn't previously allowed. I found this useful when checking for a regular expression match and then using the resulting Match object.

Say we want to take a list of names, find the names where both the first and last name start with S, and print the last name.

In [8]:
import re
pattern = re.compile("S\w+\s(S\w+)")

names = [
    "SpongeBob SquarePants",
    "Patrick Star",
    "Squidward Tentacles"
]

Without the walrus operator, we first compute re.fullmatch, assign the result to match, check if a match occurred and, if so, print the first capturing group (the last name).

In [9]:
for name in names:
    match = re.fullmatch(pattern, name)
    if match:
        print(match.group(1))
SquarePants

The assignment expression condenses assigning the result of re.fullmatch and checking its truthiness into a single line. match is first assigned and then its value is returned and evaluated in the if statement.

In [10]:
for name in names:
    if match := re.fullmatch(pattern, name):
        print(match.group(1))
SquarePants

This example is hardly earth-shattering but this Real Python article provides a range of examples where assignment expressions can save repeated computations in list comprehensions.

itertools.product (day 17)

itertools.product has something of a niche use case but it has the power to greatly simplify some unseemly nested for-loops. From the documentation:

product(A, B)

Is equivalent to the generator expression:

((x, y) for x in A for y in B)

For example, say we want all Cartesian coordinates on [0, 2], i.e. (0, 0), (0, 1), (0, 2), (1, 0)..., then we could generate them like so:

In [11]:
[(x, y) for x in range(3) for y in range(3)]
Out[11]:
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]

Equivalently, we can write this using product as follows:

In [12]:
from itertools import product
list(product(range(3), range(3)))
Out[12]:
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]

This is only a modest improvement in 2-dimensions but becomes incredibly useful in 3, 4, and higher dimensions where writing triple and quadruple nested for-loops becomes fraught. Using itertools.product allowed me to write a generic Conway's Game of Life simulator for any number of dimensions to solve day 17 which had you run simulations in 3 and 4-dimensions.

Dynamic programming (day 10)

Dubbed one of the "sledgehammers of the algorithms craft" by my favorite algorithms text, dynamic programming is a technique for combining the results of subproblems to compute the result of a larger problem. With dynamic programming, you start by solving the smallest subproblems first and using their results to build up to the solution for the goal problem.

Day 10 part 2 asks us a question equivalent to this:

You are given a list of unique sorted integers where each adjacent integer differs by at most 3. Starting at the first (and smallest) element in the list, you can move to an integer that is 1, 2, or 3 larger than the current integer. How many different paths can you take from the first element to the last?

The trick to solving this is that to get to any given integer, you must have come from an integer that is 1, 2, or 3 less than the current one. That means that the number of ways to get to the current integer is the sum of the number of ways to get to the integers that are 1, 2, or 3 less than the current one. Thus, if we know the solutions to n-1, n-2, and n-3, then we know the solution to n.

We start with the smallest subproblem, there is only 1 way to get to the very first integer (because it's our starting point). For all subsequent integers, we look back at all integers that differ by less than or equal to 3 and sum the number of ways to get to all of those.

In [13]:
elements = [1, 2, 4, 5, 6, 8, 9, 10, 12, 13, 15]

num_paths = [0] * len(elements)
num_paths[0] = 1  # initialize smallest subproblem

for cur_index, cur_element in enumerate(elements):
    # look backwards at the subproblems that have already
    # been solved
    for prev_index in range(cur_index-1, -1, -1):
        prev_element = elements[prev_index]

        # because the elements are in sorted order, when
        # we've reached an element that differs by more
        # than 3, we can quit looking back
        if cur_element - prev_element > 3:
            break

        # sum the number of ways to get to a previous element
        num_paths[cur_index] += num_paths[prev_index]

num_paths
Out[13]:
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]

Dynamic programming is widely applicable and famously used to compute edit distance, solve the knapsack problem, and compute Fibonacci numbers, but it's also a notoriously difficult concept to understand. I read several texts and solved dozens of exercises before I even felt like I had a small grasp on it. I highly recommend the textbook linked above as well as Jeff Erickson's chapter on dynamic programming for more reading.

Named capturing groups (day 14)

Python adds a small readability enhancement to the notoriously-difficult-to-comprehend regular expression patterns. Python allows you to name capturing groups and then retrieve them by name from a Match object.

Traditionally, capturing groups are referred to by number, starting at 1 and incrementing in the order that they appear. Take the example above where we search through names for first and last names starting with S and print the first and last name.

In [14]:
names = [
    "SpongeBob SquarePants",
    "Patrick Star",
    "Squidward Tentacles"
]

pattern = re.compile("(S\w+)\s(S\w+)")

for name in names:
    if match := re.fullmatch(pattern, name):
        print(f"First name: {match.group(1)}")
        print(f"Last name:  {match.group(2)}")
First name: SpongeBob
Last name:  SquarePants

Named capturing groups take the form (?P<NAME>) and allow us to refer to the capturing group by a given name, rather than simply by number.

In [15]:
pattern = re.compile("(?P<FIRST>S\w+)\s(?P<LAST>S\w+)")

for name in names:
    if match := re.fullmatch(pattern, name):
        print(f"First name: {match.group('FIRST')}")
        print(f"Last name:  {match.group('LAST')}")
First name: SpongeBob
Last name:  SquarePants

Parting thoughts

A huge thanks to the Advent of Code team for an awesome year, I'm in awe at the amount work that must go into coming up with such clever problems, generating unique inputs, and maintaining a great community. Also a big thank you to the r/adventofcode community as well, it was fun to see a range of solutions and neat visualizations for each day's problem. It's super cool to see a friendly and encouraging community spring up around a set of problem solving challenges and I look forward to participating in more of these in the future.