## Advent of Code: Day 2

This is the second notebook in my posts on the Advent of Code challenges. This notebook in its original format can be found on my github.

# Day 2: Corruption Checksum, part I¶

You are given a table of integers. Find the difference between the maximum and minimum of each row, and add these differences together.

There is not a lot to say about this challenge. The plan is to read the file linewise, compute the difference on each line, and sum them up.

```
with open("input.txt", "r") as f:
lines = f.readlines()
lines[0]
```

```
l = lines[0]
l = l.split()
l
```

```
def max_minus_min(line):
'''Compute the difference between the largest and smallest integer in a line'''
line = list(map(int, line.split()))
return max(line) - min(line)
def sum_differences(lines):
'''Sum the value of `max_minus_min` for each line in `lines`'''
return sum(max_minus_min(line) for line in lines)
```

```
testcase = ['5 1 9 5','7 5 3', '2 4 6 8']
assert sum_differences(testcase) == 18
```

```
sum_differences(lines)
```

## Mathematical Interlude¶

In line with the first day’s challenge, I’m inclined to ask what we should “expect.” But what we should expect is not well-defined in this case. Let us rephrase the problem in a randomized sense.

Suppose we are given a table, $n$ lines long, where each line consists of $m$ elements, that are each uniformly randomly chosen integers from $1$ to $10$. We might ask what is the expected value of this operation, of summing the differences between the maxima and minima of each row, on this table. What should we expect?

As each line is independent of the others, we are really asking what is the expected value across a single row. So given $m$ integers uniformly randomly chosen from $1$ to $10$, what is the expected value of the maximum, and what is the expected value of the minimum?

### Expected Minimum¶

Let’s begin with the minimum. The minimum is $1$ unless all the integers are greater than $2$. This has probability

$$ 1 – \left( \frac{9}{10} \right)^m = \frac{10^m – 9^m}{10^m}$$

of occurring. We rewrite it as the version on the right for reasons that will soon be clear.

The minimum is $2$ if all the integers are at least $2$ (which can occur in $9$ different ways for each integer), but not all the integers are at least $3$ (each integer has $8$ different ways of being at least $3$). Thus this has probability

$$ \frac{9^m – 8^m}{10^m}.$$

Continuing to do one more for posterity, the minimum is $3$ if all the integers are at least $3$ (each integer has $8$ different ways of being at least $3$), but not all integers are at least $4$ (each integer has $7$ different ways of being at least $4$). Thus this has probability

$$ \frac{8^m – 7^m}{10^m}.$$

And so on.

Recall that the expected value of a random variable is

$$ E[X] = \sum x_i P(X = x_i),$$

so the expected value of the minimum is

$$ \frac{1}{10^m} \big( 1(10^m – 9^m) + 2(9^m – 8^m) + 3(8^m – 7^m) + \cdots + 9(2^m – 1^m) + 10(1^m – 0^m)\big).$$

This simplifies nicely to

$$ \sum_ {k = 1}^{10} \frac{k^m}{10^m}. $$

### Expected Maximum¶

The same style of thinking shows that the expected value of the maximum is

$$ \frac{1}{10^m} \big( 10(10^m – 9^m) + 9(9^m – 8^m) + 8(8^m – 7^m) + \cdots + 2(2^m – 1^m) + 1(1^m – 0^m)\big).$$

This simplifies to

$$ \frac{1}{10^m} \big( 10 \cdot 10^m – 9^m – 8^m – \cdots – 2^m – 1^m \big) = 10 – \sum_ {k = 1}^{9} \frac{k^m}{10^m}.$$

### Expected Difference¶

Subtracting, we find that the expected difference is

$$ 9 – 2\sum_ {k=1}^{9} \frac{k^m}{10^m}. $$

From this we can compute this for each list-length $m$. It is good to note that as $m \to \infty$, the expected value is $9$. Does this make sense? Yes, as when there are lots of values we should expect one to be a $10$ and one to be a $1$. It’s also pretty straightforward to see how to extend this to values of integers from $1$ to $N$.

Looking at the data, it does not appear that the integers were randomly chosen. Instead, there are very many relatively small integers and some relatively large integers. So we shouldn’t expect this toy analysis to accurately model this problem — the distribution is definitely not uniform random.

But we can try it out anyway.