# Tag Archives: expository

## Notes from a talk on the Mean Value Theorem

1. Introduction

When I first learned the Mean Value Theorem and the Intermediate Value Theorem, I thought they were both intuitively obvious and utterly useless. In one of my courses in analysis, I was struck when, after proving the Mean Value Theorem, my instructor said that all of calculus was downhill from there. But it was a case of not being able to see the forest for the trees, and I missed the big picture.

I have since come to realize that almost every major (and often, minor) result of calculus is a direct and immediate consequence of the Mean Value Theorem and the Intermediate Value Theorem. In this talk, we will focus on the forest, the big picture, and see the Mean Value Theorem for what it really is: the true Fundamental Theorem of Calculus.

Posted in Expository, Mathematics | | 2 Comments

## A response to FtYoU’s question on Reddit

FtYou writes

Hello everyone ! There is a concept I have a hard time getting my head wrap around. If you have a Vector Space V and a subspace W, I understand that you can find the least square vector approximation from any vector in V to a vector in W. And this correspond to the projection of V to the subspace W. Now , for data fitting … Let’s suppose you have a bunch of points (xi, yi) where you want to fit a set a regressors so you can approximate yi by a linear combination of the regressors lets say ( 1, x, x2 … ). What Vector space are we talking about ? If we consider the Vector space of function R -> R, in what subspace are we trying to map these vectors ?

I have a hard time merging these two concepts of projecting to a vector space and fitting the data. In the latter case what vector are we using ? The functions ? If so I understand the choice of regressors ( which constitute a basis for the vector space ) But what’s the role of the (xi,yi) ?

I want to point out that I understand completely how to build the matrices to get Y = AX and solving using least square approx. What I miss is the big picture. The linear algebra picture. Thanks for any help !

We’ll go over this by closely examining and understanding an example. Suppose we have the data points ${(x_i, y_i)}$

$\displaystyle \begin{cases} (x_1, y_1) = (-1,8) \\ (x_2, y_2) = (0,8) \\ (x_3, y_3) = (1,4) \\ (x_4, y_4) = (2,16) \end{cases},$

and we have decided to try to find the best fitting quadratic function. What do we mean by best-fitting? We mean that we want the one that approximates these data points the best. What exactly does that mean? We’ll see that before the end of this note – but in linear algebra terms, we are projecting on to some sort of vector space – we claim that projection is the ”best-fit” possible.

## Notes on the first week (SummerNT)

We’ve covered a lot of ground this first week! I wanted to provide a written summary, with partial proof, of what we have done so far.

We began by learning about proofs. We talked about direct proofs, inductive proofs, proofs by contradiction, and proofs by using the contrapositive of the statement we want to prove. A proof is a justification and argument based upon certain logical premises (which we call axioms); in contrast to other disciplines, a mathematical proof is completely logical and can be correct or incorrect.

We then established a set of axioms for the integers that would serve as the foundation of our exploration into the (often fantastic yet sometimes frustrating) realm of number theory. In short, the integers are a non-empty set with addition and multiplication [which are both associative, commutative, and have an identity, and which behave as we think they should behave; further, there are additive inverses], a total order [an integer is either bigger than, less than, or equal to any other integer, and it behaves like we think it should under addition and multiplication], and satisfying the deceptively important well ordering principle [every nonempty set of positive integers has a least element].

With this logical framework in place, we really began number theory in earnest. We talked about divisibility [we say that $a$ divides $b$, written $a mid b$, if $b = ak$ for some integer $k$]. We showed that every number has a prime factorization. To do this, we used the well-ordering principle.

Suppose that not all integers have a prime factorization. Then there must be a smallest integer that does not have a prime factorization: call it $n$. Then we know that $n$ is either a prime or a composite. If it’s prime, then it has a prime factorization. If it’s composite, then it factors as $n = ab$ with $a,b < n$. But then we know that each of $a, b$ have prime factorizations since they are less than $n$. Multiplying them together, we see that $n$ also has a prime factorization after all. $diamondsuit$

Our first major result is the following:

There are infinitely many primes

There are many proofs, and we saw 2 of them in class. For posterity, I’ll present three here.

First proof that there are infinitely many primes

Take a finite collection of primes, say $p_1, p_2, ldots, p_k$. We will show that there is at least one more prime not mentioned in the collection. To see this, consider the number $p_1 p_2 ldots p_k + 1$. We know that this number will factor into primes, but upon division by every prime in our collection, it leaves a remainder of $1$. Thus it has at least one prime factor different than every factor in our collection. $diamondsuit$

This was a common proof used in class. A pattern also quickly emerges: $2 + 1 = 3$, a prime. $2cdot3 + 1 = 7$, a prime. $2 cdot 3 cdot 5 + 1 = 31$, also a prime. It is always the case that a product of primes plus one is another prime? No, in fact. If you look at $2 cdot 3 cdot 5 cdot 7 cdot 11 cdot 13 + 1=30031 = 59cdot 509$, you get a nonprime.

Second proof that there are infinitely many primes

In a similar vein to the first proof, we will show that there is always a prime larger than $n$ for any positive integer $n$. To see this, consider $n! + 1$. Upon dividing by any prime less than $n$, we get a remainder of $1$. So all of its prime factors are larger than $n$, and so there are infinitely many primes. $diamondsuit$

I would also like to present one more, which I’ve always liked.

Third proof that there are infinitely many primes

Suppose there are only finitely many primes $p_1, ldots, p_k$. Then consider the two numbers $n = p_1 cdot dots cdot p_k$ and $n -1$. We know that $n – 1$ has a prime factor, so that it must share a factor $P$ with $n$ since $n$ is the product of all the primes. But then $P$ divides $n – (n – 1) = 1$, which is nonsense; no prime divides $1$. Thus there are infinitely many primes. $diamondsuit$

We also looked at modular arithmetic, often called the arithmetic of a clock. When we say that $a equiv b mod m$, we mean to say that $m | (b – a)$, or equivalently that $a = b + km$ for some integer $m$ (can you show these are equivalent?). And we pronounce that statement as ” $a$ is congruent to $b$ mod $m$.” We played a lot with modular arithmetic: we added, subtracted, and multiplied many times, hopefully enough to build a bit of familiarity with the feel. In most ways, it feels like regular arithmetic. But in some ways, it’s different. Looking at the integers $mod m$ partitions the integers into a set of equivalence classes, i.e. into sets of integers that are congruent to $0 mod m, 1 mod m, ldots$. When we talk about adding or multiplying numbers mod $mod m$, we’re really talking about manipulating these equivalence classes. (This isn’t super important to us – just a hint at what’s going on beneath the surface).

We expect that if $a equiv b mod m$, then we would also have $ac equiv bc mod m$ for any integer $c$, and this is true (can you prove this?). But we would also expect that if we had $ac equiv bc mod m$, then we would necessarily have $a equiv b mod m$, i.e. that we can cancel out the same number on each side. And it turns out that’s not the case. For example, $4 cdot 2 equiv 4 cdot 5 mod 6$ (both are $2 mod 6$), but ‘cancelling the fours’ says that $2 equiv 5 mod 6$ – that’s simply not true. With this example in mind, we went about proving things about modular arithmetic. It’s important to know what one can and can’t do.

One very big and important observation that we noted is that it doesn’t matter what order we operate, as in it doesn’t matter if we multiply an expression out and then ‘mod it’ down, or ‘mod it down’ and then multiply, or if we intermix these operations. Knowing this allows us to simplify expressions like $11^4 mod 12$, since we know $11 equiv -1 mod 12$, and we know that $(-1)^2 equiv 1 mod 12$, and so $11^4 equiv (-1)^{2cdot 2} equiv 1 mod 12$. If we’d wanted to, we could have multiplied it out and then reduced – the choice is ours!

Amidst our exploration of modular arithmetic, we noticed some patterns. Some numbers  are invertible in the modular sense, while others are not. For example, $5 cdot 5 equiv 1 mod 6$, so in that sense, we might think of $frac{1}{5} equiv 5 mod 6$. More interestingly but in the same vein, $frac{1}{2} equiv 6 mod 11$ since $2 cdot 6 equiv 1 mod 11$. Stated more formally, a number $a$ has a modular inverse $a^{-1} mod m$ if there is a solution to the modular equation $ax equiv 1 mod m$, in which case that solution is the modular inverse. When does this happen? Are these units special?

Returning to division, we think of the greatest common divisor. I showed you the Euclidean algorithm, and you managed to prove it in class. The Euclidean algorithm produces the greatest common divisor of $a$ and $b$, and it looks like this (where I assume that $b > 1$:

$b = q_1 a + r_1$

$a = q_2 r_1 + r_2$

$r_1 = q_3 r_2 + r_3$

$cdots$

$r_k = q_{k+2}r_{k+1} + r_{k+2}$

$r_{k+1} = q_{k+3}r_{k+2} + 0$

where in each step, we just did regular old division to guarantee a remainder $r_i$ that was less than the divisor. As the divisors become the remainders, this yields a strictly decreasing remainder at each iteration, so it will terminate (in fact, it’s very fast). Further, using the notation from above, I claimed that the gcd of $a$ and $b$ was the last nonzero remainder, in this case $r_{k+2}$. How did we prove it?

Proof of Euclidean Algorithm

Suppose that $d$ is a common divisor (such as the greatest common divisor) of $a$ and $b$. Then $d$ divides the left hand side of $b – q_1 a = r_1$, and thus must also divide the right hand side. So any divisor of $a$ and $b$ is also a divisor of $r_1$. This carries down the list, so that the gcd of $a$ and $b$ will divide each remainder term. How do we know that the last remainder is exactly the gcd, and no more? The way we proved it in class relied on the observation that $r_{k+2} mid r_{k+1}$. But then $r_{k+2}$ divides the right hand side of $r_k = q_{k+2} r_{k+1} + r_{k+2}$, and so it also divides the left. This also carries up the chain, so that $r_{k+2}$ divides both $a$ and $b$. So it is itself a divisor, and thus cannot be larger than the greatest common divisor. $diamondsuit$

As an aside, I really liked the way it was proved in class. Great job!

The Euclidean algorithm can be turned backwards with back-substitution (some call this the extended Euclidean algorithm,) to give a solution in $x,y$ to the equation $ax + by = gcd(a,b)$. This has played a super important role in our class ever since. By the way, though I never said it in class, we proved Bezout’s Identity along the way (which we just called part of the Extended Euclidean Algorithm). This essentially says that the gcd of $a$ and $b$ is the smallest number expressible in the form $ax + by$. The Euclidean algorithm has shown us that the gcd is expressible in this form. How do we know it’s the smallest? Observe again that if $c$ is a common divisor of $a$ and $b$, then $c$ divides the left hand side of $ax + by = d$, and so $c mid d$. So $d$ cannot be smaller than the gcd.

This led us to explore and solve linear Diophantine equations of the form $ax + by = c$ for general $a,b,c$. There will be solutions whenever the $gcd(a,b) mid c$, and in such cases there are infinitely many solutions (Do you remember how to see infinitely many other solutions?).

Linear Diophantine equations are very closely related a linear problems in modular arithmetic of the form $ax equiv c mod m$. In particular, this last modular equation is equivalent to $ax + my = c$ for some $y$.(Can you show that these are the same?). Using what we’ve learned about linear Diophantine equations, we know that $ax equiv c mod m$ has a solution iff $gcd(a,m) mid c$. But now, there are not infinitely many incongruent (i.e. not the same $mod m$) solutions. This is called the ‘Linear Congruence Theorem,’ and is interestingly the first major result we’ve learned with no proof on wikipedia.

Theorem: the modular equation $ax equiv b mod m$ has a solution iff $gcd(a,m) mid b$, in which case there are exactly $gcd(a,m)$ incongruent solutions.

Proof

We can translate a solution of $ax equiv b mod m$ into a solution of $ax + my = b$, and vice-versa. So we know from the Extended Euclidean algorithm that there are only solutions if $gcd(a,m) mid b$. Now, let’s show that there are $gcd(a,m)$ solutions. I will do this a bit differently than how we did it in class.

First, let’s do the case when $gcd(a,m)=1$, and suppose we have a solution $(x,y)$ so that $ax + my = b$. If there is another solution, then there is some perturbation we can do by shifting $x$ by a number $x’$ and $y$ by a number $y’$ that yields another solution looking like $a(x + x’) + m(y + y’) = b$. As we already know that $ax + my = b$, we can remove that from the equation. Then we get simply $ax’ = -my’$. Since $gcd(m,a) = 1$, we know (see below the proof) that $m$ divides $x’$. But then the new solution $x + x’ equiv x mod m$, so all solutions fall in the same congruence class – the same as $x$.

Now suppose that $gcd(a,m) = d$ and that there is a solution. Since there is a solution, each of $a,m,$ and $b$ are divisible by $d$, and we can write them as $a = da’, b = db’, m = dm’$. Then the modular equation $ax equiv b mod m$ is the same as $da’ x equiv d b’ mod d m’$, which is the same as $d m’ mid (d b’ – d a’x)$. Note that in this last case, we can remove the $d$ from both sides, so that $m’ mid b’ – a’x$, or that $a’x equiv b mod m’$. From the first case, we know this has exactly one solution mod $m’$, but we are interested in solutions mod $m$. Just as knowing that $x equiv 2 mod 4$ means that $x$ might be $2, 6, 10 mod 12$ since $4$ goes into $12$ three times, $m’$ goes into $m$ $d$ times, and this gives us our $d$ incongruent solutions. $diamondsuit.$

I mentioned that we used the fact that we’ve proven 3 times in class now in different forms: if $gcd(a,b) = 1$ and $a mid bc$, then we can conclude that $a mid c$. Can you prove this? Can you prove this without using unique factorization? We actually used this fact to prove unique factorization (really we use the statement about primes: if $p$ is a prime and $p mid ab$, then we must have that $p mid a$ or $p mid b$, or perhaps both). Do you remember how we proved that? We used the well-ordered principle to say that if there were a positive integer that couldn’t be uniquely factored, then there is a smaller one. But choosing two of its factorizations, and finding a prime on one side – we concluded that this prime divided the other side. Dividing both sides by this prime yielded a smaller (and therefore unique by assumption) factorization. This was the gist of the argument.

The last major bit of the week was the Chinese Remainder Theorem, which is awesome enough (and which I have enough to say about) that it will get its own post – which I’m working on now.

I’ll see you all in class tomorrow.

Posted in Brown University, Expository, Math.NT, Mathematics | | 1 Comment

## A proof from the first sheet (SummerNT)

In class today, we were asked to explain what was wrong with the following proof:

Claim: As $x$ increases, the function

$displaystyle f(x)=frac{100x^2+x^2sin(1/x)+50000}{100x^2}$

approaches (gets arbitrarily close to) 1.

Proof: Look at values of $f(x)$ as $x$ gets larger and larger.

$f(5) approx 21.002$
$f(10)approx 6.0010$
$f(25)approx 1.8004$
$f(50)approx 1.2002$
$f(100) approx 1.0501$
$f(500) approx 1.0020$

These values are clearly getting closer to 1. QED

Of course, this is incorrect. Choosing a couple of numbers and thinking there might be a pattern does not constitute a proof.

But on a related note, these sorts of questions (where you observe a pattern and seek to prove it) can sometimes lead to strongly suspected conjectures, which may or may not be true. Here’s an interesting one (with a good picture over at SpikedMath):

Draw $2$ points on the circumference of a circle, and connect them with a line. How many regions is the circle divided into? (two). Draw another point, and connect it to the previous points with a line. How many regions are there now? Draw another point, connecting to the previous points with lines. How many regions now? Do this once more. Do you see the pattern? You might even begin to formulate a belief as to why it’s true.

But then draw one more point and its lines, and carefully count the number of regions formed in the circle. How many circles now? (It doesn’t fit the obvious pattern).

So we know that the presented proof is incorrect. But lets say we want to know if the statement is true. How can we prove it? Further, we want to prove it without calculus – we are interested in an elementary proof. How should we proceed?

Firstly, we should say something about radians. Recall that at an angle $theta$ (in radians) on the unit circle, the arc-length subtended by the angle $theta$ is exactly $theta$ (in fact, this is the defining attribute of radians). And the value $sin theta$ is exactly the height, or rather the $y$ value, of the part of the unit circle at angle $theta$. It’s annoying to phrase, so we look for clarification at the hastily drawn math below:

Note in particular that the arc length is longer than the value of $sin theta$, so that $sin theta < theta$. (This relies critically on the fact that the angle is positive). Further, we see that this is always true for small, positive $theta$. So it will be true that for large, positive $x$, we’ll have $sin frac{1}{x} < frac{1}{x}$. For those of you who know a bit more calculus, you might know that in fact, $sin(frac{1}{x}) = frac{1}{x} – frac{1}{x^33!} + O(frac{1}{t^5})$, which is a more precise statement.

What do we do with this? Well, I say that this allow us to finish the proof.

$dfrac{100x^2 + x^2 sin(1/x) + 50000}{100x^2} leq dfrac{100x^2 + x + 50000}{100x^2} = 1 + dfrac{1}{100x} + dfrac{50000}{100x^2}$

, and it is clear that the last two terms go to zero as $x$ increases. $spadesuit$

Finally, I’d like to remind you about the class webpage at the left – I’ll see you tomorrow in class.