Tag Archives: david lowry-duda

A response to FtYoU’s question on Reddit

FtYou writes

Hello everyone ! There is a concept I have a hard time getting my head wrap around. If you have a Vector Space V and a subspace W, I understand that you can find the least square vector approximation from any vector in V to a vector in W. And this correspond to the projection of V to the subspace W. Now , for data fitting … Let’s suppose you have a bunch of points (xi, yi) where you want to fit a set a regressors so you can approximate yi by a linear combination of the regressors lets say ( 1, x, x2 … ). What Vector space are we talking about ? If we consider the Vector space of function R -> R, in what subspace are we trying to map these vectors ?

I have a hard time merging these two concepts of projecting to a vector space and fitting the data. In the latter case what vector are we using ? The functions ? If so I understand the choice of regressors ( which constitute a basis for the vector space ) But what’s the role of the (xi,yi) ?

I want to point out that I understand completely how to build the matrices to get Y = AX and solving using least square approx. What I miss is the big picture. The linear algebra picture. Thanks for any help !

We’ll go over this by closely examining and understanding an example. Suppose we have the data points $latex {(x_i, y_i)}$

$latex \displaystyle \begin{cases} (x_1, y_1) = (-1,8) \\ (x_2, y_2) = (0,8) \\ (x_3, y_3) = (1,4) \\ (x_4, y_4) = (2,16) \end{cases}, $

and we have decided to try to find the best fitting quadratic function. What do we mean by best-fitting? We mean that we want the one that approximates these data points the best. What exactly does that mean? We’ll see that before the end of this note – but in linear algebra terms, we are projecting on to some sort of vector space – we claim that projection is the ”best-fit” possible.

(more…)

Posted in Expository, Mathematics | Tagged , , , , , , | Leave a comment

Friendly Introduction to Sieves with a Look Towards Progress on the Twin Primes Conjecture

This is an extension and background to a talk I gave on 9 October 2013 to the Brown Graduate Student Seminar, called `A friendly intro to sieves with a look towards recent progress on the twin primes conjecture.’ During the talk, I mention several sieves, some with a lot of detail and some with very little detail. I also discuss several results and built upon many sources. I’ll provide missing details and/or sources for additional reading here.

Furthermore, I like this talk, so I think it’s worth preserving.

1. Introduction

We talk about sieves and primes. Long, long ago, Euclid famously proved the infinitude of primes ($latex {\approx 300}$ B.C.). Although he didn’t show it, the stronger statement that the sum of the reciprocals of the primes diverges is true:

$latex \displaystyle \sum_{p} \frac{1}{p} \rightarrow \infty, $

where the sum is over primes.

Proof: Suppose that the sum converged. Then there is some $latex {k}$ such that

$latex \displaystyle \sum_{i = k+1}^\infty \frac{1}{p_i} < \frac{1}{2}. $

Suppose that $latex {Q := \prod_{i = 1}^k p_i}$ is the product of the primes up to $latex {p_k}$. Then the integers $latex {1 + Qn}$ are relatively prime to the primes in $latex {Q}$, and so are only made up of the primes $latex {p_{k+1}, \ldots}$. This means that

$latex \displaystyle \sum_{n = 1}^\infty \frac{1}{1+Qn} \leq \sum_{t \geq 0} \left( \sum_{i > k} \frac{1}{p_i} \right) ^t < 2, $

where the first inequality is true since all the terms on the left appear in the middle (think prime factorizations and the distributive law), and the second inequality is true because it’s bounded by the geometric series with ratio $latex {1/2}$. But by either the ratio test or by limit comparison, the sum on the left diverges (aha! Something for my math 100 students), and so we arrive at a contradiction.

Thus the sum of the reciprocals of the primes diverges. $latex \diamondsuit$

(more…)

Posted in Expository, Math.NT, Mathematics | Tagged , , , , , , , , , , , , | Leave a comment

Math 100 Fall 2013: Concluding Remarks

This is a post written towards my students in Calc II, Math 100 at Brown University, fall 2013. There will be many asides, written in italics. They are to serve as clarifications of method or true asides, to be digested or passed over.

The semester is over. All the grades are in and known, fall 2013 draws to a close. As you know, I’m interested in the statistics behind the course. I’d mentioned my previous analysis about the extremely high correlation between first midterm and final grade (much higher than I would have thought!). Let’s reveal the statistics and distribution of this course, below the fold.

(more…)

Posted in Brown University, Math 100, Mathematics, Teaching | Tagged , , , | 2 Comments

Response to bnelo12’s question on Reddit (or the Internet storm on $1 + 2 + \ldots = -1/12$)

bnelo12 writes (slightly paraphrased)

Can you explain exactly how $latex {1 + 2 + 3 + 4 + \ldots = – \frac{1}{12}}$ in the context of the Riemann $latex {\zeta}$ function?

We are going to approach this problem through a related problem that is easier to understand at first. Many are familiar with summing geometric series

$latex \displaystyle g(r) = 1 + r + r^2 + r^3 + \ldots = \frac{1}{1-r}, $

which makes sense as long as $latex {|r| < 1}$. But if you’re not, let’s see how we do that. Let $latex {S(n)}$ denote the sum of the terms up to $latex {r^n}$, so that

$latex \displaystyle S(n) = 1 + r + r^2 + \ldots + r^n. $

Then for a finite $latex {n}$, $latex {S(n)}$ makes complete sense. It’s just a sum of a few numbers. What if we multiply $latex {S(n)}$ by $latex {r}$? Then we get

$latex \displaystyle rS(n) = r + r^2 + \ldots + r^n + r^{n+1}. $

Notice how similar this is to $latex {S(n)}$. It’s very similar, but missing the first term and containing an extra last term. If we subtract them, we get

$latex \displaystyle S(n) – rS(n) = 1 – r^{n+1}, $

which is a very simple expression. But we can factor out the $latex {S(n)}$ on the left and solve for it. In total, we get

$latex \displaystyle S(n) = \frac{1 – r^{n+1}}{1 – r}. \ \ \ \ \ (1)$

This works for any natural number $latex {n}$. What if we let $latex {n}$ get arbitrarily large? Then if $latex {|r|<1}$, then $latex {|r|^{n+1} \rightarrow 0}$, and so we get that the sum of the geometric series is

$latex \displaystyle g(r) = 1 + r + r^2 + r^3 + \ldots = \frac{1}{1-r}. $

But this looks like it makes sense for almost any $latex {r}$, in that we can plug in any value for $latex {r}$ that we want on the right and get a number, unless $latex {r = 1}$. In this sense, we might say that $latex {\frac{1}{1-r}}$ extends the geometric series $latex {g(r)}$, in that whenever $latex {|r|<1}$, the geometric series $latex {g(r)}$ agrees with this function. But this function makes sense in a larger domain then $latex {g(r)}$.

People find it convenient to abuse notation slightly and call the new function $latex {\frac{1}{1-r} = g(r)}$, (i.e. use the same notation for the extension) because any time you might want to plug in $latex {r}$ when $latex {|r|<1}$, you still get the same value. But really, it’s not true that $latex {\frac{1}{1-r} = g(r)}$, since the domain on the left is bigger than the domain on the right. This can be confusing. It’s things like this that cause people to say that

$latex \displaystyle 1 + 2 + 4 + 8 + 16 + \ldots = \frac{1}{1-2} = -1, $

simply because $latex {g(2) = -1}$. This is conflating two different ideas together. What this means is that the function that extends the geometric series takes the value $latex {-1}$ when $latex {r = 2}$. But this has nothing to do with actually summing up the $latex {2}$ powers at all.

So it is with the $latex {\zeta}$ function. Even though the $latex {\zeta}$ function only makes sense at first when $latex {\text{Re}(s) > 1}$, people have extended it for almost all $latex {s}$ in the complex plane. It just so happens that the great functional equation for the Riemann $latex {\zeta}$ function that relates the right and left half planes (across the line $latex {\text{Re}(s) = \frac{1}{2}}$) is

$latex \displaystyle \pi^{\frac{-s}{2}}\Gamma\left( \frac{s}{2} \right) \zeta(s) = \pi^{\frac{s-1}{2}}\Gamma\left( \frac{1-s}{2} \right) \zeta(1-s), \ \ \ \ \ (2)$

where $latex {\Gamma}$ is the gamma function, a sort of generalization of the factorial function. If we solve for $latex {\zeta(1-s)}$, then we get

$latex \displaystyle \zeta(1-s) = \frac{\pi^{\frac{-s}{2}}\Gamma\left( \frac{s}{2} \right) \zeta(s)}{\pi^{\frac{s-1}{2}}\Gamma\left( \frac{1-s}{2} \right)}. $

If we stick in $latex {s = 2}$, we get

$latex \displaystyle \zeta(-1) = \frac{\pi^{-1}\Gamma(1) \zeta(2)}{\pi^{\frac{-1}{2}}\Gamma\left( \frac{-1}{2} \right)}. $

We happen to know that $latex {\zeta(2) = \frac{\pi^2}{6}}$ (this is called Basel’s problem) and that $latex {\Gamma(\frac{1}{2}) = \sqrt \pi}$. We also happen to know that in general, $latex {\Gamma(t+1) = t\Gamma(t)}$ (it is partially in this sense that the $latex {\Gamma}$ function generalizes the factorial function), so that $latex {\Gamma(\frac{1}{2}) = \frac{-1}{2} \Gamma(\frac{-1}{2})}$, or rather that $latex {\Gamma(\frac{-1}{2}) = -2 \sqrt \pi.}$ Finally, $latex {\Gamma(1) = 1}$ (on integers, it agrees with the one-lower factorial).

Putting these together, we get that

$latex \displaystyle \zeta(-1) = \frac{\pi^2/6}{-2\pi^2} = \frac{-1}{12}, $

which is what we wanted to show. $latex {\diamondsuit}$

The information I quoted about the Gamma function and the zeta function’s functional equation can be found on Wikipedia or any introductory book on analytic number theory. Evaluating $latex {\zeta(2)}$ is a classic problem that has been in many ways, but is most often taught in a first course on complex analysis or as a clever iterated integral problem (you can prove it with Fubini’s theorem). Evaluating $latex {\Gamma(\frac{1}{2})}$ is rarely done and is sort of a trick, usually done with Fourier analysis.

As usual, I have also created a paper version. You can find that here.

Posted in Expository, Math.NT, Mathematics | Tagged , , , , | 4 Comments

Math 100: Before second midterm

You have a midterm next week, and it’s not going to be a cakewalk.

As requested, I’m uploading the last five weeks’ worth of worksheets, with (my) solutions. A comment on the solutions: not everything is presented in full detail, but most things are presented with most detail (except for the occasional one that is far far beyond what we actually expect you to be able to do). If you have any questions about anything, let me know. Even better, ask it here – maybe others have the same questions too.

Without further ado –

And since we were unable to go over the quiz in my afternoon recitation today, I’m attaching a worked solution to the quiz as well.

Again, let me know if you have any questions. I will still have my office hours on Tuesday from 2:30-4:30pm in my office (I’m aware that this happens to be immediately before the exam – status not by design). And I’ll be more or less responsive by email.

Study study study!

Posted in Brown University, Math 100, Mathematics | Tagged , , , , , , , , , , | Leave a comment

Comments vs documentation in python

In my first programming class we learned python. It went fine (I thought), I got the idea, and I moved on (although I do now see that much of what we did was not ‘pythonic’).

But now that I’m returning to programming (mostly in python), I see that I did much of it all wrong. One of my biggest surprises was how wrong I was about comments. Too many comments are terrible. Redundant comments make code harder to maintain. If the code is too complex to understand without comments, it’s probably just bad code.

That’s not so hard, right? You read some others’ code, see their comment conventions, and move on. You sort of get into zen moments where the code becomes very clear and commentless, and there is bliss. But then we were at a puzzling competition, and someone wanted to use a piece of code I’d written. Sure, no problem! And the source was so beautifully clear that it almost didn’t even need a single comment to understand.

But they didn’t want to read the code. They wanted to use the code. Comments are very different than documentation. The realization struck me, and again I had it all wrong. In hindsight, it seems so obvious! I’ve programmed java, I know about javadoc. But no one had ever actually wanted to use my code before (and wouldn’t have been able to easily if they had)!

Enter pydoc and sphinx. These are tools that allow html API to be generated from comment docstrings in the code itself. There is a cost – some comments with specific formatting are below each method or class. But it’s reasonable, I think.

Pydoc ships with python, and is fine for single modules or small projects. But for large projects, you’ll need something more. In fact, even the documentation linked above for pydoc was generated with sphinx:

The pydoc documentation is  ‘Created using Sphinx 1.0.7.’

This isn’t to say that pydoc is bad. But I didn’t want to limit myself. Python uses sphinx, so I’ll give it a try too.

And thus I (slightly excessively, to get the hang of it) comment on my solutions to Project Euler on my github. The current documentation looks alright, and will hopefully look better as I get the hang of it.

Full disclosure – this was originally going to be a post on setting up sphinx-generated documentation on github’s pages automatically. I got sidetracked – that will be the next post.

Posted in Programming, Python | Tagged , , , , , | Leave a comment

Math 100: Week 4

This is a post for my math 100 calculus class of fall 2013. In this post, I give the 4th week’s recitation worksheet (no solutions yet – I’m still writing them up). More pertinently, we will also go over the most recent quiz and common mistakes. Trig substitution, it turns out, is not so easy.

Before we hop into the details, I’d like to encourage you all to avail of each other, your professor, your ta, and the MRC in preparation for the first midterm (next week!).

1. The quiz

There were two versions of the quiz this week, but they were very similar. Both asked about a particular trig substitution

$latex \displaystyle \int_3^6 \sqrt{36 – x^2} \mathrm{d} x $

And the other was

$latex \displaystyle \int_{2\sqrt 2}^4 \sqrt{16 – x^2} \mathrm{d}x. $

They are very similar, so I’m only going to go over one of them. I’ll go over the first one. We know we are to use trig substitution. I see two ways to proceed: either draw a reference triangle (which I recommend), or think through the Pythagorean trig identities until you find the one that works here (which I don’t recommend).

We see a $latex {\sqrt{36 – x^2}}$, and this is hard to deal with. Let’s draw a right triangle that has $latex {\sqrt{36 – x^2}}$ as a side. I’ve drawn one below. (Not fancy, but I need a better light).

In this picture, note that $latex {\sin \theta = \frac{x}{6}}$, or that $latex {x = 6 \sin \theta}$, and that $latex {\sqrt{36 – x^2} = 6 \cos \theta}$. If we substitute $latex {x = 6 \sin \theta}$ in our integral, this means that we can replace our $latex {\sqrt{36 – x^2}}$ with $latex {6 \cos \theta}$. But this is a substitution, so we need to think about $latex {\mathrm{d} x}$ too. Here, $latex {x = 6 \sin \theta}$ means that $latex {\mathrm{d}x = 6 \cos \theta}$.

Some people used the wrong trig substitution, meaning they used $latex {x = \tan \theta}$ or $latex {x = \sec \theta}$, and got stuck. It’s okay to get stuck, but if you notice that something isn’t working, it’s better to try something else than to stare at the paper for 10 minutes. Other people use $latex {x = 6 \cos \theta}$, which is perfectly doable and parallel to what I write below.

Another common error was people forgetting about the $latex {\mathrm{d}x}$ term entirely. But it’s important!.

Substituting these into our integral gives

$latex \displaystyle \int_{?}^{??} 36 \cos^2 (\theta) \mathrm{d}\theta, $

where I have included question marks for the limits because, as after most substitutions, they are different. You have a choice: you might go on and put everything back in terms of $latex {x}$ before you give your numerical answer; or you might find the new limits now.

It’s not correct to continue writing down the old limits. The variable has changed, and we really don’t want $latex {\theta}$ to go from $latex {3}$ to $latex {6}$.

If you were to find the new limits, then you need to consider: if $latex {x=3}$ and $latex {\frac{x}{6} = \sin \theta}$, then we want a $latex {\theta}$ such that $latex {\sin \theta = \frac{3}{6}= \frac{1}{2}}$, so we might use $latex {\theta = \pi/6}$. Similarly, when $latex {x = 6}$, we want $latex {\theta}$ such that $latex {\sin \theta = 1}$, like $latex {\theta = \pi/2}$. Note that these were two arcsine calculations, which we would have to do even if we waited until after we put everything back in terms of $latex {x}$ to evaluate.

Some people left their answers in terms of these arcsines. As far as mistakes go, this isn’t a very serious one. But this is the sort of simplification that is expected of you on exams, quizzes, and homeworks. In particular, if something can be written in a much simpler way through the unit circle, then you should do it if you have the time.

So we could rewrite our integral as

$latex \displaystyle \int_{\pi/6}^{\pi/2} 36 \cos^2 (\theta) \mathrm{d}\theta. $

How do we integrate $latex {\cos^2 \theta}$? We need to make use of the identity $latex {\cos^2 \theta = \dfrac{1 + \cos 2\theta}{2}}$. You should know this identity for this midterm. Now we have

$latex \displaystyle 36 \int_{\pi/6}^{\pi/2}\left(\frac{1}{2} + \frac{\cos 2 \theta}{2}\right) \mathrm{d}\theta = 18 \int_{\pi/6}^{\pi/2}\mathrm{d}\theta + 18 \int_{\pi/6}^{\pi/2}\cos 2\theta \mathrm{d}\theta. $

The first integral is extremely simple and yields $latex {6\pi}$ The second integral has antiderivative $latex {\dfrac{\sin 2 \theta}{2}}$ (Don’t forget the $latex {2}$ on bottom!), and we have to evaluate $latex {\big[9 \sin 2 \theta \big]_{\pi/6}^{\pi/2}}$, which gives $latex {-\dfrac{9 \sqrt 3}{2}}$. You should know the unit circle sufficiently well to evaluate this for your midterm.

And so the final answer is $latex {6 \pi – \dfrac{9 \sqrt 2}{2} \approx 11.0553}$. (You don’t need to be able to do that approximation).

Let’s go back a moment and suppose you didn’t re”{e}valuate the limits once you substituted in $latex {\theta}$. Then, following the same steps as above, you’d be left with

$latex \displaystyle 18 \int_{?}^{??}\mathrm{d}\theta + 18 \int_{?}^{??}\cos 2\theta \mathrm{d}\theta = \left[ 18 \theta \right]_?^{??} + \left[ 9 \sin 2 \theta \right]_?^{??}. $

Since $latex {\frac{x}{6} = \sin \theta}$, we know that $latex {\theta = \arcsin (x/6)}$. This is how we evaluate the left integral, and we are left with $latex {[18 \arcsin(x/6)]_3^6}$. This means we need to know the arcsine of $latex {1}$ and $latex {\frac 12}$. These are exactly the same two arcsine computations that I referenced above! Following them again, we get $latex {6\pi}$ as the answer.

We could do the same for the second part, since $latex {\sin ( 2 \arcsin (x/6))}$ when $latex {x = 3}$ is $latex {\sin (2 \arcsin \frac{1}{2} ) = \sin (2 \cdot \frac{\pi}{6} ) = \frac{\sqrt 3}{2}}$; and when $latex {x = 6}$ we get $latex {\sin (2 \arcsin 1) = \sin (2 \cdot \frac{\pi}{2}) = \sin (\pi) = 0}$.

Putting these together, we see that the answer is again $latex {6\pi – \frac{9\sqrt 3}{2}}$.

Or, throwing yet another option out there, we could do something else (a little bit wittier, maybe?). We have this $latex {\sin 2\theta}$ term to deal with. You might recall that $latex {\sin 2 \theta = 2 \sin \theta \cos \theta}$, the so-called double-angle identity.

Then $latex {9 \sin 2\theta = 18 \sin \theta \cos \theta}$. Going back to our reference triangle, we know that $latex {\cos \theta = \dfrac{\sqrt{36 – x^2}}{6}}$ and that $latex {\sin \theta = \dfrac{x}{6}}$. Putting these together,

$latex \displaystyle 9 \sin 2 \theta = \dfrac{ x\sqrt{36 – x^2} }{2}. $

When $latex {x=6}$, this is $latex {0}$. When $latex {x = 3}$, we have $latex {\dfrac{ 3\sqrt {27}}{2} = \dfrac{9\sqrt 3}{2}}$.

And fortunately, we get the same answer again at the end of the day. (phew).

2. The worksheet

Finally, here is the worksheet for the day. I’m working on their solutions, and I’ll have that up by late this evening (sorry for the delay).

Ending tidbits – when I was last a TA, I tried to see what were the good predictors of final grade. Some things weren’t very surprising – there is a large correlation between exam scores and final grade. Some things were a bit surprising – low homework scores correlated well with low final grade, but high homework scores didn’t really have a strong correlation with final grade at all; attendance also correlated weakly. But one thing that really stuck with me was the first midterm grade vs final grade in class: it was really strong. For a bit more on that, I refer you to my final post from my Math 90 posts.

Posted in Brown University, Math 100, Mathematics | Tagged , , , , , , , , , , , | Leave a comment

Math 100: Week 3 and pre-midterm

This is a post for my Math 100 class of fall 2013. In this post, I give the first three weeks’ worksheets from recitation and the set of solutions to week three’s worksheet, as well as a few administrative details.

Firstly, here is the recitation work from the first three weeks:

  1. (there was no recitation the first week)
  2. A worksheet focusing on review.
  3. A worksheet focusing on integration by parts and u-substitution, with solutions.

In addition, I’d like to remind you that I have office hours from 2-4pm (right now) in Kassar 018. I’ve had multiple people set up appointments with me outside of these hours, which I’m tempted to interpret as suggesting that I change when my office hours are. If you have a preference, let me know, and I’ll try to incorporate it.

Finally, there will be an exam next Tuesday. I’ve been getting a lot of emails about what material will be on the exam. The answer is that everything you have learned up to now and by the end of this week is fair game for exam material. This also means there could be exam questions on material that we have not discussed in recitation. So be prepared. However, I will be setting aside a much larger portion of recitation this Thursday for questions than normal. So come prepared with your questions.

Best of luck, and I’ll see you in class on Thursday.

Posted in Brown University, Math 100, Mathematics | Tagged , , , , , , , , , | Leave a comment

Happy Birthday to The Science Guy

On 10 July 1917, Donald Herbert Kemske (later known as Donald Jeffry Herbert) was born in Waconia, Minnesota. Back when university educations were a bit more about education and a bit less about establishing vocation, Donald studied general science and English at La Crosse State Normal College (which is now the University of Wisconsin-La Crosse). But Donald liked drama, and he became an actor. When World War II broke out, Donald joined the US Air Force, flying over 50 missions as a bomber pilot.

After the war, Donald began to act in children’s programs at a radio station in Chicago. Perhaps it was because of his love of children’s education, perhaps it was the sudden visibility of the power of science, as evidenced by the nuclear bomb, or perhaps something else – but Donald had an idea for a tv show based around general science experiments. And so Watch Mr. Wizard was born on 3 March 1951 on NBC. (When I think about it, I’m surprised at how early this was in the life of television programming). Each week, a young boy or a girl would join Mr. Wizard (played by Donald) on a live tv show, where they would be shown interesting and easily-reproducible science experiments.

Watch Mr. Wizard was the first such tv program, and one might argue that its effects are still felt today. A total of 547 episodes of Watch Mr. Wizard aired. By 1956, over 5000 local Mr. Wizard science clubs had been started around the country; by 1965, when the show was cancelled by NBC, there were more than 50000. In fact, my parents have told me of Mr. Wizard and his fascinating programs. Such was the love and reach of Mr. Wizard that on the first Late Night Show with David Letterman, the guests were Bill Murray, Steve Fessler, and Mr. Wizard. He’s also mentioned in the song Walkin’ On the Sun by Smash Mouth. Were it possible for me to credit the many scientists that certainly owe their

I mention this because the legacy of Mr. Wizard was passed down. Don Herbert passed away on June 12, 2007. In an obituary published a few days later, Bill Nye writes that “Herbert’s techniques and performances helped create the United States’ first generation of homegrown rocket scientists just in time to respond to Sputnik. He sent us to the moon. He changed the world.” Reading the obituary, you cannot help but think that Bill Nye was also inspired to start his show by Mr. Wizard.

In fact, 20 years ago today, on 10 September 1993, the first episode of Bill Nye the Science Guy aired on PBS. It’s much more likely that readers of this blog have heard of Bill Nye; even though production of the show halted in 1998, PBS still airs reruns, and it’s commonly used in schools (did you know it won an incredible 19 Emmys?). I, for one, loved Bill Nye the Science Guy, and I still follow him to this day. I think it is impossible to narrow down the source of my initial interest in science, but I can certainly say that Bill Nye furthered my interest in science and experiments. He made science seem cool and powerful. To be clear, I know science is still cool and powerful, but I’m not so sure that’s the popular opinion. (As an aside: I also think math would really benefit from having our own Bill Nye).

(more…)

Posted in Humor, Story | Tagged , , , , , , , , , | Leave a comment

An intuitive introduction to calculus

This is a post written for my fall 2013 Math 100 class but largely intended for anyone with knowledge of what a function is and a desire to know what calculus is all about. Calculus is made out to be the pinnacle of the high school math curriculum, and correspondingly is thought to be very hard. But the difficulty is bloated, blown out of proportion. In fact, the ideas behind calculus are approachable and even intuitive if thought about in the right way.

Many people managed to stumble across the page before I’d finished all the graphics. I’m sorry, but they’re all done now! I was having trouble interpreting how WordPress was going to handle my gif files – it turns out that they automagically resize them if you don’t make them of the correct size, which makes them not display. It took me a bit to realize this. I’d like to mention that this actually started as a 90 minute talk I had with my wife over coffee, so perhaps an alternate title would be “Learning calculus in 2 hours over a cup of coffee.”

So read on if you would like to understand what calculus is, or if you’re looking for a refresher of the concepts from a first semester in calculus (like for Math 100 students at Brown), or if you’re looking for a bird’s eye view of AP Calc AB subject material.

1. An intuitive and semicomplete introduction to calculus

We will think of a function $ {f(\cdot)}$ as something that takes an input $ {x}$ and gives out another number, which we’ll denote by $ {f(x)}$. We know functions like $ {f(x) = x^2 + 1}$, which means that if I give in a number $ {x}$ then the function returns the number $ {f(x) = x^2 + 1}$. So I put in $ {1}$, I get $ {1^2 + 1 = 2}$, i.e. $ {f(1) = 2}$. Primary and secondary school overly conditions students to think of functions in terms of a formula or equation. The important thing to remember is that a function is really just something that gives an output when given an input, and if the same input is given later then the function spits the same output out. As an aside, I should mention that the most common problem I’ve seen in my teaching and tutoring is a fundamental misunderstanding of functions and their graphs

For a function that takes in and spits out numbers, we can associate a graph. A graph is a two-dimensional representation of our function, where by convention the input is put on the horizontal axis and the output is put on the vertical axis. Each axis is numbered, and in this way we can identify any point in the graph by its coordinates, i.e. its horizontal and vertical position. A graph of a function $ {f(x)}$ includes a point $ {(x,y)}$ if $ {y = f(x)}$.

The graph of the function $ x^2 + 1$ is in blue. The emphasized point appears on the graph because it is of the form $ (x, f(x))$. In particular, this point is $ (1, 2)$.

Thus each point on the graph is really of the form $ {(x, f(x))}$. A large portion of algebra I and II is devoted to being able to draw graphs for a variety of functions. And if you think about it, graphs contain a huge amount of information. Graphing $ {f(x)= x^2 + 1}$ involves drawing an upwards-facing parabola, which really represents an infinite number of points. That’s pretty intense, but it’s not what I want to focus on here.

1.1. Generalizing slope – introducing the derivative

You might recall the idea of the ‘slope’ of a line. A line has a constant ratio of how much the $ {y}$ value changes for a specific change in $ {x}$, which we call the slope (people always seem to remember rise over run). In particular, if a line passes through the points $ {(x_1, y_1)}$ and $ {(x_2, y_2)}$, then its slope will be the vertical change $ {y_2 – y_1}$ divided by the horizontal change $ {x_2 – x_1}$, or $ {\dfrac{y_2 – y_1}{x_2 – x_1}}$.

The graph of a line appears in blue. The two points $ (0,1)$ and $ (1,3)$ are shown on the line. The horizontal red line shows the horizontal change. The vertical red line shows the vertical change. The ‘slope’ of the blue line is the length of the vertical red line divided by the length of the horizontal red line.

So if the line is given by an equation $ {f(x) = \text{something}}$, then the slope from two inputs $ {x_1}$ and $ {x_2}$ is $ {\dfrac{f(x_2) – f(x_1)}{x_2 – x_1}}$. As an aside, for those that remember things like the ‘standard equation’ $ {y = mx + b}$ or ‘point-slope’ $ {(y – y_0) = m(x – x_0)}$ but who have never thought or been taught where these come from: the claim that lines are the curves of constant slope is saying that for any choice of $ {(x_1, y_1)}$ on the line, we expect $ {\dfrac{y_2 – y_1}{x_2 – x_1} = m}$ a constant, which I denote by $ {m}$ for no particularly good reason other than the fact that some textbook author long ago did such a thing. Since we’re allowing ourselves to choose any $ {(x_1, y_1)}$, we might drop the subscripts – since they usually mean a constant – and rearrange our equation to give $ {y_2 – y = m(x_2 – x)}$, which is what has been so unkindly drilled into students’ heads as the ‘point-slope form.’ This is why lines have a point-slope form, and a reason that it comes up so much is that it comes so naturally from the defining characteristic of a line, i.e. constant slope.

But one cannot speak of the ‘slope’ of a parabola.

The parabola $ f(x) = x^2 + 1$ is shows in blue. Slope is a measure of how much the function $ f(x)$ changes when $ x$ is changed. Some tangent lines to the parabola are shown in red. The slope of each line seems like it should be the ‘slope’ of the parabola when the line touches the parabola, but these slopes are different.

Intuitively, we look at our parabola $ {x^2 + 1}$ and see that the ‘slope,’ or an estimate of how much the function $ {f(x)}$ changes with a change in $ {x}$, seems to be changing depending on what $ {x}$ values we choose. (This should make sense – if it didn’t change, and had constant slope, then it would be a line). The first major goal of calculus is to come up with an idea of a ‘slope’ for non-linear functions. I should add that we already know a sort of ‘instantaneous rate of change’ of a nonlinear function. When we’re in a car and we’re driving somewhere, we’re usually speeding up or slowing down, and our pace isn’t usually linear. Yet our speedometer still manages to say how fast we’re going, which is an immediate rate of change. So if we had a function $ {p(t)}$ that gave us our position at a time $ {t}$, then the slope would give us our velocity (change in position per change in time) at a moment. So without knowing it, we’re familiar with a generalized slope already. Now in our parabola, we don’t expect a constant slope, so we want to associate a ‘slope’ to each input $ {x}$. In other words, we want to be able to understand how rapidly the function $ {f(x)}$ is changing at each $ {x}$, analogous to how the slope $ {m}$ of a line $ {g(x) = mx + b}$ tells us that if we change our input by an amount $ {h}$ then our output value will change by $ {mh}$.

How does calculus do that? The idea is to get closer and closer approximations. Suppose we want to find the ‘slope’ of our parabola at the point $ {x = 1}$. Let’s get an approximate answer. The slope of the line coming from inputs $ {x = 1}$ and $ {x = 2}$ is a (poor) approximation. In particular, since we’re working with $ {f(x) = x^2 + 1}$, we have that $ {f(2) = 5}$ and $ {f(1) = 2}$, so that the ‘approximate slope’ from $ {x = 1}$ and $ {x = 2}$ is $ {\frac{5 – 2}{2 – 1} = 3}$. But looking at the graph,

The parabola $ x^2 + 1$ is shown in blue, and the line going through the points $ (1,2)$ and $ (2,5)$ is shown. The line immediately goes above and crosses the parabola, so it seems like this line is rising faster (changing faster) than the parabola. It’s too steep, and the slope is too high to reflect the ‘slope’ of the parabola at the indicated point.

we see that it feels like this slope is too large. So let’s get closer. Suppose we use inputs $ {x = 1}$ and $ {x = 1.5}$. We get that the approximate slope is $ {\frac{3.25 – 2}{1.5 – 1} = 2.5}$. If we were to graph it, this would also feel too large. So we can keep choosing smaller and smaller changes, like using $ {x = 1}$ and $ {x = 1.1}$, or $ {x = 1}$ and $ {x = 1.01}$, and so on. This next graphic contains these approximations, with chosen points getting closer and closer to $ {1}$.

The parabola $ x^2 + 1$ is shown in blue. Two points are chosen on the parabola and the line between them is drawn in red. As the points get closer to each other, the red line indicates the rate of growth of the parabola at the point $ (1,2)$ better and better. So the slope of the red lines seems to be getting closer to the ‘slope’ of the parabola at $ (1,2)$.

Let’s look a little closer at the values we’re getting for our slopes when we use $ {1}$ and $ {2, 1.5, 1.1, 1.01, 1.001}$ as our inputs. We get

$ \displaystyle \begin{array}{c|c} \text{second input} & \text{approx. slope} \\ \hline 2 & 3 \\ 1.5 & 2.5 \\ 1.1 & 2.1 \\ 1.01 & 2.01 \\ 1.001 & 2.001 \end{array} $

It looks like the approximate slopes are approaching $ {2}$. What if we plot the graph with a line of slope $ {2}$ going through the point $ {(1,2)}$?

The parabola $ x^2 + 1$ is shown in blue. The line in red has slope $ 2$ and goes through the point $ (1,2)$. We got this line by continuing the successive approximations done above. It looks like it accurately indicates the ‘slope’ of the parabola at $ (1,2)$.

It looks great! Let’s zoom in a whole lot.

When we zoom in, the blue parabola looks almost like a line, and the red line looks almost like the parabola! This is why we are measuring the ‘slope’ of the parabola in this fashion – when we zoom in, it looks more and more like a line, and we are getting the slope of that line.

That looks really close! In fact, what I’ve been allowing as the natural feeling slope, or local rate of change, is really the line tangent to the graph of our function at the point $ {(1, f(1))}$. In a calculus class, you’ll spend a bit of time making sense of what it means for the approximate slopes to ‘approach’ $ {2}$. This is called a ‘limit,’ and the details are not important to us right now. The important thing is that this let us get an idea of a ‘slope’ at a point on a parabola. It’s not really a slope, because a parabola isn’t a line. So we’ve given it a different name – we call this ‘the derivative.’ So the derivative of $ {f(x) = x^2 + 1}$ at $ {x = 1}$ is $ {2}$, i.e. right around $ {x = 1}$ we expect a rate of change of $ {2}$, so that we expect $ {f(1 + h) – f(1) \approx 2h}$. If you think about it, we’re saying that we can approximate $ {f(x) = x^2 + 1}$ near the point $ {(1, 2)}$ by the line shown in the graph above: this line passes through $ {(1,2)}$ and it’s slope is $ {2}$, what we’re calling the slope of $ {f(x) = x^2 + 1}$ at $ {x = 1}$.

Let’s generalize. We were able to speak of the derivative at one point, but how about other points? The rest of this post is below the ‘more’ tag below.

(more…)

Posted in Brown University, Expository, Math 100, Mathematics | Tagged , , , , , , , , , , , , , | 7 Comments