Category Archives: Math 100

Math 100 Fall 2016: Concluding Remarks

It is that time of year. Classes are over. Campus is emptying. Soon it will be mostly emptiness, snow, and grad students (who of course never leave).

I like to take some time to reflect on the course. How did it go? What went well and what didn’t work out? And now that all the numbers are in, we can examine course trends and data.

Since numbers are direct and graphs are pretty, let’s look at the numbers first.

Math 100 grades at a glance

Let’s get an understanding of the distribution of grades in the course, all at once.

box_plots

These are classic box plots. The center line of each box denotes the median. The left and right ends of the box indicate the 1st and 3rd quartiles. As a quick reminder, the 1st quartile is the point where 25% of students received that grade or lower. The 3rd quartile is the point where 75% of students received that grade or lower. So within each box lies 50% of the course.

Each box has two arms (or “whiskers”) extending out, indicating the other grades of students. Points that are plotted separately are statistical outliers, which means that they are $1.5 \cdot (Q_3 – Q_1)$ higher than $Q_3$ or lower than $Q_1$ (where $Q_1$ denotes the first quartile and $Q_3$ indicates the third quartile).

A bit more information about the distribution itself can be seen in the following graph.

violin_plot

Within each blob, you’ll notice an embedded box-and-whisker graph. The white dots indicate the medians, and the thicker black parts indicate the central 50% of the grade. The width of the colored blobs roughly indicate how many students scored within that region. [As an aside, each blob actually has the same area, so the area is a meaningful data point].

So what can we determine from these graphs? Firstly, students did extremely well in their recitation sections and on the homework. I am perhaps most stunned by the tightness of the homework distribution. Remarkably, 75% of students had at least a 93 homework average. Recitation scores were very similar.

I also notice some patterns between the two midterms and final. The median on the first midterm was very high and about 50% of students earned a score within about 12 points of the median. The median on the second midterm was a bit lower, but the spread of the middle 50% of students was about the same. However the lower end was significantly lower on the second midterm in comparison to the first midterm. The median on the final was significantly lower, and the 50% spread was much, much larger.

Looking at the Overall grade, it looks very similar to the distribution of the first midterm, except shifted a bit.

It is interesting to note that that Recitation (10%), Homework (20%), and the First Midterm (20%) accounted for 50% of the course grade; the Second Midterm (20%) and the Final (30%) accounted for the other 50% of the course grade. The Recitation, Homework, and First Midterm grades pulled the Overall grade distribution up, while the Second Midterm and Final pulled the Overall grade distribution down.

Correlation between assignments and Overall Grade

I post the question: was any individual assignment type a good predictor of the final grade? For example, to what extent can we predict your final grade based on your First Midterm grade?

hw_vs_overall

No, doing well on homework is a terrible predictor of final grade. The huge vertical cluster of dots indicates that the overall grades vary significantly over a very small amount of homework. However, I note that doing poorly on homework is a great predictor of doing poorly overall. No one whose homework average was below an 80 got an A in the course. Having a homework grade below a 70 is a very strong indicator of failing the course. In terms of Pearson’s R correlation, one might say that about 40% of overall performance is predicted from performance on homework (which is very little).

Although drastic, this is in line with my expectations for calculus courses. This is perhaps a bit more extreme than normal — the level of clustering in the homework averages is truly stunning. Explaining this is a bit hard. It is possible to get homework help from the instructor or TA, or to work with other students, or to get help from the Math Resource Center or other tutoring. It is also possible to cheat, either with a solutions manual (which I know some students have), or a paid answer service (which I also witnessed), or to check answers on a computer algebra system like WolframAlpha. Each of these weakens the relationship between homework as an indicator of mastery.

In the calculus curriculum at Brown, I think it’s safe to say that homework plays a formative role instead of a normative role. It serves to provide opportunities for students to work through and learn material, and we don’t expect the grades to correspond strongly to understanding. To that end, half of the homework isn’t even collected.

m1_vs_overall

m2_vs_overall

The two midterms each correlate pretty strongly with Overall grade. In particular, the second midterm visually indicates really strong correlation. Statistically speaking (i.e. from Pearson’s R), it turns out that 67% of the Overall grade can be predict from the First Midterm (higher than might be expected) and 80% can be predicted from the Second Midterm (which is really, really high).

If we are willing to combine some pieces of information, the Homework and First Midterm (together) predict 77% of the Overall grade. As each student’s initial homework effort is very indicative of later homework, this means that we can often predict a student’s final grade (to a pretty good accuracy) as early as the first midterm.

(The Homework and the Second Midterm together predict 85% of the Overall grade. The two midterms together predict 88% of the Overall grade.)

This has always surprised me a bit, since for many students the first midterm is at least partially a review of material taught before. However, this course is very cumulative, so it does make sense that doing poorly on earlier tests indicates a hurdle that must be overcome in order to succeed on later tests. This is one of the unforgiving aspects of math, the sciences, and programming — early disadvantages compound. I’ve noted roughly this pattern in the past as well.

final_vs_overall

However the correlation between the Final and the Overall grade is astounding. I mean, look at how much the relationship looks like a line. Even the distributions (shown around the edges) look similar. Approximately 90% of the Overall grade is predicted by the grade on the Final Exam.

This is a bit atypical. One should expect a somewhat high correlation, as the final exam is cumulative and covers everything from the course (or at least tries to). But 90% is extremely high.

I think one reason why this occurred this semester is that the final exam was quite hard. It was distinctly harder than the midterms (though still easier than many of the homework problems). A hard final gives more opportunities for students who really understand the material to demonstrate their mastery. Conversely, a hard final punishes students with only a cursory understanding. Although stressful, I’ve always been a fan of exams that are difficult enough to distinguish between students, and to provide a chance for students to catch up. See Chances for a Comeback below for more on this.

Related statistics of interest might concern to what extent performance on the First Midterm predicts performance on the Second Midterm (44%) or the Final Exam (48%), or to what extent the Second Midterm predicts performance on the Final Exam (63%).

Homework and Recitations

hw_vs_m1As mentioned above, homework performance is a terrible predictor of course grade. I thought it was worth diving into a bit more. Does homework performance predict anything well? The short answer is not really.

Plotting Homework grade vs the First Midterm shows such a lack of meaning that it doesn’t even make sense to try to draw a line of best fit.

To be fair, homework is a better predictor of performance on the Second Midterm and Final Exam, but it’s still very bad.

rec_vs_hwHere’s a related question: what about Recitation sections? Are these good predictors of any other aspect of the course?

Plotting Recitation vs Homework is sort of interesting. Evidently, most people did very well on both homework and recitation. It is perhaps no surprise that most students who did very well in Recitation also did very well on their Homework, and vice versa. However it turns out that there are more people with high recitation grades and low homework grades than the other way around. But thinking about it, this makes sense.

These distributions are so tight that it still doesn’t make sense to try to draw a line of best fit or to talk about Pearson coefficients – most variation is simply too small to be meaningful.

Together, Homework and Recitation predict a measly 50% of the Overall grade of the course (in the Pearson’s R sense). One would expect more, as Homework and Recitation are directly responsible for 30% of the Overall grade, and one would expect homework and recitation to correlate at least somewhat meaningfully with the rest of graded content of the course, right?

I guess not.

So what does this mean about recitation and homework? Should we toss them aside? Does something need to be changed?

I would say “Not necessarily,” as it is important to recognize that not all grades are equal. Both homework and recitation are the places for students to experiment and learn. Recitations are supposed to be times where students are still learning material. They are to be inoffensive and safe, where students can mess up, fall over, and get back up again with the help of their peers and TA. I defend the lack of stress on grade or challenging and rigorous examination during recitation.

Homework is sort of the same, and sort of completely different. What gives me pause concerning homework is that homework is supposed to be the barometer by which students can measure their own understanding. When students ask us about how they should prepare for exams, our usual response is “If you can do all the homework (including self-check) without referencing the book, then you will be well-prepared for the exam.” If homework grade is such a poor predictor of exam grades, then is it possible that homework gives a poor ruler for students to measure themselves by?

I’m not sure. Perhaps it would be a good idea to indicate all the relevant questions in the textbook so that students have more problems to work on. In theory, students could do this themselves (and for that matter, I’m confident that such students would do very well in the course). But the problem is that we only cover a subset of the material in most sections of the textbook, and many questions (even those right next to ones we assign) require ideas or concepts that we don’t teach.

On the other hand, learning how to actually learn is a necessary skill, and probably one that most people struggle with when they first actually have to learn it. It’s necessary to learn it sooner or later.

Chances for a Comeback

The last numerical aspect I’ll consider is about whether or not it is possible to come back after doing badly on an earlier assessment. There are two things to consider: whether it is actually feasible or not, and whether any students did make it after a poor initial/early performance.

As to whether it is possible, the answer is yes (but it may be hard). And the reason why is that the Second Midterm and Final grades were each relatively low. It may be counterintuitive, but in order to return from a failing grade, it is necessary that there be enough room to actually come back.

Suppose Aiko is a pretty good student, but it just so happens that she makes a 49 on the first midterm due to some particular misunderstanding. If the class average on every assessment is a 90, then Aiko cannot claw her way back. That is, even if Aiko makes a 100 on everything else, Aiko’s final grade would be below a 90, and thus below average. Aiko would probably make a B.

In this situation, the class is too easy, and thus there are no chances for students to overcome a setback on any single exam.

On the other hand, suppose that Bilal makes a 49 on the first midterm, but that the class average is a 75 overall. If Bilal makes a 100 on everything else, Bilal will  end with just below a 90, significantly above the class average. Bilal would probably make an A.

In this course, the mean overall was a 78, and the standard deviation was about 15. In this case, an 89 would be an A. So there was enough space and distance to overcome even a disastrous exam.

But, did anyone actually do this? The way I like to look at this is to look at changes in relative performance in terms of standard deviations away from the mean. Performing at one standard deviation below the mean on Midterm 1 and one standard deviation above the mean on Midterm 2 indicates a more meaningful amount of grade fluidity than merely looking at points above or below the mean

m1_vs_m2_stddevs

Looking at the First Midterm vs the Second Midterm, we see that there is a rough linear relationship (Pearson R suggests 44% predictive value). That’s to be expected. What’s relevant now are the points above or below the line $y = x$. To be above the line $y = x$ means that you did better on the Second Midterm than you did on the First Midterm, all in comparison to the rest of the class. To be below means the opposite.

Even more relevant is to be in the Fourth Quadrant, which indicates that you did worse than average on the first midterm and better than average on the second. Looking here, there is a very healthy amount of people who are in the Fourth Quadrant. There are many people who changed by 2 standard deviations in comparison to the mean — a very meaningful change. [Many people lost a few standard deviations too – grade mobility is a two way street].

m1_vs_overall_stddevsm2_vs_overall_stddevs

The First Midterm to the Overall grade shows healthy mobility as well.

The Second Midterm to Overall shows some mobility, but it is interesting that more people lost ground (by performing well on the Second Midterm, and then performing badly Overall) than gained ground (by performing badly on the Second Midterm, but performing well Overall).

Although I don’t show the plots, this trend carries through pretty well. Many people were able to salvage or boost a letter grade based solely on the final (and correspondingly many people managed to lose just enough on the final to drop a letter grade). Interestingly, very many people were able to turn a likely B into an A through the final.

So overall, I would say that it was definitely possible to salvage grades this semester.

If you’ve never thought about this before, then keep this in mind the next time you hear complaints about a course with challenging exams — it gives enough space for students to demonstrate sufficient understanding to make up for a bad past assessment.

Non-Numerical Reflection

The numbers tell some characteristics of the class, but not the whole story.

Standard Class Materials

We used Thomas’ Calculus. I think this is an easy book to teach from, and relatively easy to read. It feels like many other cookie-cutter calculus books (such as Larson and Edwards or Stewart). But it’s quite expensive for students. However, as we do not use an electronic homework component (which seems to be becoming more popular elsewhere), at least students can buy used copies (or use other methods of procural).

However, solutions manuals are available online (I noticed some students had copies). Some of the pay-for sites have complete (and mostly but not entirely correct) provided solutions manuals as well. This makes some parts of Thomas challenging to use, especially as we do not write our own homework to give. I suppose that this is a big reason why one might want to use an electronic system.

The book has much more material in it than we teach. For instance, the book includes all of a first semester of calculus, and also more details in many sections. We avoid numerical integration, Fourier series, some applications, some details concerning polar and parametric plots, etc. Ideally, there would exist a book catering to exactly our needs. But there isn’t, so I suppose Thomas is about as good as any.

Additional Course Materials

I’ve now taught elementary calculus for a few years, and I’m surprised at how often I am able to reuse two notes I wrote years ago, namely the refresher on first semester calculus (An Intuitive Introduction to Calculus) and my additional note on Taylor series (An Intuitive Overview of Taylor Series). Perhaps more surprisingly, I’m astounded at how often people from other places link to and visit these two notes (and in particular, the Taylor Series note).

These were each written for a Math 100 course in 2013. So my note to myself is that there is good value in writing something well enough that I can reuse it, and others might even find it valuable.

Unfortunately, while I wrote a few notes this semester, I don’t think that they will have the same lasting appeal. The one I wrote on the series convergence tests is something that (perhaps after one more round of editing) I will use each time I teach this subject in the future. I’m tremendously happy with my note on computing $\pi$ with Math 100 tools, but as it sits outside the curriculum, many students won’t actually read it. [However many did read it, and it generated many interesting conversations about actual mathematics]. Perhaps sometime I will teach a calculus class ending with some sort of project, as computing $\pi$ leads to very many interested and interrelated project thoughts.

Course Content

I must admit that I do not know why this course is the way it is, and this bothers me a bit. In many ways, this course is a grab bag of calculus nuggets. Presumably each piece was added in because it is necessary in sufficiently many other places, or is so directly related to the “core material” of this course, that it makes sense to include it. But from what I can tell, these reasons have been lost to the sands of time.

The core material in this course are: Integration by Parts, Taylor’s Theorem, Parametric and Polar coordinates, and First Order Linear Differential Equations. We also spend a large amount of time towards other techniques of integration (partial fraction decomposition, trig substitution) and understanding generic series (including the various series convergence/divergence tests). Along the way, there are some seemingly arbitrary decisions on what to include or exclude. For instance, we learn how to integrate

$$ \int \sin^n x \cos^m x \; dx$$

because we have decided that being able to perform trigonometric substitution in integrals is a good idea. But we omit integrals like

$$ \int \sin(nx) \sin(mx) \; dx$$

which would come up naturally in talking about Fourier series. Fourier series fit naturally into this class, and in some variants of this class they are taught. But so does trigonometric substitution! So what is the rationale here? If the answer is to become better at problem solving or to develop mathematical maturity, then I think it would be good to recognize that so that we know what we should feel comfortable wiggling to build and develop the curriculum in the future. [Also, students should know that calculus is not a pinnacle. See for instance this podcast with Steven Strogatz on Innovation Hub.]

This is not restricted to Brown. I’m familiar with the equivalent of this course at other institutions, and there are similar seemingly arbitrary differences in what to include or exclude. For years at Georgia Tech, they tossed in a several week unit on linear algebra into this course [although I’ve learned that they stopped that in the past two years]. The AP Calc BC curriculum includes trig substitution but not Fourier series. Perhaps they had a reason?

What this means to me is that the intent of this course has become muddled, and separated from the content of the course. This is an overwhelmingly hard task to try to fix, as a second semester of calculus fits right in the middle of so many other pieces. Yet I would be very grateful to the instructor who sits down and identifies reasons for or against inclusion of the various topics in this course, or perhaps cuts the calculus curriculum into pieces and rearranges them to fit modern necessities.

A Parachute is only necessary to go skydiving twice

This is the last class I teach at Brown as a graduate student (and most likely, ever). Amusingly, I taught it in the same room as the first course I taught as a graduate student. I’ve learned quite a bit about teaching inbetween, but in many ways it feels the same. Just like for students, the only scary class is the first one, although exams can be a real pain (to take, or to grade).

It’s been a pleasure. As usual, if you have any questions, please let me know.

Posted in Brown University, Math 100, Mathematics, Teaching | Tagged , , , , | Leave a comment

Computing pi with tools from Calculus

Computing $\pi$

This note was originally written in the context of my fall Math 100 class at Brown University. It is also available as a pdf note.

While investigating Taylor series, we proved that
\begin{equation}\label{eq:base}
\frac{\pi}{4} = 1 – \frac{1}{3} + \frac{1}{5} – \frac{1}{7} + \frac{1}{9} + \cdots
\end{equation}
Let’s remind ourselves how. Begin with the geometric series
\begin{equation}
\frac{1}{1 + x^2} = 1 – x^2 + x^4 – x^6 + x^8 + \cdots = \sum_{n = 0}^\infty (-1)^n x^{2n}. \notag
\end{equation}
(We showed that this has interval of convergence $\lvert x \rvert < 1$). Integrating this geometric series yields
\begin{equation}
\int_0^x \frac{1}{1 + t^2} dt = x – \frac{x^3}{3} + \frac{x^5}{5} – \frac{x^7}{7} + \cdots = \sum_{n = 0}^\infty (-1)^n \frac{x^{2n+1}}{2n+1}. \notag
\end{equation}
Note that this has interval of convergence $-1 < x \leq 1$.

We also recognize this integral as
\begin{equation}
\int_0^x \frac{1}{1 + t^2} dt = \text{arctan}(x), \notag
\end{equation}
one of the common integrals arising from trigonometric substitution. Putting these together, we find that
\begin{equation}
\text{arctan}(x) = x – \frac{x^3}{3} + \frac{x^5}{5} – \frac{x^7}{7} + \cdots = \sum_{n = 0}^\infty (-1)^n \frac{x^{2n+1}}{2n+1}. \notag
\end{equation}
As $x = 1$ is within the interval of convergence, we can substitute $x = 1$ into the series to find the representation
\begin{equation}
\text{arctan}(1) = 1 – \frac{1}{3} + \frac{1}{5} – \frac{1}{7} + \cdots = \sum_{n = 0}^\infty (-1)^n \frac{1}{2n+1}. \notag
\end{equation}
Since $\text{arctan}(1) = \frac{\pi}{4}$, this gives the representation for $\pi/4$ given in \eqref{eq:base}.

However, since $x=1$ was at the very edge of the interval of convergence, this series converges very, very slowly. For instance, using the first $50$ terms gives the approximation
\begin{equation}
\pi \approx 3.121594652591011. \notag
\end{equation}
The expansion of $\pi$ is actually
\begin{equation}
\pi = 3.141592653589793238462\ldots \notag
\end{equation}
So the first $50$ terms of \eqref{eq:base} gives two digits of accuracy. That’s not very good.

I think it is very natural to ask: can we do better? This series converges slowly — can we find one that converges more quickly?

Aside

As an aside: one might also ask if we can somehow speed up the convergence of the series we already have. It turns out that in many cases, you can! For example, we know in alternating series that the sum of the whole series is between any two consecutive partial sums. So what if you took the average of two consecutive partial sums? [Equivalently, what if you added only one half of the last term in a partial sum. Do you see why these are the same?]

The average of the partial sum of the first 49 terms and the partial sum of the first 50 terms is actually
\begin{equation}
3.141796672793031, \notag
\end{equation}
which is correct to within $0.001$. That’s an improvement!

What if you do still more? More on this can be found in the last Section.

Estimating $\pi$ through a different series

We return to the question: can we find a series that gives us $\pi$, but which converges faster? Yes we can! And we don’t have to look too far — we can continue to rely on our expansion for $\text{arctan}(x)$.

We had been using that $\text{arctan}(1) = \frac{\pi}{4}$. But we also know that $\text{arctan}(1/\sqrt{3}) = \frac{\pi}{6}$. Since $1/\sqrt{3}$ is closer to the center of the power series than $1$, we should expect that the convergence is much better.

Recall that
\begin{equation}
\text{arctan}(x) = x – \frac{x^3}{3} + \frac{x^5}{5} – \frac{x^7}{7} + \cdots = \sum_{n = 0}^\infty (-1)^n \frac{x^{2n + 1}}{2n + 1}. \notag
\end{equation}
Then we have that
\begin{align}
\text{arctan}\left(\frac{1}{\sqrt 3}\right) &= \frac{1}{\sqrt 3} – \frac{1}{3(\sqrt 3)^3} + \frac{1}{5(\sqrt 3)^5} + \cdots \notag \\
&= \frac{1}{\sqrt 3} \left(1 – \frac{1}{3 \cdot 3} + \frac{1}{5 \cdot 3^2} – \frac{1}{7 \cdot 3^3} + \cdots \right) \notag \\
&= \frac{1}{\sqrt 3} \sum_{n = 0}^\infty (-1)^n \frac{1}{(2n + 1) 3^n}. \notag
\end{align}
Therefore, we have the equality
\begin{equation}
\frac{\pi}{6} = \frac{1}{\sqrt 3} \sum_{n = 0}^\infty (-1)^n \frac{1}{(2n + 1) 3^n} \notag
\end{equation}
or rather that
\begin{equation}
\pi = 2 \sqrt{3} \sum_{n = 0}^\infty (-1)^n \frac{1}{(2n + 1) 3^n}. \notag
\end{equation}
From a computation perspective, this is far superior. For instance, based on our understanding of error from the alternating series test, using the first $10$ terms of this series will approximate $\pi$ to within
\begin{equation}
2 \sqrt 3 \frac{1}{23 \cdot 3^{11}} \approx \frac{1}{26680}. \notag
\end{equation}

Let’s check this.
\begin{equation}
2 \sqrt 3 \left(1 – \frac{1}{3\cdot 3} + \frac{1}{5 \cdot 3^2} + \cdots + \frac{1}{21 \cdot 3^{10}}\right) = 3.1415933045030813. \notag
\end{equation}
Look at how close that approximation is, and we only used the first $10$ terms!
Roughly speaking, each additional 2.5 terms yields another digit of $\pi$. Using the first $100$ terms would give the first 48 digits of $\pi$.
Using the first million terms would give the first 47000 (or so) digits of $\pi$ — and this is definitely doable, even on a personal laptop. (On my laptop, it takes approximately 4 milliseconds to compute the first 20 digits of $\pi$ using this technique).

Even Better Series

I think it is very natural to ask again: can we find an even faster converging series? Perhaps we can choose better values to evaluate arctan at? This turns out to be a very useful line of thought, and it leads to some of the best-known methods for evaluating $\pi$. Through clever choices of values and identities involving arctangents, one can construct extremely quickly converging series for $\pi$. For more information on this line of thought, look up Machin-like formula.

Patterns in the Approximation of $\pi/4$

 

Looking back at the approximation of $\pi$ coming from the first $50$ terms of the series
\begin{equation}\label{eq:series_pi4_base}
1 – \frac{1}{3} + \frac{1}{5} – \frac{1}{7} + \cdots
\end{equation}
we found an approximation of $\pi$, which I’ll represent as $\widehat{\pi}$,
\begin{equation}
\pi \approx \widehat{\pi} = 3.121594652591011. \notag
\end{equation}
Let’s look very carefully at how this compares to $\pi$, up to the first $10$ decimals. We color the incorrect digits in ${\color{orange}{orange}}$.
\begin{align}
\pi &= 3.1415926535\ldots \notag \\
\widehat{\pi} &= 3.1{\color{orange}2}159{\color{orange}4}65{\color{orange}2}5 \notag
\end{align}
Notice that most of the digits are correct — in fact, only three (of the first ten) are incorrect! Isn’t that weird?

It happens to be that when one uses the first $10^N / 2$ terms (for any $N$) of the series \eqref{eq:series_pi4_base}, there will be a pattern of mostly correct digits with disjoint strings of incorrect digits in the middle. This is an unusual and surprising phenomenon.

The positions of the incorrect digits can be predicted. Although I won’t go into any detail here, the positions of the errors are closely related to something called Euler Numbers or, more deeply, to Boole Summation.

Playing with infinite series leads to all sorts of interesting patterns. There is a great history of mathematicians and physicists messing around with series and stumbling across really deep ideas.

Speeding up computation

Take an alternating series
\begin{equation}
\sum_{n = 0}^\infty (-1)^{n} a_n = a_0 – a_1 + a_2 – a_3 + \cdots \notag
\end{equation}
If ${a_n}$ is a sequence of positive, decreasing terms with limit $0$, then the alternating series converges to some value $S$. And further, consecutive partial sums bound the value of $S$, in that
\begin{equation}
\sum_{n = 0}^{2K-1} (-1)^{n} a_n \leq S \leq \sum_{n = 1}^{2K} (-1)^{n} a_n. \notag
\end{equation}
For example,
\begin{equation}
1 – \frac{1}{3} < \sum_{n = 0}^\infty \frac{(-1)^{n}}{2n+1} < 1 – \frac{1}{3} + \frac{1}{5}. \notag
\end{equation}

Instead of approximating the value of the whole sum $S$ by the $K$th partial sum $\sum_{n \leq K} (-1)^n a_n$, it might seem reasonable to approximate $S$ by the average of the $(K-1)$st partial sum and the $K$th partial sum. Since we know $S$ is between the two, taking their average might be closer to the real result.

As mentioned above, the average of the partial sum consisting of the first $49$ terms of \eqref{eq:base} and the first $50$ terms of \eqref{eq:base} gives a much improved estimate of $\pi$ than using either the first $49$ or first $50$ terms on their own. (And indeed, it works much better than even the first $500$ terms on their own).

Before we go on, let’s introduce a little notation. Let $S_K$ denote the partial sum of the terms up to $K$, i.e.
\begin{equation}
S_K = \sum_{n = 0}^K (-1)^{n} a_n. \notag
\end{equation}
Then the idea is that instead of using $S_{K}$ to approximate the wholse sum $S$, we’ll use the average
\begin{equation}
\frac{S_{K-1} + S_{K}}{2} \approx S. \notag
\end{equation}

Averaging once seems like a great idea. What if we average again? That is, what if instead of using the average of $S_{K-1}$ and $S_K$, we actually use the average of (the average of $S_{K-2}$ and $S_{K-1}$) and (the average of $S_{K_1}$ and $S_K$),
\begin{equation}\label{eq:avgavg}
\frac{\frac{S_{K-2} + S_{K-1}}{2} + \frac{S_{K-1} + S_{K}}{2}}{2}.
\end{equation}
As this is really annoying to write, let’s come up with some new notation. Write the average between a quantity $X$ and $Y$ as
\begin{equation}
[X, Y] = \frac{X + Y}{2}. \notag
\end{equation}
Further, define the average of $[X, Y]$ and $[Y, Z]$ to be $[X, Y, Z]$,
\begin{equation}
[X, Y, Z] = \frac{[X, Y] + [Y, Z]}{2} = \frac{\frac{X + Y}{2} + \frac{Y + Z}{2}}{2}. \notag
\end{equation}
So the long expression in \eqref{eq:avgavg} can be written as $[S_{K-2}, S_{K-1}, S_{K}]$.

With this notation in mind, let’s compute some numerics. Below, we give the actual value of $\pi$, the values of $S_{48}, S_{49}$, and $S_{50}$, pairwise averages, and the average-of-the-average, in the case of $1 – \frac{1}{3} + \frac{1}{5} + \cdots$.
\begin{equation} \notag
\begin{array}{c|l|l}
& \text{Value} & \text{Difference from } \pi \\ \hline
\pi & 3.141592653589793238462\ldots & \phantom{-}0 \\ \hline
4 \cdot S_{48} & 3.1207615795929895 & \phantom{-}0.020831073996803617 \\ \hline
4 \cdot S_{49} & 3.161998692995051 & -0.020406039405258092 \\ \hline
4 \cdot S_{50} & 3.121594652591011 & \phantom{-}0.01999800099878213 \\ \hline
4 \cdot [S_{48}, S_{49}] & 3.1413801362940204 & \phantom{-}0.0002125172957727628 \\ \hline
4 \cdot [S_{49}, S_{50}] & 3.1417966727930313 & -0.00020401920323820377 \\ \hline
4 \cdot [S_{48}, S_{49}, S_{50}] & 3.141588404543526 & \phantom{-}0.00000424904626727951 \\ \hline
\end{array}
\end{equation}
So using the average of averages from the three sums $S_{48}, S_{49}$, and $S_{50}$ gives $\pi$ to within $4.2 \cdot 10^{-6}$, an incredible improvement compared to $S_{50}$ on its own.

There is something really odd going on here. We are not computing additional summands in the overall sum \eqref{eq:base}. We are merely combining some of our partial results together in a really simple way, repeatedly. Somehow, the sequence of partial sums contains more information about the limit $S$ than individual terms, and we are able to extract some of this information.

I think there is a very natural question. What if we didn’t stop now? What if we took averages-of-averages-of-averages, and averages-of-averages-of-averages-of-averages, and so on? Indeed, we might define the average
\begin{equation}
[X, Y, Z, W] = \frac{[X, Y, Z] + [Y, Z, W]}{2}, \notag
\end{equation}
and so on for larger numbers of terms. In this case, it happens to be that
\begin{equation}
[S_{15}, S_{16}, \ldots, S_{50}] = 3.141592653589794,
\end{equation}
which has the first 15 digits of $\pi$ correct!

By repeatedly averaging alternating sums of just the first $50$ reciprocals of odd integers, we can find $\pi$ up to 15 digits. I think that’s incredible — it seems both harder than it might have been (as this involves lots of averaging) and much easier than it might have been (as the only arithmetic input are the fractions $1/(2n+1)$ for $n$ up to $50$.

Although we leave the thread of ideas here, there are plenty of questions that I think are now asking themselves. I encourage you to ask them, and we may return to this (or related) topics in the future. I’ll see you in class.

Posted in Brown University, Expository, Math 100, Mathematics, Teaching | Tagged , , , | Leave a comment

Series Convergence Tests with Prototypical Examples

This is a note written for my Fall 2016 Math 100 class at Brown University. We are currently learning about various tests for determining whether series converge or diverge. In this note, we collect these tests together in a single document. We give a brief description of each test, some indicators of when each test would be good to use, and give a prototypical example for each. Note that we do justify any of these tests here — we’ve discussed that extensively in class. [But if something is unclear, send me an email or head to my office hours]. This is here to remind us of the variety of the various tests of convergence.

A copy of just the statements of the tests, put together, can be found here. A pdf copy of this whole post can be found here.

In order, we discuss the following tests:

  1. The $n$th term test, also called the basic divergence test
  2. Recognizing an alternating series
  3. Recognizing a geometric series
  4. Recognizing a telescoping series
  5. The Integral Test
  6. P-series
  7. Direct (or basic) comparison
  8. Limit comparison
  9. The ratio test
  10. The root test

The $n$th term test

Statement

Suppose we are looking at $\sum_{n = 1}^\infty a_n$ and
\begin{equation}
\lim_{n \to \infty} a_n \neq 0. \notag
\end{equation}
Then $\sum_{n = 1}^\infty a_n$ does not converge.

When to use it

When applicable, the $n$th term test for divergence is usually the easiest and quickest way to confirm that a series diverges. When first considering a series, it’s a good idea to think about whether the terms go to zero or not. But remember that if the limit of the individual terms is zero, then it is necessary to think harder about whether the series converges or diverges.

Example

Each of the series
\begin{equation}
\sum_{n = 1}^\infty \frac{n+1}{2n + 4}, \quad \sum_{n = 1}^\infty \cos n, \quad \sum_{n = 1}^\infty \sqrt{n} \notag
\end{equation}
diverges since their limits are not $0$.

Recognizing alternating series

Statement

Suppose $\sum_{n = 1}^\infty (-1)^n a_n$ is a series where

  1. $a_n \geq 0$,
  2. $a_n$ is decreasing, and
  3. $\lim_{n \to \infty} a_n = 0$.

Then $\sum_{n = 1}^\infty (-1)^n a_n$ converges.

Stated differently, if the terms are alternating sign, decreasing in absolute size, and converging to zero, then the series converges.

When to use it

The key is in the name — if the series is alternating, then this is the goto idea of analysis. Note that if the terms of a series are alternating and decreasing, but the terms do not go to zero, then the series diverges by the $n$th term test.

Example

Suppose we are looking at the series
\begin{equation}
\sum_{n = 1}^\infty \frac{(-1)^n}{\log(n+1)} = \frac{-1}{\log 2} + \frac{1}{\log 3} + \frac{-1}{\log 4} + \cdots \notag
\end{equation}
The terms are alternating.
The sizes of the terms are $\frac{1}{\log (n+1)}$, and these are decreasing.
Finally,
\begin{equation}
\lim_{n \to \infty} \frac{1}{\log(n+1)} = 0. \notag
\end{equation}
Thus the alternating series test applies and shows that this series converges.

(more…)

Posted in Brown University, Math 100, Mathematics, Teaching | Tagged , , , | Leave a comment

Math 100: Completing the partial fractions example from class

An Unfinished Example

At the end of class today, someone asked if we could do another example of a partial fractions integral involving an irreducible quadratic. We decided to look at the integral

$$ \int \frac{1}{(x^2 + 4)(x+1)}dx. $$
Notice that $latex {x^2 + 4}$ is an irreducible quadratic polynomial. So when setting up the partial fraction decomposition, we treat the $latex {x^2 + 4}$ term as a whole.

So we seek to find a decomposition of the form

$$ \frac{1}{(x^2 + 4)(x+1)} = \frac{A}{x+1} + \frac{Bx + C}{x^2 + 4}. $$
Now that we have the decomposition set up, we need to solve for $latex {A,B,}$ and $latex {C}$ using whatever methods we feel most comfortable with. Multiplying through by $latex {(x^2 + 4)(x+1)}$ leads to

$$ 1 = A(x^2 + 4) + (Bx + C)(x+1) = (A + B)x^2 + (B + C)x + (4A + C). $$
Matching up coefficients leads to the system of equations

$$\begin{align}
0 &= A + B \\
0 &= B + C \\
1 &= 4A + C.
\end{align}$$
So we learn that $latex {A = -B = C}$, and $latex {A = 1/5}$. So $latex {B = -1/5}$ and $latex {C = 1/5}$.

Together, this means that

$$ \frac{1}{(x^2 + 4)(x+1)} = \frac{1}{5}\frac{1}{x+1} + \frac{1}{5} \frac{-x + 1}{x^2 + 4}. $$
Recall that if you wanted to, you could check this decomposition by finding a common denominator and checking through.

Now that we have performed the decomposition, we can return to the integral. We now have that

$$ \int \frac{1}{(x^2 + 4)(x+1)}dx = \underbrace{\int \frac{1}{5}\frac{1}{x+1}dx}_ {\text{first integral}} + \underbrace{\int \frac{1}{5} \frac{-x + 1}{x^2 + 4} dx.}_ {\text{second integral}} $$
We can handle both of the integrals on the right hand side.

The first integral is

$$ \frac{1}{5} \int \frac{1}{x+1} dx = \frac{1}{5} \ln (x+1) + C. $$

The second integral is a bit more complicated. It’s good to see if there is a simple $latex {u}$-substition, since there is an $latex {x}$ in the numerator and an $latex {x^2}$ in the denominator. But unfortunately, this integral needs to be further broken into two pieces that we know how to handle separately.

$$ \frac{1}{5} \int \frac{-x + 1}{x^2 + 4} dx = \underbrace{\frac{-1}{5} \int \frac{x}{x^2 + 4}dx}_ {\text{first piece}} + \underbrace{\frac{1}{5} \int \frac{1}{x^2 + 4}dx.}_ {\text{second piece}} $$

The first piece is now a $latex {u}$-substitution problem with $latex {u = x^2 + 4}$. Then $latex {du = 2x dx}$, and so

$$ \frac{-1}{5} \int \frac{x}{x^2 + 4}dx = \frac{-1}{10} \int \frac{du}{u} = \frac{-1}{10} \ln u + C = \frac{-1}{10} \ln (x^2 + 4) + C. $$

The second piece is one of the classic trig substitions. So we draw a triangle.

triangle

In this triangle, thinking of the bottom-left angle as $latex {\theta}$ (sorry, I forgot to label it), then we have that $latex {2\tan \theta = x}$ so that $latex {2 \sec^2 \theta d \theta = dx}$. We can express the so-called hard part of the triangle by $latex {2\sec \theta = \sqrt{x^2 + 4}}$.

Going back to our integral, we can think of $latex {x^2 + 4}$ as $latex {(\sqrt{x^2 + 4})^2}$ so that $latex {x^2 + 4 = (2 \sec \theta)^2 = 4 \sec^2 \theta}$. We can now write our integral as

$$ \frac{1}{5} \int \frac{1}{x^2 + 4}dx = \frac{1}{5} \int \frac{1}{4 \sec^2 \theta} 2 \sec^2 \theta d \theta = \frac{1}{5} \int \frac{1}{2} d\theta = \frac{1}{10} \theta. $$
As $latex {2 \tan \theta = x}$, we have that $latex {\theta = \text{arctan}(x/2)}$. Inserting this into our expression, we have

$$ \frac{1}{10} \int \frac{1}{x^2 + 4} dx = \frac{1}{10} \text{arctan}(x/2) + C. $$

Combining the first integral and the first and second parts of the second integral together (and combining all the constants $latex {C}$ into a single constant, which we also denote by $latex {C}$), we reach the final expression

$$ \int \frac{1}{(x^2 + 4)(x + 1)} dx = \frac{1}{5} \ln (x+1) – \frac{1}{10} \ln(x^2 + 4) + \frac{1}{10} \text{arctan}(x/2) + C. $$

And this is the answer.

Other Notes

If you have any questions or concerns, please let me know. As a reminder, I have office hours on Tuesday from 9:30–11:30 (or perhaps noon) in my office, and I highly recommend attending the Math Resource Center in the Kassar House from 8pm-10pm, offered Monday-Thursday. [Especially on Tuesday and Thursdays, when there tend to be fewer people there].

On my course page, I have linked to two additional resources. One is to Paul’s Online Math notes for partial fraction decomposition (which I think is quite a good resource). The other is to the Khan Academy for some additional worked through examples on polynomial long division, in case you wanted to see more worked examples. This note can also be found on my website, or in pdf form.

Good luck, and I’ll see you in class.

Posted in Math 100, Mathematics, Teaching | Tagged , | Leave a comment

Math 100 Fall 2013: Concluding Remarks

This is a post written towards my students in Calc II, Math 100 at Brown University, fall 2013. There will be many asides, written in italics. They are to serve as clarifications of method or true asides, to be digested or passed over.

The semester is over. All the grades are in and known, fall 2013 draws to a close. As you know, I’m interested in the statistics behind the course. I’d mentioned my previous analysis about the extremely high correlation between first midterm and final grade (much higher than I would have thought!). Let’s reveal the statistics and distribution of this course, below the fold.

(more…)

Posted in Brown University, Math 100, Mathematics, Teaching | Tagged , , , | 1 Comment

My Teaching

In fall 2016, I am currently teaching Math 100 (second semester calculus, starting with integration by parts and going through sequences and series) at Brown University.

In spring 2016, I designed and taught Math 42 (elementary number theory) at Brown University. My students were exceptional — check out a showcase of some of their final projects.

In fall 2014, I taught Math 170 (advanced placement second semester calculus) at Brown University.

I taught number theory in the Summer@Brown program for high school students in the summers of 2013-2015.

I taught a privately requested course in precalculus in the summer of 2013.

I have served as a TA (many, many, many times) for

  • Math 90 (first semester calculus) at Brown University
  • Math 100 (second semester calculus) at Brown University
  • Math 1501 (first semester calculus) at Georgia Tech
  • Math 1502 (second semester calculus, starting with sequences and series but also with 7 weeks of linear algebra) at Georgia Tech
  • Math 2401 (multivariable calculus) at Georgia Tech (there’s essentially no content on this site about this – this was just before I began to maintain a website)

I sometimes tutor at Brown (but not limited to Brown students) and around Boston, on a wide variety of topics (not just the ordinary, boring ones). I charge $80/hour, but I am not currently looking for tutees.

Below, you can find my most recent posts tagged under “Teaching”.

Posted in Brown University, Math 100, Teaching | Leave a comment

Math 100: Before second midterm

You have a midterm next week, and it’s not going to be a cakewalk.

As requested, I’m uploading the last five weeks’ worth of worksheets, with (my) solutions. A comment on the solutions: not everything is presented in full detail, but most things are presented with most detail (except for the occasional one that is far far beyond what we actually expect you to be able to do). If you have any questions about anything, let me know. Even better, ask it here – maybe others have the same questions too.

Without further ado –

And since we were unable to go over the quiz in my afternoon recitation today, I’m attaching a worked solution to the quiz as well.

Again, let me know if you have any questions. I will still have my office hours on Tuesday from 2:30-4:30pm in my office (I’m aware that this happens to be immediately before the exam – status not by design). And I’ll be more or less responsive by email.

Study study study!

Posted in Brown University, Math 100, Mathematics | Tagged , , , , , , , , , , | Leave a comment

Math 100: Week 4

This is a post for my math 100 calculus class of fall 2013. In this post, I give the 4th week’s recitation worksheet (no solutions yet – I’m still writing them up). More pertinently, we will also go over the most recent quiz and common mistakes. Trig substitution, it turns out, is not so easy.

Before we hop into the details, I’d like to encourage you all to avail of each other, your professor, your ta, and the MRC in preparation for the first midterm (next week!).

1. The quiz

There were two versions of the quiz this week, but they were very similar. Both asked about a particular trig substitution

$latex \displaystyle \int_3^6 \sqrt{36 – x^2} \mathrm{d} x $

And the other was

$latex \displaystyle \int_{2\sqrt 2}^4 \sqrt{16 – x^2} \mathrm{d}x. $

They are very similar, so I’m only going to go over one of them. I’ll go over the first one. We know we are to use trig substitution. I see two ways to proceed: either draw a reference triangle (which I recommend), or think through the Pythagorean trig identities until you find the one that works here (which I don’t recommend).

We see a $latex {\sqrt{36 – x^2}}$, and this is hard to deal with. Let’s draw a right triangle that has $latex {\sqrt{36 – x^2}}$ as a side. I’ve drawn one below. (Not fancy, but I need a better light).

In this picture, note that $latex {\sin \theta = \frac{x}{6}}$, or that $latex {x = 6 \sin \theta}$, and that $latex {\sqrt{36 – x^2} = 6 \cos \theta}$. If we substitute $latex {x = 6 \sin \theta}$ in our integral, this means that we can replace our $latex {\sqrt{36 – x^2}}$ with $latex {6 \cos \theta}$. But this is a substitution, so we need to think about $latex {\mathrm{d} x}$ too. Here, $latex {x = 6 \sin \theta}$ means that $latex {\mathrm{d}x = 6 \cos \theta}$.

Some people used the wrong trig substitution, meaning they used $latex {x = \tan \theta}$ or $latex {x = \sec \theta}$, and got stuck. It’s okay to get stuck, but if you notice that something isn’t working, it’s better to try something else than to stare at the paper for 10 minutes. Other people use $latex {x = 6 \cos \theta}$, which is perfectly doable and parallel to what I write below.

Another common error was people forgetting about the $latex {\mathrm{d}x}$ term entirely. But it’s important!.

Substituting these into our integral gives

$latex \displaystyle \int_{?}^{??} 36 \cos^2 (\theta) \mathrm{d}\theta, $

where I have included question marks for the limits because, as after most substitutions, they are different. You have a choice: you might go on and put everything back in terms of $latex {x}$ before you give your numerical answer; or you might find the new limits now.

It’s not correct to continue writing down the old limits. The variable has changed, and we really don’t want $latex {\theta}$ to go from $latex {3}$ to $latex {6}$.

If you were to find the new limits, then you need to consider: if $latex {x=3}$ and $latex {\frac{x}{6} = \sin \theta}$, then we want a $latex {\theta}$ such that $latex {\sin \theta = \frac{3}{6}= \frac{1}{2}}$, so we might use $latex {\theta = \pi/6}$. Similarly, when $latex {x = 6}$, we want $latex {\theta}$ such that $latex {\sin \theta = 1}$, like $latex {\theta = \pi/2}$. Note that these were two arcsine calculations, which we would have to do even if we waited until after we put everything back in terms of $latex {x}$ to evaluate.

Some people left their answers in terms of these arcsines. As far as mistakes go, this isn’t a very serious one. But this is the sort of simplification that is expected of you on exams, quizzes, and homeworks. In particular, if something can be written in a much simpler way through the unit circle, then you should do it if you have the time.

So we could rewrite our integral as

$latex \displaystyle \int_{\pi/6}^{\pi/2} 36 \cos^2 (\theta) \mathrm{d}\theta. $

How do we integrate $latex {\cos^2 \theta}$? We need to make use of the identity $latex {\cos^2 \theta = \dfrac{1 + \cos 2\theta}{2}}$. You should know this identity for this midterm. Now we have

$latex \displaystyle 36 \int_{\pi/6}^{\pi/2}\left(\frac{1}{2} + \frac{\cos 2 \theta}{2}\right) \mathrm{d}\theta = 18 \int_{\pi/6}^{\pi/2}\mathrm{d}\theta + 18 \int_{\pi/6}^{\pi/2}\cos 2\theta \mathrm{d}\theta. $

The first integral is extremely simple and yields $latex {6\pi}$ The second integral has antiderivative $latex {\dfrac{\sin 2 \theta}{2}}$ (Don’t forget the $latex {2}$ on bottom!), and we have to evaluate $latex {\big[9 \sin 2 \theta \big]_{\pi/6}^{\pi/2}}$, which gives $latex {-\dfrac{9 \sqrt 3}{2}}$. You should know the unit circle sufficiently well to evaluate this for your midterm.

And so the final answer is $latex {6 \pi – \dfrac{9 \sqrt 2}{2} \approx 11.0553}$. (You don’t need to be able to do that approximation).

Let’s go back a moment and suppose you didn’t re”{e}valuate the limits once you substituted in $latex {\theta}$. Then, following the same steps as above, you’d be left with

$latex \displaystyle 18 \int_{?}^{??}\mathrm{d}\theta + 18 \int_{?}^{??}\cos 2\theta \mathrm{d}\theta = \left[ 18 \theta \right]_?^{??} + \left[ 9 \sin 2 \theta \right]_?^{??}. $

Since $latex {\frac{x}{6} = \sin \theta}$, we know that $latex {\theta = \arcsin (x/6)}$. This is how we evaluate the left integral, and we are left with $latex {[18 \arcsin(x/6)]_3^6}$. This means we need to know the arcsine of $latex {1}$ and $latex {\frac 12}$. These are exactly the same two arcsine computations that I referenced above! Following them again, we get $latex {6\pi}$ as the answer.

We could do the same for the second part, since $latex {\sin ( 2 \arcsin (x/6))}$ when $latex {x = 3}$ is $latex {\sin (2 \arcsin \frac{1}{2} ) = \sin (2 \cdot \frac{\pi}{6} ) = \frac{\sqrt 3}{2}}$; and when $latex {x = 6}$ we get $latex {\sin (2 \arcsin 1) = \sin (2 \cdot \frac{\pi}{2}) = \sin (\pi) = 0}$.

Putting these together, we see that the answer is again $latex {6\pi – \frac{9\sqrt 3}{2}}$.

Or, throwing yet another option out there, we could do something else (a little bit wittier, maybe?). We have this $latex {\sin 2\theta}$ term to deal with. You might recall that $latex {\sin 2 \theta = 2 \sin \theta \cos \theta}$, the so-called double-angle identity.

Then $latex {9 \sin 2\theta = 18 \sin \theta \cos \theta}$. Going back to our reference triangle, we know that $latex {\cos \theta = \dfrac{\sqrt{36 – x^2}}{6}}$ and that $latex {\sin \theta = \dfrac{x}{6}}$. Putting these together,

$latex \displaystyle 9 \sin 2 \theta = \dfrac{ x\sqrt{36 – x^2} }{2}. $

When $latex {x=6}$, this is $latex {0}$. When $latex {x = 3}$, we have $latex {\dfrac{ 3\sqrt {27}}{2} = \dfrac{9\sqrt 3}{2}}$.

And fortunately, we get the same answer again at the end of the day. (phew).

2. The worksheet

Finally, here is the worksheet for the day. I’m working on their solutions, and I’ll have that up by late this evening (sorry for the delay).

Ending tidbits – when I was last a TA, I tried to see what were the good predictors of final grade. Some things weren’t very surprising – there is a large correlation between exam scores and final grade. Some things were a bit surprising – low homework scores correlated well with low final grade, but high homework scores didn’t really have a strong correlation with final grade at all; attendance also correlated weakly. But one thing that really stuck with me was the first midterm grade vs final grade in class: it was really strong. For a bit more on that, I refer you to my final post from my Math 90 posts.

Posted in Brown University, Math 100, Mathematics | Tagged , , , , , , , , , , , | Leave a comment

Math 100: Week 3 and pre-midterm

This is a post for my Math 100 class of fall 2013. In this post, I give the first three weeks’ worksheets from recitation and the set of solutions to week three’s worksheet, as well as a few administrative details.

Firstly, here is the recitation work from the first three weeks:

  1. (there was no recitation the first week)
  2. A worksheet focusing on review.
  3. A worksheet focusing on integration by parts and u-substitution, with solutions.

In addition, I’d like to remind you that I have office hours from 2-4pm (right now) in Kassar 018. I’ve had multiple people set up appointments with me outside of these hours, which I’m tempted to interpret as suggesting that I change when my office hours are. If you have a preference, let me know, and I’ll try to incorporate it.

Finally, there will be an exam next Tuesday. I’ve been getting a lot of emails about what material will be on the exam. The answer is that everything you have learned up to now and by the end of this week is fair game for exam material. This also means there could be exam questions on material that we have not discussed in recitation. So be prepared. However, I will be setting aside a much larger portion of recitation this Thursday for questions than normal. So come prepared with your questions.

Best of luck, and I’ll see you in class on Thursday.

Posted in Brown University, Math 100, Mathematics | Tagged , , , , , , , , , | Leave a comment

An intuitive introduction to calculus

This is a post written for my fall 2013 Math 100 class but largely intended for anyone with knowledge of what a function is and a desire to know what calculus is all about. Calculus is made out to be the pinnacle of the high school math curriculum, and correspondingly is thought to be very hard. But the difficulty is bloated, blown out of proportion. In fact, the ideas behind calculus are approachable and even intuitive if thought about in the right way.

Many people managed to stumble across the page before I’d finished all the graphics. I’m sorry, but they’re all done now! I was having trouble interpreting how WordPress was going to handle my gif files – it turns out that they automagically resize them if you don’t make them of the correct size, which makes them not display. It took me a bit to realize this. I’d like to mention that this actually started as a 90 minute talk I had with my wife over coffee, so perhaps an alternate title would be “Learning calculus in 2 hours over a cup of coffee.”

So read on if you would like to understand what calculus is, or if you’re looking for a refresher of the concepts from a first semester in calculus (like for Math 100 students at Brown), or if you’re looking for a bird’s eye view of AP Calc AB subject material.

1. An intuitive and semicomplete introduction to calculus

We will think of a function $ {f(\cdot)}$ as something that takes an input $ {x}$ and gives out another number, which we’ll denote by $ {f(x)}$. We know functions like $ {f(x) = x^2 + 1}$, which means that if I give in a number $ {x}$ then the function returns the number $ {f(x) = x^2 + 1}$. So I put in $ {1}$, I get $ {1^2 + 1 = 2}$, i.e. $ {f(1) = 2}$. Primary and secondary school overly conditions students to think of functions in terms of a formula or equation. The important thing to remember is that a function is really just something that gives an output when given an input, and if the same input is given later then the function spits the same output out. As an aside, I should mention that the most common problem I’ve seen in my teaching and tutoring is a fundamental misunderstanding of functions and their graphs

For a function that takes in and spits out numbers, we can associate a graph. A graph is a two-dimensional representation of our function, where by convention the input is put on the horizontal axis and the output is put on the vertical axis. Each axis is numbered, and in this way we can identify any point in the graph by its coordinates, i.e. its horizontal and vertical position. A graph of a function $ {f(x)}$ includes a point $ {(x,y)}$ if $ {y = f(x)}$.

The graph of the function $ x^2 + 1$ is in blue. The emphasized point appears on the graph because it is of the form $ (x, f(x))$. In particular, this point is $ (1, 2)$.

Thus each point on the graph is really of the form $ {(x, f(x))}$. A large portion of algebra I and II is devoted to being able to draw graphs for a variety of functions. And if you think about it, graphs contain a huge amount of information. Graphing $ {f(x)= x^2 + 1}$ involves drawing an upwards-facing parabola, which really represents an infinite number of points. That’s pretty intense, but it’s not what I want to focus on here.

1.1. Generalizing slope – introducing the derivative

You might recall the idea of the ‘slope’ of a line. A line has a constant ratio of how much the $ {y}$ value changes for a specific change in $ {x}$, which we call the slope (people always seem to remember rise over run). In particular, if a line passes through the points $ {(x_1, y_1)}$ and $ {(x_2, y_2)}$, then its slope will be the vertical change $ {y_2 – y_1}$ divided by the horizontal change $ {x_2 – x_1}$, or $ {\dfrac{y_2 – y_1}{x_2 – x_1}}$.

The graph of a line appears in blue. The two points $ (0,1)$ and $ (1,3)$ are shown on the line. The horizontal red line shows the horizontal change. The vertical red line shows the vertical change. The ‘slope’ of the blue line is the length of the vertical red line divided by the length of the horizontal red line.

So if the line is given by an equation $ {f(x) = \text{something}}$, then the slope from two inputs $ {x_1}$ and $ {x_2}$ is $ {\dfrac{f(x_2) – f(x_1)}{x_2 – x_1}}$. As an aside, for those that remember things like the ‘standard equation’ $ {y = mx + b}$ or ‘point-slope’ $ {(y – y_0) = m(x – x_0)}$ but who have never thought or been taught where these come from: the claim that lines are the curves of constant slope is saying that for any choice of $ {(x_1, y_1)}$ on the line, we expect $ {\dfrac{y_2 – y_1}{x_2 – x_1} = m}$ a constant, which I denote by $ {m}$ for no particularly good reason other than the fact that some textbook author long ago did such a thing. Since we’re allowing ourselves to choose any $ {(x_1, y_1)}$, we might drop the subscripts – since they usually mean a constant – and rearrange our equation to give $ {y_2 – y = m(x_2 – x)}$, which is what has been so unkindly drilled into students’ heads as the ‘point-slope form.’ This is why lines have a point-slope form, and a reason that it comes up so much is that it comes so naturally from the defining characteristic of a line, i.e. constant slope.

But one cannot speak of the ‘slope’ of a parabola.

The parabola $ f(x) = x^2 + 1$ is shows in blue. Slope is a measure of how much the function $ f(x)$ changes when $ x$ is changed. Some tangent lines to the parabola are shown in red. The slope of each line seems like it should be the ‘slope’ of the parabola when the line touches the parabola, but these slopes are different.

Intuitively, we look at our parabola $ {x^2 + 1}$ and see that the ‘slope,’ or an estimate of how much the function $ {f(x)}$ changes with a change in $ {x}$, seems to be changing depending on what $ {x}$ values we choose. (This should make sense – if it didn’t change, and had constant slope, then it would be a line). The first major goal of calculus is to come up with an idea of a ‘slope’ for non-linear functions. I should add that we already know a sort of ‘instantaneous rate of change’ of a nonlinear function. When we’re in a car and we’re driving somewhere, we’re usually speeding up or slowing down, and our pace isn’t usually linear. Yet our speedometer still manages to say how fast we’re going, which is an immediate rate of change. So if we had a function $ {p(t)}$ that gave us our position at a time $ {t}$, then the slope would give us our velocity (change in position per change in time) at a moment. So without knowing it, we’re familiar with a generalized slope already. Now in our parabola, we don’t expect a constant slope, so we want to associate a ‘slope’ to each input $ {x}$. In other words, we want to be able to understand how rapidly the function $ {f(x)}$ is changing at each $ {x}$, analogous to how the slope $ {m}$ of a line $ {g(x) = mx + b}$ tells us that if we change our input by an amount $ {h}$ then our output value will change by $ {mh}$.

How does calculus do that? The idea is to get closer and closer approximations. Suppose we want to find the ‘slope’ of our parabola at the point $ {x = 1}$. Let’s get an approximate answer. The slope of the line coming from inputs $ {x = 1}$ and $ {x = 2}$ is a (poor) approximation. In particular, since we’re working with $ {f(x) = x^2 + 1}$, we have that $ {f(2) = 5}$ and $ {f(1) = 2}$, so that the ‘approximate slope’ from $ {x = 1}$ and $ {x = 2}$ is $ {\frac{5 – 2}{2 – 1} = 3}$. But looking at the graph,

The parabola $ x^2 + 1$ is shown in blue, and the line going through the points $ (1,2)$ and $ (2,5)$ is shown. The line immediately goes above and crosses the parabola, so it seems like this line is rising faster (changing faster) than the parabola. It’s too steep, and the slope is too high to reflect the ‘slope’ of the parabola at the indicated point.

we see that it feels like this slope is too large. So let’s get closer. Suppose we use inputs $ {x = 1}$ and $ {x = 1.5}$. We get that the approximate slope is $ {\frac{3.25 – 2}{1.5 – 1} = 2.5}$. If we were to graph it, this would also feel too large. So we can keep choosing smaller and smaller changes, like using $ {x = 1}$ and $ {x = 1.1}$, or $ {x = 1}$ and $ {x = 1.01}$, and so on. This next graphic contains these approximations, with chosen points getting closer and closer to $ {1}$.

The parabola $ x^2 + 1$ is shown in blue. Two points are chosen on the parabola and the line between them is drawn in red. As the points get closer to each other, the red line indicates the rate of growth of the parabola at the point $ (1,2)$ better and better. So the slope of the red lines seems to be getting closer to the ‘slope’ of the parabola at $ (1,2)$.

Let’s look a little closer at the values we’re getting for our slopes when we use $ {1}$ and $ {2, 1.5, 1.1, 1.01, 1.001}$ as our inputs. We get

$ \displaystyle \begin{array}{c|c} \text{second input} & \text{approx. slope} \\ \hline 2 & 3 \\ 1.5 & 2.5 \\ 1.1 & 2.1 \\ 1.01 & 2.01 \\ 1.001 & 2.001 \end{array} $

It looks like the approximate slopes are approaching $ {2}$. What if we plot the graph with a line of slope $ {2}$ going through the point $ {(1,2)}$?

The parabola $ x^2 + 1$ is shown in blue. The line in red has slope $ 2$ and goes through the point $ (1,2)$. We got this line by continuing the successive approximations done above. It looks like it accurately indicates the ‘slope’ of the parabola at $ (1,2)$.

It looks great! Let’s zoom in a whole lot.

When we zoom in, the blue parabola looks almost like a line, and the red line looks almost like the parabola! This is why we are measuring the ‘slope’ of the parabola in this fashion – when we zoom in, it looks more and more like a line, and we are getting the slope of that line.

That looks really close! In fact, what I’ve been allowing as the natural feeling slope, or local rate of change, is really the line tangent to the graph of our function at the point $ {(1, f(1))}$. In a calculus class, you’ll spend a bit of time making sense of what it means for the approximate slopes to ‘approach’ $ {2}$. This is called a ‘limit,’ and the details are not important to us right now. The important thing is that this let us get an idea of a ‘slope’ at a point on a parabola. It’s not really a slope, because a parabola isn’t a line. So we’ve given it a different name – we call this ‘the derivative.’ So the derivative of $ {f(x) = x^2 + 1}$ at $ {x = 1}$ is $ {2}$, i.e. right around $ {x = 1}$ we expect a rate of change of $ {2}$, so that we expect $ {f(1 + h) – f(1) \approx 2h}$. If you think about it, we’re saying that we can approximate $ {f(x) = x^2 + 1}$ near the point $ {(1, 2)}$ by the line shown in the graph above: this line passes through $ {(1,2)}$ and it’s slope is $ {2}$, what we’re calling the slope of $ {f(x) = x^2 + 1}$ at $ {x = 1}$.

Let’s generalize. We were able to speak of the derivative at one point, but how about other points? The rest of this post is below the ‘more’ tag below.

(more…)

Posted in Brown University, Expository, Math 100, Mathematics | Tagged , , , , , , , , , , , , , | 6 Comments