Tag Archives: taylor series

Computing pi with tools from Calculus

Computing $\pi$

This note was originally written in the context of my fall Math 100 class at Brown University. It is also available as a pdf note.

While investigating Taylor series, we proved that
\begin{equation}\label{eq:base}
\frac{\pi}{4} = 1 – \frac{1}{3} + \frac{1}{5} – \frac{1}{7} + \frac{1}{9} + \cdots
\end{equation}
Let’s remind ourselves how. Begin with the geometric series
\begin{equation}
\frac{1}{1 + x^2} = 1 – x^2 + x^4 – x^6 + x^8 + \cdots = \sum_{n = 0}^\infty (-1)^n x^{2n}. \notag
\end{equation}
(We showed that this has interval of convergence $\lvert x \rvert < 1$). Integrating this geometric series yields
\begin{equation}
\int_0^x \frac{1}{1 + t^2} dt = x – \frac{x^3}{3} + \frac{x^5}{5} – \frac{x^7}{7} + \cdots = \sum_{n = 0}^\infty (-1)^n \frac{x^{2n+1}}{2n+1}. \notag
\end{equation}
Note that this has interval of convergence $-1 < x \leq 1$.

We also recognize this integral as
\begin{equation}
\int_0^x \frac{1}{1 + t^2} dt = \text{arctan}(x), \notag
\end{equation}
one of the common integrals arising from trigonometric substitution. Putting these together, we find that
\begin{equation}
\text{arctan}(x) = x – \frac{x^3}{3} + \frac{x^5}{5} – \frac{x^7}{7} + \cdots = \sum_{n = 0}^\infty (-1)^n \frac{x^{2n+1}}{2n+1}. \notag
\end{equation}
As $x = 1$ is within the interval of convergence, we can substitute $x = 1$ into the series to find the representation
\begin{equation}
\text{arctan}(1) = 1 – \frac{1}{3} + \frac{1}{5} – \frac{1}{7} + \cdots = \sum_{n = 0}^\infty (-1)^n \frac{1}{2n+1}. \notag
\end{equation}
Since $\text{arctan}(1) = \frac{\pi}{4}$, this gives the representation for $\pi/4$ given in \eqref{eq:base}.

However, since $x=1$ was at the very edge of the interval of convergence, this series converges very, very slowly. For instance, using the first $50$ terms gives the approximation
\begin{equation}
\pi \approx 3.121594652591011. \notag
\end{equation}
The expansion of $\pi$ is actually
\begin{equation}
\pi = 3.141592653589793238462\ldots \notag
\end{equation}
So the first $50$ terms of \eqref{eq:base} gives two digits of accuracy. That’s not very good.

I think it is very natural to ask: can we do better? This series converges slowly — can we find one that converges more quickly?

Aside

As an aside: one might also ask if we can somehow speed up the convergence of the series we already have. It turns out that in many cases, you can! For example, we know in alternating series that the sum of the whole series is between any two consecutive partial sums. So what if you took the average of two consecutive partial sums? [Equivalently, what if you added only one half of the last term in a partial sum. Do you see why these are the same?]

The average of the partial sum of the first 49 terms and the partial sum of the first 50 terms is actually
\begin{equation}
3.141796672793031, \notag
\end{equation}
which is correct to within $0.001$. That’s an improvement!

What if you do still more? More on this can be found in the last Section.

Estimating $\pi$ through a different series

We return to the question: can we find a series that gives us $\pi$, but which converges faster? Yes we can! And we don’t have to look too far — we can continue to rely on our expansion for $\text{arctan}(x)$.

We had been using that $\text{arctan}(1) = \frac{\pi}{4}$. But we also know that $\text{arctan}(1/\sqrt{3}) = \frac{\pi}{6}$. Since $1/\sqrt{3}$ is closer to the center of the power series than $1$, we should expect that the convergence is much better.

Recall that
\begin{equation}
\text{arctan}(x) = x – \frac{x^3}{3} + \frac{x^5}{5} – \frac{x^7}{7} + \cdots = \sum_{n = 0}^\infty (-1)^n \frac{x^{2n + 1}}{2n + 1}. \notag
\end{equation}
Then we have that
\begin{align}
\text{arctan}\left(\frac{1}{\sqrt 3}\right) &= \frac{1}{\sqrt 3} – \frac{1}{3(\sqrt 3)^3} + \frac{1}{5(\sqrt 3)^5} + \cdots \notag \\
&= \frac{1}{\sqrt 3} \left(1 – \frac{1}{3 \cdot 3} + \frac{1}{5 \cdot 3^2} – \frac{1}{7 \cdot 3^3} + \cdots \right) \notag \\
&= \frac{1}{\sqrt 3} \sum_{n = 0}^\infty (-1)^n \frac{1}{(2n + 1) 3^n}. \notag
\end{align}
Therefore, we have the equality
\begin{equation}
\frac{\pi}{6} = \frac{1}{\sqrt 3} \sum_{n = 0}^\infty (-1)^n \frac{1}{(2n + 1) 3^n} \notag
\end{equation}
or rather that
\begin{equation}
\pi = 2 \sqrt{3} \sum_{n = 0}^\infty (-1)^n \frac{1}{(2n + 1) 3^n}. \notag
\end{equation}
From a computation perspective, this is far superior. For instance, based on our understanding of error from the alternating series test, using the first $10$ terms of this series will approximate $\pi$ to within
\begin{equation}
2 \sqrt 3 \frac{1}{23 \cdot 3^{11}} \approx \frac{1}{26680}. \notag
\end{equation}

Let’s check this.
\begin{equation}
2 \sqrt 3 \left(1 – \frac{1}{3\cdot 3} + \frac{1}{5 \cdot 3^2} + \cdots + \frac{1}{21 \cdot 3^{10}}\right) = 3.1415933045030813. \notag
\end{equation}
Look at how close that approximation is, and we only used the first $10$ terms!
Roughly speaking, each additional 2.5 terms yields another digit of $\pi$. Using the first $100$ terms would give the first 48 digits of $\pi$.
Using the first million terms would give the first 47000 (or so) digits of $\pi$ — and this is definitely doable, even on a personal laptop. (On my laptop, it takes approximately 4 milliseconds to compute the first 20 digits of $\pi$ using this technique).

Even Better Series

I think it is very natural to ask again: can we find an even faster converging series? Perhaps we can choose better values to evaluate arctan at? This turns out to be a very useful line of thought, and it leads to some of the best-known methods for evaluating $\pi$. Through clever choices of values and identities involving arctangents, one can construct extremely quickly converging series for $\pi$. For more information on this line of thought, look up Machin-like formula.

Patterns in the Approximation of $\pi/4$

 

Looking back at the approximation of $\pi$ coming from the first $50$ terms of the series
\begin{equation}\label{eq:series_pi4_base}
1 – \frac{1}{3} + \frac{1}{5} – \frac{1}{7} + \cdots
\end{equation}
we found an approximation of $\pi$, which I’ll represent as $\widehat{\pi}$,
\begin{equation}
\pi \approx \widehat{\pi} = 3.121594652591011. \notag
\end{equation}
Let’s look very carefully at how this compares to $\pi$, up to the first $10$ decimals. We color the incorrect digits in ${\color{orange}{orange}}$.
\begin{align}
\pi &= 3.1415926535\ldots \notag \\
\widehat{\pi} &= 3.1{\color{orange}2}159{\color{orange}4}65{\color{orange}2}5 \notag
\end{align}
Notice that most of the digits are correct — in fact, only three (of the first ten) are incorrect! Isn’t that weird?

It happens to be that when one uses the first $10^N / 2$ terms (for any $N$) of the series \eqref{eq:series_pi4_base}, there will be a pattern of mostly correct digits with disjoint strings of incorrect digits in the middle. This is an unusual and surprising phenomenon.

The positions of the incorrect digits can be predicted. Although I won’t go into any detail here, the positions of the errors are closely related to something called Euler Numbers¬†or, more deeply, to Boole Summation.

Playing with infinite series leads to all sorts of interesting patterns. There is a great history of mathematicians and physicists messing around with series and stumbling across really deep ideas.

Speeding up computation

Take an alternating series
\begin{equation}
\sum_{n = 0}^\infty (-1)^{n} a_n = a_0 – a_1 + a_2 – a_3 + \cdots \notag
\end{equation}
If ${a_n}$ is a sequence of positive, decreasing terms with limit $0$, then the alternating series converges to some value $S$. And further, consecutive partial sums bound the value of $S$, in that
\begin{equation}
\sum_{n = 0}^{2K-1} (-1)^{n} a_n \leq S \leq \sum_{n = 1}^{2K} (-1)^{n} a_n. \notag
\end{equation}
For example,
\begin{equation}
1 – \frac{1}{3} < \sum_{n = 0}^\infty \frac{(-1)^{n}}{2n+1} < 1 – \frac{1}{3} + \frac{1}{5}. \notag
\end{equation}

Instead of approximating the value of the whole sum $S$ by the $K$th partial sum $\sum_{n \leq K} (-1)^n a_n$, it might seem reasonable to approximate $S$ by the average of the $(K-1)$st partial sum and the $K$th partial sum. Since we know $S$ is between the two, taking their average might be closer to the real result.

As mentioned above, the average of the partial sum consisting of the first $49$ terms of \eqref{eq:base} and the first $50$ terms of \eqref{eq:base} gives a much improved estimate of $\pi$ than using either the first $49$ or first $50$ terms on their own. (And indeed, it works much better than even the first $500$ terms on their own).

Before we go on, let’s introduce a little notation. Let $S_K$ denote the partial sum of the terms up to $K$, i.e.
\begin{equation}
S_K = \sum_{n = 0}^K (-1)^{n} a_n. \notag
\end{equation}
Then the idea is that instead of using $S_{K}$ to approximate the wholse sum $S$, we’ll use the average
\begin{equation}
\frac{S_{K-1} + S_{K}}{2} \approx S. \notag
\end{equation}

Averaging once seems like a great idea. What if we average again? That is, what if instead of using the average of $S_{K-1}$ and $S_K$, we actually use the average of (the average of $S_{K-2}$ and $S_{K-1}$) and (the average of $S_{K_1}$ and $S_K$),
\begin{equation}\label{eq:avgavg}
\frac{\frac{S_{K-2} + S_{K-1}}{2} + \frac{S_{K-1} + S_{K}}{2}}{2}.
\end{equation}
As this is really annoying to write, let’s come up with some new notation. Write the average between a quantity $X$ and $Y$ as
\begin{equation}
[X, Y] = \frac{X + Y}{2}. \notag
\end{equation}
Further, define the average of $[X, Y]$ and $[Y, Z]$ to be $[X, Y, Z]$,
\begin{equation}
[X, Y, Z] = \frac{[X, Y] + [Y, Z]}{2} = \frac{\frac{X + Y}{2} + \frac{Y + Z}{2}}{2}. \notag
\end{equation}
So the long expression in \eqref{eq:avgavg} can be written as $[S_{K-2}, S_{K-1}, S_{K}]$.

With this notation in mind, let’s compute some numerics. Below, we give the actual value of $\pi$, the values of $S_{48}, S_{49}$, and $S_{50}$, pairwise averages, and the average-of-the-average, in the case of $1 – \frac{1}{3} + \frac{1}{5} + \cdots$.
\begin{equation} \notag
\begin{array}{c|l|l}
& \text{Value} & \text{Difference from } \pi \\ \hline
\pi & 3.141592653589793238462\ldots & \phantom{-}0 \\ \hline
4 \cdot S_{48} & 3.1207615795929895 & \phantom{-}0.020831073996803617 \\ \hline
4 \cdot S_{49} & 3.161998692995051 & -0.020406039405258092 \\ \hline
4 \cdot S_{50} & 3.121594652591011 & \phantom{-}0.01999800099878213 \\ \hline
4 \cdot [S_{48}, S_{49}] & 3.1413801362940204 & \phantom{-}0.0002125172957727628 \\ \hline
4 \cdot [S_{49}, S_{50}] & 3.1417966727930313 & -0.00020401920323820377 \\ \hline
4 \cdot [S_{48}, S_{49}, S_{50}] & 3.141588404543526 & \phantom{-}0.00000424904626727951 \\ \hline
\end{array}
\end{equation}
So using the average of averages from the three sums $S_{48}, S_{49}$, and $S_{50}$ gives $\pi$ to within $4.2 \cdot 10^{-6}$, an incredible improvement compared to $S_{50}$ on its own.

There is something really odd going on here. We are not computing additional summands in the overall sum \eqref{eq:base}. We are merely combining some of our partial results together in a really simple way, repeatedly. Somehow, the sequence of partial sums contains more information about the limit $S$ than individual terms, and we are able to extract some of this information.

I think there is a very natural question. What if we didn’t stop now? What if we took averages-of-averages-of-averages, and averages-of-averages-of-averages-of-averages, and so on? Indeed, we might define the average
\begin{equation}
[X, Y, Z, W] = \frac{[X, Y, Z] + [Y, Z, W]}{2}, \notag
\end{equation}
and so on for larger numbers of terms. In this case, it happens to be that
\begin{equation}
[S_{15}, S_{16}, \ldots, S_{50}] = 3.141592653589794,
\end{equation}
which has the first 15 digits of $\pi$ correct!

By repeatedly averaging alternating sums of just the first $50$ reciprocals of odd integers, we can find $\pi$ up to 15 digits. I think that’s incredible — it seems both harder than it might have been (as this involves lots of averaging) and much easier than it might have been (as the only arithmetic input are the fractions $1/(2n+1)$ for $n$ up to $50$.

Although we leave the thread of ideas here, there are plenty of questions that I think are now asking themselves. I encourage you to ask them, and we may return to this (or related) topics in the future. I’ll see you in class.

Posted in Brown University, Expository, Math 100, Mathematics, Teaching | Tagged , , , | Leave a comment

An Intuitive Overview of Taylor Series

This is a note written for my fall 2013 Math 100 class, but it was not written “for the exam,” nor does anything on here subtly hint at anything on any exam. But I hope that this will be helpful for anyone who wants to get a basic understanding of Taylor series. What I want to do is try to get some sort of intuitive grasp on Taylor series as approximations of functions. By intuitive, I mean intuitive to those with a good grasp of functions, the basics of a first semester of calculus (derivatives, integrals, the mean value theorem, and the fundamental theorem of calculus) – so it’s a mathematical intuition. In this way, this post is a sort of follow-up of my earlier note, An Intuitive Introduction to Calculus.

PLEASE NOTE that my math compiler and my markdown compiler sometimes compete, and sometimes repeated derivatives are too high or too low by one pixel.

We care about Taylor series because they allow us to approximate other functions in predictable ways. Sometimes, these approximations can be made to be very, very, very accurate without requiring too much computing power. You might have heard that computers/calculators routinely use Taylor series to calculate things like $latex {e^x}$ (which is more or less often true). But up to this point in most students’ mathematical development, most mathematics has been clean and perfect; everything has been exact algorithms yielding exact answers for years and years. This is simply not the way of the world.

Here’s a fundamental fact to both mathematics and life: almost anything worth doing is probably pretty hard and pretty messy.

For a very recognizable example, let’s think about finding zeroes of polynomials. Finding roots of linear polynomials is very easy. If we see $latex {5 + x = 0}$, we see that $latex {-5}$ is the zero. Similarly, finding roots of quadratic polynomials is very easy, and many of us have memorized the quadratic formula to this end. Thus $latex {ax^2 + bx + c = 0}$ has solutions $latex {x = \frac{-b \pm \sqrt{b^2 – 4ac}}{2a}}$. These are both nice, algorithmic, and exact. But I will guess that the vast majority of those who read this have never seen a “cubic polynomial formula” for finding roots of cubic polynomials (although it does exist, it is horrendously messy – look up Cardano’s formula). There is even an algorithmic way of finding the roots of quartic polynomials. But here’s something amazing: there is no general method for finding the exact roots of 5th degree polynomials (or higher degree).

I don’t mean We haven’t found it yet, but there may be one, or even You’ll have to use one of these myriad ways – I mean it has been shown that there is no general method of finding exact roots of degree 5 or higher polynomials. But we certainly can approximate them arbitrarily well. So even something as simple as finding roots of polynomials, which we’ve been doing since we were in middle school, gets incredibly and unbelievably complicated.

So before we hop into Taylor series directly, I want to get into the mindset of approximating functions with other functions.

1. Approximating functions with other functions

We like working with polynomials because they’re so easy to calculate and manipulate. So sometimes we try to approximate complicated functions with polynomials, a problem sometimes called
polynomial interpolation”.

Suppose we wanted to approximate $latex {\sin(x)}$. The most naive approximation that we might do is see that $latex {\sin(0) = 0}$, so we might approximate $latex {\sin(x)}$ by $latex {p_0(x) = 0}$. We know that it’s right at least once, and since $latex {\sin(x)}$ is periodic, it’s going to be right many times. I write $latex {p_0}$ to indicate that this is a degree $latex {0}$ polynomial, that is, a constant polynomial. Clearly though, this is a terrible approximation, and we can do better.

(more…)

Posted in Expository, Mathematics, Teaching | Tagged , , , , , , , , | 21 Comments