Polynomial Interpolation

A suppose you want to approximate some function from a finite set of points.

(x_0, y_0), (x_1, y_1), ... (x_{n-1}, y_{n-1})

Specifically, we want a polynomial P(x) that:

  1. Approximates the function between x_0 and x_{n-1}, and
  2. Pass through every point (P(x_i) = y_i for all 0 \le i \le n-1)

Lagrange Interpolation

  1. Lagrange polynomial for each data point x_i:

\boxed{ l_i (x) = {\LARGE{\Pi}}_{j=0, j \ne i}^{n-1} \frac{(x-x_j)}{(x_i - x_j)} }

  1. Lagrange interpolation of data set:

\boxed{ L(x) = \sum_{i=0}^{n-1} y_i l_i (x) }

Example: Deriving Lagrange interpolation

We first create a Lagrange polynomial for each data point.

This polynomial will equal 1 when the polynomial is passing through its respective point, but 0 when passing through other data points.

For all other points in-between, it’ll be some interpolated value we won’t care about.

Lagrange polynomial for some data point at x=1

These polynomials only contribute to their own data point.

What’s the purpose of this?

After finding the Lagrange polynomial of each data point, we can construct the Lagrange interpolation, a polynomial that goes through all the data points.

To do this, just multiply each Lagrange polynomial by its respective point’s y-value to make it pass through its data point. Then, add every term together.

That’s all getting a bit ahead of ourselves though, let’s return to formalling defining a Lagrange polynomial.

More formally, we want to construct a function:

l_i (x) = \begin{cases} 1 &\quad x = x_i, \\ 0 &\quad x = x_j, j \ne i \end{cases}

Making a polynomial equal to zero at certain points is easy.

For example, the polynomial y(x) = (x-a)(x-b)(x-c) is obviously zero when x is a, b, or c.


Suppose you have some data points: x_0 = 0, x_1 = 1, x_2 = 2, and x_3 = 3.

How would you make a polynomial that follows the “one at my point, zero at the other points” rule?

Focusing on x_2 as a concrete example, making the polynomial equal zero is easy:

l_2^* (x) = (x-0) (x-1) (x-3)

This work-in-progress polynomial follows the zero rule, but doesn’t yet equal one at its respective point.

To do that, we just have to divide the polynomial by itself, but replace x with i in the denominator (in our example, i=2).

l_2 (x) = \frac{(x-0) (x-1) (x-3)}{(2-0) (2-1) (2-3)}

Graph these two polynomials in your calculator of choice if you need more visualization.


So, in general for a set of points x_i (for i=0,1,...,n), the ith Lagrange polynomial is:

\boxed{ l_i (x) = {\LARGE{\Pi}}_{j=0, j \ne i}^{n-1} \frac{(x-x_j)}{(x_i - x_j)} }


To combine all out Lagrange polynomials into a single polynomial that applies to the whole data set, we use the Lagrange interpolation formula.

\boxed{ p(x) = L(x) = \sum_{i=0}^{n - 1} y_i l_i (x) }

The resulting Lagrange polynomial will end up being at most a power of n-1, where n is the number of data points.

Example: Apply Lagrange interpolation

Q: Find the Lagrange interpolation of the following data

(0,1),(1,2),(2,4),(3,10)

A:

1. Get Lagrange polynomials for each data point

\begin{aligned} l_0 (x) &= \frac{x - x_1}{x_0 - x_1} \times \frac{x - x_2}{x_0 - x_2} \times \frac{x - x_3}{x_0 - x_3} \\ &= \frac{x - 1}{0 - 1} \times \frac{x - 2}{0 - 2} \times \frac{x - 3}{0 - 3} \\ &= \frac{-x^3 + 6x^2 - 11x + 6}{6} \\ &= -\frac{1}{6}x^3 + x^2 - \frac{11}{6}x + 1 \\ \end{aligned}

\begin{aligned} l_1 (x) &= \frac{x - x_0}{x_1 - x_0} \times \frac{x - x_2}{x_1 - x_2} \times \frac{x - x_3}{x_1 - x_3} \\ &= \frac{x - 0}{1 - 0} \times \frac{x - 2}{1 - 2} \times \frac{x - 3}{1 - 3} \\ &= \frac{x^3 - 5x^2 + 6x}{2} \\ &= \frac{1}{2}x^3 - \frac{5}{2}x^2 + 3x \\ \end{aligned}

\begin{aligned} l_2 (x) &= \frac{x - x_0}{x_2 - x_0} \times \frac{x - x_1}{x_2 - x_1} \times \frac{x - x_3}{x_2 - x_3} \\ &= \frac{x - 0}{2 - 0} \times \frac{x - 1}{2 - 1} \times \frac{x - 3}{2 - 3} \\ &= \frac{-x^2 + 4x^2 - 3x}{2} \\ &= -\frac{1}{2}x^3 + 2x^2 - \frac{3}{2}x \\ \end{aligned}

\begin{aligned} l_3 (x) &= \frac{x - x_0}{x_3 - x_0} \times \frac{x - x_1}{x_3 - x_1} \times \frac{x - x_2}{x_3 - x_2} \\ &= \frac{x - 0}{3 - 0} \times \frac{x - 1}{3 - 1} \times \frac{x - 2}{3 - 2} \\ &= \frac{x^3 - 3x^2 + 2x}{6} \\ &= \frac{1}{6}x^3 - \frac{1}{2}x^2 + \frac{1}{3}x \end{aligned}

Note — You can plug in the x’s from the data to check correctness for every basis polynomial.

2. Get Lagrange interpolation

Now we apply the y-values and add-up everything.

\begin{aligned} L(x) &= l_0(x) y_0 + l_1(x) y_1 + l_2(x) y_2 + l_3(x) y_3 \\ &= \left( -\frac{1}{6}x^3 + x^2 - \frac{11}{6}x + 1 \right) (1) \\ &+ \left( \frac{1}{2}x^3 - \frac{5}{2}x^2 + 3x \right) (2) \\ &+ \left( -\frac{1}{2}x^3 + 2x^2 - \frac{3}{2}x \right) (4) \\ &+ \left( \frac{1}{6}x^3 - \frac{1}{2}x^2 + \frac{1}{3}x \right) (10) \\ &= ... \\ &= \frac{1}{2}x^3 -x^2 + \frac{3}{2}x + 1 \end{aligned}

The simplification is a bit tedious, but essentially it’s just a ton of duplicated operations to solve for a bajillion constants. Lagrange interpolation is more of an academic curiosity, so we’ll stop here.

In the next method, we’ll factor out the constants from Lagrange and focus on another way to get the interpolation polynomial.

Newton’s Interpolation

Divided Difference:

\boxed{ f[x_i, x_{i+1}, ..., x_j] = \frac{ f[ x_j, ..., x_{i+1} ] - f[ x_{j-1},...,x_i ]}{ x_j - x_i } }

Example: Deriving divided difference

Newton interpolation looks at the interpolation polynomial and defines it as:

P(x) = c_0 + c_1 (x-x_0) + c_2 (x-x_0)(x - x_1) + ... + c_n (x - x_0)(x-x_1)...(x-x_{n-1})

There are immediate similarities to Lagrange interpolation, such as the summation of terms or the zeroing at other data points.

Notation — These notes use c_i, though other notes may use a_i.

The difference is that we build the polynomial iteratively, one data point at a time, instead of in a single massive step like in Lagrange.

Why It’s Important — Each term added doesn’t disrupt previous work.


So, how do you solve for these coefficients? (c_k)

We use divided difference, which represents the rate of change between some points.

Note — This’ll look familiar, because we just did something similar to approximate the derivative in the previous chapter.

1. Zeroth Divided Difference (c_0)

f[x_0] = y_0

Dead simple, just maps x \to y. Just a horizontal line passing through y.

Note — This becomes the base case if you want to do a recurrence relation

2. First Divided Difference (c_1)

f[x_0, x_1] = \frac{y_1 - y_0}{x_1 - x_0}

Shows slope needed to get from x_0 to x_1

3. Second Divided Difference (c_2)

f[x_0, x_1, x_2] = \frac{f[x_1,x_2] - f[x_0,x_1]}{x_2 - x_0}

Shows rate of change of slope to curve the polynomial to hit the third point (x_2).

Further — The numerator just calculates the difference between two slopes (between points 1 and 2, and points 0 and 1), and is divided by the range of outer points (x_2 - x_0).

4. In General (c_k)

\boxed{ f[x_i, x_{i+1}, ..., x_j] = \frac{ f[ x_j, ..., x_{i+1} ] - f[ x_{j-1},...,x_i ]}{ x_j - x_i } }

This is the general form.

To avoid repeat/messy calculations, we can organize our data neatly into a table called a divided difference table.

To reiterate, all we are doing is figuring out, at every iteration from c_0 to c_n, how to curve the graph to make it hit the next data point.

Example: Divided difference table

Recall (from 10 seconds ago) — Divided difference gets us the coefficients which tell us how to curve a polynomial into hitting the next data point.

The divided difference table lets us easily calculate divided differences recursively.

Here is a minimal example:

x_0y_0
f[x_0, x_1]
x_1y_1f[x_0, x_1, x_2]
f[x_1, x_2]f[x_0, x_1, x_2, x_3]
x_2y_2f[x_1, x_2, x_3]
f[x_2, x_3]
x_3y_3

Once constructed, we construct the polynomial by reading off the top-diagonal of the table. For this example, that’d be:

p_3 (x) = y_0 + f[x_0,x_1](x-x_0) + f[x_0,x_1,x_2](x-x_0)(x-x_1) + f[x_0,x_1,x_2,x_3](x-x_0)(x-x_1)(x-x_2)

To add a new entry to the table, you must calculate two parts:

  1. Numerator: Use the values from the column immediately to the left. Subtract the value diagonally above from the value diagonally below (above and below are relative to the target cell)
  2. Denominator: Trace those two numerator values diagonally back to the very first x-column. Subtract the top-most x value from the bottom-most x value.

As a concrete example, here is a simple construction for easy data set:

(0,1), (1,2), (2,4), (3,10)

1. Set up initial table.

x_0c_0
01
12
24
310

2. Calculate first column

x_0c_0c_1
01
\frac{2-1}{1-0} = 1
12
\frac{4-2}{2-1} = 2
24
\frac{10-4}{3-2} = 6
310

3. Calculate second column

x_0c_0c_1c_2
01
1
12\frac{2-1}{2-0} = 0.5
2
24\frac{6-2}{3-1} = 2
6
310

3. Calculate last column

x_0c_0c_1c_2c_3
01
1
120.5
2$ = $ 0.5
242
6
310

4. Assemble

Our coefficients are: c_0 = 1, c_1 = 1, c_2 = 0.5, and c_3 = 0.5

Thus,

\begin{aligned} P(x) &= c_0 + c_1(x - x_0) + c_2(x - x_0)(x - x_1) + c_3(x - x_0)(x - x_1)(x - x_2) \\ &= 1 + 1(x - 0) + 0.5(x - 0)(x - 1) + 0.5(x - 0)(x - 1)(x - 2) \\ \therefore P(x) &= 0.5x^3 - x^2 + 1.5x + 1 \end{aligned}

Note — If you remember the rules/reasoning for the terms (e.g., must equal zero at other data points), there are no formulae to remember. Re-read Lagrange if you don’t get this step!