Linear Systems

Linear System: Set of n equations consisting of a linear combination of n variables.

Naive Gaussian Elimination

Gaussian Elimination: An algorithm for solving linear systems.

Augmented Matrix Notation:

\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1j} & | & b_1 \\ a_{21} & a_{22} & \cdots & a_{2j} & | & b_2 \\ \vdots & \vdots & \ddots & \vdots & | & \vdots \\ a_{i1} & a_{i2} & \cdots & a_{ij} & | & b_i \\ \end{bmatrix}

Background: More on Augmented Matrices

We can represent a system of linear equations as an augmented matrix. This is the starting point for Gaussian elimination.

\begin{aligned} a_{11} x_1 + a_{12} x_2 + a_{13} x_3 = b_1 \\ a_{21} x_1 + a_{22} x_2 + a_{23} x_3 = b_2 \\ a_{31} x_1 + a_{32} x_2 + a_{33} x_3 = b_3 \end{aligned} \\~\\ \downarrow \\~\\ \begin{bmatrix} a_{11} &a_{12} &a_{13} &| &b_1 \\ a_{21} &a_{22} &a_{23} &| &b_2 \\ a_{31} &a_{32} &a_{33} &| &b_3 \\ \end{bmatrix}

Notation:


Or, in more general form (using i and j to represent row and column):

\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1j} & | & b_1 \\ a_{21} & a_{22} & \cdots & a_{2j} & | & b_2 \\ \vdots & \vdots & \ddots & \vdots & | & \vdots \\ a_{i1} & a_{i2} & \cdots & a_{ij} & | & b_i \\ \end{bmatrix}

Step 1. Forward Elimination

Goal: Simplify the matrix into upper triangular form

Definition: Upper Triangle Form

Upper Triangular Form: When all the elements below the main diagonal are zero.

For example: \begin{bmatrix} a_{11} & a_{12} & a_{13} & | & b_1 \\ \textbf{a}_{\textbf{21}} & a_{22} & a_{23} & | & b_2 \\ \textbf{a}_{\textbf{31}} & \textbf{a}_{\textbf{32}} & a_{33} & | & b_3 \end{bmatrix} \\~\\ \downarrow \\~\\ \begin{bmatrix} a'_{11} & a'_{12} & a'_{13} & | & b'_1 \\ \textbf{0} & a'_{22} & a'_{23} & | & b'_2 \\ \textbf{0} & \textbf{0} & a'_{33} & | & b'_3 \end{bmatrix}

(The prime symbol is just to indicate the original values get changed in the process)

\boxed{ \text{Updating Coefficients: } a'_{ij} = a_{ij} - m \times a_{kj} } \\~\\ \boxed{ \text{Updating Constants: } b'_i = b_i - m \times b_k } \\~\\ \small \textit{where $m = \left( \frac{a_{ik}}{a_{kk}} \right)$}

Layman’s Explanation: The new value of your target row is its old value minus the multiplier \times corresponding element from the pivot row above.

\text{Multiplier: } \\ \boxed{ m = \frac{ \text{element to eliminate} }{ \text{pivot} } }

\text{Updating Coefficients \& Constants: } \\ \boxed{ a'_{ij} \lor b'_{i} = \text{old element} - m \times \text{element from pivot row above} }

Example: Baby’s First Forward Elimination

All of this is just the matrix representation/simplification of this algebra:


Suppose our current goal is to eliminate -3x_1 (make it zero)

\begin{aligned} 2x_1 + 1x_2 &= 8 \\ -3x_1 - 1x_2 &= -11 \\ \end{aligned}

Method 1: Plain Ol’ Algebra

We can see that to eliminate -3x_1, we must multiply 2x_1 (the top equation) by \frac{3}{2}, and then add the two equations.

\begin{aligned} (2x_1 + 1x_2 &= 8) \times \frac{3}{2} \\ -3x_1 - 1x_2 &= -11 \\ \end{aligned} \\~\\ \downarrow \\~\\ \begin{aligned} 3x_1 + \frac{3}{2}x_2 &= 12 \\ -3x_1 - 1x_2 &= -11 \\ \end{aligned} \\~\\ \downarrow \\~\\ \begin{aligned} &3x_1 &+& \tfrac{3}{2}x_2 &=& 12 \\ +~ -&3x_1 &-& 1x_2 &=& -11 \\ \hline =~ & &+& \tfrac{1}{2}x_2 &=& 1 \end{aligned}

Method 2: The Same Thing But With a Matrix

Doing all of the above with an augmented matrix looks like this:

\begin{bmatrix} 2 &1 &| &8 \\ -3 &-1 &| &-11 \\ \end{bmatrix} \\~\\ \downarrow \\~\\ \text{Apply $a_{ij}$} \\~\\ \downarrow \\~\\ \begin{bmatrix} 2 &1 &| &8 \\ 0 &0.5 &| &1 \\ \end{bmatrix}

Example: Okay, but I don’t get how to apply a_{ij} (Babies first forward elimination)

Let’s do this again, but using the layman’s explanation.

\begin{bmatrix} 2 &1 &| &8 \\ -3 &-1 &| &-11 \\ \end{bmatrix} \\~\\ \downarrow \\~\\ \text{Goal: \textcolor{red}{eliminate the -3} and make it zero} \\ \text{(This will affect everything on its right)} \\~\\ \downarrow \\~\\ \text{First, we'll calculate the multiplier} \\~\\ \begin{bmatrix} \textcolor{CornflowerBlue}{2} &1 &| &8 \\ \textcolor{red}{-3} &-1 &| &-11 \\ \end{bmatrix} \\~\\ \textcolor{fuchsia}{\text{multiplier}} = \frac{\textcolor{red}{\text{element to eliminate}}}{\textcolor{CornflowerBlue}{\text{pivot element}}} = \textcolor{fuchsia}{\frac{-3}{2}} \\~\\ \downarrow \\~\\ \text{then, for each element on the right, we find: } \\ a_{ij} \lor b_i = \text{old element} - \textcolor{fuchsia}{\text{multiplier}} \times \text{element from pivot row above} \\~\\ \downarrow \\~\\ \begin{bmatrix} 2 &1 &| &8 \\ 0 &\textcolor{gold}{\sout{-1}} &| &\textcolor{cyan}{\sout{-11}} \\ \end{bmatrix} \\~\\ \text{For \textcolor{gold}{-1}: } \\ \text{new element} = \textcolor{gold}{-1} - \textcolor{fuchsia}{\frac{-3}{2}} \times 1 = 0.5 \\~\\ \text{For \textcolor{cyan}{-11}: } \\ \text{new element} = \textcolor{cyan}{-11} - \textcolor{fuchsia}{\frac{-3}{2}} \times 8 = 1 \\~\\ \downarrow \\~\\ \begin{bmatrix} 2 &1 &| &8 \\ 0 &0.5 &| &1 \\ \end{bmatrix}

Step 2. Backwards Substitution

Goal: Now, solve the system.

\boxed{ \text{Base Case: } x_n = \frac{b'_n}{a'_{nn}} } \\~\\ \boxed{ \text{General Case: } x_i = \frac{ b'_i - \sum_{j=i+1}^n a'_{ik} x_k }{ a'_{ij} } }

(Remember that the prime just means the value has changed)

Pseudocode

// Naive Gaussian
function NaiveGaussian(coeff : array[n,n], const : vector(n))
    sol := new vector(n)
    call FwdElimination(coeff, const)
    call BackSubst(coeff, const, sol)
end function
function FwdElimination(coeff : array(n,n), const: vector(n))
    for k \leftarrow 1 to (n-1)
// (Start below the pivot (k+1))
        for i \leftarrow k \text{$+$} 1 to n
            mult := coeff[i][k] / coeff[k][k]
// Update all coefficients
            for j \leftarrow k to n
                coeff[i][j] := coeff[i][j] \text{$-$} mult \large{*} coeff[k][j]
            end for
// Update all constant
            for j \leftarrow k to n
            const[i] := const[i] \text{$-$} mult \large{*} const[k]
        end for
    end for
end function
function BackSubst(coeff: array(n,n), const : vector(n), sol : vector(n))
    sol[n] := const[n] / coeff[n][n]
    for i \leftarrow n \text{$-$} 1 to 1 do
        sum := const[i]
// Computes top term of x_i
        for j \leftarrow i \text{$+$} 1 to n
            sum := sum \text{$-$} coeff[i][j] \large{*} sol[j]
        end for
// Divide by bottom term
        sol[i] := sum / coeff[i][i]
    end for
end function

Caveat Emptor:

Analysis

1. Big O

Unrolling FwdElimination's loops, we see it runs:

(n-1) + ... + 1 = \frac{n(n+1) - n}{2} \in O(n^2)

Analyzing BackSubst, we get O(n^3)

So, worst-case of Naive-Gaussian is O(n^3)

2. Imprecision

Recall these formulae from forward-elimination:

\boxed{ \text{Updating Coefficients: } a'_{ij} = a_{ij} - m \times a_{kj} } \\~\\ \boxed{ \text{Updating Constants: } b'_i = b_i - m \times b_k } \\~\\ \small \textit{where $m = \left( \frac{a_{ik}}{a_{kk}} \right)$}

The only part where we perform division is the calculation of m, which can lead to ridiculous amounts of impreicision when a_{ik} >> a_{kk}

We can improve precision by arranging the numbers such that the smaller number is on top.

Scaled Partial Pivoting (SPP)

This method seeks to reduce imprecision in the calculation of m.

Suppose the following augmented matrix:

\begin{bmatrix} 2& 5& 2& |& -38& \\ 3& -2& 4& |& 17& \\ -6& 1& -7& |& -12& \end{bmatrix}

  1. Go through each coefficient and note the one with the greatest absolute value.

Reducing Cost of Swapping: When implementing, if you represent matrices as arrays, swapping array is expensive.

By introducing another array for indices, we can do:

a[i][j] \to a[ind[i], j]

Which eliminates the need to swap.

Pseudocode

function SPPFwdElimination(coeff : array(n,n), const : vector(n), sol : ???, ind : vector(n))
    ???
    for k \leftarrow 1 to n \text{$-$} 1
        rmax := 0
        maxInd := k

        for i \leftarrow k to n
            r := | coeff[ind[i]][k] / scaling[ind[i]]|
            if (r \gt rmax) then
                rmax := r
                maxInd := i
            end if
        end for
        swap(ind[maxInd], ind[k])

        for i \leftarrow k \text{$+$} 1 to n
            mult := coeff(ind[i]][k] / coeff[ind[k]][k]

            for j \leftarrow k to n
                coeff(ind[i]][j] := coeff[ind[i]][j] \text{$-$} mult \large{*} coeff[ind[k]][j]
            end for

            const[ind[i]] := const[ind[i]] \text{$-$} multi \large{*} const[ind[k]]
        end for
    end for
end function
function SPPBackSubst(coeff : array(n,n), const : vector(n), ind : vector(n))
    sol[n] := const[ind[n]] / coeff[ind[n]][n]
    for i \leftarrow n \text{$-$} 1 to 1
        sum := const[ind[i]]
        for j \leftarrow i \text{$+$} 1 to n
            sum := sum \text{$-$} coeff[ind[i]][j] \large{*} sol[j]
        end for
        sol[i] := sum / coeff[ind[i]][i]
    end for
end function
function SPPGaussian(coeff: array(n,n), const : vector(n))
    sol := new array (n,n)
    ind := new vector(n)

    for i \leftarrow 1 to n
        ind[i] := i
    end for

    call SPPFwdElimination(coeff, const, ind)
    call SPPBackSubst(coeff, const, sol, ind)
end function

TODO Finish copying pseudocode

First Loop:

Afterwards we do forward elimination as normal and back substitution as normal.

This algorithm also works out to be O(n^3).