Instance Simplification11. For example, simplifying a graph or transforming an array into a sorted array.: Transform to simpler instance of same problem.
Representation Change22. e.g., representing a polynomial by its coefficients vs. its roots, or storing elements in an AVL tree vs. a list.: Different representation of same instance.
Problem Reduction: Transform a problem into a different problem for which an algorithm is already known.
Sorting is important for lots of algorithms, which means the efficiency of the sort is super important.
Aside — Factoid: The fastest comparison sorting algorithm in the world is O(n \log n)
Q: Search for k in A[0..n-1]
A:
We can first sort A, and then do a binary search.
\text{Sort} + \text{Binary Search} = O(n \log n) + O(\log n)
Q: Determine if A[0..n-1] has unique elements.
A:
Sort, then scary array to check pairs of adjacent elements.
\text{Sort} + \text{Check Pairs} = O(n \log n) + O(n)
Q: Given a set of keys S and a search key k, find and occurrence of k in S, if any.
Aside — Real-World Uses: File size, dynamic of data.
A: Use a balanced search tree or hash table.
Suppose we implement a dictionary with a BST.
Recall — Efficiency depends on tree’s height.
Efficiency:
Recall — BST works worst on unbalanced trees. This is problematic if our operations gradually make our tree worse.
Solutions:
- AVL
- red-black
- 2-3
- 2-3-4
- B-trees
AVL Tree: BST in which, for every node, balance factor is at most 1.
Note — Height of empty tree defined as -1.
Balance Factor Examples:
When adding a new node to a BST, we can balance the addition via one of the four rotations.
Analysis — AVL Trees guarantee \Theta(\log n) time for search, insertion, and deletion.
MST is a search tree that allows more than one key in the same node of the tree.
Note — Lets us overcome some of the issues with binary search trees.
Search tree that may have 2-nodes and 3-nodes.
Insertion
Why? — Not as fast, but easier to keep balanced.
- In, you unbalance a binary tree when you add height to one path significantly more than other possible paths.
- On 2-3, you can only add height when creating a new root, which adds a unit of height to all paths simultaneously.
Analysis — Search, insertion, and deletion are \Theta (\log n)
Heap: Binary tree with keys at its nodes (one key per node) such that it is essentially complete.
Definition — Essentially Complete: All levels full except possibly last level, where only some rightmost keys may be missing.
Note — Remember: This is a binary tree, not a binary search tree! The property is parents are bigger than children, so it’s ordered top-down, but not left-right.
Note — Important:
- Root is the largest element.
- Any subtree is also a heap.
- Easily represented as array.
We can represent heaps as arrays by storing elements in level-order.
Note — Accesses:
- Left Child of Node j: 2j
- Right Child of Node j: 2j + 1
- Parent of Node: \lfloor j/2 \rfloor
- Parental nodes are in the first half of the array.
Note — Heaps are really good at representing priority queues.
A data structure for maintaining a set of elements, each with an associated key (priority). The standard operations are insert, and extract maximum/minimum.
Note — AVL and 2-3 we’ve done top-down, but heap we can do two methods.
Bottom-Up: Put everything in and then fix it.
Top-Down: Construct by successively inserting elements into an initially empty heap.
Note — Which is Better? Top-down is more straightforward, especially for inserting a new key; but bottom-up is more efficient.
ALGORITHM HeapBottomUp(H[1..n])// Constructs a heap from elements of a given array// Input: An array H[1..n] of orderable items// Output: A heap H[1..n]for i \leftarrow \lfloor n/2 \rfloor down to 1 dok \leftarrow i; v \leftarrow H[k]heap \leftarrow falsewhile not heap and 2*k \le n doj \leftarrow 2*kif j \lt n // there are two childrenif H[j] \lt H[j+1] j \leftarrow j \text{$+$} 1if v \ge H[j]heap \leftarrow trueelseH[k] \leftarrow H[j]; k \leftarrow jH[k] \leftarrow v
Analysis — Parental node at level i does at most 2(h-i) comparisons. Thus, efficiency of bottom-up construction \Theta(n).
Analysis — Although it looks more straightforward, efficiency ends up being O(n \log n).
Strategy:
Algorithm:
ALGORITHM Heapsort(A[1..n])// Sorts an array using Heapsort// Input: Array A[1..n]// Output: Array A[1..n] sorted in nondecreasing orderHeapBottomUp(A[1..n])for i \leftarrow n down to 2 doswap(A[1], A[i]) // remove root and place at the end// fix the heap of size i-1 (sifting down the new root)k \leftarrow 1; v \leftarrow A[1]; heap \leftarrow falsewhile not heap and 2*k \le i \text{$-$} 1 doj \leftarrow 2*kif j \lt i \text{$-$} 1 and A[j] \lt A[j+1] j \leftarrow j \text{$+$} 1if v \ge A[j] heap \leftarrow trueelse A[k] \leftarrow A[j]; k \leftarrow jA[k] \leftarrow v
Analysis — Creation of heap is \Theta (n), root removal portion is O(n \log n), so efficiency class is \Theta( n \log n). Space complexity is O(1).
Solve problem by transforming it into differnt problem for which an algorithm is already available.
Note — To be of practical value, the combined time of the transformation and solving of the reduced problem should be smaller than solving the problem as by another method.
Examples:
We know:
\text{lcm}(m,n) = \frac{m \times n}{\gcd(m,n)}
Thus, we could just use an efficient method like Euclid’s algorithm to quickly get LCM.
The number of paths of length k between vertex i and j in a graph is the element at position i,j in the k-th power of the adjacency matrix. We reduce path counting to matrix multiplication.