## Lecture 32 - 04/10

### Sorting I

• Arrangement of some elements according to some total order:
• Total: x,y  xyyx$\forall x, y~~x\leq y \vee y\leq x$
• Reflexive: xx$x\leq x$
• Antisymmetric: xyyxx=y$x\leq y \wedge y\leq x \iff x=y$
• Transitive: xyyzxz$x\leq y \wedge y\leq z \implies x \leq z$
• Inversion: pair of elements out of order
• Sorting goal is to make number of inversions zero

#### Selection Sort

• Find smallest item
• Swap this item to the front and ‘fix’ it
• Repeat for unfixed items until all items are fixed
• Time complexity: Θ(N2)$\Theta(N^2)$
• Inefficient

#### Heapsort

• Naive heapsort:
• Insert all items into a max heap, and discard input array. Create output array
• Repeat N times:
• Delete largest item from the max heap.
• Put largest item at the end of the unused part of the output array.
• Time complexity:
• Θ(NlogN)$\Theta(N\log N)$ time to put all items in heap
• Θ(1)$\Theta(1)$ time to get largest item, done N$N$ times
• Θ(NlogN)$\Theta(N\log N)$ time to do all N$N$ removals from heap
• 2Θ(NlogN)+Θ(N)=Θ(NlogN) $2\cdot \Theta(N\log N) + \Theta(N) = \Theta(N\log N)~$ runtime!
• Memory usage is Θ(N)$\Theta(N)$ to build additional copy of all data
• In-place heapsort:
• Convert array into heap via "bottom-up heapification" (make array itself a heap)
• Sink nodes in heap (move them down the heap) in reverse level order (reverse Breath-First-Search order, lower rows sunk first, and higher rows last)
• Still Θ(NlogN) $\Theta(N\log N)~$ time complexity, but now Θ(1)$\Theta(1)$ space complexity because no additional memory used
• Bottom-up heapification is Θ(N)$\Theta(N)$ in the worst case
• Θ(1)$\Theta(1)$ space complexity is only in iterative heap operations; if we have recursive heap operations then to keep track of frames we have Θ(logN)$\Theta(\log N)$ space complexity (very minor)

#### Mergesort

• Recursively split an input item into two halves
• These two halves are further split until a base case of size one
• Recursively combine smaller items into a bigger item
• Combine until back at item of size of input array
• Combine by having two pointers, one for each of the two smaller sorted items (starting from base case of size one item, which is sorted)
• We "step" through the two items with pointers and compare values to construct a larger sorted item: pointers start at the first element in respective items, and compare values; smaller item is put in the larger to-return item, and corresponding pointer is advanced
• Done until all elements are transfered to the larger item
• Time complexity: Θ(NlogN)$\Theta(N\log N)$, space complexity: Θ(N)$\Theta(N)$
• Faster than heap sort!

#### Insertion Sort

• Naive approach:
• For each element in the input, walk through the current output array and find the element's correct position, and insert it there
• Repeat until all input elements have been inserted into output
• More efficient approach:
• Instead of inserting and needing to move elements over, do in-place swapping
• We have a 'sorted' and 'unsorted' section of our array; keep track of this division via pointers (originally pointer at first element, so all in unsorted part and none in sorted)
• Let's say we have X$X$ items in the sorted partition of the array, and a remaining Y$Y$ items in the unsorted partition to go
• Find the first item currently in unsorted, from the pointer we use to keep track of this partition
• Advance this pointer, moving this first item of Y$Y$ to the sorted position. With a second pointer, walk backwards through the array (from the partition index to the array's start), pairwise swapping the newly inserted item with the elements that already were in the sorted part of the array
• Keep doing these pairwise swaps as long as the item at the second pointer is larger than the item we newly added to the sorted part of the array. Each time we swap two neighboring items in the array, we advance the pointer by decreasing it's index by one (moving it closer to the start of the array).
• When the item at the second pointer is smaller or equal to the newly added item, we are done swapping and have properly inserted the new element into the sorted section
• Repeat till all elements are in the sorted array (first pointer reaches end of array)
• Time complexity:
• Ω(N)$\Omega(N)$ best case, and O(N2)$O(N^2)$ worst case
• Insertion sort is better than merge sort and the others for a small N$N$, or an input array that is almost sorted
• Almost sorted:
• For an array with only K$K$ inversions, insertion sort has a runtime of Θ(N+K)$\Theta(N+K)$
• An array is almost sorted if the number of inversions K$K$ is cN$\leq cN$ for some constant c$c$
• For small N$N$ (N15$\approx N\leq 15$), insertion sort is faster than all the other algorithms
• Θ(1)$\Theta(1)$ space complexity for in-place sort