## Lecture 36 - 04/19

### Sorting V - Radix Sort

• From last time, we showed that a comparison based sorting algorithm can be at best Θ(NlogN)$\Theta(N\log N)$ time, since N$N$ elements have N!$N!$ possible combinations, and we must make logN!$\log N!$ comparisons to determine the correct ordering (which is equivalent to NlogN$N\log N$ as both functions grow at the same rate)
• New idea: sort without comparing elements with each other - rather, use intrinsic properties of the elements themselves
• Naive implementation: sleep sort for integer sorting (bad algorithm):
• For each integer x$x$ in some array A$A$, do the following simultaneously:
• Sleep for x$x$ seconds
• Print out x$x$
• The order in which elements are printed out tells us the correct sorted order
• The runtime of this is Θ(N+max(A))$\Theta(N+\max(A))$
• However this isn't practical, as on real machines scheduling so many threads of the program to simultaneously run affects runtime. Also bad for an array where max(A)$\max(A)$ is very high
• Improvement: Basic Counting Sort:
• Let us have an array of size N$N$, each element of which has a unique key, numbering from 0$0$ to N1$N-1$
• Sort by creating a new array of size N$N$
• Then copy the item with key i$i$ into the ith$i^{th}$ position of the new array
• Runtime is Θ(N)$\Theta(N)$, but this has very limited application - what if we are dealing with non-unique keys/non-consecutive keys/non-numerical keys
• Better Counting Sort:
• There is a simple solution to solving the problems of non-unique keys
• Let our array of size N$N$ have X$X$ different types of keys - each key must take on an element in X$X$, and X$X$ is an "alphabet" for the keys
• Then, before doing any sorting at all, go through the entire array, counting how many occurences of each key (each element of X$X$) occur in the array
• We will use these counts for determining the index at which to copy elements in the second phase of the sort
• For example, let us have an alphabet X={k1,k2,k3}$X=\{k_1, k_2, k_3\}$ for an array of size N=12$N=12$, the counts of which are respectively {4,6,2}$\{4, 6, 2\}$
• Then, we create an array P$P$ of size |X|$|X|$ (in this case 3$3$), to stor,e pointers corresponding to where we will insert an upcoming element of key kk$k_k$. This array is initialized as P={0,4,10}$P=\{0,4,10\}$
• When we come across an element of key Kk$K_k$, we look at it's corresponding index pointer in P$P$, copy at this location, and then increment the pointer in P$P$
• So for example if the first element we find is of key k2$k_2$, we insert at index 4$4$ in the new array, and P$P$ updates to {0,5,10}$\{0,5,10\}$. Next if we find two elements of key k3$k_3$, we insert at indices 10$10$ and 11$11$ respectively, and now have P={0,5,12}$P=\{0,5,12\}$. Continue till we go through all N=12$N=12$ elements of our input
• Note, counting sort will sort in Θ(N+R)$\Theta(N+R)$ where R$R$ is the size of our alphabet
• The sort also uses memory Θ(N+R)$\Theta(N+R)$
• This sort is bad with non-consecutive keys. For example if our max-value key is 1000000$1000000$, then we have an alphabet of size R=1000000$R=1000000$. Sort is good when NR$N\geq R$

• Not all keys belong to finite alphabets (for example, Strings), but all strings consist of characters from a finite alphabet
• Do a counting sort over this finite alphabet starting from the least significant digit (LSD) of an input, all the way to the most significant
• Each counting sort is with respect to only one digit of the input at a time
• For example, if all input elements are of size W+1$W+1$, first do a sort over just the Wth$W^{th}$ (least significant digit) of all the elements, then do the W1th,...$W-1^{th}, ...$ until we do the sort over the 0th$0^{th}$ and most significant digit
• We can deal with elements of non-equal size by having a small character ""$"\cdot"$ fill up all the missing digit positions of all the elements; this makes all elements now of equal size (the small character is smaller than all other characters)
• Example:
• This algorithm has the same memory usage as a counting sort, but has a runtime of Θ(WN+WR)$\Theta(WN+WR)$, where W$W$ is the length of the longest element
• LSD vs Merge Sort:
• Merge Sort better when sorting a very large number of very large, dissimilar strings
• This is in large part due to how LSD sort looks at every single character in all the elements, while merge sort most likely will not need to - within the first few characters it can determine the ordering of elements
• This is an example of W$W$ dominating N$N$
• LSD sort better when sorting a very large number of very large strings that are very similar, except at the lesser significance digits
• This is because in this case Merge Sort will look through almost all the digits in all the elements while trying to determine an ordering, while LSD sort finds an ordering much faster

• We can also sort from the other end, by starting with the first/most significant digits of all the elements, and then working towards the least significant digit
• After sorting by the ith$i^{th}$ digit on an alphabet Xi$X_i$, we must split up the array into |Xi|$|X_i|$ different partitions, each of which contains all instances of a character in Xi$X_i$ found at the ith$i^{th}$ digit depth
• For example:
• Best case: we finish the sort in a single pass, by only needing to look at the top digit: Θ(N+R)$\Theta(N+R)$
• Worst case: we have to look at every character, which degenerates to a cache-heavy LSD sort: Θ(WN+WR)$\Theta(WN+WR)$
• WR$WR$ term because we recurse W$W$ levels deep, and at each level we create an array of size R$R$ for the counting sort at the level