Sorting algorithms

Learning basic sorting algorithms is a bit of a Computer Science 101 class. But many examples out there are either in pesudocode, or languages more suited to large computation (e.x. Python). So I thought I would quickly go over the three basic sorting algorithms, and demonstrate them in C#.

Default Sorting In C#/.NET

So going over these, you’re probably going to be thinking “So which one does C# use?”. By that I mean, if you call Array.Sort(), does it use any of these examples? Well.. the answer is “it depends”. In general when using “Sort()” on a List, Array or Collection it will use :

• If the collection has less than 16 elements, the algorithm “Insertion Sort” will be used (We will talk about this below).
• If the number of partitions exceeds 2 log *array size*, then Heapsort is used.
• Otherwise Quicksort is used.

However this is not always the case, For example when using Linq on a list and calling OrderBy, Quicksort is always used as the underlying implementation.

However, what I’m trying to point out here is that the sorting algorithms outlined below are rarely, if ever used in the real world and are more likely to be used as an interview question. But they are important to understand because often other sorting algorithms are built on top of these more “archaic” algorithms (It also helps to understand jokes in Silicon Valley too!).

Array vs List

I just want to point out something very important in talking about sorting algorithms when it comes to C#. When I first started programming, I couldn’t understand why examples always used arrays in their sample code. Surely since we are in C#, Lists are way cooler! And even though a fixed array is obviously faster than a linked list, do we really need to use arrays even in examples?

Well the thing is, these are all “in place” sorts. That is, we do not create a second object to return the result, store state, or hold “partial” results. When we do things like Insertion Sort below, I’ll give an example of how this might easier be done with a List, but it requires an additional array to be created in memory. In almost all sorting algorithms you’ll find that they work within the data structure given and don’t “clone” or select out items into a new List to return an entirely different object. Once I realized that “sorting” was not simply about “give me the lowest item and I’ll just put it in this new list over here and keep going until I select out all items”, but instead almost about the most efficient way to “juggle” items inside an array, those pseudocode sort algorithms suddenly made sense.

Bubble Sort

So first up we are going to look at Bubblesort. This is essentially worse case scenario for sorting data as it takes many “passes” of single swaps for things to actually sort.

Let’s look at the code :

``````public static void BubbleSort(int[] input)
{
var itemMoved = false;
do
{
itemMoved = false;
for (int i = 0; i < input.Count() - 1; i++)
{
if (input[i] > input[i + 1])
{
var lowerValue = input[i + 1];
input[i + 1] = input[i];
input[i] = lowerValue;
itemMoved = true;
}
}
} while (itemMoved);
}
``````

Now how does BubbleSort work. Starting at index zero, we take an item and the item next in the array and compare them. If they are in the right order, then we do nothing, if they are in the wrong order (e.g. The item lower in the array is actually a higher value than the next element), then we swap these items. Then we continue through each item in the array doing the same thing (Swapping with the next element if it’s higher).

Now since we are only comparing each item with it’s neighbour, each item may only move a single place when it actually needs to move several places. So how does Bubblesort solve this? Well it just runs the entire process all over again. Notice how we have the variable called “itemMoved”. We simply set this to true if we did swap an item and start the scan all over again.

Because we are moving things one at a time, not directly to the right position, and having to multiple passes to get things right, BubbleSort is seen as extremely inefficient.

Insertion Sort

Next up is Insertion Sort. Now while we still check items one by one what we instead do is “insert” the item at the correct index right from the get go. Unlike BubbleSort where we are swapping the item with it’s neighbour, we are instead inserting the item into the correct position given what we have already checked.

I’m actually going to show the code twice. First is what I think is your typical insertion sort :

``````public static void InsertionSort(int[] input)
{
for (int i = 0; i < input.Count(); i++)
{
var item = input[i];
var currentIndex = i;
while (currentIndex > 0 && input[currentIndex - 1] > item)
{
input[currentIndex] = input[currentIndex - 1];
currentIndex--;
}
input[currentIndex] = item;
}
}
``````

So a quick explanation of this code.

We loop through each item in the index and get the value. Then we loop through each item in the indexes *below* the index we started at. If the item has a lower value than the index below them, then we “shift” that item below them up by 1. And check the next item. In a way it’s like a bubble sort because we are comparing the neighbour below them, but if we do swap, then we continue swapping until we get to the end.

If we get to the last index (0), or we hit a new item that has a lower value than our current item, then we “break” and simple insert our current item at that index.

But a simpler way to view the “Insertion” sort algorithm is actually by building a new list to return. For example :

``````public static List<int> InsertionSortNew(this List<int> input)
{
var clonedList = new List<int>(input.Count);
for (int i = 0; i < input.Count; i++)
{
var item = input[i];
var currentIndex = i;
while (currentIndex > 0 && clonedList[currentIndex - 1] > item)
{
currentIndex--;
}
clonedList.Insert(currentIndex, item);
}
return clonedList;
}
``````

So in this example, instead we create a brand new list where we slowly add items to it by inserting them at the correct location. Again, not quite correct as we are doing things like being able to insert items at certain indexes without having to shift the items above it up an index. But really being able to insert the item at a certain index is just sugar that C# takes care of for us.

Again however, generally when we talk about list sorting, we are talking about “inplace” sorting and not trying to cherry pick items out into a new object.

Selection Sort

Selection Sort is actually very very similar to Insertion Sort. The code looks like so :

``````public static void SelectionSort(int[] input)
{
for (var i = 0; i < input.Length; i++)
{
var min = i;
for(var j = i + 1; j < input.Length; j++) {
if(input[min] > input[j])
{
min = j;
}
}
if(min != i)
{
var lowerValue = input[min];
input[min] = input[i];
input[i] = lowerValue;
}
}
}
``````

What we are essentially doing is scanning the index from start to finish. For each index, we scan the rest of the array for an item that is lower (Infact, the lowest) compared to the current item. If we find one, we swap it with the current item. The fact that the current item goes into a position later in the array isn’t too important as eventually, all elements will be checked.

Now, again, this looks more complicated than it should be because of our in-place array criteria. But really what we are doing is scanning by one one, finding the lowest item in the list, putting it into an array, and then continuing with the next highest etc.

Divide and Conquer Sorting

Not featured here are “Divide and Conquer” sorting algorithms, these are things like MergeSort and QuickSort that divide up the work to many smaller sorting operations, and then combine the results at the end. These are generally the sorting algorithms you will find out in the wild, but it’s maybe a little bit past the “basics” of sorting.