Sorting Algorithms: The Bubble Sort

Thursday August 14thAlgorithms, Software Development, Sorting Category

BubblesThis is the first of many posts covering the fascinating topic of sorting.

I chose the Bubble Sort algorithm as the first to cover because of its simplicity. This algorithm tends to be the first sorting algorithm that is taught to students, and hence is a rather apt starting point.

Let’s break it down.

Common Terms

  • Data set - the array or list of items that is to be sorted.
  • n - the number of elements in the data set

The Algorithm

The algorithm consists of a repeated iteration over the elements of the data set. In each iteration, adjacent elements are compared. If those two adjacent items are in the wrong order, they are swapped. The result of each iteration is that the next highest value is placed in the appropriate location in the data set. After repeating the iteration n - 1 times, the entire data set will be sorted.

The Bubble Sort fits in the category of Exchange Sorts due to the manner in which elements are moved inside the data set during the sorting process.

The Example

We start with an unsorted data set of 10 elements which we want to sort in ascending order:

33 98 74 13 55 20 77 45 64 83

The start of the first iteration looks at the first two elements (marked in red):

33 98 74 13 55 20 77 45 64 83

As per the algorithm we compare the two values, swapping them if the first item is bigger than the second. In this case, 98 is bigger than 33, so no swap is required.

Next we look at the two adjacent items, starting at the second item in the list:

33 98 74 13 55 20 77 45 64 83

In this case 98 is greater than 74, so the items must be swapped (marked in blue):

33 74 98 13 55 20 77 45 64 83

We do the same again, this time starting at the third item:

33 74 98 13 55 20 77 45 64 83

Again, we need to swap the items since they’re not in order:

33 74 13 98 55 20 77 45 64 83

We repeat this process until we get to the end of the list (marked in green):

33 74 13 55 20 77 45 64 83 98

As we can see, the result is that the largest number is placed at the end of the list. This is the end of the first iteration.

We then iterate again but only cover the section of numbers that haven’t already been sorted.

33 74 13 55 20 77 45 64 83 98

No swap required, move to the next pair.

33 74 13 55 20 77 45 64 83 98

Swap required:

33 13 74 55 20 77 45 64 83 98

Move to the next pair:

33 13 74 55 20 77 45 64 83 98

Again, swap required:

33 13 55 74 20 77 45 64 83 98

We continue this trend for another iteration before hitting the following case:

33 13 55 20 74 77 45 64 83 98

Here, there is no swap required, so we leave behind the number we’ve been “carrying” (74) and pick up number 77.

33 13 55 20 74 77 45 64 83 98

The result of the iteration is as follows:

33 13 55 20 74 45 64 77 83 98

As we can see, this “bubbles” each of the numbers to the top of the list in the order of highest to lowest. Here is how the rest of the numbers are sorted at the end of each following iteration:

13 33 20 55 45 64 74 77 83 98
13 20 33 45 55 64 74 77 83 98
13 20 33 45 55 64 74 77 83 98
13 20 33 45 55 64 74 77 83 98
13 20 33 45 55 64 74 77 83 98
13 20 33 45 55 64 74 77 83 98
13 20 33 45 55 64 74 77 83 98

Note that the last iteration results in two numbers being locked in due to the fact we no longer have any numbers to sort.

The Implementation

Here’s a commented sample in C# which is easily translatable to many other languages:

/// <summary>
/// This is the basic bubble sort algorithm, hard-coded to work
/// with integer values.
/// </summary>
/// <param name="dataSet">Array of integers to sort.</param>
static void BubbleSortBasic(int[] dataSet)
{
    // loop n-1 times.
    for (int i = dataSet.Length - 1; i > 0 ; --i)
    {
        // for each loop, iterate through the first i
        // items (ie. the unsorted ones)
        for (int j = 0; j < i; ++j)
        {
            // if adjacent items need to be swapped
            if (dataSet[j] > dataSet[j + 1])
            {
                // swap them
                Swap(dataSet, j, j + 1);
            }
        }
    }
}

A general form of this algorithm which applies to any comparable type can be found in the source repository listed below.

A Minor Optimsation

The Bubble Sort can be optimised very slightly, though it’s not guaranteed to provide much benefit depending on the structure of the data set that is to be sorted.

If, for any iteration, there are no items swapped then all of the items in the data set must be in the correct order. As a result, subsequent iterations are unnecessary. The optimised version is listed below, again in C#:

/// <summary>
/// This is the basic bubble sort algorithm, hard-coded to work
/// with integer values but with a slight difference - a minor
/// optimisation.
/// </summary>
/// <param name="dataSet">Array of integers to sort.</param>
static void BubbleSortBasicOptimised(int[] dataSet)
{
    // loop n-1 times.
    for (int i = dataSet.Length - 1; i > 0 ; --i)
    {
        // keep track of whether items were swapped
        // for this iteration
        bool swapped = false;
 
        // for each loop, iterate through the first i
        // items (ie. the unsorted ones)
        for (int j = 0; j < i; ++j)
        {
            // if adjacent items need to be swapped
            if (dataSet[j] > dataSet[j + 1])
            {
                // swap them
                Swap(dataSet, j, j + 1);
 
                // indicate that we found a swap
                swapped = true;
            }
        }
 
        // if nothing was swapped, then we should
        // already have everything in order
        if (!swapped)
        {
            break;
        }
    }
}

Caveat: This “optimisation” may actually result in lower performance, particularly in the case where every iteration results in a swap.

The Complexity

Space Complexity

Bubble Sorts are extremely efficient in terms of memory usage, due to the fact that all sorting and swapping operations are done on the original data set. Regardless of the size of the original data set, the amount of memory overhead is constant as we don’t allocate any memory for each of the items. Hence, the space complexity is O(1) (in Big-O notation). This, of course, doesn’t include the O(n) space complexity taken up by the original data set.

Time Complexity

This is where the Bubble Sort fails to shine. For each iteration i starting at n, we must loop through the previous i - 1 elements and swap if required.

The first iteration loops through n - 1 elements.
The second iteration loops through n - 2 elements.
The third iteration loops through n - 3 elements.
And so on.. which means the number of compares/swaps that we do is equal to:
(n - 1) + (n - 2) + (n - 3) … + 2 + 1
This indicates that there n lots of n - i, or n2 - ni (where ni varies for each iteration).

Give that the highest power of n is 2 this indicates a time complexity of O(n2).

Or in layman’s terms: it’s f**king slow :)

Bubble Sorts should only be used on data sets that are rather small. If you’re dealing with medium-sized or larger sets of data, then the Bubble Sort is not the right algorithm to choose. So what would be a better option? You’ll have to read the rest of this series and answer that yourself ;)

Other Implementations

I’m slowly gathering a collection of implementations of all the sorting algorithms, including the ones listed above, that I’m covering in this series and posting them up on BitBucket. The Bubble Sort implementations can be found here.

Note: You’ll need Mercurial if you want to pull directly from the repository, otherwise you’ll have to copy and paste from the web.

Closing Thoughts

I haven’t included information on multithreading because the post would be insanely big. I will put up a follow-up post covering multithreading another day. The code stored in the BitBucket repository contains a sample of how you might use multithreading in conjunction with this sorting algorithm.

To wrap up, Bubble Sorts are easy to understand and are a great place to start when learning sorting algorithms. Unfortunately, this simplicity results in a fairly expensive and slow sorting implementation which really isn’t an option when dealing with anything other than small data sets.

References and Other Information

  1. Wikipedia - Bubble sort

Next up, we’ll be looking at the Cocktail Sort.

Disclaimer

I’m no expert, nor an authority. The post above is based on my experience and a bit of research. If you find something wrong with anything I’ve said please let me know. Comments and feedback are greatly appreciated.

10 Comments

  1. Keef
    August 14, 2008

    Ah the hello world of algorithms!  Nice article!

    On an unrelated note, I’ve just noticed that not all the text colouring comes through on the feed.  The pre formatted code blocks are fine (though the grey box around them is missing), but the red and blue highlights in the sort run through are missing.

  2. gav
    August 15, 2008

    Nice article OJ. Was interesting enough to read even though I (obviously!) already knew all that. 

    Looking forward to the rest in the series.

  3. OJ
    August 15, 2008

    @Keef: Hrm, bugger. I was trying to make it so that it would actually behave in the feed as well. I’ll take a look. Cheers.

    @gav: Thanks for the comments dude. I realise that most people will know this one, but it was low hanging fruit which I think people are happy to read about because it’s a “refresher”. As time goes by the algorithms will get more interesting, and hence a lot harder to digest ;)

  4. OJ
    August 15, 2008

    @Keef: This is interseting. It appears that some of the class tags for the spans that I was using for formatting have been stripped in the feed. I’m not yet sure if it’s Google Reader, or if the feed has been “adjusted”. I’ll try and fix that up soon though. Cheers!

  5. RobG
    August 15, 2008

    Hey OJ,

    How is this any different from a really good Wikipedia entry on Bubblesort? In fact the entry there is pretty similar. I guess what I’m asking is, why would anyone come here to find out about sorting rather than going to Wikipedia?

    Given Wikipedia’s popularity, how do you plan to differentiate your posts in this series from the general stuff already out there? I know it’s probably quite tough to depict that thinking in such a simple sort, but I was wondering what your plans were on this early on?

    Great read mate - flows nicely!
    Cheers,
    Rob

  6. OJ
    August 15, 2008

    @RobG: It’s a good question. One that I asked myself prior to starting the series.

    In the case of a simple algorithm like the Bubble Sort, the benefit of reading this page vs the Wikipedia entry is probably nothing. Will this remain the case for the other algorithms? No I don’t think so.

    Wikipedia articles, while helpful, tend to be written in a manner which makes it sound like the author is trying to sound smart. That might sound harsh, but that’s how it feels to me. The entries attempt to be too formal, and use a bit too much jargon for my liking. Also, as you get to some of the more complicated algorithms, the entries on Wikipedia aren’t easily grokked, or they’re too high-level to really cover what’s going on.

    My goals:

    1. To cover as many algorithms as I can in detail (Wikipedia is hit and miss in this regard).
    2. Avoid using jargon when it’s not necessary and make it as clear and as readable as possible.
    3. Give working samples of code in multiple languages (both imperative and functional).

    On top of this, I am working on WPF-based visualiser (not sure if I’ll make it an EXE or an XBAP) which will animate the process of each of these sorting algorithms. This will take a bit of time and probably won’t get released until I’ve covered a few algorithms.

    So, with all that in mind, I want it to be a place where people can come and read about all the algorithms, and get full coverage, in laymans terms, with an interactive visualisation. No site that I have found so far has that, including Wikipedia.

    Of course, if nobody reads this stuff it still doesn’t matter. Because I will have learned a lot, and brushed up on my understanding of all these algorithms along the way. So I’m still ahead :)

    Cheers bud!

  7. OJ
    August 15, 2008

    FYI, I’ve just uploaded samples in Haskell. You can find them on the BitBucket repo. Cheers!

  8. OJ
    August 16, 2008

    Another update, I’ve added F# samples as well. Grab the latest from the repo if you’re interested. They’re very similar to Haskell.

    Down the track I’ll add an async sample to the F# code.

  9. Bona
    September 17, 2008

    Please hlep me write a C++ program that sorts person’s name in desecnding order using bubble sorting algorithm.

  10. OJ
    September 18, 2008

    @Bona: The short answer is: no.

    I’ve described the algorithm in a fair bit of detail, and shown sample implementations. It’s a fairly trivial matter to take the information in this article and extend it to sort lists and arrays of strings.

    Do your own homework, you might learn something! :)

Leave a comment

Size

Colors