Skip to content

Big O Visualizer

Documentation

Before we proceed

You should have a basic understanding of what "Big O" in the context of computer science means. If you Google "Big O notation", you will find that there many resources available online that describe this concept. Below are some resources to help you get started:

The basic idea

The premise of the Big O Visualizer is as follows:

  1. Generate a data set of length n
  2. Execute the algorithm on the data set and keep a count of the operations (O)
  3. Plot the result in a chart where the X-axis represents the length of the data set (n) and the Y-axis represents the operations executed (O)

That's it! If we follow the above steps, we will have created a single point in the chart. By repeating these three steps with different sizes of input, types of data and algorithms we can fill the chart with more points and create interesting visualizations.

An example

Below is an example using the Quick Sort algorithm on a data set filled with random numbers.

I'll explain what you're looking at:

  • The horizontal axis represents the length of the data set. In this example using Quick Sort the axis starts with a list of 10 random numbers and goes up all the way to 10.000 random numbers.
  • The vertical axis represents the amount of operations the algorithm executed. Because computers can do a lot of work this axis goes all the way up to a hundred million!
  • The colorful areas on the background of the chart represent the typical time complexities that algorithms conform to. Desirable run times are colored green(ish), whereas undesirable run times are colored red(dish).
  • Each blue dot in the chart represents the outcome of one analysis. So the dot at n = 100 represents the amount of operations Quick Sort had to execute, in order to sort a list of a 100 random numbers. In this case this should be about 800 operations. As this analysis is performed live within your browser, the numbers will change each time you visit this page. However, the general outcome will always be the same.

So what can we tell from this chart? As all the dots line up neatly with the linearithmic complexity, it means that Quick Sort has a runtime of O(n log n) when executed on a data set containing random numbers.

Neat!

The rest of this page will provide more details about how this tool works, if you're interested. If not, skip ahead to the demo section for more examples or you can try the tool out yourself in the live editor.

Generating data sets

A data set is a list of elements on which the algorithm will perform its operation. For example, a sorting algorithm will put a list of numbers into its natural order. Some algorithms have a time complexity that is influenced by the data set. For these algorithms, we typically speak in terms of best-case and worst-case time complexity. In order to analyze these different cases, the Big O Visualizer comes with a variety of data set generators. Each generator has its own unique combination of the following characteristics:

  1. Size. The longer the list, the more work an algorithm has to do.
  2. Variance. The difference between the biggest and smallest item in the list.
  3. Uniqueness. The percentage of duplicate items in the list, where 0 means no duplicates and 100 means all items are the same value.
  4. Sortedness. The percentage of items that are at the same position in the list as when the list would be sorted by its natural order, where 0 means no items are at their sorted position and 100 means all items are at their sorted position (and thus the list is sorted).
  5. Sequenceness. The percentage of items in the list that form a sequence. A sequence is a series of consecutive items that are all ascending, equal or descending.

The Big O Visualizer comes with the following built-in data generators. The uniqueness, sortedness and sequenceness is calculated for a list of a 1000 items.

Name
Random
Description
Pseudorandom list of numbers between 0 and n
Uniqueness
62%
Sortedness
0%
Sequenceness
32%
Example (n=30)
[6, 6, 8, 29, 9, 21, 26, 4, 15, 3, 10, 18, 6, 5, 3, 12, 17, 21, 22, 17, 19, 21, 25, 4, 16, 21, 6, 5, 14, 23]
Name
Sorted
Description
Sequence of numbers from 0 to n
Uniqueness
100%
Sortedness
100%
Sequenceness
100%
Example (n=30)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
Name
Half Sorted
Description
First half is sorted and second half is pseudorandom
Uniqueness
50%
Sortedness
0%
Sequenceness
66%
Example (n=30)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 6, 8, 5, 5, 10, 10, 3, 9, 1, 8, 7, 1, 10, 5, 1]
Name
Sorted Half
Description
First half is pseudorandom and second half is sorted
Uniqueness
50%
Sortedness
1%
Sequenceness
66%
Example (n=30)
[13, 3, 4, 11, 6, 7, 1, 6, 1, 13, 6, 10, 11, 2, 2, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
Name
Semi Sorted
Description
Alternates every 10 numbers between sorted and pseudorandom
Uniqueness
100%
Sortedness
2%
Sequenceness
65%
Example (n=30)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 21, 20, 22, 14, 15, 20, 18, 5, 17, 21, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]

Instrumenting algorithms

Once the data set has been prepared it can be fed to the algorithm. Each algorithm is an implementation of the Algorithm class, which provides a basic framework for analyzable algorithms. Below is the implementation of the Bubble Sort algorithm as an analyzable algorithm. The code is a direct copy from Wikipedia, but with one difference: the algorithm has been instrumented so that it keeps track of its operations. This is achieved by adding two calls to the method this.incrementOpCounter(). One to the outer loop and one to the inner loop. These calls increment the number operations in the base class by one each time the algorithm iterates any of the two loops.

src/algorithms/bubble-sort.ts
1class BubbleSort extends Algorithm {
2
3 name = "Bubble Sort"
4 timeComplexityBest = Complexities.linear
5 timeComplexityAverage = Complexities.quadratic
6 timeComplexityWorst = Complexities.quadratic
7
8 execute(array:number[]): void {
9 let len = array.length,
10 swapped
11 do {
12 this.incrementOpCounter()
13 swapped = false
14 for (let i = 0; i < len; i++) {
15 this.incrementOpCounter()
16 if (array[i] > array[i + 1]) {
17 let tmp = array[i]
18 array[i] = array[i + 1]
19 array[i + 1] = tmp
20 swapped = true
21 }
22 }
23 } while (swapped)
24 }
25}

What to count?

Placing the incrementors at different positions can lead to different results. So where do you put them? That actually depends on what you want to measure, which in turn depends on your specific use case. In this project, I've focused on code iterations, which means I've (naively) placed the incrementors at the beginning of each loop and recursive function. Unfortunately, this means I am potentially inflating the numbers as I may be counting more than I should. Fortunately, since we're computer scientists and not mathematicians, we don't have to treat Big O as exact science. This means that when we're investigating algorithms, we're interested in the approximate complexity, which is why we use common classifications such as O(n²) instead of O(n² + 5 x n log n + 10 x n + 1000). While the latter is more accurate, the prior gives us an easier understanding of the time complexity of an algorithm.

Moreover, when we calculate the output of these formulae with a large value for n (which is what we're interested in anyway), you'll see that both formulas produce nearly the same output. For example, n = 10000 gives us a value of 100000000 for the first formula and 100765385 for the second formula, which is less than one percent difference. When using Big O we're generally interested in order of magnitudes. The question we're asking ourselves is not "Will this algorithm take one second or two seconds to complete?", but instead "Will this algorithm take one second or one decade to complete?".

A word about performance

Please note that while time complexity is an important performance aspect of an algorithm, it's not the only one. There are many other factors that determine the actual performance of an algorithm, such as:

  • Cost of functions such as comparing, swapping, allocating, shifting and whatever else an algorithm does to get its job done;
  • Cost of recursion for recursive algorithms;
  • Characteristics of the data set, such as its size, its datatype and whether it's already partially sorted;
  • The language and framework in which the algorithm was implemented;
  • The hardware the algorithm runs on;
  • Many other (often invisible) aspects related to the environment such as the operating system, the hypervisor if present, CPU contention in shared environments, automatic optimizations such as speculative execution, multi-core versus hyper-threading, anti-virus software and so on.

You've finished this section.

Cool, now show me some examples | I want to try this for myself