Skip to content### Before we proceed

### The basic idea

### An example

### Generating data sets

### Instrumenting algorithms

### What to count?

### A word about performance

You should have a basic understanding of what "Big O" in the context of computer science means. If you Google "Big O notation", you will find that there *many* resources available online that describe this concept. Below are some resources to help you get started:

- Video by Gayle Laakmann McDowell - She's the author of the excellent book "Cracking The Code Interview" and does a good job explaining the concept in just under ten minutes.
- Big-O Cheat Sheet - Solid overview of common data structures and sorting algorithms.
- Table of common time complexities - Extensive overview of common time complexities.
- What is a plain English explanation of “Big O” notation? - Good Stack Overflow Q&A on the topic.
- Analysis of algorithms - Wikipedia article on the analysis of algorithms which also covers Big O. Quite theoretical. Moreover, do
*not*read the official Wikipedia article on Big O as it describes the concept from its mathematical origin, which goes far deeper than is required in our computing science context.

The premise of the Big O Visualizer is as follows:

- Generate a data set of length n
- Execute the algorithm on the data set and keep a count of the operations (O)
- Plot the result in a chart where the X-axis represents the length of the data set (n) and the Y-axis represents the operations executed (O)

That's it! If we follow the above steps, we will have created a single point in the chart. By repeating these three steps with different sizes of input, types of data and algorithms we can fill the chart with more points and create interesting visualizations.

Below is an example using the Quick Sort algorithm on a data set filled with random numbers.

I'll explain what you're looking at:

- The horizontal axis represents the length of the data set. In this example using Quick Sort the axis starts with a list of 10 random numbers and goes up all the way to 10.000 random numbers.
- The vertical axis represents the amount of operations the algorithm executed. Because computers can do
*a lot*of work this axis goes all the way up to a hundred million! - The colorful areas on the background of the chart represent the typical time complexities that algorithms conform to. Desirable run times are colored green(ish), whereas undesirable run times are colored red(dish).
- Each blue dot in the chart represents the outcome of one analysis. So the dot at n = 100 represents the amount of operations Quick Sort had to execute, in order to sort a list of a 100 random numbers. In this case this should be about 800 operations. As this analysis is performed live within your browser, the numbers will change each time you visit this page. However, the general outcome will always be the same.

So what can we tell from this chart? As all the dots line up neatly with the linearithmic complexity, it means that Quick Sort has a runtime of `O(n log n)`

when executed on a data set containing random numbers.

Neat!

The rest of this page will provide more details about how this tool works, if you're interested. If not, skip ahead to the demo section for more examples or you can try the tool out yourself in the live editor.

A data set is a list of elements on which the algorithm will perform its operation. For example, a sorting algorithm will put a list of numbers into its natural order. Some algorithms have a time complexity that is influenced by the data set. For these algorithms, we typically speak in terms of *best-case* and *worst-case* time complexity. In order to analyze these different cases, the Big O Visualizer comes with a variety of data set generators. Each generator has its own unique combination of the following characteristics:

*Size*. The longer the list, the more work an algorithm has to do.*Variance*. The difference between the biggest and smallest item in the list.*Uniqueness*. The percentage of duplicate items in the list, where 0 means no duplicates and 100 means all items are the same value.*Sortedness*. The percentage of items that are at the same position in the list as when the list would be sorted by its natural order, where 0 means no items are at their sorted position and 100 means all items are at their sorted position (and thus the list is sorted).*Sequenceness*. The percentage of items in the list that form a sequence. A sequence is a series of consecutive items that are all ascending, equal or descending.

The Big O Visualizer comes with the following built-in data generators. The uniqueness, sortedness and sequenceness is calculated for a list of a 1000 items.

Name

Random

Description

Pseudorandom list of numbers between 0 and n

Uniqueness

62%

Sortedness

0%

Sequenceness

32%

Example

`(n=30)`

`[6, 6, 8, 29, 9, 21, 26, 4, 15, 3, 10, 18, 6, 5, 3, 12, 17, 21, 22, 17, 19, 21, 25, 4, 16, 21, 6, 5, 14, 23]`

Name

Sorted

Description

Sequence of numbers from 0 to n

Uniqueness

100%

Sortedness

100%

Sequenceness

100%

Example

`(n=30)`

`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]`

Name

Half Sorted

Description

First half is sorted and second half is pseudorandom

Uniqueness

50%

Sortedness

0%

Sequenceness

66%

Example

`(n=30)`

`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 6, 8, 5, 5, 10, 10, 3, 9, 1, 8, 7, 1, 10, 5, 1]`

Name

Sorted Half

Description

First half is pseudorandom and second half is sorted

Uniqueness

50%

Sortedness

1%

Sequenceness

66%

Example

`(n=30)`

`[13, 3, 4, 11, 6, 7, 1, 6, 1, 13, 6, 10, 11, 2, 2, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]`

Name

Semi Sorted

Description

Alternates every 10 numbers between sorted and pseudorandom

Uniqueness

100%

Sortedness

2%

Sequenceness

65%

Example

`(n=30)`

`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 21, 20, 22, 14, 15, 20, 18, 5, 17, 21, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]`

Once the data set has been prepared it can be fed to the algorithm. Each algorithm is an implementation of the `Algorithm`

class, which provides a basic framework for analyzable algorithms. Below is the implementation of the Bubble Sort algorithm as an analyzable algorithm. The code is a direct copy from Wikipedia, but with one difference: the algorithm has been instrumented so that it keeps track of its operations. This is achieved by adding two calls to the method `this.incrementOpCounter()`

. One to the outer loop and one to the inner loop. These calls increment the number `operations`

in the base class by one each time the algorithm iterates any of the two loops.

src/algorithms/bubble-sort.ts

`1class BubbleSort extends Algorithm {23 name = "Bubble Sort"4 timeComplexityBest = Complexities.linear5 timeComplexityAverage = Complexities.quadratic6 timeComplexityWorst = Complexities.quadratic78 execute(array:number[]): void {9 let len = array.length,10 swapped11 do {12 this.incrementOpCounter()13 swapped = false14 for (let i = 0; i < len; i++) {15 this.incrementOpCounter()16 if (array[i] > array[i + 1]) {17 let tmp = array[i]18 array[i] = array[i + 1]19 array[i + 1] = tmp20 swapped = true21 }22 }23 } while (swapped)24 }25}`

Placing the incrementors at different positions can lead to different results. So where do you put them? That actually depends on *what* you want to measure, which in turn depends on your specific use case. In this project, I've focused on code iterations, which means I've (naively) placed the incrementors at the beginning of each loop and recursive function. Unfortunately, this means I am potentially inflating the numbers as I may be counting more than I should. Fortunately, since we're computer scientists and not mathematicians, we don't have to treat Big O as exact science. This means that when we're investigating algorithms, we're interested in the *approximate* complexity, which is why we use common classifications such as `O(n²)`

instead of `O(n² + 5 x n log n + 10 x n + 1000)`

. While the latter is more accurate, the prior gives us an easier understanding of the time complexity of an algorithm.

Moreover, when we calculate the output of these formulae with a large value for n (which is what we're interested in anyway), you'll see that both formulas produce nearly the same output. For example, `n = 10000`

gives us a value of `100000000`

for the first formula and `100765385`

for the second formula, which is less than one percent difference. When using Big O we're generally interested in order of magnitudes. The question we're asking ourselves is not "Will this algorithm take one second or two seconds to complete?", but instead "Will this algorithm take one second or one *decade* to complete?".

Please note that while time complexity is an important performance aspect of an algorithm, it's not the only one. There are many other factors that determine the actual performance of an algorithm, such as:

- Cost of functions such as comparing, swapping, allocating, shifting and whatever else an algorithm does to get its job done;
- Cost of recursion for recursive algorithms;
- Characteristics of the data set, such as its size, its datatype and whether it's already partially sorted;
- The language and framework in which the algorithm was implemented;
- The hardware the algorithm runs on;
- Many other (often invisible) aspects related to the environment such as the operating system, the hypervisor if present, CPU contention in shared environments, automatic optimizations such as speculative execution, multi-core versus hyper-threading, anti-virus software and so on.

You've finished this section.

Cool, now show me some examples | I want to try this for myself