The Big O

Algorithm analysis provides a framework for measuring the efficiency of algorithms. For several key reasons, it is impossible to perform a deep scientific analysis that takes into account all the possible machine, memory, language and implementation-specific issues. Yet a rough estimate, based on the most significant elements affecting performance, is important when deciding which algorithm to use. The O notation, a.k.a. Big-Oh or Asymptotic notation, effectively eliminates the less significant elements of the equation. The objective is to produce a simplified formula, which is similar to less than or equal to the actual algorithm performance curve.

A formula that looks like: f(n) = XN^2 + YN + Z; factors into something like: O(N^2). It’s intuitively easy enough to understand that the N^2 (N squared) part of the equation is the most significant factor. Removing any other element has less impact on the curve than removing that one. There are other variants to the Big O, by the way - Big Omega, Big Theta, and so on - each measures growth rate by applying slightly different formulas. Based on our quick explanation, you can probably guess why Big O has it’s name. The O stands for Order. So the Bubble sort, which includes a pair of nested loops, is said to perform on the order of N squared, where N stands for the number of elements being sorted.

To make things easier to compare, algorithms are usually grouped into complexity classes. The following table shows the most common classes. Notice that the general purpose sorting algorithms fall between Quadratic and Linear. For obvious reasons, it is not wise to use the slower algorithms on large sets of data. It’s also good idea to understand enough about these factors to predict performance in your applications, whether these factors apply to sorting or other algorithms you may develop or use.

Name	Notation
Constant	O(1)
Logarithmic	O(log N)
Linear	O(N)
N log N	O(N log N)
Quadratic	O(N^2)
Cubic	O(N^3)
Exponential	O(2^N)

Keep in mind that O notation is an indicator, which cannot and does not try to account for memory usage, resource constraints, machine speed or other elements, so your mileage will tend to vary widely. Obviously a faster machine will typically run the algorithm more quickly but certain algorithms perform better in constrained memory environments (like Heap sort, which consumes less memory) while others might be better in slower memory conditions (Merge sort, for example, can be applied to disk-based sorting). Finally, the order of the data can affect performance, especially in sorting. Some algorithms move less data and the O notation cannot account for those conditions.

Claude Duguay

Enterprises, Inc.

The Big O