Introduction to Complexity Theory/Big O Algorithm Analysis

Big O notation, whilst not being a part of complexity theory, is used to describe upper bound of the time, and space usage of an algorithm. In this notation $$n$$ refers to the size of the input into the algorithm. When it is written that a given algorithm runs “in big O of” a mathematical expression, it refers to the time or amount of time it takes the algorithm to finish executing in relationship to a mathematical modification of the size of the input.

Understanding big O, and how to describe an algorithms in terms of it is necessary in the understanding of the analysis of the relative complexity of any problem.

Big O notation
the analysis of algorithms is a major task in computer science. In order to compare algorithms we must have some criteria to measure to efficiency of over algorithm. If it instead is referred to as being $$O\left(n\right)$$ then the time the algorithm runs will grow larger linearly proportional to the size of the input. On a deterministic Turing machine one such algorithm is the linear search algorithm because it accepts an input consisting of a value and a list data, and looks through that list sequentially until it finds the specified value at which point it returns the key and halts, or if it does not find the value, returns a value which could not be a key in order to signify that it did not find the value in the list of data.

An algorithm that runs at $$O\left(\log\left(n\right)\right)$$ has the time grow logarithmically to the size of the input. An example of an algorithm that runs in $$O\left(\log\left(n\right)\right)$$ is the binary search algorithm.

$$O\left({n \log\left(n\right)}\right)$$ time on a deterministic Turing machine is the lower-bound of all comparison-based sorting algorithms such as Merge sort. --117.196.244.89 04:48, 12 December 2012 (UTC)

How to Determine Big O
In order to determine the big O of an algorithm, we look at how the size of the input affects the time that it takes the algorithm to execute. A simple brute-force way to do this is to plot a graph of input size vs. execution time for a given algorithm and fit an equation over the data. Another way to accomplish this is to think through the algorithm execution logically.

When the expression for execution time in terms of input size is a polynomial, the largest term in the polynomial is all that is used and constant factors in that term are dropped. For example, $$O\left(5n^{3} + 2n^{2} - 12n\right)$$ would simply be written as $$O\left(n^{3}\right)$$ because the two functions grow at the same asymptotic rate.

Similarly, the base of a logarithm does not matter when discussing big O. The expression $$O\left(\log_{10}\left(n\right)\right)$$ could, through the change of base formula, be rewritten as $$O\left(\log_2\left(n\right) / \log_2\left(10\right)\right)$$. The constant factor $$1 / \log_2\left(10\right)$$ could then be dropped. Because of this, all logarithmic bases are equivalent in big O notation and are omitted in practice.

Analysis of Linear Search
The way in which big O is determined for a particular algorithm is algorithm analysis. Take for example the simple linear search algorithm that we mentioned previously and apply it to the value $$3$$ and the ordered set $$S=\left\{0,1,2,3,4,5\right\}$$.

Sequentially the algorithm would start from the index of the set, $$0$$, and then look at each next value until it finds the value $$3$$, thus making four comparisons.

But what about all of the other infinite number of integers that we could have looked for? Let's try the value $$6$$. The algorithm will first check $$6$$ against $$0$$, $$1$$, $$2$$, $$3$$, $$4$$, and $$5$$ until it halts without finding the value and thus makes six comparisons.

From this we can draw that the algorithm will make a number of comparisons equal to the input size, $$n$$, and thus is $$O\left(n\right)$$.

Analysis of Binary Search
The binary search algorithm is more efficient than the previously mentioned linear search algorithm, but harder to analyse.

Let's take the same set as we did last time $$S=\left\{0,1,2,3,4,5\right\}$$. This time we will search for the value $$1$$. The binary search algorithm divided the set into two equally-sized sets, or almost equally-sized if the set has an odd number of elements, and by looking to the left of the centre (even sized), or the centre value (odd sized) determined which set to look further into. So, in our case, the binary search algorithm divides the set into the subsets $$S_{1}=\left\{0,1,2\right\}$$ and $$S_{2}=\left\{3,4,5\right\}$$. The centre of set $$S$$ was between the end of $$S_{1}$$ and $$S_{2}$$, so we have to compare our value to the end of $$S_{1}$$, $$2$$. $$1 < 2$$, so if $$1$$ exists in our ordered set $$S$$ it must be contained in $$S_{1}$$. We then compare one to the centre of $$S_{1}$$ and get that it is equal to it, and therefore exists in $$S$$.

That was only two comparisons, but what if we again take and search for a value which does not exists in $$S$$, $$6$$? We would compare to $$3$$, and then to $$4$$, and finally to $$5$$, without successfully finding $$6$$. This would give us a total of 3 comparisons. But what does it all mean? Let's look at the maximum number of comparisons needed for different input sizes:

The pattern that is shown here is that the greatest number of comparisons for a particular input is equal to $$\left\lceil\log_{2}\left(n\right)\right\rceil$$, or the greatest integer function of the binary logarithm of the input size. Generalised for very large input sizes, we get that the binary search algorithm is $$O\left(\log\left(n\right)\right)$$.

Activities

 * Take the quiz.
 * Write a paper detailing your deconstruction of the big O complexity of the bubble sort algorithm.