This is the second part of a series. Go back to part 1 if you haven't read it already.
In a fast paced interview, running time analysis is often about recognition of common running times of basic operations and combining them together, since you won't have time to derive the running time of each subcomponent of your algorithm. For example, say you had an algorithm that first sorted an array and then made a constant number of passes through the sorted array. You should point out to your interviewer that the sorting took assuming mergesort or heapsort and the constant number of passes took O(n) which is dwarfed by for large n. Therefore the whole algorithm is . This is the kind of thing you have to learn to recognize. As we go over common algorithms and algorithms for specific real interview questions, I will point out the running times of the various parts. Remember that O(1) constant < O(log n) logarithmic < O(n) linear < quasilinear < quadratic < cubic < ... < exponential < ... < O(n!) factorial. So if your algorithm consists of operations in multiple Big-Theta complexity classes, the larger one is going to dwarf all the smaller ones for large n so the running time of the whole algorithm should be whichever is the largest. So if a algorithm has a part, an , and an O(n log n) part, then the whole thing is . If you find your algorithm doing an O(1) operation n times then it will be O(n) altogether. For 80% of interviews, I think this is sufficient to recognize the running times of the common routines you use in your algorithm (e.g., comparison-based sorting, iterating through the array, looking up a hash, traversing a tree) and being able to put them together via the above rule. The below is a treatment of some of the deeper math you can learn to deal with all kinds of running time analysis. It is very much optional. It may even go above the head of your interviewer if they have forgotten their college Algorithms class, but I keep it here for personal edification.
What in interviews they call Big-O is actually Big-Theta notation, the notation for the asymptotically tight bound for the actual growth function of an algorithm. The precise distinction will become important later. Let f(n) be the actual growth function of an algorithm. Formally, In practice, Big- (theta) means the complexity of an algorithm ignoring any constant coefficient. The real Big-O notation is only the asymptotic upper bound part of that. There is also the complementary Big-Omega notation that refers to the asymptotic lower bound. Big- notation that combines the two. Note that since Big-O is an upper bound, really any upper bound, any function f(n) = n is contained in and both those and f_1(n) = n^2 are contained in Again, this is the precise definitions of asymptotic notation, but unfortunately in interviews, they abuse Big-O to be an asymptotically tight bound, i.e. Big- . From this point on, we will also abuse the O(g(n)) notation to denote ( ) except when we need to make a meaningful distinction.
Nontrivial complexity comes from using loops and recursion. In both cases, you need to derive a recurrence equation to characterize the complexity.
For mergesort, the recurrence is the following: . As with all recursive algorithms, merge sort consists of a base case and an inductive case. The base case handles the leaves of the recursion tree, the simplest steps that don't need any further recursive call to the function. In merge sort, that would be the case when the input is one element. A subarray of one element is of course already sorted so we have a fairly trivial base case. The running time of the base case is O(1). The inductive case is where we make the two recursive calls to mergesort each at a cost of T(n/2) and do an O(n) amount of work merging the two sorted subarrays together, so altogether T(n) = 2*T(n/2) + O(n).
Guess and check (Substitution)
Recursion Tree MethodWe went over this in the past session. The idea is to construct a two column table. One column is the iteration number or recursive call depth and the other is the amount of work to be done at each iteration. The work column can be a straight column or a tree depending on the number of recursive calls you make.
This is where we need to distinguish between Big-O and Big-Theta. For recurrences of the form T(n) = aT(n/b) + f(n), the following holds:
- If for some constant , then
- , then
for some constant
, and if
for some constant
and all sufficiently large
Note that Omega notation, the asymptotic lower bound, is defined as:
Consider our mergesort recurrence: T(n) = 2*T(n/2) + O(n). So a=b=2 and f(n) = Theta(n) in this case. So,
For binary search f(n) = 1, a = 1, and b = 2 so and therefore by Master's Theorem and some algebra:
The derivation of the theorem is beyond this post but the classic Cormen et al Introduction to Algorithms has a good treatment. The takeaway is that with this theorem, you can calculate the solution to any recurrence of the above form.Return to part 1.