Skip to main content

Complexity Analysis for Interviews, Part 2

This is the second part of a series. Go back to part 1 if you haven't read it already.

In a fast paced interview, running time analysis is often about recognition of common running times of basic operations and combining them together, since you won't have time to derive the running time of each subcomponent of your algorithm. For example, say you had an algorithm that first sorted an array and then made a constant number of passes through the sorted array. You should point out to your interviewer that the sorting took O ( n log n ) assuming mergesort or heapsort and the constant number of passes took O(n) which is dwarfed by O ( n log n ) for large n. Therefore the whole algorithm is O ( n log n ) . This is the kind of thing you have to learn to recognize. As we go over common algorithms and algorithms for specific real interview questions, I will point out the running times of the various parts. Remember that O(1) constant < O(log n) logarithmic < O(n) linear < O ( n log n ) quasilinear < O ( n 2 ) quadratic < O ( n 3 ) cubic < ... < O ( 2 n ) exponential < ... < O(n!) factorial. So if your algorithm consists of operations in multiple Big-Theta complexity classes, the larger one is going to dwarf all the smaller ones for large n so the running time of the whole algorithm should be whichever is the largest. So if a algorithm has a O ( 2 n ) part, an O ( n 2 ) , and an O(n log n) part, then the whole thing is O ( 2 n ) . If you find your algorithm doing an O(1) operation n times then it will be O(n) altogether. For 80% of interviews, I think this is sufficient to recognize the running times of the common routines you use in your algorithm (e.g., comparison-based sorting, iterating through the array, looking up a hash, traversing a tree) and being able to put them together via the above rule. The below is a treatment of some of the deeper math you can learn to deal with all kinds of running time analysis. It is very much optional. It may even go above the head of your interviewer if they have forgotten their college Algorithms class, but I keep it here for personal edification.

What in interviews they call Big-O is actually Big-Theta notation, the notation for the asymptotically tight bound for the actual growth function of an algorithm. The precise distinction will become important later. Let f(n) be the actual growth function of an algorithm. Formally, Θ ( g ( n ) ) = { f ( n ) : c 0 > 0 c 1 > 0 n 0 > 0   s . t . c 0 g ( n ) f ( n ) c 1 g ( n ) n n 0 } In practice, Big- Θ (theta) means the complexity of an algorithm ignoring any constant coefficient. The real Big-O notation is only the asymptotic upper bound part of that. There is also the complementary Big-Omega notation that refers to the asymptotic lower bound. Big- Θ notation that combines the two. Note that since Big-O is an upper bound, really any upper bound, any function f(n) = n is contained in O ( n 2 ) and both those and f_1(n) = n^2 are contained in O ( 2 n ) Again, this is the precise definitions of asymptotic notation, but unfortunately in interviews, they abuse Big-O to be an asymptotically tight bound, i.e. Big- Θ . From this point on, we will also abuse the O(g(n)) notation to denote ( Θ ( g ( n ) ) ) except when we need to make a meaningful distinction.

Deriving Recurrences

Nontrivial complexity comes from using loops and recursion. In both cases, you need to derive a recurrence equation to characterize the complexity.

Solving Recurrences

For mergesort, the recurrence is the following: T ( n ) = { O ( 1 ) i f   n = 1 2 T ( n / 2 ) + O ( n ) i f   n > 1 } . As with all recursive algorithms, merge sort consists of a base case and an inductive case. The base case handles the leaves of the recursion tree, the simplest steps that don't need any further recursive call to the function. In merge sort, that would be the case when the input is one element. A subarray of one element is of course already sorted so we have a fairly trivial base case. The running time of the base case is O(1). The inductive case is where we make the two recursive calls to mergesort each at a cost of T(n/2) and do an O(n) amount of work merging the two sorted subarrays together, so altogether T(n) = 2*T(n/2) + O(n).

Guess and check (Substitution)

Recursion Tree Method

We went over this in the past session. The idea is to construct a two column table. One column is the iteration number or recursive call depth and the other is the amount of work to be done at each iteration. The work column can be a straight column or a tree depending on the number of recursive calls you make.

Master's Theorem

This is where we need to distinguish between Big-O and Big-Theta. For recurrences of the form T(n) = aT(n/b) + f(n), the following holds:

  1. If f ( n ) = O ( n log b a - ϵ ) for some constant ϵ > 0 , then T ( n ) = Θ ( n log b a )
  2. f ( n ) = Θ ( n log b a ) , then T ( n ) = Θ ( n log b a log n )
  3. f ( n ) = Ω ( n log b a + ϵ ) for some constant ϵ > 0 , and if a f ( n / b ) c f ( n ) for some constant c < 1 and all sufficiently large n , then T ( n ) = Θ ( f ( n ) ) .

    Note that Omega notation, the asymptotic lower bound, is defined as:

    Ω ( g ( n ) ) = { f ( n ) : c > 0 , n 0 > 0   s . t .   0 c g ( n ) f ( n ) n n 0 }

Consider our mergesort recurrence: T(n) = 2*T(n/2) + O(n). So a=b=2 and f(n) = Theta(n) in this case. So, T ( n ) = Θ ( n log b a log n ) = Θ ( n log 2 2 log n ) = Θ ( n log n )

For binary search f(n) = 1, a = 1, and b = 2 so Θ ( n log b a ) = Θ ( n log 2 1 ) = Θ ( n 0 ) = Θ ( 1 ) and therefore by Master's Theorem and some algebra: T ( n ) = n log b a log n = n log 2 1 log n = n 0 log n = log n

The derivation of the theorem is beyond this post but the classic Cormen et al Introduction to Algorithms has a good treatment. The takeaway is that with this theorem, you can calculate the solution to any recurrence of the above form.

Return to part 1.

Comments

Popular posts from this blog

Complexity Analysis for Interviews, Part 1

This is part 1 of a two part series. Skip over to part 2 you'd like . For coding interviews, we are interested in gauging the asymptotic efficiency of algorithms both in terms of running time and space. The formal study of analysis of algorithms is called complexity theory, a rich field with fascinating and complicated math. For interviews, we only need a few basic concepts. Asymptotic efficiency is concerned with how the running time or memory requirements of an algorithm grows with the input size, so it is intimately concerned with how well algorithms scale to larger inputs. This is important in Computer Science and in practice because whereas some algorithms work well enough for small inputs of say < 10 inputs, the running time and space grows far faster than the input size and thus large inputs of say 10s to millions of inputs becomes impractical (usually meaning taking hours or even years of execution time). Consider sorting. Say for the sake of argument that sorting

Top 5 Books for Language-Specific Interview Questions

Shrunk and White of Programming When you put down that you know a certain programming language or languages on your resume, you are setting certain expectations for the interviewer. I would strongly caution against putting down "expert" in a language unless you invented or are one of the language's maintainers. You are giving your interviewer the license to quiz you on programming language lore. There are a handful of concepts that are considered "standard" knowledge for each language which go broadly beyond syntax and general semantics. These concepts commonly involve major pitfalls in a given language and the idiomatic technique for negotiating these pitfalls and writing efficient and maintainable code. Note, although the concepts are considered idiomatic, you can seldom infer them from knowledge of syntax and semantics alone. The tricky part here is that most courses that teach a particular programming language do not cover these idiomatic techniques and e

What Are The Types of Questions Asked In Tech Interviews?

Many interview prep books focus on the bread and butter coding interviewing questions, but in real interviews your interviewer may pose other types of questions. It is important to be comfortable and come prepared for these other types of questions which depart dramatically for coding questions yet still may bear considerable weight in your interview evaluation. Behavioral & experience These are the standard non-technical interview questions where the interviewer asks about what you did in particular scenarios and how you performed in the roles listed on your resume. Language specific knowledge questions There is a manageable set of questions for each common programming language. There are questions for uncommon languages too should to encounter a company that emphasizes their use of some of the lesser known languages. Usually these questions attempt to probe for your knowledge of some differentiator and key feature of the programming language. For example, virtual functio