Skip to main content

Tricky, tricky integer overflow

In today's session, I show the group a tricky, tricky integer overflow detail that is pervasive in binary search and mergesort implementations. How tricky is this whole thing? According to the Google Research blog, most of the first year CMU Ph.D. students' implementations of the rudimentary algorithms were flawed due to overflow problems. Moreover, various cybersecurity lists name integer overflow as one of the top software flaws. The Google blog also recommends Jon Bentley's classic Programming Pearls (Amazon affiliate link) book which motivates this and other Google-style large scale questions such as sorting billions of integers efficiently limited memory. You can probably snag that book in the local library, but we'll cover these and other popular interview questions. But back to the main point, integer overflow is a popular conceptual interview question in some quarters. The interviewer wants to see if you understand integer overflow and by extension an understanding of how machines work. I didn't have time to go over this in depth today, but here are some more details.

Integer overflow happens because machines (mostly) keep integers in registers during arithmetic operations and these registers are fixed precision and of a fixed width, usually either 32-bits or 64-bits. To see what happens you need to experiment a bit to see what happens when an int gets near the limits of the fixed width. Consider the following Java program:


class Test {
    public static void main(String[] args) {
        int maxInt = 1 << 31 - 1; // Integer.MAX_VALUE
        int overMax = 1 << 31;

        System.out.println(new Integer(maxInt));
        System.out.println(new Integer(overMax));
        System.out.println(new Integer(maxInt+2));
    }
}

Recall that 1 << 31 = 2^31. What would you guess the above program prints out? For a signed integer, the maximum representable value is 2^31-1. The left most bit is the sign bit. When this value is exceeded, the integer overflow happens. There are a few ways programming languages resolve integer overflows. The most popular and cheapest way is to essentially do nothing and let the integer wrap to negative values (i.e., the sign bit is set and the rest of the bits represent the magnitude of the negative int). What are these negative values? They are typically represented as two's complement but that is beyond the scope of this discussion. The main point is that in many programming languages, integer overflow results in wrapping of ints to negative values. This can be a problem if you aren't careful. Consider the following code. The while loop looks innocent enough, but this loop won't be terminating when one intuitively thinks, when we add maxInt to i. Upon that addition, i wraps and we stay in negative territory such that i will mostly definitely be < 1000 for a long, long while. In some cases, integer wrapping may lead to unexpected infinite loops.


int i=0, j=0;
int A[] = { 1, 2, 3, maxInt };
while (i < 1000) {
   i += A[j % A.length];
   A[j % A.length] = 2;
   j++;
   System.out.printf("i %d j %d\n", i, j);
}

So the moral of the story is to be careful if your code can conceivably hit integer overflow and you are in an environment where integer overflow is handled by wrapping. It is a tricky interview question, but now you are aware of it. For extra credit, how can you detect that integer overflow happened when ints wrap?

One of you pointed out that in Python, integer overflows don't apply, which is mostly true since Python 3 (and Ruby) defaults to arbitrary-precision integers ("big ints") and Python 2 automatically promotes integers to arbitrary-precision as needed, but if one were to use numpy or pandas, those packages use native fixed-precision integers where wrapping will happen. In the Python 2 case, one would still have integer overflow, but Python deals with the overflow by coercing the integer to a big int. Java, C/C++, and C# deal with integer overflow by silently wrap to negative or 0. C# does have a checked decoration which can be used to have the program generate an exception upon overflow.

Comments

Popular posts from this blog

Complexity Analysis for Interviews, Part 1

This is part 1 of a two part series. Skip over to part 2 you'd like . For coding interviews, we are interested in gauging the asymptotic efficiency of algorithms both in terms of running time and space. The formal study of analysis of algorithms is called complexity theory, a rich field with fascinating and complicated math. For interviews, we only need a few basic concepts. Asymptotic efficiency is concerned with how the running time or memory requirements of an algorithm grows with the input size, so it is intimately concerned with how well algorithms scale to larger inputs. This is important in Computer Science and in practice because whereas some algorithms work well enough for small inputs of say < 10 inputs, the running time and space grows far faster than the input size and thus large inputs of say 10s to millions of inputs becomes impractical (usually meaning taking hours or even years of execution time). Consider sorting. Say for the sake of argument that sorting

Top 5 Books for Language-Specific Interview Questions

Shrunk and White of Programming When you put down that you know a certain programming language or languages on your resume, you are setting certain expectations for the interviewer. I would strongly caution against putting down "expert" in a language unless you invented or are one of the language's maintainers. You are giving your interviewer the license to quiz you on programming language lore. There are a handful of concepts that are considered "standard" knowledge for each language which go broadly beyond syntax and general semantics. These concepts commonly involve major pitfalls in a given language and the idiomatic technique for negotiating these pitfalls and writing efficient and maintainable code. Note, although the concepts are considered idiomatic, you can seldom infer them from knowledge of syntax and semantics alone. The tricky part here is that most courses that teach a particular programming language do not cover these idiomatic techniques and e

What Are The Types of Questions Asked In Tech Interviews?

Many interview prep books focus on the bread and butter coding interviewing questions, but in real interviews your interviewer may pose other types of questions. It is important to be comfortable and come prepared for these other types of questions which depart dramatically for coding questions yet still may bear considerable weight in your interview evaluation. Behavioral & experience These are the standard non-technical interview questions where the interviewer asks about what you did in particular scenarios and how you performed in the roles listed on your resume. Language specific knowledge questions There is a manageable set of questions for each common programming language. There are questions for uncommon languages too should to encounter a company that emphasizes their use of some of the lesser known languages. Usually these questions attempt to probe for your knowledge of some differentiator and key feature of the programming language. For example, virtual functio