Okay, this is gonna be kind of an off-the-cuff post here. I’m writing this because I just made a fascinating discovery that sheds some light on why the factorial function overflows so quickly when you use integers, as well as a simple way that you can determine just when the factorial function will overflow without having to do complex calculations with logarithms. This thought process was spurred by my use of an integer factorial function in one of my QBASIC programs that I was writing, which got me thinking about the exponential behavior of that function. This post will be somewhat less technology-related and more theoretical/mathematical than my previous ones, but the mathematical concepts I’m employing here are all fairly elementary and will be familiar to anyone with a basic education in high school algebra.
If you’ve never worked with factorials in your programs, you might be surprised by how quickly the factorial function overflows. In fact if you use a 64-bit integer as your result, which is the largest integer width possible in most programming languages, you’ll quickly run out of space once the base gets past about 23. This is because not only does the factorial increase in an exponential fashion, but the logarithm of the factorial increases semi-exponentially as well (meaning the second derivative is consistently positive). In fact I’ve calculated the formula for the number of bits needed to hold n!:
Big thanks to the creators of the free Online LaTeX Editor web app who made the above graphic possible. I didn’t feel like installing or purchasing a LaTeX editor just for this one thing.
Of course now I’m obligated to explain my rationale behind this formula. I know it looks really convoluted, but 90% of it is just there for the sake of precision; the concept behind it is very simple. Think about it this way: Every time you multiply by a number n, you’re adding a certain number of bits to the representation of the product. If the number is greater than 2, then you will be adding at least one bit, because you’re multiplying by at least a factor of 2. If the number is greater than 4, then you will be adding at least two bits, because you’re multiplying by a factor of at least 4. So the factorial increases by 1 bit for 2 and 3, by 2 bits for 4, 5, 6, and 7, by 3 bits for 8, 9, 10, 11, 12, 13, 14, and 15, and so on. So you’re adding the floor of the logarithm of that number a certain number of times, precisely the number of times you add it is equal to the difference between that number and the last power of 2, which is where the (n – 2^floor(log2(n))) bit at the end comes from. For everything up to the last power of 2 you’re summing a series where each term adds the logarithm times the difference between the next power of 2 and the previous one, i.e. 2^(k-1). This is where the sigma portion of the equation comes from. The left side of the equation is of course the number of bits needed to store n!, which would be the logarithm of n! rounded up to the nearest integer.
But we don’t actually need the formula to explain why the factorial function overflows so quickly. The formula is only there to formalize my theory, so if you’re intimidated by complex math, feel free to ignore it because you won’t need it. The counting method I gave in the previous paragraph allows you to quickly determine when an n-bit integer will overflow without using a calculator and without using logarithms. All you do is add up the numbers as I showed previously. If we do this, then we find that the factorial function will overflow at about 15! for a 32-bit integer and at about 23! for for a 64-bit integer.
Okay, so that was just a random math post there. If I made any mistakes in my calculations feel free to correct them, because this was all thought out rather quickly and there may be things that I missed. For now, farewell and happy hacking!