Complexity vs. Constant Factors

2014-05-30

If you're formally trained, you've seen lots and lots of asymptotic analysis and the like in Computer Science.

In an earlier post about trees I noted one way you can be led astray by asymptotic analysis when I compared AVL trees and Red-Black trees. The question there is whether the part of the work people analyze is the part that makes up the most work.

There's another simpler way it often goes wrong though. Constant factors *matter*. A computer only has so much memory, and so much hardware. Every problem that can be solved on a given computer can be solved in 2**number_of_bits where number_of_bits is all storage on the machine. After that the machine must cycle to an already met state. That number is *extremely* large, (and in point of fact likely takes all the energy in the universe to cycle through) but it is finite.

It turns out that small local operations on arrays can still be VERY fast, particularly because the general locality helps with caching. A BTree is a way of taking advantage of these speedups, but within the small finite space of your node size.. that is, your arity. This makes BTrees MUCH faster than either AVL or RedBlack trees, despite the asymptotic analysis being identical to RedBlack trees.

I was curious about exactly where this tradeoff lies for BTrees. A btree stores many elements in each node, and usually keeps the elements sorted within that node in an array. This means inserts and removals from a node require shifting everything over in the node - a linear time operation in the size of the array, but B-tree nodes are of finite size, so still O(1). Anyway the number of elements in a BTree corrosponds to the number of children it has (-1 actually), so we can call it the "arity". So, I tried running my benchmark (same one used in the earlier tree experiments) at varying arities to see where our efficiency peaked. Note that these results are for 32 bit integers. The hardware and other dependencies are: Dell XPS 12 convertable laptop/tablet with: Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz, ubuntu desktop, gcc 4-8-1-10ubuntu9. Linux kernel 3-11.0-19-generic.

It appears to peak around 70 elements. The peak is pretty wide though so it's hard to be sure given the noise in the data. This was done with a single trial on a laptop that wasn't necessarilly perfectly isolated. For details on the exact experiement, and ability to run it yourself see



This isn't a surprising result at all, it's about what I expected. My guess had been between 100 and 200, which isn't far off.

Now if you read my post about real-time http://www.blog.computersarehard.net/2014/05/worst-case-vs-average-case.html You probably just stopped to wonder about this algorithm.... Yup, you can bet that the worst-case goes up a bit as we add more elements even while the average might be dropping. So, it's not clear what the best ARITY to use would be. Given no information about use-case there's a pretty clear cliff where we get a huge win by increasing the arity. So, not knowing I'd be tempted to set it at maybe 55. This will get us basically all of the gains of increasing arity, but keep the worst-case and variance comparitively small.

Consider the minimum there, ~25 seconds or so. The best run I've seen out of AVL is ~49 seconds. That's nearly a 2x difference. Even at Arity 5 the btree destroys the AVL tree at only ~39 seconds.

Conclusion: Constant factors matter, and can matter a lot. What's neat is that we still keep our asymptotic bounds TOO, so on both big and small problems we can be fairly sure BTree will keep beating the AVL tree by the same factors. I think that's pretty neat.