When will BitFunnel be usable? · BitFunnel

When will BitFunnel be usable?

How long should we expect this project to take? In theory, we should have a relatively easy time guessing how long this project will take because this project is a half-port-half-rewrite whose aim to produce an open source version that’s simpler than the internal version of the project, and we know how big the original project is.

If we do find . -name "*.h" -o -name "*.cpp" | grep -v NativeJIT | xargs wc on the original project to count all lines of code except NativeJIT, we get roughly 144k lines of code. I’m excluding NativeJIT because that was ported seperately from the BitFunnel repo, so our extrapolation should exclude that.

We’re currently at about 53kLOC in the new BitFunnel repo. If we graph the progress, we can see that it’s been roughly linear since May.

The date is on the x-axis and lines of code are on the y-axis. It’s a bit surprising to me that the progress looks so linear. We’ve had periods where I’ve been busy with non-coding duties and Mike has done the vast majority of the coding, and we’ve had periods where Mike’s been busy with non-coding duties and I’ve been doing the vast majority of the coding. Despite the wildly varying coding workload we’ve taken on at times, when you average everything out, progress has been approximately linear.

I don’t expect this to continue indefinitely – once we get to the point where we have enough of a system stood up so that we can run experiments, progress as measured in lines of code should slow down. We should also see some slowdown when we do intergration and integration testing with whatever we’re going to integrate with, which will probably be a lot of work but not much code. On top of that, we’ll probably enter a slow period as the holiday season rolls around. Additionally, the lines of code in the new project are somewhat differently scaled than the lines of code in the old project because we’ve been adding a license at the top of most files. With all those disclaimers aside, if we guess that we’ll end up with somewhere between 1/2x to 1x as much code as the original project, we can make a crude estimate of how long it will take to “finish” BitFunnel:

This is the same graph as before, but with a red horizontal line at the size of the old BitFunnel project and a green horizontal line at half the size of the old BitFunnel project. If we believe the linear estimate, we might be “done” anywhere between late this year and next July. If we take all of the caveats listed above into account, it’s likely that we won’t have something “complete” this calendar year. Beyond that, the error bars are so large that it’s hard to say much except that it’s plausible that we’ll have something “complete” by the end of the next calendar year.

Dan Luu
Prior to working on BitFunnel, Dan worked on network virtualization hardware at Microsoft (SmartNIC), deep learning hardware at Google (TPU), and x86/ARM processors at Centaur.