BitFunnel Engineering Diary

We're open sourcing BitFunnel, a library for high performance indexing, retrieval, and ranking of documents. Today the code runs at massive scale inside of Bing's data centers, but our dream is to make the code available and relevant to anyone, anywhere who values search. As we release each module, we will document our key design decisions here on this blog.

On the Road to Open Source

Today we are kicking off an effort to open source BitFunnel, a library for high-performance full-text search over a chunk of the internet, spread across thousands of machines. It is based on a probabilistic algorithm that identifies and ranks documents according to queries involving keywords, phrases, and mathematical expressions. BitFunnel is the most significant accomplishment in my 35 years building software. I am proud of the system we created and want to share it with the world. (read more...)

Debugging unfamiliar code

This is a story about debugging a register allocation issue in a compiler. I recently joined the Bitfunnel project; there wasn’t much documentation at the time, so I started playing with the code by writing some simple examples that could later serve as documentation to anyone else coming in after me. The first thing I tried this for was NativeJIT, both because it’s self-contained and because it’s the first thing we open sourced. (read more...)