BitFunnel Blog

We're open sourcing BitFunnel, a key component of the Bing search engine. The BitFunnel library provides high performance indexing, retrieval, and ranking of documents. Today the code runs at massive scale inside of Bing's data centers, but our dream is to make the code available and relevant to anyone, anywhere who values search. As we release each module, we will document our key design decisions here on this blog.

Corpus File Format

One of the challenges in making BitFunnel relevant to the open source community is removing Bing-specific functionality that has deep dependencies on the internals of the rest of the Bing web crawling and index serving infrastructure. As I mentioned in my first post, we plan to start with an empty repository and bring over BitFunnel modules one by one. We are essentially bootstrapping the BitFunnel project, and this process will require a new test corpus, a set of performance benchmarks, and some system to help verify correctness. (read more...)

On the Road to Open Source

Today we are kicking off an effort to open source BitFunnel, a key part of Bing. BitFunnel is a library for high-performance full-text search over a chunk of the internet, spread across thousands of machines. It is based on a probabilistic algorithm that identifies and ranks documents according to queries involving keywords, phrases, and mathematical expressions. BitFunnel is the most significant accomplishment in my 35 years building software. I am proud of the system we created and want to share it with the world. (read more...)

Debugging unfamiliar code

This is a story about debugging a register allocation issue in a compiler. I recently joined the Bitfunnel project; there wasn’t much documentation at the time, so I started playing with the code by writing some simple examples that could later serve as documentation to anyone else coming in after me. The first thing I tried this for was NativeJIT, both because it’s self-contained and because it’s the first thing we open sourced. (read more...)