Bloomd: Serving those Flowers Fast
Bloomd
Bloomd is a high-performance C server which is used to expose bloom filters and operations over them to networked clients. It uses a simple ASCI protocol which is human readable, and similar to memcached.
Features
Scalable non-blocking core allows for many connected clients and concurrent operations
Implements scalable bloom filters, allowing dynamic filter sizes
Supports asynchronous flushes to disk for persistence
Supports non-disk backed bloom filters for high I/O
Automatically faults cold filters out of memory to save resources
Dead simple to start and administer
FAST, FAST, FAST
Whats is a Bloom Fliter?
A Bloom filter is a data structure designed to tell you, rapidly and memory-efficiently, whether an element is present in a set.
A Bloom filter consists of two components: a set of k hash functions and a bit vector of a given length. We choose the length of the bit vector and the number of hash functions depending on how many keys we want to add to the set and how high an error rate we are willing to put up with -- more on that a little bit further on.
All of the hash functions in a Bloom filter are configured so that their range matches the length of the bit vector. For example, if a vector is 200 bits long, the hash functions return a value between 1 and 200. It's important to use high-quality hash functions in the filter to guarantee that output is equally distributed over all possible values -- "hot spots" in a hash function would increase our false-positive rate.
To enter a key into a Bloom filter, we run it through each one of the k hash functions and treat the result as an offset into the bit vector, turning on whatever bit we find at that position. If the bit is already set, we leave it on. There's no mechanism for turning bits off in a Bloom filter.














