Efficient distribution of cacheable http requests at Tumblr
Tumblr uses Haproxy for load balancing http requests and other types of tcp traffic to pools of application servers in order to evenly distribute our workloads and ensure that we're using our machines efficiently. Haproxy supports consistent hashing of requests to specific servers which is a requirement for efficiently caching responses. Consistent hashing ensures that only a few mappings (of requests to servers) are redistributed when a server is added or removed from the pool which results in a higher maintained hit ratio when dealing with caching systems than a traditional modulo distribution method.
An important aspect of the hashing algorithm is to ensure that load is distributed evenly across the pool. Haproxy uses the SDBM hashing function by default, but about a year ago we began doing testing to determine the efficiency of alternative algorithms using the same hashing criteria as we see in production. We found that the hashing algorithm most suited for the input given to our Haproxy instances at Tumblr was DJB2.
The improvement we see when using DJB2 is best illustrated via a graph of connections from Tumblr's varnish pools to the application nodes. The Varnish pools are consistently hashed backends to Haproxy. The effect of switching the algorithm is an improvement in the load distribution on the varnish pool which is reflected in the convergence of connections to application nodes ultimately leading to better cache usage.
Please continue reading for more on the details of hashing as implemented in Haproxy and the patch we have pushed upstream which lets you try out these options.
Hash functions strive to have little correlation between input and output. The heart of a hash function is its mixing step. The behavior of the mixing step largely determines the degree in which a hash function is collision resistant. Hash functions that are collision resistant are more likely to provide an even distribution of load.
The purpose of the mixing function is to spread the effect of each message bit throughout all the bits of the internal state. Ideally every bit in the hash state is affected by every bit in the message and perform that operation as quickly as possible for the sake of program performance. A function is said to satisfy the strict avalanche criterion if, whenever a single input bit is complemented (toggled between 0 and 1), each of the output bits should change with a probability of one half for an arbitrary selection of the remaining input bits.
To guard against a combination of hash function and input that results in high rate of collisions, Haproxy implements an avalanche algorithm on the result of the hashing function. The avalanche is always applied when using the consistent hashing directive. It is intended to provide a good distribution for little input variations. The result is quite suited to fit over a 32-bit space with enough variations so that a randomly picked number falls equally before any server position, which is ideal for consistently hashed backends, a common use case for caches.
In additional tests involving alternative algorithms for hash input and an option to trigger avalanche, we found different algorithms perform better on different criteria. DJB2 performs well when hashing ascii text which makes it a good choice for hashing http host headers. Other alternatives perform better on numbers and are a good choice when using source ip. The results also vary by use of the avalanche flag.
What are the effects of the DJB2 and avalanche algorithms on your production deployments? Would it not be great to have an option that lets you play with the hash function and determine via configuration if avalanche is beneficial to you?
Now you can. Tumblr's patches for enabling the hashing alternative and an option to trigger avalanche are now available in Haproxy 1.5. Let us know the results of your testing.