For our most heavily accessed data set, we had an extremely good read/write ratio, so we were able to fan out to about 20 slaves from a single master. This particular database had several hundred million rows, which challenged the limits of our hardware (periodically, we had to clean out stale data when it got too large), so one trick we used was index-segmentation. Different sets of slaves had different indexes, and our database access layer could pick a different cluster based on the necessary index. Specifically, the tables in this database generally had an ID and a string, but the index on the string was only necessary for some queries. So, on some slaves we simply didn’t have the string index. This allowed those machines to keep the entire ID index in memory, which was a huge performance boost.
We used sharding to scale our databases in other areas.
A new malloc(3) implementation has been introduced. This implementation, sometimes referred to as “jemalloc”, was designed to improve the performance of multi-threaded programs, particularly on SMP systems, while preserving the performance of single-threaded programs. Due to the use of different algorithms and data structures, jemalloc may expose some previously-unknown bugs in userland code, although most of the FreeBSD base system and common ports have been tested and/or fixed. Note that jemalloc uses mmap(2) to obtain memory and only uses sbrk(2) under limited circumstances (and then only for 32-bit architectures). As a result, the datasize resource limit has little practical effect for typical applications. The vmemoryuse resource limit, however, can be used to bound the total virtual memory used by a process, as described in limits(1).
CMG is pleased to announce the releasing of our past conference proceedings to the general public. We are currently starting with papers from 1997 through 2005 with plans to add more.