is it too n00b/out-of-place of me to ask what went in to tuning memcached to do that? i'd love to pick your brain (and possibly eat them... creepy, right?)
The general path involves adding some locks around critical sections, then benchmarking until those locks bubble up, then reducing the number of locks, then repeating until your server has very little lock contention.
The process is pretty straightforward, but as you walk down the path, you end up with something that looks a bit different from where you started.
hmm, i'm clearly wrong, but it doesn't seem like there's any reason memcached couldn't be implemented completely locklessly... can't most of everything be done with CAS operations instead of taking out locks?
That's likely true. We're separating out storage engines and allowing people to write their own. Your engine doesn't need locks. The more stuff we can do in the core without them, the closer we get to your dream.
Lock-free hash tables without GC are kind of hard, but not possible. I'd certainly welcome a lock-free engine if you're working on one. :)