There are many different approaches. Chezscheme does a gazillion steps using the nanopass framework, but is still fast enough to compile a metric shit-tonne of code without any noticeable delay.
Interesting -- I suppose it's analogous to CPU pipelining, where having a larger number of simple stages can outperform having a small number of complicated ones.