I've worked on sensor fusion systems and I fully agree. Replaying the sensor inputs and evaluating new estimated state is a really good way of debugging failures (because you can't just stop the system mid-air and evaluate internal state). It also helps with regression test suite and trying out new algorithms quickly.