Yes, backpropagation isn't the chain rule itself, but just an efficient way to c...

blt · on Aug 21, 2022

I think of it as: computing the chain rule in the order such that we never need to compute Jacobians explicitly; only Jacobian-vector products.

I also didn't totally grasp its significance until implementing neural networks from matrix/array operations in NumPy. I hope all deep learning courses include this exercise.