Actually, backpropagation is more of a fancy word for the *chain rule*.

punnerud · on Aug 20, 2022

ALMOST like using the chain rule

Backpropagation ≠ Chain Rule: https://theorydish.blog/2021/12/16/backpropagation-≠-chain-r...

aaaaaaaaaaab · on Aug 20, 2022

That's just nitpicking, but ok: backpropagation is the application of the chain rule for total derivatives.

Look into forward- vs reverse-mode automatic differentiation, and you'll understand what I'm referring to.

cyber_kinetist · on Aug 20, 2022

Yes, backpropagation isn't the chain rule itself, but just an efficient way to calculate the chain rule. (In this respect there are some connections to dynamic programming, where you find the most efficient order of recursive computations to arrive at the solution).

blt · on Aug 21, 2022

I think of it as: computing the chain rule in the order such that we never need to compute Jacobians explicitly; only Jacobian-vector products.

I also didn't totally grasp its significance until implementing neural networks from matrix/array operations in NumPy. I hope all deep learning courses include this exercise.

marcosdumay · on Aug 20, 2022

Yes, they are not the same. The chain rule is what solves the one non-trivial problem with backpropagation. Besides that, it's just the quite obvious idea of changing the weights in proportion to how impactful they are on the error.