Hacker News new | past | comments | ask | show | jobs | submit login

Actually, backpropagation is more of a fancy word for the chain rule.



ALMOST like using the chain rule

Backpropagation ≠ Chain Rule: https://theorydish.blog/2021/12/16/backpropagation-≠-chain-r...


That's just nitpicking, but ok: backpropagation is the application of the chain rule for total derivatives.

Look into forward- vs reverse-mode automatic differentiation, and you'll understand what I'm referring to.


Yes, backpropagation isn't the chain rule itself, but just an efficient way to calculate the chain rule. (In this respect there are some connections to dynamic programming, where you find the most efficient order of recursive computations to arrive at the solution).


I think of it as: computing the chain rule in the order such that we never need to compute Jacobians explicitly; only Jacobian-vector products.

I also didn't totally grasp its significance until implementing neural networks from matrix/array operations in NumPy. I hope all deep learning courses include this exercise.


Yes, they are not the same. The chain rule is what solves the one non-trivial problem with backpropagation. Besides that, it's just the quite obvious idea of changing the weights in proportion to how impactful they are on the error.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: