in Lectures from Caianiello Summer School on Adaptive Processing of Sequences
Abstract
Deriving gradient algorithms for time-dependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. While principled methods using Euler- Lagrange or ordered derivative approaches exist, we present an alternative approach based on a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagation-through-time, without a single chain rule expansion.
Share this



