While Long Short-Term Memory (LSTM) networks provide an effective mechanism for capturing long-term dependencies using multiple gates, Gated Recurrent Units (GRUs) offer a related, often simpler, alternative.
This chapter introduces the GRU architecture. We will examine its components, specifically the update gate (zt) and the reset gate (rt), and understand how they work together to control information flow and update the hidden state. We'll look at how the candidate hidden state is calculated and combined with the previous hidden state.
We will also compare GRUs directly to LSTMs, discussing their structural differences, relative computational efficiency, and providing practical guidance on selecting the appropriate gated unit for specific sequence modeling tasks.
6.1 Introducing GRUs: A Simpler Gated Architecture
6.2 The GRU Cell Architecture
6.3 The Update Gate
6.4 The Reset Gate
6.5 Calculating the Candidate Hidden State
6.6 Calculating the Final Hidden State
6.7 Comparing GRU and LSTM
6.8 Computational Efficiency Considerations
6.9 When to Choose GRU or LSTM
© 2025 ApX Machine Learning