Having established the nature of sequential data and the limitations of standard feedforward networks in the previous chapter, we now turn our attention to models specifically designed for sequences. This chapter introduces the fundamentals of Recurrent Neural Networks (RNNs).
You will learn the core idea behind RNNs: processing sequences element by element while maintaining an internal 'memory' or hidden state. We will examine the architecture of a simple RNN cell, understanding how input xt at a time step t combines with the previous hidden state ht−1 to produce the current hidden state ht and an optional output yt. The mathematical operations governing this process, often represented as:
ht=f(Whhht−1+Wxhxt+bh) yt=g(Whyht+by)
(where f and g are activation functions like tanh or sigmoid) will be detailed. We will visualize how information flows through time and introduce the essential training algorithm for RNNs: Backpropagation Through Time (BPTT), including the concept of unrolling the network.
By the end of this chapter, you will grasp the operational principles of basic RNNs and the mechanics of their training process.
2.1 The Core Idea: Processing Sequences Iteratively
2.2 Simple RNN Architecture
2.3 The Role of the Hidden State
2.4 Mathematical Formulation of an RNN Cell
2.5 Information Flow in RNNs
2.6 Backpropagation Through Time (BPTT)
2.7 Unrolling the Network for Training
© 2025 ApX Machine Learning