APX AI
Online
Build a simple sequence model using a common deep learning library, TensorFlow with its Keras API. This practical approach demonstrates the application of concepts such as Recurrent Neural Networks (RNNs), LSTMs, and GRUs. This exercise will solidify your understanding of how to prepare sequential text data and construct a basic recurrent model for a representative task.
"We'll tackle a simplified sentiment analysis problem: classifying short text snippets as either positive or negative. While sentiment analysis often involves more complex datasets and models, this example focuses purely on the mechanics of setting up and training a sequence model."
First, ensure you have TensorFlow installed. If not, you can typically install it using pip:
pip install tensorflow
We'll use Keras, which is bundled with TensorFlow, for building our model. Let's define a small, synthetic dataset for demonstration purposes.
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, SimpleRNN, Dense, LSTM, GRU
from tensorflow.keras.optimizers import Adam
# Fix random seeds for reproducibility
tf.keras.utils.set_random_seed(42)
# Sample data: (text, label) -> 0 for negative, 1 for positive
# Each sentence contains at least one unambiguous sentiment word so the
# model can learn to generalise across the train/validation split.
positives = [
"this is a great movie",
"i really enjoyed the film",
"what a fantastic performance",
"loved every minute of it",
"truly amazing storytelling",
"absolutely wonderful experience",
"a brilliant and captivating film",
"the acting was superb",
"an outstanding piece of cinema",
"excellent direction and great writing",
"i enjoyed this film immensely",
"wonderful and deeply moving",
"a great story told brilliantly",
"fantastic visuals and amazing score",
"loved the characters and the plot",
"superb cinematography and great acting",
"an excellent and enjoyable watch",
"brilliant performances throughout",
"outstanding film highly recommended",
"i loved the wonderful atmosphere",
"a fantastic journey from start to end",
"amazing how great this film is",
"enjoyed the brilliant script",
"truly wonderful and moving experience",
"great film with excellent characters",
"superb acting and a fantastic story",
"an absolutely wonderful movie",
"loved the outstanding direction",
"brilliant film i really enjoyed it",
"a great and amazing experience",
"i found this film truly excellent",
"wonderful performances and great pacing",
"a fantastic and superb achievement",
"loved it brilliant from beginning to end",
"amazing story and excellent execution",
"great script and wonderful acting",
"enjoyed every scene it was brilliant",
"outstanding and amazing in every way",
"a superb film loved every moment",
"excellent and wonderful in equal measure",
]
negatives = [
"this is a terrible movie",
"i really hated the film",
"what a dreadful performance",
"boring from the very first scene",
"truly awful storytelling",
"absolutely disappointing experience",
"a bad and tedious film",
"the acting was terrible",
"a poor piece of cinema",
"awful direction and bad writing",
"i hated this film entirely",
"dull and deeply boring",
"a terrible story told badly",
"dreadful visuals and awful score",
"hated the characters and the plot",
"poor cinematography and terrible acting",
"a bad and disappointing watch",
"awful performances throughout",
"worst film do not recommend",
"i hated the dull atmosphere",
"a terrible journey from start to end",
"disappointing how bad this film is",
"hated the awful script",
"truly dreadful and boring experience",
"bad film with terrible characters",
"poor acting and a dreadful story",
"an absolutely awful movie",
"hated the disappointing direction",
"terrible film i really hated it",
"a bad and dull experience",
"i found this film truly awful",
"disappointing performances and poor pacing",
"a dreadful and terrible achievement",
"hated it boring from beginning to end",
"awful story and bad execution",
"terrible script and dull acting",
"hated every scene it was awful",
"worst and disappointing in every way",
"a terrible film hated every moment",
"bad and dreadful in equal measure",
]
texts = positives + negatives
labels = np.array([1] * len(positives) + [0] * len(negatives))
# Shuffle so the validation split contains a balanced mix of both classes
idx = np.random.permutation(len(texts))
texts = np.array(texts)[idx]
labels = labels[idx]
print(f"Number of samples: {len(texts)} ({labels.sum()} positive, {(1-labels).sum()} negative)")
print(f"Sample text: '{texts[0]}', Label: {labels[0]}")
Sequence models don't work directly with raw text. We need to convert our sentences into numerical representations that the model can process. This involves two main steps: tokenization and padding.
# --- Tokenization ---
vocab_size = 500 # Maximum number of words to keep based on frequency
tokenizer = Tokenizer(num_words=vocab_size, oov_token="<OOV>") # <OOV> for out-of-vocabulary words
tokenizer.fit_on_texts(texts)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(texts)
print("\nWord Index Sample:", list(word_index.items())[:10])
print("Original Text:", texts[0])
print("Sequence Representation:", sequences[0])
# --- Padding ---
max_length = 10 # Define a maximum sequence length (can be inferred or set)
padded_sequences = pad_sequences(sequences, maxlen=max_length, padding='post', truncating='post')
print("\nPadded Sequence Example (Post-padding):")
print(padded_sequences[0])
print("Shape of padded sequences:", padded_sequences.shape)
Notice how pad_sequences adds zeros at the end (padding='post') to make all sequences have length 10. If a sequence was longer than max_length, it would be shortened (truncating='post').
Now, let's construct our model. We'll use Keras's Sequential API, which allows us to stack layers linearly.
input_dim (size of the vocabulary) and output_dim (dimensionality of the embedding vectors). We also specify input_length which corresponds to our max_length from padding.SimpleRNN. The primary argument is units, which defines the dimensionality of the hidden state (and output space). Other recurrent layers like LSTM or GRU can be swapped in here.Dense layer with one unit and a sigmoid activation function. The sigmoid function outputs a value between 0 and 1, representing the probability of the positive class.embedding_dim = 16 # Dimensionality of the word embeddings
rnn_units = 32 # Number of units in the RNN layer
model = Sequential([
# 1. Embedding Layer
Embedding(input_dim=vocab_size,
output_dim=embedding_dim),
# 2. Recurrent Layer (SimpleRNN)
# Try replacing SimpleRNN with LSTM or GRU later!
SimpleRNN(units=rnn_units),
# If stacking RNN layers, use return_sequences=True on intermediate layers:
# SimpleRNN(units=rnn_units, return_sequences=True),
# SimpleRNN(units=rnn_units), # Last RNN layer doesn't need return_sequences=True
# 3. Output Layer
Dense(units=1, activation='sigmoid')
])
# Display the model's architecture
model.summary()
The summary shows the layers, their output shapes, and the number of trainable parameters. Notice how the SimpleRNN layer outputs a single vector of shape (None, 32), where 32 is rnn_units. If return_sequences=True were set, the output shape would be (None, max_length, rnn_units).
Before training, we need to configure the learning process using model.compile(). This involves specifying:
binary_crossentropy is appropriate.accuracy is a common metric.model.compile(optimizer=Adam(learning_rate=0.001),
loss='binary_crossentropy',
metrics=['accuracy'])
print("\nModel compiled successfully.")
Now we train the model using our prepared data. We provide the padded sequences as input (X) and the corresponding labels (y).
num_epochs = 30
batch_size = 8
validation_fraction = 0.2 # Use 20% of the data for validation
print(f"\nStarting training for {num_epochs} epochs...")
history = model.fit(padded_sequences,
labels,
epochs=num_epochs,
batch_size=batch_size,
validation_split=validation_fraction,
verbose=1) # Set verbose=0 to hide epoch progress
print("\nTraining finished.")
During training, Keras prints the loss and accuracy for both the training set and the validation set (if provided) after each epoch.
Plotting the training and validation loss and accuracy over epochs is a standard way to assess the model's learning progress and check for overfitting. Overfitting occurs when the model performs well on the training data but poorly on unseen validation data (training loss decreases while validation loss increases).
import plotly.graph_objects as go
from plotly.subplots import make_subplots
# Extract history data
acc = history.history['accuracy']
val_acc = history.history.get('val_accuracy') # Use .get() in case validation_split was 0
loss = history.history['loss']
val_loss = history.history.get('val_loss')
epochs_range = range(1, num_epochs + 1)
# Create figure with subplots
fig = make_subplots(rows=1, cols=2, subplot_titles=('Training and Validation Accuracy', 'Training and Validation Loss'))
# Add Accuracy trace
fig.add_trace(go.Scatter(x=list(epochs_range), y=acc, name='Training Accuracy', mode='lines+markers', marker_color='#1f77b4'), row=1, col=1)
if val_acc:
fig.add_trace(go.Scatter(x=list(epochs_range), y=val_acc, name='Validation Accuracy', mode='lines+markers', marker_color='#ff7f0e'), row=1, col=1)
# Add Loss trace
fig.add_trace(go.Scatter(x=list(epochs_range), y=loss, name='Training Loss', mode='lines+markers', marker_color='#1f77b4'), row=1, col=2)
if val_loss:
fig.add_trace(go.Scatter(x=list(epochs_range), y=val_loss, name='Validation Loss', mode='lines+markers', marker_color='#ff7f0e'), row=1, col=2)
# Update layout
fig.update_layout(
height=400,
width=800,
xaxis_title='Epoch',
yaxis_title='Accuracy',
xaxis2_title='Epoch',
yaxis2_title='Loss',
legend_title_text='Metric',
margin=dict(l=20, r=20, t=50, b=20) # Adjust margins
)
# Display the plot (in environments that support Plotly rendering)
# fig.show() # Uncomment to display locally if Plotly is configured
# Or provide the JSON representation for web embedding
plotly_json = fig.to_json()
Training and validation accuracy and loss curves over 30 epochs.
In this simple example with easily separable data, the training accuracy quickly reaches 1.0 (100%) while the validation accuracy plateaus around 93–94%, and the diverging loss curves show clear signs of overfitting. On more realistic datasets, you'd expect a more gradual increase in accuracy and less pronounced overfitting.
Finally, let's see how to use the trained model to predict the sentiment of new, unseen text. Remember to apply the same preprocessing steps (tokenization and padding) to the new data.
new_texts = [
"it was truly great",
"a complete waste of time",
"amazing film loved it"
]
# Preprocess the new texts
new_sequences = tokenizer.texts_to_sequences(new_texts)
new_padded = pad_sequences(new_sequences, maxlen=max_length, padding='post', truncating='post')
print("\nNew padded sequences:")
print(new_padded)
# Get predictions (probabilities)
predictions = model.predict(new_padded)
print("\nRaw Predictions (Probabilities):")
print(predictions)
# Interpret predictions (threshold at 0.5)
predicted_labels = (predictions > 0.5).astype(int).flatten() # flatten converts [[0],[1]] to [0,1]
print("\nPredicted Labels (0=Negative, 1=Positive):")
for text, label in zip(new_texts, predicted_labels):
sentiment = "Positive" if label == 1 else "Negative"
print(f"'{text}' -> {sentiment}")
The output shows the probability assigned by the model to the positive class (values closer to 1 indicate positive sentiment, closer to 0 indicate negative) and the final predicted label based on a 0.5 threshold.
This example provides a basic framework. You are encouraged to experiment:
SimpleRNN with LSTM or GRU in the model definition. Observe if there's any difference in training speed or final performance (though this dataset is too simple to see significant differences related to vanishing gradients).
# Example using LSTM
# model = Sequential([
# Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length),
# LSTM(units=rnn_units), # Replace SimpleRNN with LSTM
# Dense(units=1, activation='sigmoid')
# ])
embedding_dim, rnn_units, learning_rate, batch_size, or num_epochs and retrain the model.return_sequences=True on all but the last recurrent layer).tensorflow_datasets.This practical exercise demonstrated the end-to-end process of building and training a simple sequence model for text classification. You now have the foundational code structure to tackle more complex sequence processing tasks using RNNs, LSTMs, or GRUs.
© 2026 ApX Machine LearningContent Integrity & Transparency•