Let's put the concepts from this chapter into practice. These exercises reinforce your understanding of creating and manipulating Tensors, using Variables, and calculating gradients, which are fundamental skills for working with TensorFlow. Ensure you have TensorFlow installed and imported (import tensorflow as tf
) before running these examples.
In this exercise, you'll create tensors and perform basic operations, similar to how you might handle data inputs or intermediate calculations in a model.
Create Tensors:
10.0
. Check its shape
and dtype
.[1.0, 2.0, 3.0, 4.0]
. Check its shape
and dtype
.[[1, 2], [3, 4]]
. Specify the dtype
as tf.float32
. Check its shape
.(3, 2)
.(2, 2)
.import tensorflow as tf
import numpy as np
# Scalar
scalar = tf.constant(10.0)
print("Scalar:", scalar)
print("Scalar shape:", scalar.shape)
print("Scalar dtype:", scalar.dtype)
print("-" * 20)
# Vector
vector = tf.constant([1.0, 2.0, 3.0, 4.0])
print("Vector:", vector)
print("Vector shape:", vector.shape)
print("Vector dtype:", vector.dtype)
print("-" * 20)
# Matrix
matrix = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
print("Matrix:", matrix)
print("Matrix shape:", matrix.shape)
print("Matrix dtype:", matrix.dtype)
print("-" * 20)
# Zeros tensor
zeros_tensor = tf.zeros((3, 2))
print("Zeros Tensor:", zeros_tensor)
print("-" * 20)
# Random tensor
random_tensor = tf.random.normal((2, 2), mean=0.0, stddev=1.0)
print("Random Tensor:", random_tensor)
print("-" * 20)
Perform Tensor Operations:
matrix
and random_tensor
created above (ensure they are compatible shapes or create new ones if needed), perform element-wise addition and multiplication.matrix
and random_tensor
(adjust shapes if necessary, e.g., make random_tensor
shape (2,1)
). Use tf.matmul()
.matrix
.matrix
.# Reshape random_tensor for matrix multiplication example
random_tensor_reshaped = tf.random.normal((2, 1), mean=0.0, stddev=1.0)
print("Original Matrix:\n", matrix.numpy())
print("Reshaped Random Tensor:\n", random_tensor_reshaped.numpy())
print("-" * 20)
# Element-wise operations (requires compatible shapes)
# Let's create another matrix compatible with 'matrix'
matrix2 = tf.constant([[5, 6], [7, 8]], dtype=tf.float32)
addition_result = tf.add(matrix, matrix2)
multiplication_result = tf.multiply(matrix, matrix2)
print("Element-wise Addition:\n", addition_result.numpy())
print("Element-wise Multiplication:\n", multiplication_result.numpy())
print("-" * 20)
# Matrix multiplication
matmul_result = tf.matmul(matrix, random_tensor_reshaped)
print("Matrix Multiplication Result:\n", matmul_result.numpy())
print("-" * 20)
# Indexing and Slicing
element = matrix[0, 1] # First row (index 0), second column (index 1)
print("Element at [0, 1]:", element.numpy())
first_row = matrix[0, :] # First row, all columns
print("First row:", first_row.numpy())
print("-" * 20)
Variables hold the parameters (like weights and biases) that your model learns during training. This exercise demonstrates their creation and mutability.
Create and Modify a Variable:
tf.Variable
initialized with the value 5.0
. Name it my_variable
..assign()
method to change its value to 10.0
. Print the updated value.2.0
to the variable using tf.add()
and then using .assign_add()
. Observe the difference.# Create a Variable
my_variable = tf.Variable(5.0, name="my_variable", dtype=tf.float32)
print("Initial Variable Value:", my_variable.numpy())
print("-" * 20)
# Modify using assign()
my_variable.assign(10.0)
print("Value after assign(10.0):", my_variable.numpy())
print("-" * 20)
# Using tf.add (does not modify in-place)
result_add = tf.add(my_variable, 2.0)
print("Result of tf.add(my_variable, 2.0):", result_add.numpy())
print("Variable value after tf.add (unchanged):", my_variable.numpy())
print("-" * 20)
# Using assign_add (modifies in-place)
my_variable.assign_add(2.0)
print("Value after assign_add(2.0):", my_variable.numpy())
print("-" * 20)
Note how tf.add
creates a new tensor, while assign_add
modifies the variable directly. This in-place modification is essential for updating model weights during training.
Automatic differentiation is the engine behind training neural networks. tf.GradientTape
allows you to track operations and compute gradients automatically.
Gradient of a Simple Function:
tf.constant
or tf.Variable
with a float dtype
.tf.GradientTape
to compute the gradient dxdy at x=2.0. The expected analytical result is 3x2=3(22)=12.x = tf.constant(2.0, dtype=tf.float32)
with tf.GradientTape() as tape:
# Important: Watch the input tensor 'x'
tape.watch(x)
# Define the function y = x^3
y = x * x * x # or tf.pow(x, 3)
# Compute the gradient of y with respect to x
dy_dx = tape.gradient(y, x)
print(f"Function: y = x^3")
print(f"At x = {x.numpy()}")
print(f"Computed gradient dy/dx: {dy_dx.numpy()}") # Should be close to 12.0
print("-" * 20)
Gradients of a Function with Multiple Variables:
w
, x
, and b
as tf.Variable
s (or tf.constant
for x
if it represents input data). For example: w=3.0, x=4.0, b=1.0.tf.GradientTape
to compute the gradients ∂w∂z, ∂x∂z, and ∂b∂z.w = tf.Variable(3.0, dtype=tf.float32)
x = tf.constant(4.0, dtype=tf.float32) # Input data, usually constant here
b = tf.Variable(1.0, dtype=tf.float32)
with tf.GradientTape(persistent=True) as tape:
# Note: Tape automatically watches trainable Variables (w, b)
# We still need tape.watch(x) if x were a Tensor and we needed its gradient
# tape.watch(x) # Uncomment if x was a tf.constant and you needed its gradient
z = w * x + b
# Compute gradients
dz_dw = tape.gradient(z, w)
dz_dx = tape.gradient(z, x) # Will be None if x is not watched and not a Variable
dz_db = tape.gradient(z, b)
# It's good practice to delete the tape once done if persistent=True
del tape
print(f"Function: z = w * x + b")
print(f"w = {w.numpy()}, x = {x.numpy()}, b = {b.numpy()}")
print(f"Computed gradient dz/dw: {dz_dw.numpy()}") # Should be 4.0
print(f"Computed gradient dz/dx: {dz_dx}") # Gradient w.r.t. non-watched Constant is None
# If you need gradient w.r.t. x, ensure x is watched or make x a tf.Variable
# Let's re-run watching x explicitly to get dz_dx
with tf.GradientTape(persistent=True) as tape:
tape.watch(x) # Explicitly watch the constant tensor x
z = w * x + b
dz_dx = tape.gradient(z, x)
del tape
print(f"Computed gradient dz/dx (after watching x): {dz_dx.numpy()}") # Should be 3.0
print(f"Computed gradient dz/db: {dz_db.numpy()}") # Should be 1.0
print("-" * 20)
Notice that by default, GradientTape
only tracks operations involving tf.Variable
s. To compute gradients with respect to tf.constant
s, you need to use tape.watch()
. Also, a tape is automatically consumed after one call to .gradient()
. Use persistent=True
if you need to call .gradient()
multiple times on the same tape.
This exercise simulates a single step in optimizing a very basic model. We'll calculate the gradient of a simple loss function with respect to model parameters.
Setup:
x_true = tf.constant([1.0, 2.0, 3.0], dtype=tf.float32)
y_true = tf.constant([2.0, 4.0, 6.0], dtype=tf.float32)
(Notice y=2x)w = tf.Variable(1.0, dtype=tf.float32)
and b = tf.Variable(0.0, dtype=tf.float32)
. Our goal is to learn w=2 and b=0.Calculate Loss and Gradients:
tf.GradientTape
context:
tf.reduce_mean(tf.square(y_pred - y_true))
.Loss
with respect to w
and b
.x_true = tf.constant([1.0, 2.0, 3.0], dtype=tf.float32)
y_true = tf.constant([2.0, 4.0, 6.0], dtype=tf.float32) # Target: y = 2x
# Initial parameter guesses
w = tf.Variable(1.0, dtype=tf.float32)
b = tf.Variable(0.0, dtype=tf.float32)
with tf.GradientTape() as tape:
# Predict y based on current w and b
y_pred = w * x_true + b
# Calculate Mean Squared Error loss
loss = tf.reduce_mean(tf.square(y_pred - y_true))
# Calculate gradients of the loss with respect to w and b
grad_w, grad_b = tape.gradient(loss, [w, b])
print(f"Input x: {x_true.numpy()}")
print(f"Target y: {y_true.numpy()}")
print(f"Initial w: {w.numpy()}, Initial b: {b.numpy()}")
print(f"Predicted y_pred: {y_pred.numpy()}")
print(f"Initial Loss (MSE): {loss.numpy()}")
print(f"Gradient dLoss/dw: {grad_w.numpy()}")
print(f"Gradient dLoss/db: {grad_b.numpy()}")
print("-" * 20)
# Optional: Perform one step of gradient descent
learning_rate = 0.1
w.assign_sub(learning_rate * grad_w) # w = w - learning_rate * grad_w
b.assign_sub(learning_rate * grad_b) # b = b - learning_rate * grad_b
print(f"Updated w after one step: {w.numpy()}")
print(f"Updated b after one step: {b.numpy()}")
These gradients (grad_w
, grad_b
) tell you how to adjust w
and b
to decrease the loss. Optimization algorithms like Gradient Descent use these gradients iteratively to find the best parameter values. You can see that after one simple update step, w
moved closer to 2 and b
closer to 0.
These exercises cover the essential mechanics of tensor manipulation and gradient computation in TensorFlow. Having practiced these operations, you are better prepared to understand how Keras utilizes these underlying concepts to build and train sophisticated machine learning models, which we will explore in the next chapters.
© 2025 ApX Machine Learning