All Courses

Practice: Tensor Manipulation and Gradients

Let's put the concepts from this chapter into practice. These exercises reinforce your understanding of creating and manipulating Tensors, using Variables, and calculating gradients, which are fundamental skills for working with TensorFlow. Ensure you have TensorFlow installed and imported (import tensorflow as tf) before running these examples.

Exercise 1: Creating and Manipulating Tensors

In this exercise, you'll create tensors and perform basic operations, similar to how you might handle data inputs or intermediate calculations in a model.

Create Tensors:

Create a scalar (0-D tensor) containing the value 10.0. Check its shape and dtype.
Create a vector (1-D tensor) from the Python list [1.0, 2.0, 3.0, 4.0]. Check its shape and dtype.
Create a matrix (2-D tensor) representing a small batch of data: [[1, 2], [3, 4]]. Specify the dtype as tf.float32. Check its shape.
Create a tensor initialized with all zeros, having the shape (3, 2).
Create a tensor initialized with random values drawn from a normal distribution, with shape (2, 2).

import tensorflow as tf
import numpy as np

# Scalar
scalar = tf.constant(10.0)
print("Scalar:", scalar)
print("Scalar shape:", scalar.shape)
print("Scalar dtype:", scalar.dtype)
print("-" * 20)

# Vector
vector = tf.constant([1.0, 2.0, 3.0, 4.0])
print("Vector:", vector)
print("Vector shape:", vector.shape)
print("Vector dtype:", vector.dtype)
print("-" * 20)

# Matrix
matrix = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
print("Matrix:", matrix)
print("Matrix shape:", matrix.shape)
print("Matrix dtype:", matrix.dtype)
print("-" * 20)

# Zeros tensor
zeros_tensor = tf.zeros((3, 2))
print("Zeros Tensor:", zeros_tensor)
print("-" * 20)

# Random tensor
random_tensor = tf.random.normal((2, 2), mean=0.0, stddev=1.0)
print("Random Tensor:", random_tensor)
print("-" * 20)

Perform Tensor Operations:

Using the matrix and random_tensor created above (ensure they are compatible shapes or create new ones if needed), perform element-wise addition and multiplication.
Perform matrix multiplication between matrix and random_tensor (adjust shapes if necessary, e.g., make random_tensor shape (2,1)). Use tf.matmul().
Access the element in the first row, second column of the matrix.
Extract the first row of the matrix.

# Reshape random_tensor for matrix multiplication example
random_tensor_reshaped = tf.random.normal((2, 1), mean=0.0, stddev=1.0)
print("Original Matrix:\n", matrix.numpy())
print("Reshaped Random Tensor:\n", random_tensor_reshaped.numpy())
print("-" * 20)

# Element-wise operations (requires compatible shapes)
# Let's create another matrix compatible with 'matrix'
matrix2 = tf.constant([[5, 6], [7, 8]], dtype=tf.float32)
addition_result = tf.add(matrix, matrix2)
multiplication_result = tf.multiply(matrix, matrix2)
print("Element-wise Addition:\n", addition_result.numpy())
print("Element-wise Multiplication:\n", multiplication_result.numpy())
print("-" * 20)

# Matrix multiplication
matmul_result = tf.matmul(matrix, random_tensor_reshaped)
print("Matrix Multiplication Result:\n", matmul_result.numpy())
print("-" * 20)

# Indexing and Slicing
element = matrix[0, 1] # First row (index 0), second column (index 1)
print("Element at [0, 1]:", element.numpy())

first_row = matrix[0, :] # First row, all columns
print("First row:", first_row.numpy())
print("-" * 20)

Exercise 2: Working with Variables

Variables hold the parameters (like weights and biases) that your model learns during training. This exercise demonstrates their creation and mutability.

Create and Modify a Variable:

Create a tf.Variable initialized with the value 5.0. Name it my_variable.
Print its initial value.
Use the .assign() method to change its value to 10.0. Print the updated value.
Try adding 2.0 to the variable using tf.add() and then using .assign_add(). Observe the difference.

# Create a Variable
my_variable = tf.Variable(5.0, name="my_variable", dtype=tf.float32)
print("Initial Variable Value:", my_variable.numpy())
print("-" * 20)

# Modify using assign()
my_variable.assign(10.0)
print("Value after assign(10.0):", my_variable.numpy())
print("-" * 20)

# Using tf.add (does not modify in-place)
result_add = tf.add(my_variable, 2.0)
print("Result of tf.add(my_variable, 2.0):", result_add.numpy())
print("Variable value after tf.add (unchanged):", my_variable.numpy())
print("-" * 20)

# Using assign_add (modifies in-place)
my_variable.assign_add(2.0)
print("Value after assign_add(2.0):", my_variable.numpy())
print("-" * 20)

Note how tf.add creates a new tensor, while assign_add modifies the variable directly. This in-place modification is essential for updating model weights during training.

Exercise 3: Computing Gradients with GradientTape

Automatic differentiation is the engine behind training neural networks. tf.GradientTape allows you to track operations and compute gradients automatically.

Gradient of a Simple Function:

Define a function $y = x^3$ .
Choose a value for $x$ , e.g., $x = 2.0$ . Make sure $x$ is a tf.constant or tf.Variable with a float dtype.
Use tf.GradientTape to compute the gradient $\frac{dy}{dx}$ at $x = 2.0$ . The expected analytical result is $3x^2 = 3(2^2) = 12$ .

x = tf.constant(2.0, dtype=tf.float32)

with tf.GradientTape() as tape:
    # Important: Watch the input tensor 'x'
    tape.watch(x)
    # Define the function y = x^3
    y = x * x * x # or tf.pow(x, 3)

# Compute the gradient of y with respect to x
dy_dx = tape.gradient(y, x)

print(f"Function: y = x^3")
print(f"At x = {x.numpy()}")
print(f"Computed gradient dy/dx: {dy_dx.numpy()}") # Should be close to 12.0
print("-" * 20)

Gradients of a Function with Multiple Variables:

Consider a simple linear operation $z = w \cdot x + b$ .
Initialize w, x, and b as tf.Variables (or tf.constant for x if it represents input data). For example: $w = 3.0$ , $x = 4.0$ , $b = 1.0$ .
Use tf.GradientTape to compute the gradients $\frac{\partial z}{\partial w}$ , $\frac{\partial z}{\partial x}$ , and $\frac{\partial z}{\partial b}$ .
Verify the results: $\frac{\partial z}{\partial w} = x = 4.0$ , $\frac{\partial z}{\partial x} = w = 3.0$ , $\frac{\partial z}{\partial b} = 1.0$ .

w = tf.Variable(3.0, dtype=tf.float32)
x = tf.constant(4.0, dtype=tf.float32) # Input data, usually constant here
b = tf.Variable(1.0, dtype=tf.float32)

with tf.GradientTape(persistent=True) as tape:
    # Note: Tape automatically watches trainable Variables (w, b)
    # We still need tape.watch(x) if x were a Tensor and we needed its gradient
    # tape.watch(x) # Uncomment if x was a tf.constant and you needed its gradient
    z = w * x + b

# Compute gradients
dz_dw = tape.gradient(z, w)
dz_dx = tape.gradient(z, x) # Will be None if x is not watched and not a Variable
dz_db = tape.gradient(z, b)

# It's good practice to delete the tape once done if persistent=True
del tape

print(f"Function: z = w * x + b")
print(f"w = {w.numpy()}, x = {x.numpy()}, b = {b.numpy()}")
print(f"Computed gradient dz/dw: {dz_dw.numpy()}") # Should be 4.0
print(f"Computed gradient dz/dx: {dz_dx}") # Gradient w.r.t. non-watched Constant is None
# If you need gradient w.r.t. x, ensure x is watched or make x a tf.Variable

# Let's re-run watching x explicitly to get dz_dx
with tf.GradientTape(persistent=True) as tape:
    tape.watch(x) # Explicitly watch the constant tensor x
    z = w * x + b
dz_dx = tape.gradient(z, x)
del tape
print(f"Computed gradient dz/dx (after watching x): {dz_dx.numpy()}") # Should be 3.0

print(f"Computed gradient dz/db: {dz_db.numpy()}") # Should be 1.0
print("-" * 20)

Notice that by default, GradientTape only tracks operations involving tf.Variables. To compute gradients with respect to tf.constants, you need to use tape.watch(). Also, a tape is automatically consumed after one call to .gradient(). Use persistent=True if you need to call .gradient() multiple times on the same tape.

Exercise 4: Gradients for a Simple Loss

This exercise simulates a single step in optimizing a very basic model. We'll calculate the gradient of a simple loss function with respect to model parameters.

Setup:
- Define input x_true = tf.constant([1.0, 2.0, 3.0], dtype=tf.float32)
- Define target output y_true = tf.constant([2.0, 4.0, 6.0], dtype=tf.float32) (Notice $y = 2x$ )
- Initialize parameters w = tf.Variable(1.0, dtype=tf.float32) and b = tf.Variable(0.0, dtype=tf.float32). Our goal is to learn $w=2$ and $b=0$ .

Calculate Loss and Gradients:

Inside a tf.GradientTape context:
- Calculate the predicted output: $y_{pred} = w \cdot x_{true} + b$ .
- Calculate the Mean Squared Error (MSE) loss: $Loss = \frac{1}{N} \sum (y_{pred} - y_{true})^2$ . Use tf.reduce_mean(tf.square(y_pred - y_true)).
Compute the gradients of the Loss with respect to w and b.

x_true = tf.constant([1.0, 2.0, 3.0], dtype=tf.float32)
y_true = tf.constant([2.0, 4.0, 6.0], dtype=tf.float32) # Target: y = 2x

# Initial parameter guesses
w = tf.Variable(1.0, dtype=tf.float32)
b = tf.Variable(0.0, dtype=tf.float32)

with tf.GradientTape() as tape:
    # Predict y based on current w and b
    y_pred = w * x_true + b

    # Calculate Mean Squared Error loss
    loss = tf.reduce_mean(tf.square(y_pred - y_true))

# Calculate gradients of the loss with respect to w and b
grad_w, grad_b = tape.gradient(loss, [w, b])

print(f"Input x: {x_true.numpy()}")
print(f"Target y: {y_true.numpy()}")
print(f"Initial w: {w.numpy()}, Initial b: {b.numpy()}")
print(f"Predicted y_pred: {y_pred.numpy()}")
print(f"Initial Loss (MSE): {loss.numpy()}")
print(f"Gradient dLoss/dw: {grad_w.numpy()}")
print(f"Gradient dLoss/db: {grad_b.numpy()}")
print("-" * 20)

# Optional: Perform one step of gradient descent
learning_rate = 0.1
w.assign_sub(learning_rate * grad_w) # w = w - learning_rate * grad_w
b.assign_sub(learning_rate * grad_b) # b = b - learning_rate * grad_b

print(f"Updated w after one step: {w.numpy()}")
print(f"Updated b after one step: {b.numpy()}")

These gradients (grad_w, grad_b) tell you how to adjust w and b to decrease the loss. Optimization algorithms like Gradient Descent use these gradients iteratively to find the best parameter values. You can see that after one simple update step, w moved closer to 2 and b closer to 0.

These exercises cover the essential mechanics of tensor manipulation and gradient computation in TensorFlow. Having practiced these operations, you are better prepared to understand how Keras utilizes these underlying concepts to build and train sophisticated machine learning models, which we will explore in the next chapters.

Was this section helpful?