Having established what Tensors are, the next logical step is to understand how to manipulate them. TensorFlow provides a comprehensive suite of operations, many of which mirror NumPy functions but are optimized for GPU acceleration and automatic differentiation. These operations form the computational backbone of any machine learning model built with TensorFlow.
TensorFlow supports standard element-wise mathematical operations. If you have two tensors of the same shape, you can perform arithmetic directly.
import tensorflow as tf
# Create some tensors
a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
b = tf.constant([[5.0, 6.0], [7.0, 8.0]])
c = tf.constant(2.0)
# Addition
print("Addition (a + b):\n", a + b)
# Alternative using TensorFlow function
print("Addition (tf.add(a, b)):\n", tf.add(a, b))
# Subtraction
print("Subtraction (b - a):\n", b - a)
print("Subtraction (tf.subtract(b, a)):\n", tf.subtract(b, a))
# Element-wise Multiplication
print("Element-wise Multiplication (a * b):\n", a * b)
print("Element-wise Multiplication (tf.multiply(a, b)):\n", tf.multiply(a, b))
# Element-wise Division
print("Element-wise Division (b / a):\n", b / a)
print("Element-wise Division (tf.divide(b, a)):\n", tf.divide(b, a))
# Operations with scalars
print("Scalar Multiplication (a * c):\n", a * c)
print("Scalar Addition (a + c):\n", a + c)
These operations are performed element by element. For example, in a + b
, the element at position (0, 0) in the result is the sum of the elements at (0, 0) in a
and b
(1.0 + 5.0 = 6.0).
What happens if you try to perform element-wise operations on tensors with different shapes? TensorFlow uses a concept called broadcasting, similar to NumPy. If certain rules are met, TensorFlow automatically "broadcasts" the smaller tensor to match the shape of the larger tensor during the operation.
The rules generally allow operations if, for each dimension (starting from the trailing dimensions):
import tensorflow as tf
# Example 1: Adding a scalar (shape ()) to a matrix (shape (2, 2))
matrix = tf.constant([[1.0, 2.0], [3.0, 4.0]])
scalar = tf.constant(10.0)
result = matrix + scalar # Scalar is broadcast to [[10.0, 10.0], [10.0, 10.0]]
print("Matrix + Scalar:\n", result)
# Example 2: Adding a vector (shape (3,)) to a matrix (shape (2, 3))
matrix = tf.constant([[1, 2, 3], [4, 5, 6]]) # Shape (2, 3)
vector = tf.constant([10, 20, 30]) # Shape (3,)
result = matrix + vector # Vector is broadcast across rows
# vector becomes [[10, 20, 30], [10, 20, 30]]
print("Matrix + Vector:\n", result)
# Example 3: Adding a column vector (shape (2, 1)) to a matrix (shape (2, 3))
matrix = tf.constant([[1, 2, 3], [4, 5, 6]]) # Shape (2, 3)
col_vector = tf.constant([[10], [20]]) # Shape (2, 1)
result = matrix + col_vector # Column vector is broadcast across columns
# col_vector becomes [[10, 10, 10], [20, 20, 20]]
print("Matrix + Column Vector:\n", result)
# Example 4: Incompatible shapes (error)
# matrix = tf.constant([[1, 2], [3, 4]]) # Shape (2, 2)
# vector = tf.constant([10, 20, 30]) # Shape (3,)
# try:
# result = matrix + vector
# except tf.errors.InvalidArgumentError as e:
# print("\nError (Incompatible shapes):\n", e)
Broadcasting is extremely useful as it often avoids the need to manually tile or repeat tensors to make shapes compatible, leading to more concise code and potentially better memory efficiency.
One of the most fundamental operations in machine learning, particularly in neural networks, is matrix multiplication. This is not element-wise multiplication. Use tf.matmul()
or the @
operator.
For tf.matmul(a, b)
, if a
has shape (m,n) and b
has shape (n,p), the result will have shape (m,p). The inner dimensions must match (n).
import tensorflow as tf
a = tf.constant([[1, 2], [3, 4]]) # Shape (2, 2)
b = tf.constant([[5, 6], [7, 8]]) # Shape (2, 2)
c = tf.constant([[1, 2, 3], [4, 5, 6]]) # Shape (2, 3)
# Matrix multiplication using tf.matmul
matmul_ab = tf.matmul(a, b)
print("Matrix Multiplication (tf.matmul(a, b)):\n", matmul_ab)
# Matrix multiplication using @ operator
matmul_ab_operator = a @ b
print("Matrix Multiplication (a @ b):\n", matmul_ab_operator)
# Matrix multiplication with different compatible shapes
matmul_ac = tf.matmul(a, c) # (2, 2) @ (2, 3) -> (2, 3)
print("Matrix Multiplication (a @ c):\n", matmul_ac)
# Attempting incompatible shapes will raise an error
# try:
# tf.matmul(c, a) # (2, 3) @ (2, 2) - Incompatible
# except tf.errors.InvalidArgumentError as e:
# print("\nError (Incompatible shapes for matmul):\n", e)
# Contrast with element-wise multiplication
elementwise_ab = a * b
print("\nElement-wise Multiplication (a * b):\n", elementwise_ab)
Remember the distinction: *
performs element-wise multiplication, while tf.matmul
or @
performs standard matrix multiplication.
Reduction operations reduce the number of elements in a tensor by performing an operation across specified dimensions (axes). Common examples include tf.reduce_sum
, tf.reduce_mean
, tf.reduce_max
, and tf.reduce_min
.
The axis
argument specifies the dimension(s) along which to perform the reduction. If axis
is not provided, the reduction happens across all dimensions, resulting in a scalar tensor.
import tensorflow as tf
tensor = tf.constant([[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0]]) # Shape (2, 3)
# Reduce across all dimensions (sum)
total_sum = tf.reduce_sum(tensor)
print("Total sum (reduce_sum without axis):\n", total_sum) # 1+2+3+4+5+6 = 21
# Reduce along axis 0 (summing down columns)
sum_axis_0 = tf.reduce_sum(tensor, axis=0)
print("Sum along axis 0:\n", sum_axis_0) # [1+4, 2+5, 3+6] = [5, 7, 9] - Shape (3,)
# Reduce along axis 1 (summing across rows)
sum_axis_1 = tf.reduce_sum(tensor, axis=1)
print("Sum along axis 1:\n", sum_axis_1) # [1+2+3, 4+5+6] = [6, 15] - Shape (2,)
# Calculate the mean across all dimensions
mean_val = tf.reduce_mean(tensor)
print("Mean of all elements:\n", mean_val) # 21 / 6 = 3.5
# Find the maximum value along axis 1
max_axis_1 = tf.reduce_max(tensor, axis=1)
print("Max along axis 1:\n", max_axis_1) # [3, 6] - Shape (2,)
Understanding axes is important:
TensorFlow tensors support Python-style indexing and slicing, much like NumPy arrays. This allows you to access or modify specific parts of a tensor.
import tensorflow as tf
tensor = tf.constant([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]]) # Shape (3, 4)
# Get a single element (row 0, column 1)
print("Element at (0, 1):", tensor[0, 1]) # Output: tf.Tensor(2, shape=(), dtype=int32)
# Get the first row
print("First row:", tensor[0]) # Output: tf.Tensor([1 2 3 4], shape=(4,), dtype=int32)
# Get the first two rows
print("First two rows:\n", tensor[0:2]) # Standard Python slicing
# Get the second column (all rows, column index 1)
print("Second column:\n", tensor[:, 1]) # Output: tf.Tensor([ 2 6 10], shape=(3,), dtype=int32)
# Get a sub-matrix (rows 1 to 2, columns 1 to 3)
print("Sub-matrix (rows 1:3, cols 1:3):\n", tensor[1:3, 1:3])
# Get the last element using negative indexing
print("Last element:", tensor[-1, -1]) # Output: tf.Tensor(12, shape=(), dtype=int32)
# Use tf.gather for more complex indexing (gathering specific indices)
indices = [0, 2] # Gather rows 0 and 2
gathered_rows = tf.gather(tensor, indices=indices, axis=0)
print("Gathered rows (0 and 2):\n", gathered_rows)
# Gather specific elements using multi-axis indices
indices = [[0, 0], [1, 1], [2, 3]] # Coordinates: (0,0), (1,1), (2,3)
gathered_elements = tf.gather_nd(tensor, indices=indices)
print("Gathered elements at (0,0), (1,1), (2,3):", gathered_elements)
Often, you need to change the shape of a tensor without changing its data.
tf.reshape(tensor, new_shape)
: Reshapes a tensor, provided the total number of elements remains the same.tf.transpose(tensor, perm=...)
: Permutes the dimensions of a tensor. Essential for operations like aligning matrices before multiplication.tf.expand_dims(tensor, axis)
: Adds a new dimension of size 1 at the specified axis.tf.squeeze(tensor, axis=None)
: Removes dimensions of size 1. If axis
is specified, only that axis is squeezed (if its size is 1).import tensorflow as tf
tensor = tf.range(6) # [0, 1, 2, 3, 4, 5], Shape (6,)
print("Original tensor:", tensor)
# Reshape to (2, 3)
reshaped = tf.reshape(tensor, (2, 3))
print("Reshaped to (2, 3):\n", reshaped)
# Reshape to (3, 2)
reshaped_alt = tf.reshape(reshaped, (3, 2))
print("Reshaped to (3, 2):\n", reshaped_alt)
# Transpose a matrix (swap rows and columns for a 2D tensor)
matrix = tf.constant([[1, 2, 3], [4, 5, 6]]) # Shape (2, 3)
transposed = tf.transpose(matrix) # Default perm=[1, 0] for 2D
print("Transposed matrix (shape {}):\n{}".format(transposed.shape, transposed)) # Shape (3, 2)
# Expand dimensions
# Original shape (6,)
expanded = tf.expand_dims(tensor, axis=0) # Add dimension at the beginning
print("Expanded dims (axis=0): {} Shape: {}".format(expanded, expanded.shape)) # Shape (1, 6)
expanded_end = tf.expand_dims(tensor, axis=-1) # Add dimension at the end
print("Expanded dims (axis=-1): {} Shape: {}".format(expanded_end, expanded_end.shape)) # Shape (6, 1)
# Squeeze dimensions
squeezed = tf.squeeze(expanded) # Remove the dimension of size 1
print("Squeezed tensor: {} Shape: {}".format(squeezed, squeezed.shape)) # Shape (6,)
If you are familiar with NumPy, you'll find TensorFlow's tensor operations very familiar. Many functions have the same names and similar behavior (tf.add
, tf.multiply
, tf.matmul
, tf.reduce_sum
, tf.reshape
, indexing/slicing).
However, there are significant distinctions:
tf.GradientTape
for automatic computation of gradients, which is fundamental for training machine learning models. NumPy arrays do not inherently support this.tf.function
for optimization and deployment.While you can easily convert between TensorFlow tensors and NumPy arrays (tensor.numpy()
and tf.convert_to_tensor(numpy_array)
), performing computations directly within TensorFlow is generally preferred when building and training models to benefit from acceleration and differentiation.
Mastering these core operations provides the vocabulary needed to express complex computations, forming the basis for defining layers and models in Keras, as we will see in the next chapter.
© 2025 ApX Machine Learning