At the heart of TensorFlow lies the Tensor, a multi-dimensional array that serves as the primary data structure. If you're familiar with NumPy arrays, you'll find tensors quite intuitive. However, TensorFlow tensors offer additional capabilities specifically designed for large-scale machine learning and deep learning, such as seamless GPU acceleration and integration with automatic differentiation.
Think of a tensor as a generalization of vectors and matrices to potentially higher dimensions.
[1, 2, 3]
) is a rank-1 tensor.[[1, 2], [3, 4]]
) is a rank-2 tensor.Every tensor has two fundamental properties:
Shape: Describes the dimensionality of the tensor and the size along each dimension. It's typically represented as a tuple or list of integers. For example:
()
(5,)
(3, 4)
(32, 28, 28)
or perhaps (32, 28, 28, 1)
if explicitly including a channel dimension.
The shape is crucial for ensuring compatibility between operations in a model. You can access a tensor's shape using its .shape
attribute.Data Type (dtype
): Specifies the type of data held within the tensor, such as floating-point numbers, integers, booleans, or strings. Common data types include:
tf.float32
: Standard 32-bit floating-point. The default for most operations and widely used in neural networks for balancing precision and computational cost.tf.int32
: Standard 32-bit integer.tf.bool
: Boolean values (True
or False
).tf.string
: Variable-length byte strings.
TensorFlow operations generally require tensors to have compatible data types. You can access a tensor's data type using its .dtype
attribute.You can create tensors in several ways. The most direct method is using tf.constant()
, which creates an immutable tensor from Python objects (like lists or tuples) or NumPy arrays.
import tensorflow as tf
import numpy as np
# Create a scalar (rank-0 tensor)
scalar = tf.constant(10)
print("Scalar:", scalar)
print("Shape:", scalar.shape)
print("Dtype:", scalar.dtype)
# Create a vector (rank-1 tensor)
vector = tf.constant([1.0, 2.0, 3.0])
print("\nVector:", vector)
print("Shape:", vector.shape)
print("Dtype:", vector.dtype) # Infers float32 from the input
# Create a matrix (rank-2 tensor) specifying the dtype
matrix = tf.constant([[1, 2], [3, 4]], dtype=tf.int16)
print("\nMatrix:", matrix)
print("Shape:", matrix.shape)
print("Dtype:", matrix.dtype)
# Create a rank-3 tensor
tensor3d = tf.constant([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("\nRank-3 Tensor:", tensor3d)
print("Shape:", tensor3d.shape)
print("Dtype:", tensor3d.dtype) # Infers int32
TensorFlow also provides functions to create tensors with specific values, similar to NumPy:
# Tensor of zeros
zeros_tensor = tf.zeros(shape=(2, 3))
print("\nZeros Tensor:\n", zeros_tensor)
# Tensor of ones
ones_tensor = tf.ones(shape=(3, 2), dtype=tf.float32)
print("\nOnes Tensor:\n", ones_tensor)
# Tensor filled with a specific value
fill_tensor = tf.fill(dims=(2, 2), value=99)
print("\nFill Tensor:\n", fill_tensor)
# Tensor with values drawn from a normal distribution
random_normal = tf.random.normal(shape=(2, 2), mean=0.0, stddev=1.0)
print("\nRandom Normal Tensor:\n", random_normal)
# Tensor with values drawn from a uniform distribution
random_uniform = tf.random.uniform(shape=(2, 2), minval=0, maxval=10, dtype=tf.int32)
print("\nRandom Uniform Tensor:\n", random_uniform)
You can also easily convert NumPy arrays into TensorFlow tensors and vice versa. TensorFlow often works seamlessly with NumPy arrays passed directly to its operations.
# Convert NumPy array to Tensor
numpy_array = np.array([[1.0, 2.0], [3.0, 4.0]])
tensor_from_numpy = tf.convert_to_tensor(numpy_array)
print("\nTensor from NumPy:\n", tensor_from_numpy)
print("Dtype:", tensor_from_numpy.dtype) # Preserves NumPy dtype (float64 here)
# Convert Tensor back to NumPy array
numpy_from_tensor = tensor_from_numpy.numpy()
print("\nNumPy from Tensor:\n", numpy_from_tensor)
print("Type:", type(numpy_from_tensor))
While tensors share similarities with NumPy arrays, some distinctions are important:
tf.Tensor
objects are immutable. You cannot update the contents of a tensor once created; operations typically create new tensors. This contrasts with NumPy arrays, which are mutable. For mutable state in TensorFlow, such as model weights that need updating during training, you use tf.Variable
(covered in the next section).tf.GradientTape
(discussed later in this chapter), is fundamental for training machine learning models via gradient descent.Let's clarify these related terms:
tf.rank(tensor)
.
tensor.shape
.tf.size(tensor)
. This is the product of the elements in the shape tuple.rank_2_tensor = tf.constant([[1, 2, 3], [4, 5, 6]])
print("\nTensor:\n", rank_2_tensor)
print("Shape:", rank_2_tensor.shape) # Output: (2, 3)
print("Rank:", tf.rank(rank_2_tensor)) # Output: tf.Tensor(2, shape=(), dtype=int32)
print("Size:", tf.size(rank_2_tensor)) # Output: tf.Tensor(6, shape=(), dtype=int32)
print("Number of elements:", tf.size(rank_2_tensor).numpy()) # Get the Python value
Accessing elements within tensors works much like indexing NumPy arrays or Python lists, using zero-based indexing and slicing.
# Using the rank_2_tensor defined above: shape=(2, 3)
print("\nTensor:\n", rank_2_tensor)
# Get element at row 0, column 1
print("Element at [0, 1]:", rank_2_tensor[0, 1]) # Output: tf.Tensor(2, ...)
# Get the first row
print("First row:", rank_2_tensor[0, :]) # Output: tf.Tensor([1 2 3], ...)
# Get the second column
print("Second column:", rank_2_tensor[:, 1]) # Output: tf.Tensor([2 5], ...)
# Get a submatrix (first row, columns 1 and 2)
print("Submatrix:", rank_2_tensor[0, 1:3]) # Output: tf.Tensor([2 3], ...)
Understanding tensors, their shapes, data types, and how to create and manipulate them is foundational for working with TensorFlow. They are the data containers that flow through the computational graphs you will build and execute. In the next sections, we will explore operations you can perform on these tensors and introduce tf.Variable
for handling mutable model parameters.
© 2025 ApX Machine Learning