While tf.Tensor
objects are fundamental for representing data in TensorFlow, they possess a characteristic that makes them unsuitable for certain tasks: they are immutable. Once created, you cannot change the value of a tf.Tensor
. Think of them like Python numbers or strings; performing an operation like a = a + 1
doesn't modify the original a
, it creates a new object representing the result and reassigns the variable name a
to point to it.
In machine learning, however, we constantly need to update model parameters, such as weights and biases, during the training process. We need a way to hold state that can be modified. This is precisely the role of tf.Variable
.
A tf.Variable
represents a mutable tensor whose value can be changed by running operations on it. Variables are the standard way to represent the shared, persistent state your model manipulates, primarily the trainable parameters. Under the hood, a tf.Variable
stores a persistent tensor and makes its value available for operations within your TensorFlow graph or eager execution context.
You create a tf.Variable
by providing an initial value, which can be a Python scalar, list, NumPy array, or an existing tf.Tensor
. TensorFlow infers the data type (dtype
) from the initial value, but you can also specify it explicitly.
import tensorflow as tf
import numpy as np
# Create a scalar Variable
scalar_var = tf.Variable(5.0, dtype=tf.float32)
print(f"Scalar Variable: {scalar_var}")
# Create a vector Variable from a list
vector_var = tf.Variable([1.0, 2.0, 3.0], name="my_vector") # Optional name
print(f"Vector Variable: {vector_var}")
# Create a matrix Variable from a NumPy array
matrix_var = tf.Variable(np.array([[1, 2], [3, 4]], dtype=np.int32))
print(f"Matrix Variable:\n{matrix_var}")
# Create a Variable from another Tensor
initial_tensor = tf.zeros((2, 2))
tensor_var = tf.Variable(initial_tensor)
print(f"Variable from Tensor:\n{tensor_var}")
You'll notice that printing a tf.Variable
shows its shape, data type, and name (if provided), similar to a tf.Tensor
. The trainable
parameter (which defaults to True
) is particularly important; it tells TensorFlow whether the variable's value should be considered during automatic differentiation, which we'll cover shortly. Non-trainable variables can be useful for tracking statistics like step counts.
Variables can be used in TensorFlow operations much like tf.Tensor
objects. Performing an operation with a Variable implicitly reads its current value.
# Using Variables in operations
result_tensor = scalar_var * 2.0 + vector_var[0]
print(f"\nOperation result (Tensor): {result_tensor}")
# Variables can be reshaped (creates a new Tensor)
reshaped_tensor = tf.reshape(matrix_var, (4,))
print(f"Reshaped Variable (Tensor): {reshaped_tensor}")
# Accessing the underlying Tensor value
print(f"\nVariable value as NumPy: {vector_var.numpy()}")
It's important to note that operations involving Variables typically return new tf.Tensor
objects, not Variables. The original Variable remains, holding its current state.
The defining characteristic of tf.Variable
is its mutability. You can change the value held by a Variable using several methods, most commonly .assign()
, .assign_add()
, and .assign_sub()
.
variable.assign(new_value)
: Replaces the variable's value entirely.variable.assign_add(increment)
: Adds a value to the variable (equivalent to +=
).variable.assign_sub(decrement)
: Subtracts a value from the variable (equivalent to -=
).These assignment operations are themselves TensorFlow operations and become part of the computation graph when using tf.function
.
print(f"Original scalar_var: {scalar_var.numpy()}")
# Assign a new value
scalar_var.assign(10.0)
print(f"After assign(10.0): {scalar_var.numpy()}")
# Increment the value
scalar_var.assign_add(2.5)
print(f"After assign_add(2.5): {scalar_var.numpy()}")
# Decrement the value
scalar_var.assign_sub(1.0)
print(f"After assign_sub(1.0): {scalar_var.numpy()}")
# Assign can work with Tensors of compatible shape
new_vector_val = tf.constant([4.0, 5.0, 6.0])
vector_var.assign(new_vector_val)
print(f"\nVector variable after assign: {vector_var.numpy()}")
Attempting to modify a tf.Tensor
in a similar way would result in an error, highlighting the fundamental difference in their intended use.
The ability to modify Variables is intrinsically linked to model training. During training, algorithms like gradient descent need to compute how a small change in each model parameter (Variable) affects the model's error (loss function). TensorFlow's automatic differentiation engine, tf.GradientTape
, is designed to track computations involving trainable tf.Variable
objects automatically.
When you perform operations within a tf.GradientTape
context, TensorFlow records the operations involving trainable Variables. You can then use the tape to compute gradients of some target (like a loss) with respect to those Variables. This gradient information is then used by optimizers to update the Variable values, moving the model towards better performance.
# Example demonstrating gradient tracking
var_a = tf.Variable(2.0)
var_b = tf.Variable(3.0)
with tf.GradientTape() as tape:
# Operations involving Variables are tracked
y = var_a * var_a * var_b # y = a^2 * b
# Calculate gradients of y with respect to var_a and var_b
# dy/da = 2*a*b = 2*2*3 = 12
# dy/db = a^2 = 2^2 = 4
gradients = tape.gradient(y, [var_a, var_b])
print(f"\ny = {y.numpy()}")
print(f"Gradient dy/da: {gradients[0].numpy()}")
print(f"Gradient dy/db: {gradients[1].numpy()}")
# If a Variable is marked non-trainable, gradients are not tracked
non_trainable_var = tf.Variable(5.0, trainable=False)
with tf.GradientTape() as tape:
z = non_trainable_var * 2.0
gradients_z = tape.gradient(z, [non_trainable_var])
print(f"\nGradient for non-trainable variable: {gradients_z[0]}") # Output will be None
This automatic tracking and gradient calculation is the engine behind training most deep learning models in TensorFlow.
tf.Tensor
vs tf.Variable
Feature | tf.Tensor |
tf.Variable |
---|---|---|
Mutability | Immutable | Mutable |
Purpose | Representing fixed data, intermediate computations | Representing trainable parameters, shared state |
Modification | Cannot be changed in-place | Modified using .assign() , .assign_add() , etc. |
Gradient | Not tracked by GradientTape by default |
Tracked by GradientTape if trainable=True |
In essence, use tf.Tensor
for your input data and intermediate results of computations. Use tf.Variable
for any parameter or state within your model that needs to persist and be updated across multiple computation steps, particularly model weights and biases that will be adjusted during training. Understanding this distinction is foundational for building and training models effectively in TensorFlow. We will now explore how TensorFlow uses tf.GradientTape
to compute the gradients essential for optimizing these Variables.
© 2025 ApX Machine Learning