Once you have torch.Tensor
objects, similar to tf.Tensor
in TensorFlow, the next step is to perform operations on them. Many mathematical operations you're familiar with from TensorFlow and NumPy have direct counterparts in PyTorch. This section offers a comparative look at how these fundamental tensor operations are executed, helping you map your TensorFlow knowledge to PyTorch's syntax.
Basic arithmetic operations like addition, subtraction, multiplication, and division are performed element-wise, just as in TensorFlow. PyTorch supports both overloaded operators and explicit functions.
import torch
import tensorflow as tf # For illustrative comparison
# PyTorch
a_pt = torch.tensor([[1., 2.], [3., 4.]])
b_pt = torch.tensor([[5., 6.], [7., 8.]])
# Element-wise addition
sum_pt = a_pt + b_pt
# sum_pt = torch.add(a_pt, b_pt)
print("PyTorch Sum:\n", sum_pt)
# Element-wise multiplication
prod_pt = a_pt * b_pt
# prod_pt = torch.mul(a_pt, b_pt)
print("PyTorch Product:\n", prod_pt)
# TensorFlow equivalent (for your reference)
# a_tf = tf.constant([[1., 2.], [3., 4.]])
# b_tf = tf.constant([[5., 6.], [7., 8.]])
# sum_tf = tf.add(a_tf, b_tf) # or a_tf + b_tf
# prod_tf = tf.multiply(a_tf, b_tf) # or a_tf * b_tf
PyTorch provides a comprehensive suite of mathematical functions in the torch
module, such as torch.sin()
, torch.cos()
, torch.exp()
, torch.log()
, which operate element-wise on tensors.
Matrix multiplication is the foundation of neural networks. In PyTorch, you can use torch.matmul()
or the @
operator.
# PyTorch
mat1_pt = torch.randn(2, 3)
mat2_pt = torch.randn(3, 4)
# Matrix multiplication
result_pt = torch.matmul(mat1_pt, mat2_pt)
# result_pt = mat1_pt @ mat2_pt
print("PyTorch Matrix Multiplication (2x3 @ 3x4):\n", result_pt)
print("Result shape:", result_pt.shape) # torch.Size([2, 4])
# TensorFlow equivalent
# mat1_tf = tf.random.normal((2, 3))
# mat2_tf = tf.random.normal((3, 4))
# result_tf = tf.matmul(mat1_tf, mat2_tf) # or mat1_tf @ mat2_tf
PyTorch tensors support NumPy-style indexing and slicing, which should feel very familiar if you've worked with NumPy or TensorFlow tensors.
# PyTorch
data_pt = torch.arange(0, 10) # tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print("Element at index 3:", data_pt[3])
print("Slice from index 2 to 5 (exclusive):", data_pt[2:5])
print("All elements from index 5 onwards:", data_pt[5:])
print("Last element:", data_pt[-1])
# Multi-dimensional indexing
matrix_pt = torch.randn(3, 4)
print("First row:", matrix_pt[0]) # Gets the first row
print("First column:", matrix_pt[:, 0]) # Gets the first column
print("Element at (1,1):", matrix_pt[1, 1])
# Modifying elements
data_pt[0] = 100
print("Modified data_pt:", data_pt)
For joining tensors, PyTorch offers torch.cat()
to concatenate along an existing dimension and torch.stack()
to concatenate along a new dimension. This is similar to tf.concat()
and tf.stack()
in TensorFlow.
# PyTorch
t1_pt = torch.randn(2, 3)
t2_pt = torch.randn(2, 3)
# Concatenate along dimension 0 (rows)
cat_dim0_pt = torch.cat((t1_pt, t2_pt), dim=0) # Shape: [4, 3]
print("Concatenated along dim 0 shape:", cat_dim0_pt.shape)
# Concatenate along dimension 1 (columns)
cat_dim1_pt = torch.cat((t1_pt, t2_pt), dim=1) # Shape: [2, 6]
print("Concatenated along dim 1 shape:", cat_dim1_pt.shape)
# Stack along a new dimension (dim 0 by default)
stacked_pt = torch.stack((t1_pt, t2_pt), dim=0) # Shape: [2, 2, 3]
print("Stacked along new dim 0 shape:", stacked_pt.shape)
# TensorFlow equivalent
# t1_tf = tf.random.normal((2,3))
# t2_tf = tf.random.normal((2,3))
# tf.concat([t1_tf, t2_tf], axis=0)
# tf.stack([t1_tf, t2_tf], axis=0)
Altering the shape of a tensor without changing its data is a common requirement. PyTorch provides several functions for this:
reshape()
: Returns a tensor with the specified shape. If the new shape is compatible with the original number of elements and the tensor is contiguous in memory, it often returns a view (shares underlying data). Otherwise, it may return a copy.view()
: Similar to reshape()
, but strictly returns a view. The tensor must be contiguous, and the new shape must be compatible. This is more memory-efficient as it avoids data copying, but changes to the view will affect the original tensor.squeeze()
: Removes dimensions of size 1.unsqueeze()
: Adds a dimension of size 1.# PyTorch
original_pt = torch.arange(12) # tensor([ 0, 1, ..., 11])
# Reshape to 3x4
reshaped_pt = original_pt.reshape(3, 4)
print("Reshaped (3x4):\n", reshaped_pt)
# Using view
view_pt = original_pt.view(3, 4)
# view_pt[0,0] = 99 # This would change original_pt[0] as well
# Squeeze and unsqueeze
x_pt = torch.randn(1, 3, 1, 4) # Shape: [1, 3, 1, 4]
squeezed_pt = x_pt.squeeze() # Shape: [3, 4] (removes dims of size 1)
print("Squeezed shape:", squeezed_pt.shape)
unsqueezed_pt = squeezed_pt.unsqueeze(dim=0) # Shape: [1, 3, 4] (adds dim at pos 0)
print("Unsqueezed shape:", unsqueezed_pt.shape)
# TensorFlow equivalent
# original_tf = tf.range(12)
# reshaped_tf = tf.reshape(original_tf, (3, 4))
# x_tf = tf.random.normal((1, 3, 1, 4))
# squeezed_tf = tf.squeeze(x_tf)
# unsqueezed_tf = tf.expand_dims(squeezed_tf, axis=0)
In TensorFlow, tf.reshape
is the primary way to change shape, while tf.squeeze
and tf.expand_dims
correspond to PyTorch's squeeze
and unsqueeze
.
Reduction operations aggregate tensor values, such as sum()
, mean()
, max()
, min()
, and std()
. You can perform these over the entire tensor or along a specific dimension.
# PyTorch
tensor_pt = torch.tensor([[1., 2., 3.], [4., 5., 6.]])
# Sum of all elements
sum_all_pt = tensor_pt.sum()
print("Sum of all elements:", sum_all_pt) # tensor(21.)
# Sum along dimension 0 (collapsing rows, sum of each column)
sum_cols_pt = tensor_pt.sum(dim=0)
print("Sum along dim 0 (columns):", sum_cols_pt) # tensor([5., 7., 9.])
# Mean along dimension 1 (collapsing columns, mean of each row)
mean_rows_pt = tensor_pt.mean(dim=1)
print("Mean along dim 1 (rows):", mean_rows_pt) # tensor([2., 5.])
# Max element and its index
max_val_pt, max_idx_pt = torch.max(tensor_pt, dim=1)
print("Max values per row:", max_val_pt)
print("Max indices per row:", max_idx_pt)
# TensorFlow equivalent
# tensor_tf = tf.constant([[1., 2., 3.], [4., 5., 6.]])
# tf.reduce_sum(tensor_tf)
# tf.reduce_sum(tensor_tf, axis=0)
# tf.reduce_mean(tensor_tf, axis=1)
# tf.argmax(tensor_tf, axis=1) for indices, tf.reduce_max(tensor_tf, axis=1) for values
TensorFlow uses tf.reduce_sum
, tf.reduce_mean
, etc., for these operations. tf.argmax
and tf.argmin
find indices of max/min values.
Element-wise comparisons (>
, <
, ==
, !=
, etc.) result in boolean tensors. PyTorch provides functions like torch.eq()
, torch.gt()
, etc.
# PyTorch
a_pt = torch.tensor([1, 2, 3, 4])
b_pt = torch.tensor([4, 3, 2, 1])
# Element-wise greater than
gt_pt = a_pt > b_pt
print("a_pt > b_pt:", gt_pt) # tensor([False, False, True, True])
# Element-wise equality
eq_pt = torch.eq(a_pt, torch.tensor([1, 3, 3, 5]))
print("torch.eq(a_pt, [1,3,3,5]):", eq_pt) # tensor([ True, False, True, False])
# TensorFlow equivalent
# a_tf = tf.constant([1, 2, 3, 4])
# b_tf = tf.constant([4, 3, 2, 1])
# tf.greater(a_tf, b_tf)
# tf.equal(a_tf, tf.constant([1, 3, 3, 5]))
PyTorch supports in-place operations, which modify the tensor directly without creating a new one. These are often denoted by a trailing underscore (e.g., add_()
, mul_()
). While they can save memory, use them with caution, especially with operations that require gradients, as modifying a tensor needed for backward pass can cause errors.
# PyTorch
x_pt = torch.ones(3)
y_pt = torch.tensor([1., 2., 3.])
print("Original y_pt:", y_pt)
y_pt.add_(x_pt) # In-place addition: y_pt = y_pt + x_pt
print("y_pt after add_():", y_pt) # y_pt is modified
# This is different from:
# z_pt = y_pt.add(x_pt) # Out-of-place: z_pt is a new tensor, y_pt remains unchanged
TensorFlow's tf.Tensor
objects are immutable. Operations typically create new tensors. Mutability in TensorFlow is primarily handled through tf.Variable
objects, which have methods like assign()
, assign_add()
, etc.
PyTorch supports broadcasting, similar to NumPy and TensorFlow. If two tensors have different shapes but are compatible according to broadcasting rules (dimensions are equal, or one is 1, or one is missing), operations can still be performed element-wise.
# PyTorch
# Tensor m_pt: shape (3, 1)
# [[1],
# [2],
# [3]]
m_pt = torch.arange(1, 4).reshape(3, 1).float()
# Tensor n_pt: shape (1, 2)
# [[10, 20]]
n_pt = torch.tensor([[10., 20.]])
# m_pt is broadcast to (3,2), n_pt is broadcast to (3,2)
# m_pt + n_pt:
# [[1+10, 1+20], [[11, 21],
# [2+10, 2+20], = [12, 22],
# [3+10, 3+20]] [13, 23]]
result_pt = m_pt + n_pt
print("Broadcasted sum (3,1) + (1,2):\n", result_pt)
print("Result shape:", result_pt.shape) # torch.Size([3, 2])
# TensorFlow equivalent
# m_tf = tf.constant([[1.],[2.],[3.]]) # Shape (3,1)
# n_tf = tf.constant([[10., 20.]]) # Shape (1,2)
# result_tf = m_tf + n_tf # Shape (3,2) via broadcasting
PyTorch defaults to torch.float32
for tensors created from Python lists or NumPy arrays of floating-point numbers, and torch.int64
(long tensor) for integers. TensorFlow also commonly defaults to float32
and int32
. Mismatched data types are a frequent source of errors. You can convert tensor types using the .to(dtype)
method or specific casting methods like .float()
, .long()
, .double()
, etc.
# PyTorch
float_list_pt = torch.tensor([1.0, 2.5, 3.0])
print("Default float dtype:", float_list_pt.dtype) # torch.float32
int_list_pt = torch.tensor([1, 2, 3])
print("Default int dtype:", int_list_pt.dtype) # torch.int64
# Casting
float_to_double_pt = float_list_pt.to(torch.float64)
print("Casted to double:", float_to_double_pt.dtype) # torch.float64
int_to_float_pt = int_list_pt.float() # Equivalent to .to(torch.float32)
print("Casted int to float:", int_to_float_pt.dtype) # torch.float32
As you can see, many tensor operations in PyTorch have direct parallels in TensorFlow, often with very similar naming conventions or operator usage. The dynamic nature of PyTorch means these operations are executed immediately, which can be helpful for debugging and interactive development. Getting comfortable with these operations is a significant step in your transition to PyTorch.
© 2025 ApX Machine Learning