At the heart of both TensorFlow and PyTorch lies the tensor, a fundamental data structure for all computations. If you've worked with TensorFlow, you're already familiar with tf.Tensor
as a multi-dimensional array that can hold numerical data. PyTorch's torch.Tensor
serves the exact same purpose. While the core idea is identical, the way you create, interact with, and manage these tensors differs in ways that reflect PyTorch's dynamic nature. This section will guide you through these distinctions, focusing on tensor creation and attributes.
Both frameworks provide versatile ways to create tensors, whether from existing Python data structures like lists or NumPy arrays, or by initializing them with specific values.
In TensorFlow, you typically use tf.constant()
or tf.convert_to_tensor()
to create a tensor from a Python list or a NumPy array.
# TensorFlow
import tensorflow as tf
import numpy as np
# From a Python list
tf_tensor_list = tf.constant([1, 2, 3])
print(f"TF tensor from list: {tf_tensor_list}, dtype: {tf_tensor_list.dtype}")
# From a NumPy array
np_array = np.array([4.0, 5.0, 6.0])
tf_tensor_numpy = tf.convert_to_tensor(np_array)
print(f"TF tensor from NumPy: {tf_tensor_numpy}, dtype: {tf_tensor_numpy.dtype}")
PyTorch offers similar functionality with torch.tensor()
. It's important to note that torch.tensor()
always copies the data. If you want to create a tensor that shares memory with a NumPy array (more on this later), you'd use torch.from_numpy()
or torch.as_tensor()
.
# PyTorch
import torch
import numpy as np
# From a Python list
pt_tensor_list = torch.tensor([1, 2, 3])
print(f"PyTorch tensor from list: {pt_tensor_list}, dtype: {pt_tensor_list.dtype}")
# From a NumPy array (copies data)
np_array = np.array([4.0, 5.0, 6.0])
pt_tensor_numpy_copy = torch.tensor(np_array)
print(f"PyTorch tensor from NumPy (copy): {pt_tensor_numpy_copy}, dtype: {pt_tensor_numpy_copy.dtype}")
# From a NumPy array (shares memory, if on CPU)
pt_tensor_numpy_share = torch.from_numpy(np_array)
print(f"PyTorch tensor from NumPy (share): {pt_tensor_numpy_share}, dtype: {pt_tensor_numpy_share.dtype}")
Notice a subtle difference in default integer types: TensorFlow often defaults to tf.int32
for integers from Python lists, while PyTorch defaults to torch.int64
(also known as torch.long
). For floating-point numbers, both typically default to 32-bit precision (tf.float32
and torch.float32
).
Creating tensors filled with zeros, ones, or random numbers is a common task.
TensorFlow:
# TensorFlow
tf_zeros = tf.zeros(shape=(2, 3), dtype=tf.float32)
tf_ones = tf.ones(shape=(2, 3), dtype=tf.int32)
tf_random = tf.random.normal(shape=(2, 3), mean=0.0, stddev=1.0)
tf_range = tf.range(start=0, limit=5, delta=1)
print(f"TF Zeros:\n{tf_zeros}")
print(f"TF Ones:\n{tf_ones}")
print(f"TF Random Normal:\n{tf_random}")
print(f"TF Range:\n{tf_range}")
PyTorch:
# PyTorch
pt_zeros = torch.zeros(size=(2, 3), dtype=torch.float32)
pt_ones = torch.ones(size=(2, 3), dtype=torch.int32)
pt_random = torch.randn(size=(2, 3)) # mean 0, std 1 by default
pt_arange = torch.arange(start=0, end=5, step=1) # note 'end' instead of 'limit'
print(f"PyTorch Zeros:\n{pt_zeros}")
print(f"PyTorch Ones:\n{pt_ones}")
print(f"PyTorch Random Normal:\n{pt_random}")
print(f"PyTorch arange:\n{pt_arange}")
The function names and parameter names are often very similar (e.g., tf.zeros
vs torch.zeros
, tf.random.normal
vs torch.randn
). PyTorch's torch.arange
uses end
for the exclusive upper bound, whereas TensorFlow's tf.range
uses limit
.
Both frameworks also provide *_like
functions (e.g., tf.zeros_like()
, torch.zeros_like()
) to create new tensors with the same shape and data type as an existing tensor.
Understanding a tensor's properties is essential. Both tf.Tensor
and torch.Tensor
expose attributes like data type, shape, and the device they reside on.
dtype
)In TensorFlow, the data type is an instance of tf.DType
(e.g., tf.float32
, tf.int64
). In PyTorch, it's a torch.dtype
object (e.g., torch.float32
, torch.int64
).
# TensorFlow
tf_t = tf.constant([1.0, 2.0])
print(f"TF dtype: {tf_t.dtype}") # tf.float32
# PyTorch
pt_t = torch.tensor([1.0, 2.0])
print(f"PyTorch dtype: {pt_t.dtype}") # torch.float32
# Explicitly setting dtype
pt_t_int = torch.tensor([1, 2], dtype=torch.int16)
print(f"PyTorch explicit dtype: {pt_t_int.dtype}") # torch.int16
As mentioned, be mindful of default integer types. PyTorch's torch.int64
(or torch.long
) for integers from Python lists is a common default, while TensorFlow leans towards tf.int32
. You can always specify the dtype
explicitly during creation.
To change a tensor's data type:
tf_new_type = tf.cast(tf_t, dtype=tf.int32)
pt_new_type = pt_t.to(dtype=torch.int32)
or pt_new_type = pt_t.int()
(for common types)The shape of a tensor describes its dimensions.
shape
attribute is a tf.TensorShape
object. You can convert it to a Python list using tf_t.shape.as_list()
.shape
attribute is a torch.Size
object, which behaves like a tuple. You can also use the size()
method, which returns the same torch.Size
object.# TensorFlow
tf_t_shape = tf.zeros((2, 3, 4))
print(f"TF shape object: {tf_t_shape.shape}") # TensorShape([2, 3, 4])
print(f"TF shape as list: {tf_t_shape.shape.as_list()}") # [2, 3, 4]
print(f"TF rank: {tf_t_shape.ndim}") # 3
# PyTorch
pt_t_shape = torch.zeros(2, 3, 4)
print(f"PyTorch shape object: {pt_t_shape.shape}") # torch.Size([2, 3, 4])
print(f"PyTorch size() method: {pt_t_shape.size()}") # torch.Size([2, 3, 4])
print(f"PyTorch individual dim: {pt_t_shape.shape[0]}")# 2
print(f"PyTorch rank: {pt_t_shape.ndim}") # 3
Both also offer an ndim
attribute for the number of dimensions (rank).
Tensors can reside on different computational devices, primarily CPU or GPU.
device
attribute is a string, like '/job:localhost/replica:0/task:0/device:CPU:0'
or '/job:localhost/replica:0/task:0/device:GPU:0'
.device
attribute is a torch.device
object, e.g., device(type='cpu')
or device(type='cuda', index=0)
.# TensorFlow (device placement is more implicit or via tf.device context)
with tf.device('/CPU:0'):
tf_t_cpu = tf.constant([1, 2])
print(f"TF tensor device: {tf_t_cpu.device}")
# if gpus are available and TensorFlow is configured for GPU
# with tf.device('/GPU:0'):
# tf_t_gpu = tf.constant([1,2])
# print(f"TF tensor device (GPU): {tf_t_gpu.device}")
# PyTorch
pt_t_cpu = torch.tensor([1, 2]) # Defaults to CPU
print(f"PyTorch tensor device (default): {pt_t_cpu.device}")
if torch.cuda.is_available():
pt_t_gpu = torch.tensor([1, 2], device='cuda')
# Or: pt_t_gpu = pt_t_cpu.to('cuda')
# Or: pt_t_gpu = pt_t_cpu.cuda()
print(f"PyTorch tensor device (GPU): {pt_t_gpu.device}")
else:
print("PyTorch: CUDA not available, GPU example skipped.")
Moving tensors between devices in PyTorch is explicit using the .to(device)
method (e.g., my_tensor.to('cuda')
or my_tensor.to('cpu')
). We'll cover device management in more detail in a later section.
While tf.Tensor
and torch.Tensor
share the common goal of representing multi-dimensional data, some differences are particularly important for developers transitioning from TensorFlow:
Mutability:
tf.Tensor
usually create and return new tensors. To have mutable state, TensorFlow uses tf.Variable
.tensor.add_()
). This can lead to more memory-efficient code if used carefully but requires attention to avoid unintended side effects.tf.Variable
vs. torch.Tensor(requires_grad=True)
:
tf.Variable
is a special object designed to hold mutable tensor-like values, typically model parameters, whose modifications need to be tracked for automatic differentiation.torch.Tensor
can be marked to have its operations tracked by setting its requires_grad
attribute to True
(e.g., x = torch.randn(3, requires_grad=True)
). These are still regular, mutable tensors. The requires_grad
flag simply tells PyTorch's autograd system to record operations involving them for gradient computation.NumPy Bridge and Memory Sharing:
tf_tensor.numpy()
: Converts a TensorFlow tensor to a NumPy array. If the tensor is on a GPU, data is copied to the CPU.torch_tensor.numpy()
: Converts a PyTorch tensor to a NumPy array. Crucially, if the torch.Tensor
is on the CPU, the returned NumPy array shares the same underlying memory. Modifying one will affect the other. If the tensor is on a GPU, a copy is made to CPU memory.tf.convert_to_tensor(numpy_array)
: Creates a TensorFlow tensor from a NumPy array, typically copying the data.torch.from_numpy(numpy_array)
: Creates a PyTorch tensor from a NumPy array, sharing memory if the array is on the CPU. To ensure a copy, use torch.tensor(numpy_array)
.This memory-sharing behavior in PyTorch for CPU tensors and NumPy arrays is a performance feature but requires careful handling. If you need a distinct copy, use tensor.clone().numpy()
for PyTorch to NumPy, or torch.tensor(numpy_array)
for NumPy to PyTorch.
The following diagram illustrates the memory sharing characteristic between PyTorch CPU tensors and NumPy arrays:
Interaction between NumPy arrays and PyTorch tensors on the CPU.
torch.from_numpy()
and the.numpy()
method (on CPU tensors) facilitate memory sharing, whiletorch.tensor()
ensures a data copy.
Understanding these characteristics of torch.Tensor
, its creation methods, attributes, mutability, and NumPy interaction, is a fundamental step. As you progress, you'll see how these properties influence the way you build models and write training loops in PyTorch's more imperative style. The next section will get into common tensor operations, further highlighting the similarities and differences compared to TensorFlow.
Was this section helpful?
© 2025 ApX Machine Learning