Monday, February 7, 2022

Stack Abuse: Numpy Array to Tensor and Tensor to Numpy Array with PyTorch

Tensors are multi-dimensional objects, and the essential data representaion block of Deep Learning frameworks such as Tensorflow and PyTorch.

A scalar has one dimension, a vector has two, and tensors have three or more. In practice, we oftentimes refer to scalars and vectors as tensors as well for convinience.

Note: A tensor can also be any n-dimensional array, just like a Numpy array can. Many frameworks have support for working with Numpy arrays, and many of them are built on top of Numpy so the integration is both natural and efficient.

However, a torch.Tensor has more built-in capabilities than Numpy arrays do, and these capabilities are geared towards Deep Learning applications (such as GPU acceleration), so it makes sense to prefer torch.Tensor instances over regular Numpy arrays when working with PyTorch. Additionally, torch.Tensors have a very Numpy-like API, making it intuitive for most with prior experience!

In this guide, learn how to convert between a Numpy Array and PyTorch Tensors.

Convert Numpy Array to PyTorch Tensor

To convert a Numpy array to a PyTorch tensor - we have two distinct approaches we could take: using the from_numpy() function, or by simply supplying the Numpy array to the torch.Tensor() constructor or by using the tensor() function:

import torch
import numpy as np

np_array = np.array([5, 7, 1, 2, 4, 4])

# Convert Numpy array to torch.Tensor
tensor_a = torch.from_numpy(np_array)
tensor_b = torch.Tensor(np_array)
tensor_c = torch.tensor(np_array)

So, what's the difference? The from_numpy() and tensor() functions are dtype-aware! Since we've created a Numpy array of integers, the dtype of the underlying elements will naturally be int32:

print(np_array.dtype)
# dtype('int32')

If we were to print out our two tensors:

print(f'tensor_a: {tensor_a}\ntensor_b: {tensor_b}\ntensor_c: {tensor_c}')

tensor_a and tensor_c retain the data type used within the np_array, cast into PyTorch's variant (torch.int32), while tensor_b automatically assigns the values to floats:

tensor_a: tensor([5, 7, 1, 2, 4, 4], dtype=torch.int32)
tensor_b: tensor([5., 7., 1., 2., 4., 4.])
tensor_c: tensor([5, 7, 1, 2, 4, 4], dtype=torch.int32)

This can also be observed through checking their dtype fields:

print(tensor_a.dtype) # torch.int32
print(tensor_b.dtype) # torch.float32
print(tensor_c.dtype) # torch.int32

Numpy Array to PyTorch Tensor with dtype

These approaches also differ in whether you can explicitly set the desired dtype when creating the tensor. from_numpy() and Tensor() don't accept a dtype argument, while tensor() does:

# Retains Numpy dtype
tensor_a = torch.from_numpy(np_array)
# Creates tensor with float32 dtype
tensor_b = torch.Tensor(np_array)
# Retains Numpy dtype OR creates tensor with specified dtype
tensor_c = torch.tensor(np_array, dtype=torch.int32)

print(tensor_a.dtype) # torch.int32
print(tensor_b.dtype) # torch.float32
print(tensor_c.dtype) # torch.int32

Naturally, you can cast any of them very easily, using the exact same syntax, allowing you to set the dtype after the creation as well, so the acceptance of a dtype argument isn't a limitation, but more of a convenience:

tensor_a = tensor_a.float()
tensor_b = tensor_b.float()
tensor_c = tensor_c.float()

print(tensor_a.dtype) # torch.float32
print(tensor_b.dtype) # torch.float32
print(tensor_c.dtype) # torch.float32

Convert PyTorch Tensor to Numpy Array

Converting a PyTorch Tensor to a Numpy array is straightforward, since tensors are ultimately built on top of Numpy arrays, and all we have to do is "expose" the underlying data structure.

Since PyTorch can optimize the calculations performed on data based on your hardware, there are a couple of caveats though:

tensor = torch.tensor([1, 2, 3, 4, 5])

np_a = tensor.numpy()
np_b = tensor.detach().numpy()
np_c = tensor.detach().cpu().numpy()

So, why use detach() and cpu() before exposing the underlying data structure with numpy(), and when should you detach and transfer to a CPU?

CPU PyTorch Tensor -> CPU Numpy Array

If your tensor is on the CPU, where the new Numpy array will also be - it's fine to just expose the data structure:

np_a = tensor.numpy()
# array([1, 2, 3, 4, 5], dtype=int64)

This works very well, and you've got yourself a clean Numpy array.

CPU PyTorch Tensor with Gradients -> CPU Numpy Array

However, if your tensor requires you to calculate gradients for it as well (i.e. the requires_grad argument is set to True), this approach won't work anymore. You'll have to detach the underlying array from the tensor, and through detaching, you'll be pruning away the gradients:

tensor = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32, requires_grad=True)

np_a = tensor.numpy()
# RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.
np_b = tensor.detach().numpy()
# array([1., 2., 3., 4., 5.], dtype=float32)

GPU PyTorch Tensor -> CPU Numpy Array

Finally - if you've created your tensor on the GPU, it's worth remembering that regular Numpy arrrays don't support GPU acceleration. They reside on the CPU! You'll have to transfer the tensor to a CPU, and then detach/expose the data structure.

Note: This can either be done via the to('cpu') or cpu() functions - they're functionally equivalent.

This has to be done explicitly, because if it were done automatically - the conversion between CPU and CUDA tensors to arrays would be different under the hood, which could lead to unexpected bugs down the line.

PyTorch is fairly explicit, so this sort of automatic conversion was purposefully avoided:

# Create tensor on the GPU
tensor = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32, requires_grad=True).cuda()

np_b = tensor.detach().numpy()
# TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
np_c = tensor.detach().cpu().numpy()
# array([1., 2., 3., 4., 5.], dtype=float32)

Note: It's highly advised to call detach() before cpu(), to prune away the gradients before transfering to the CPU. The gradients won't matter anyway after the detach() call - so copying them at any point is totally redundant and inefficient. It's better to "cut the dead weight" as soon as possible.

Generally speaking - this approach is the safest, as no matter which sort of tensor you're working - it won't fail. If you've got a CPU tensor, and you try sending it to the CPU - nothing happens. If you've got a tensor without gradients, and try detaching it - nothing happens. On the other end of the stick - exceptions are thrown.

Conclusion

In this guide - we've taken a look at what PyTorch tensors are, before diving into how to convert a Numpy array into a PyTorch tensor. Finally, we've explored how PyTorch tensors can expose the underlying Numpy array, and in which cases you'd have to perform additional transfers and pruning.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...