# Chapter 3: Introduction to Pytorch & Neural NetworksÂ¶

**By Tomas Beuzen đźš€**

## Chapter OutlineÂ¶

## Chapter Learning ObjectivesÂ¶

Describe the difference between

`NumPy`

and`torch`

arrays (`np.array`

vs.`torch.Tensor`

).Explain fundamental concepts of neural networks such as layers, nodes, activation functions, etc.

Create a simple neural network in PyTorch for regression or classification.

## ImportsÂ¶

```
import sys
import NumPy as np
import pandas as pd
import torch
from torchsummary import summary
from torch import nn, optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.datasets import make_regression, make_circles, make_blobs
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from utils.plotting import *
```

## 1. IntroductionÂ¶

PyTorch is a Python-based tool for scientific computing that provides several main features:

`torch.Tensor`

, an n-dimensional array similar to that of`NumPy`

, but which can run on GPUsComputational graphs and an automatic differentiation enginge for building and training neural networks

You can install PyTorch from: https://pytorch.org/.

## 2. PyTorchâ€™s TensorÂ¶

In PyTorch a tensor is just like NumPyâ€™s `ndarray`

which most readers will be familiar with already (if not, check out Chapter 5 and Chapter 6 of my Python Programming for Data Science course).

A key difference between PyTorchâ€™s `torch.Tensor`

and NumPyâ€™s `np.array`

is that `torch.Tensor`

was constructed to integrate with GPUs and PyTorchâ€™s computational graphs (more on that next chapter though).

### 2.1. `ndarray`

vs `tensor`

Â¶

Creating and working with tensors is much the same as with NumPy `ndarrays`

. You can create a tensor with `torch.tensor()`

:

```
tensor_1 = torch.tensor([1, 2, 3])
tensor_2 = torch.tensor([1, 2, 3], dtype=torch.float32)
tensor_3 = torch.tensor(np.array([1, 2, 3]))
for t in [tensor_1, tensor_2, tensor_3]:
print(f"{t}, dtype: {t.dtype}")
```

```
tensor([1, 2, 3]), dtype: torch.int64
tensor([1., 2., 3.]), dtype: torch.float32
tensor([1, 2, 3]), dtype: torch.int64
```

PyTorch also comes with most of the `NumPy`

functions youâ€™re probably already familiar with:

```
torch.zeros(2, 2) # zeroes
```

```
tensor([[0., 0.],
[0., 0.]])
```

```
torch.ones(2, 2) # ones
```

```
tensor([[1., 1.],
[1., 1.]])
```

```
torch.randn(3, 2) # random normal
```

```
tensor([[-1.1988, -0.7157],
[-0.1942, -1.7273],
[-1.0674, 0.4149]])
```

```
torch.rand(2, 3, 2) # rand uniform
```

```
tensor([[[0.0583, 0.3669],
[0.0315, 0.9852],
[0.1880, 0.5039]],
[[0.0234, 0.7198],
[0.5472, 0.1252],
[0.1728, 0.3510]]])
```

Just like in NumPy we can look at the shape of a tensor with the `.shape`

attribute:

```
x = torch.rand(2, 3, 2, 2)
x.shape
```

```
torch.Size([2, 3, 2, 2])
```

```
x.ndim
```

```
4
```

### 2.2. Tensors and Data TypesÂ¶

Different data types have different memory and computational implications (see Chapter 6 of Python Programming for Data Science for more). In Pytorch weâ€™ll be building networks that require thousands or even millions of floating point calculations! In such cases, using a smaller dtype like `float32`

can significantly speed up computations and reduce memory requirements. The default float dtype in pytorch `float32`

, as opposed to NumPyâ€™s `float64`

. In fact some operations in Pytorch will even throw an error if you pass a high-memory `dtype`

!

```
print(np.array([3.14159]).dtype)
print(torch.tensor([3.14159]).dtype)
```

```
float64
torch.float32
```

But just like in NumPy, you can always specify the particular dtype you want using the `dtype`

argument:

```
print(torch.tensor([3.14159], dtype=torch.float64).dtype)
```

```
torch.float64
```

### 2.3. Operations on TensorsÂ¶

Tensors operate just like `ndarrays`

and have a variety of familiar methods that can be called off them:

```
a = torch.rand(1, 3)
b = torch.rand(3, 1)
a + b # broadcasting betweean a 1 x 3 and 3 x 1 tensor
```

```
tensor([[1.3773, 1.5033, 1.1765],
[0.9496, 1.0756, 0.7488],
[1.3639, 1.4899, 1.1631]])
```

```
a * b
```

```
tensor([[0.4183, 0.5349, 0.2325],
[0.2249, 0.2876, 0.1250],
[0.4122, 0.5271, 0.2292]])
```

```
a.mean()
```

```
tensor(0.4272)
```

```
a.sum()
```

```
tensor(1.2816)
```

### 2.4. IndexingÂ¶

Once again, same as NumPy!

```
X = torch.rand(5, 2)
print(X)
```

```
tensor([[0.2803, 0.1461],
[0.1740, 0.9460],
[0.1257, 0.3427],
[0.7001, 0.3810],
[0.6504, 0.6580]])
```

```
print(X[0, :])
print(X[0])
print(X[:, 0])
```

```
tensor([0.2803, 0.1461])
tensor([0.2803, 0.1461])
tensor([0.2803, 0.1740, 0.1257, 0.7001, 0.6504])
```

### 2.5. NumPy BridgeÂ¶

Sometimes we might want to convert a tensor back to a NumPy array. We can do that using the `.numpy()`

method:

```
X = torch.rand(3,3)
print(type(X))
X_NumPy = X.NumPy()
print(type(X_NumPy))
```

```
<class 'torch.Tensor'>
<class 'numpy.ndarray'>
```

### 2.6. GPU and CUDA TensorsÂ¶

GPU stands for â€śgraphical processing unitâ€ť (as opposed to a CPU: central processing unit). GPUs were originally developed for gaming, they are very fast at performing operations on large amounts of data by performing them in parallel (think about updating the value of all pixels on a screen very quickly as a player moves around in a game). More recently, GPUs have been adapted for more general purpose programming. Neural networks can typically be broken into smaller computations that can be performed in parallel on a GPU. PyTorch is tightly integrated with CUDA - a software layer that facilitates interactions with a GPU (if you have one). You can check if you have GPU capability using:

```
torch.cuda.is_available() # my MacBook Pro does not have a GPU
```

```
False
```

When training on a machine that has a GPU, you need to tell PyTorch you want to use it. Youâ€™ll see the following at the top of most PyTorch code:

```
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)
```

```
cpu
```

You can then use the `device`

argument when creating tensors to specify whether you wish to use a CPU or GPU. Or if you want to move a tensor between the CPU and GPU, you can use the `.to()`

method:

```
X = torch.rand(2, 2, 2, device=device)
print(X.device)
```

```
cpu
```

```
# X.to('cuda') # this would give me an error as I don't have a GPU so I'm commenting out
```

Weâ€™ll revisit GPUs later in the course when we are working with bigger datasets and more complex networks. For now, we can work on the CPU just fine.

## 3. Neural Network BasicsÂ¶

Itâ€™s probably that youâ€™ve already learned about several machine learning algorithms (kNN, Random Forest, SVM, etc.). Neural networks are simply another algorithm and actually one of the simplest in my opinion! As weâ€™ll see, a neural network is just a sequence of linear and non-linear transformations. Often you see something like this when learning about/using neural networks:

So what on Earth does that all mean? Well we are going to build up some intuition one step at a time.

### 3.1. Simple Linear Regression with a Neural NetworkÂ¶

Letâ€™s create a simple regression dataset with 500 observations:

```
X, y = make_regression(n_samples=500, n_features=1, random_state=0, noise=10.0)
plot_regression(X, y)
```