The blog post discusses about PyTorch and TensorFlow in deep learning.

For beginners in deep learning, one of the biggest decisions they have to make is which deep learning framework they should pursue: PyTorch or TensorFlow, the two most popular frameworks in the deep learning community. Here, I would like to provide some insights using an example ML pipeline built with both frameworks.
Step 1 & 2. Data Exploration & Preprocessing
The dataset we use here is the MNIST dataset, which we have seen before. It contains 60,000 image samples of handwritten digits. Let's look at some examples of images from the MNIST dataset:
import keras
import matplotlib.pyplot as plt
# Download MNIST
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
# Display 10 samples
plt.figure(figsize=(10, 4))
for i in range(10):
plt.subplot(2, 5, i + 1)
plt.imshow(X_train[i], cmap='gray')
plt.axis('off')
plt.tight_layout()
plt.show()

The images above may appear quite blurry, but they are still of size 28 by 28 pixels, containing 784 pixels in total. The pixel values before preprocessing range between 0 and 255, and the class labels are not one-hot encoded. Hence, we apply the following to normalize the data and apply one-hot encoding:
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1]*X_train.shape[2])
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1]*X_test.shape[2])
def zscore(X, axis = None):
X_mean = X.mean(axis=axis, keepdims=True)
X_std = np.std(X, axis=axis, keepdims=True)
zscore = (X-X_mean)/X_std
return zscore
X_train = zscore(X_train)
X_test = zscore(X_test)
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)
To check for potential overfitting and underfitting, we typically set up a validation dataset and track loss and metrics on it. We can use a portion of the training dataset to create a validation dataset as follows:
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=10000, random_state=101)
For TensorFlow, the data preprocessing is already done at this point since the model built with
TensorFlow is compatible with NumPy arrays. However, for PyTorch, we need to convert the NumPy
arrays into tensor
:
import torch.nn as nn
X_train, X_val, X_test = map(lambda X: torch.tensor(X, dtype=torch.float32), (X_train, X_val, X_test))
y_train, y_val, y_test = map(lambda y: torch.tensor(y, dtype=torch.int64), (y_train, y_val, y_test))
Technically, the above is already sufficient for PyTorch. However, it is highly recommended
to build a Dataset
and DataLoader
from the tensors to configure batch sizes and other
hyperparameters related to data:
train_dataset = torch.utils.data.TensorDataset(X_train, y_train)
val_dataset = torch.utils.data.TensorDataset(X_val, y_val)
test_dataset = torch.utils.data.TensorDataset(X_test, y_test)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=32, shuffle=True)
val_loader = torch.utils.data.DataLoader(dataset=val_dataset, batch_size=32, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=1, shuffle=True)
Here, we use a batch size of 32 for both the PyTorch and TensorFlow models. The other hyperparameters are also set the same for both frameworks.
Step 3. Models
Let's create classification models using TensorFlow and PyTorch to see how they differ. You can click below to see the corresponding implementation.
Here, I will omit the training results and Step 4 (model evaluation) since there isn't much to discuss. I highly recommend you try it yourself as practice.
Conclusion
In general, it is said that TensorFlow is for quick production because of how fast it is to write code using predefined functions, objects, and features like TensorBoard for monitoring training, as well as how easily it can be deployed on multiple platforms using features like TensorFlow Lite, TensorFlow.js, etc.
It is also said that PyTorch is for research, where we build new custom models, layers, and other custom components, due to its high customizability. However, both frameworks are constantly improving, and both are fast to write and easy to customize (at least, that is my opinion after using them both). Therefore, it is completely up to you to decide which one to use. Personally, I would recommend learning both at the same time by going back and forth between them since they are similar enough, and you are likely to use both anyway.