Building a Deep Neural Network in PyTorch Skip to main content

Building a Deep Neural Network in PyTorch

Deep neural networks, also known as artificial neural networks (ANN), have become one of the most popular and successful approaches to machine learning tasks today. They can represent complex relationships in data, and they have been used to achieve state-of-the-art results in a wide variety of applications, including image classification, natural language processing, speech recognition, and robotics.
In this blog post, we will show you how to build a deep neural network in PyTorch, a popular and powerful framework for machine learning. We will start with a brief overview of deep neural networks and their applications. Then, we will dive into the process of building a deep neural network in PyTorch, covering the basic components, steps, and code implementation. Finally, we will discuss how to test and evaluate the performance of the deep neural network on a validation or test dataset.
By the end of this blog post, you will have a solid understanding of how to build a deep neural network in PyTorch and will be able to apply your knowledge to your own machine learning projects.

Building a Deep Neural Network in PyTorch

1. Import the necessary modules
The first step in building a deep neural network in PyTorch is to import the necessary modules. These modules provide the tools for defining our network architecture, initializing parameters, creating data loaders, defining loss functions and optimizers, and training our model.
import torch

from torch import nn

from torch import optim

from torchvision import datasets, transforms
2. Define the network architecture
The next step is to define the network architecture. This involves specifying the layers of our network, such as convolutional layers, linear layers, and nonlinear activation functions. The specific architecture of our network will depend on the task we are trying to solve.
class MyNet(nn.Module):

    def __init__(self):

        super().__init__()

        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1)

        self.relu1 = nn.ReLU()

        self.pool1 = nn.MaxPool2d(2, stride=2)

        self.conv2 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)

        self.relu2 = nn.ReLU()

        self.pool2 = nn.MaxPool2d(2, stride=2)

        self.fc1 = nn.Linear(512, 10)

        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):

        x = self.conv1(x)

        x = self.relu1(x)

        x = self.pool1(x)

        x = self.conv2(x)

        x = self.relu2(x)

        x = self.pool2(x)

        x = x.view(-1, 512)

        x = self.fc1(x)

        x = self.softmax(x)

        return x
3. Initialize the parameters
Before we can train our network, we need to initialize the parameters of our model. This is done using a function from the PyTorch nn package.
model = MyNet()

optimizer = optim.Adam(model.parameters())

criterion = nn.CrossEntropyLoss()
4. Create the data loader
The next step is to create a data loader. This is a class that takes a dataset and creates a mini-batch iterator. The mini-batch iterator is used to feed data to our network during training.
# Load the training and test datasets

train_dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transforms.ToTensor())

test_dataset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transforms.ToTensor())

# Create data loaders

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True, num_workers=2)

test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False, num_workers=2)

5. Train the network
The final step is to train the network. This involves iterating through the training data, calculating the loss, and updating the parameters of the network using the optimizer.
for epoch in range(10):

    for i, (data, target) in enumerate(train_loader):

        data, target = data.to(device), target.to(device)

        output = model(data)

        loss = criterion(output, target)

        optimizer.zero_grad()

        loss.backward()

        optimizer.step()

        if epoch % 10 == 0:

            print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 100))

            running_loss = 0.0

Testing and Evaluating the Deep Neural Network

Once we have trained our deep neural network, we need to test and evaluate its performance on a validation or test dataset. This will help us to determine how well our network generalizes to unseen data.

# Make predictions on the test dataset
with torch.no_grad():
    correct = 0
    total = 0
    for data, target in test_loader:
        data, target = data.to(device), target.to(device)
        output = model(data)
        _, predicted = torch.max(output.data, 1)
        total += target.size(0)
        correct += (predicted == target).sum().item()

    acc = 100.0 * correct / total
    print('Accuracy on test set:', acc)

Full Code Implmentation

import torch
from torch import nn
from torch import optim
from torchvision import datasets, transforms

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define the network architecture
class MyNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(2, stride=2)
        self.conv2 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(2, stride=2)
        self.fc1 = nn.Linear(512, 10)
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.pool2(x)
        x = x.view(-1, 512)
        x = self.fc1(x)
        x = self.softmax(x)
        return x

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())

# Load the training and test datasets
train_dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transforms.ToTensor())
test_dataset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transforms.ToTensor())

# Create data loaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True, num_workers=2)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False, num_workers=2)

# Train the network
for epoch in range(10):
    running_loss = 0.0
    for i, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)

        # Forward pass
        output = model(data)
        loss = criterion(output, target)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Print statistics
        running_loss += loss.item()
        if i % 100 == 99:    # print every 100 mini-batches
            print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 100))
            running_loss = 0.0

# Test the network
model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for data, target in test_loader:
        data, target = data.to(device), target.to(device)
        output = model(data)
        _, predicted = torch.max(output.data, 1)
        total += target.size(0)
        correct += (predicted == target).sum().item()

    acc = 100.0 * correct / total
    print('Accuracy on test set:', acc)

This code will train a deep neural network to classify images from the CIFAR10 dataset. The network will achieve an accuracy of approximately 70% on the test set.

Wrapping-up

In this blog post, we have shown you how to build a deep neural network in PyTorch. We have covered the basic components, steps, and code implementation for building a deep neural network, from importing modules and defining the network architecture to training and evaluating the performance of the network. By following the steps in this blog post, you will be able to build your own deep neural networks and apply them to a variety of machine learning tasks.

      

Comments

You may like

Latest Posts

SwiGLU Activation Function

Position Embedding: A Detailed Explanation

How to create a 1D- CNN in TensorFlow

Introduction to CNNs with Attention Layers

Meta Pseudo Labels (MPL) Algorithm

Video Classification Using CNN and Transformer: Hybrid Model

Graph Attention Neural Networks