Pytorch Image Classification
Image classification is one of the most fundamental tasks in computer vision. Its goal is to enable computers to recognize the main content in images and classify them into predefined categories. For example, identifying whether an image contains a cat or a dog.
### Application of Deep Learning in Image Classification
Deep learning models, particularly Convolutional Neural Networks (CNN), have become the mainstream solution for image classification tasks. PyTorch, as a deep learning framework, provides a complete toolchain for building and training CNN models.
### Project Workflow Overview
A complete image classification project typically includes the following steps:
1. Data preparation and preprocessing
2. Model construction
3. Model training
4. Model evaluation
5. Model application
* * *
## Environment Preparation and Data Loading
### Installing Necessary Libraries
# Install PyTorch and torchvision!pip install torch torchvision
### Loading Common Datasets
PyTorch's torchvision provides several commonly used datasets, such as CIFAR-10, MNIST, etc.
## Example
import torch
import torchvision
import torchvision.transforms as transforms
# Define data transformations
transform = transforms.Compose([
transforms.ToTensor(),# Convert PIL image to Tensor
transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))# Normalization
])
# Load CIFAR-10 training set
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
shuffle=True, num_workers=2)
# Load CIFAR-10 test set
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
shuffle=False, num_workers=2)
# Define class names
classes =('plane','car','bird','cat','deer',
'dog','frog','horse','ship','truck')
* * *
## Building the Convolutional Neural Network Model
### Basic CNN Structure
A typical CNN contains convolutional layers, pooling layers, and fully connected layers.
## Example
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__ (self):
super(Net,self). __init__ ()
# Convolutional layer 1: input 3 channels (RGB), output 6 channels, 5x5 kernel
self.conv1= nn.Conv2d(3,6,5)
# Pooling layer: 2x2 window, stride 2
self.pool= nn.MaxPool2d(2,2)
# Convolutional layer 2: input 6 channels, output 16 channels, 5x5 kernel
self.conv2= nn.Conv2d(6,16,5)
# Fully connected layer 1: input 16*5*5, output 120
self.fc1= nn.Linear(16 * 5 * 5,120)
# Fully connected layer 2: input 120, output 84
self.fc2= nn.Linear(120,84)
# Fully connected layer 3: input 84, output 10 (corresponding to 10 classes)
self.fc3= nn.Linear(84,10)
def forward(self, x):
# First convolution + ReLU + pooling
x =self.pool(F.relu(self.conv1(x)))
# Second convolution + ReLU + pooling
x =self.pool(F.relu(self.conv2(x)))
# Flatten feature maps
x = x.view(-1,16 * 5 * 5)
# Fully connected layer + ReLU
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
# Output layer
x =self.fc3(x)
return x
# Instantiate network
net = Net()
### Model Structure Visualization
!(#)
* * *
## Training the Model
### Defining Loss Function and Optimizer
## Example
import torch.optim as optim
# Cross-entropy loss function
criterion = nn.CrossEntropyLoss()
# Stochastic gradient descent optimizer
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
### Training Loop
## Example
for epoch in range(10): # Train for 10 epochs
running_loss =0.0
for i, data in enumerate(trainloader,0):
# Get input data
inputs, labels = data
# Zero gradients
optimizer.zero_grad()
# Forward propagation
outputs = net(inputs)
# Calculate loss
loss = criterion(outputs, labels)
# Backward propagation
loss.backward()
# Update weights
optimizer.step()
# Print statistics
running_loss += loss.item()
if i % 2000==1999: # Print every 2000 mini-batches
print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
running_loss =0.0
print('Finished Training')
* * *
## Model Evaluation
### Test Set Accuracy Calculation
## Example
correct =0
total =0
with torch.no_grad(): # No gradient calculation
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data,1)
total += labels.size(0)
correct +=(predicted == labels).sum().item()
print(f'Accuracy on test images: {100 * correct / total:.2f}%')
### Per-Class Accuracy Analysis
## Example
class_correct =list(0. for i in range(10))
class_total =list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs,1)
c =(predicted == labels).squeeze()
for i in range(4):
label = labels
class_correct += c.item()
class_total +=1
for i in range(10):
print(f'Accuracy of {classes:5s}: {100 * class_correct / class_total:.2f}%')
* * *
## Model Saving and Loading
### Saving the Trained Model
## Example
# Save model parameters
PATH ='./cifar_net.pth'
torch.save(net.state_dict(), PATH)
### Loading Model for Prediction
## Example
# Load model
net = Net()
net.load_state_dict(torch.load(PATH))
# Use model for prediction
outputs = net(images)
_, predicted = torch.max(outputs,1)
print('Predicted: ',' '.join(f'{classes[predicted]:5s}'for j in range(4)))
YouTip