Pytorch Linear Regression
Linear regression is one of the most basic machine learning algorithms, used to predict a continuous value. It is a simple and common regression analysis method that aims to predict output by fitting a linear function.
For a simple linear regression problem, the model can be expressed as:
!(#)
* y is the predicted value (target value).
* $x_{1}$, $x_{2}$, $x_{n}$ are input features.
* $w_{1}$, $w_{2}$, $w_{n}$ are weights to be learned (model parameters).
* b is the bias term.
!(#)
In PyTorch, a linear regression model can be implemented by inheriting from the `nn.Module` class. We will explain in detail how to implement a linear regression model in PyTorch through a simple example.
* * *
## Data Preparation
First, we prepare some fake data for training our linear regression model. Here, we can generate a simple linear relationship dataset, where each sample has two features $x_{1}$, $x_{2}$.
## Example
import torch
import numpy as np
import matplotlib.pyplot as plt
# Random seed to ensure consistent results each run
torch.manual_seed(42)
# Generate training data
X = torch.randn(100,2)# 100 samples, each with 2 features
true_w = torch.tensor([2.0,3.0])# Assume true weights
true_b =4.0# Bias term
Y = X @ true_w + true_b + torch.randn(100) * 0.1# Add some noise
# Print partial data
print(X[:5])
print(Y[:5])
The output is as follows:
tensor([[ 1.9269, 1.4873], [ 0.9007, -2.1055], [ 0.6784, -1.2345], [-0.0431, -1.6047], [-0.7521, 1.6487]]) tensor([12.4460, -0.4663, 1.7666, -0.9357, 7.4781])
This code creates a linear dataset with noise.
* Input X is a 100x2 matrix, each sample has two features.
* Output Y is generated from the true weights and bias, plus some random noise.
* Using `torch.manual_seed(42)` ensures consistent results each run, facilitating debugging and reproducibility.
* * *
## Define Linear Regression Model
We can define a simple linear regression model by inheriting from `nn.Module`. In PyTorch, the core of linear regression is the `nn.Linear()` layer, which automatically handles weight and bias initialization.
## Example
import torch.nn as nn
# Define linear regression model
class LinearRegressionModel(nn.Module):
def __init__ (self):
super(LinearRegressionModel,self). __init__ ()
# Define a linear layer, input has 2 features, output is 1 predicted value
self.linear= nn.Linear(2,1)# Input dimension 2, output dimension 1
def forward(self, x):
return self.linear(x)# Forward propagation, return predicted result
# Create model instance
model = LinearRegressionModel()
Here, `nn.Linear(2, 1)` represents a linear layer with 2 input features and 1 output. The `forward` method defines how to perform forward propagation through this layer.
> **Note:** `nn.Linear` automatically creates the weight matrix and bias vector, no manual definition needed.
* * *
## Define Loss Function and Optimizer
The common loss function for linear regression is **Mean Squared Error Loss (MSELoss)**, used to measure the difference between predicted and true values. PyTorch provides a ready-made MSELoss function.
We will use **SGD (Stochastic Gradient Descent)** or **Adam** optimizer to minimize the loss function.
## Example
# Loss function (Mean Squared Error)
criterion = nn.MSELoss()
# Optimizer (using SGD or Adam)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)# Learning rate set to 0.01
# Can also use Adam optimizer
# optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
| Component | Description |
| --- | --- |
| `MSELoss` | Calculates the mean squared error between predicted and true values, formula is $frac{1}{n} sum left(right. y_{p r e d} - y_{t r u e} left.right)^{2}$ |
| `SGD` | Uses stochastic gradient descent to update parameters, learning rate controls the step size |
| `Adam` | Adaptive learning rate optimizer, usually converges faster |
* * *
## Train the Model
During training, we will perform the following steps:
1. Perform forward propagation with input data X to get predicted values.
2. Calculate loss (difference between predicted and actual values).
3. Use backpropagation to compute gradients.
4. Update model parameters (weights and bias).
We will train the model for 1000 epochs and print the loss every 100 epochs.
## Example
# Train the model
num_epochs =1000# Train for 1000 epochs
for epoch in range(num_epochs):
model.train()# Set model to training mode
# Forward propagation
predictions = model(X)# Model outputs predicted values
loss = criterion(predictions.squeeze(), Y)# Calculate loss (note: predictions need to be squeezed to 1D)
# Backward propagation
optimizer.zero_grad()# Clear previous gradients
loss.backward()# Compute gradients
optimizer.step()# Update model parameters
# Print loss
if(epoch + 1) % 100==0:
print(f'Epoch [{epoch + 1}/1000], Loss: {loss.item():.4f}')
* `predictions.squeeze()`: Compresses the model's output from 2D tensor to 1D, because target value `Y` is a one-dimensional array.
* `optimizer.zero_grad()`: Need to clear previous gradients before each backward propagation, otherwise gradients will accumulate.
* `loss.backward()`: Automatically computes gradients for all trainable parameters.
* `optimizer.step()`: Updates weights and bias based on the computed gradients.
> **Training Mode vs. Evaluation Mode:** Calling `model.train()` in the training loop is necessary. Although this example doesn't use Dropout or BatchNorm, developing this habit is important for complex models.
* * *
## Evaluate the Model
After training, we can evaluate the model by examining its weights and bias. We can also make predictions on new data and compare with actual values.
## Example
# View trained weights and bias
print(f'Predicted weight: {model.linear.weight.data.numpy()}')
print(f'Predicted bias: {model.linear.bias.data.numpy()}')
# Make predictions on new data
with torch.no_grad(): # No need to compute gradients during evaluation
predictions = model(X)
# Visualize predictions
YouTip