PyTorch torch.nn Reference Manual
torch.nn.Dropout is a module in PyTorch used for regularization.
It reduces co-adaptation between neurons by randomly zeroing out input elements, thus preventing overfitting.
Function Definition
torch.nn.Dropout(p=0.5, inplace=False)
Parameter Description:
- p (float): The probability of an element to be zeroed. Default is 0.5.
- inplace (bool): Whether to perform the operation in-place. Default is False.
Usage Examples
Example 1: Basic Usage
Create and use a Dropout layer:
import torch
import torch.nn as nn
# Create Dropout layer with 50% dropout rate
dropout = nn.Dropout(p=0.5)
# Training mode (Dropout active)
dropout.train()
# Create input
input_tensor = torch.ones(1,10)
print("Input:", input_tensor.squeeze().tolist())
# Forward pass multiple times to observe randomness
for i in range(3):
output = dropout(input_tensor)
print(f"Output {i+1}:", output.squeeze().tolist())
You can see that approximately half of the elements are randomly zeroed each time it's called.
Example 2: Training vs Evaluation Mode
Dropout behaves differently in training and evaluation modes:
import torch
import torch.nn as nn
dropout = nn.Dropout(p=0.5)
# Training mode
dropout.train()
train_output = dropout(torch.ones(4,10))
print("Training mode - Activation ratio:", (train_output != 0).float().mean().item())
# Evaluation mode
dropout.eval()
eval_output = dropout(torch.ones(4,10))
print("Evaluation mode - Activation ratio:", (eval_output != 0).float().mean().item())
print("Evaluation mode output:", eval_output.tolist())
In evaluation mode, Dropout does not take effect, and the output remains unchanged.
Example 3: Using in Neural Networks
A typical fully connected network with Dropout:
import torch
import torch.nn as nn
class DropoutNet(nn.Module):
def __init__(self, input_dim=784, hidden_dim=256, output_dim=10, dropout_rate=0.5):
super(DropoutNet, self).__init__()
self.fc1 = nn.Linear(input_dim, hidden_dim)
self.dropout1 = nn.Dropout(p=dropout_rate)
self.fc2 = nn.Linear(hidden_dim, hidden_dim)
self.dropout2 = nn.Dropout(p=dropout_rate)
self.fc3 = nn.Linear(hidden_dim, output_dim)
self.relu = nn.ReLU()
def forward(self, x):
x = self.relu(self.fc1(x))
x = self.dropout1(x) # First Dropout
x = self.relu(self.fc2(x))
x = self.dropout2(x) # Second Dropout
x = self.fc3(x)
return x
model = DropoutNet()
# Training mode
model.train()
input_data = torch.randn(32, 784)
output = model(input_data)
print("Training mode output shape:", output.shape)
# Evaluation mode
model.eval()
output = model(input_data)
print("Evaluation mode output shape:", output.shape)
Example 4: Using Dropout2d in CNNs
nn.Dropout2d drops entire feature maps per channel:
import torch
import torch.nn as nn
# Dropout2d drops entire channels
dropout2d = nn.Dropout2d(p=0.5)
# Input: batch=1, channels=4, height=4, width=4
input_tensor = torch.ones(1, 4, 4, 4)
dropout2d.train()
output = dropout2d(input_tensor)
print("Dropout2d output shape:", output.shape)
print("Number of non-zero channels:", (output.sum(dim=(2, 3)) != 0).sum().item())
Example 5: Effect of Different Dropout Rates
The impact of dropout rate on the network:
import torch
import torch.nn as nn
for p in [0.1, 0.3, 0.5, 0.7]:
dropout = nn.Dropout(p=p)
dropout.train()
# Run multiple times and average
total_active = 0
for _ in range(100):
output = dropout(torch.ones(1000))
total_active += (output != 0).float().sum().item()
avg_active = total_active / 100 / 1000
print(f"p={p} - Average activation ratio: {avg_active:.2%} (Expected: {1-p:.2%})")
Comparison of Dropout Types
| Type | Dropout Method | Applicable Scenarios |
|---|---|---|
nn.Dropout |
Randomly zeros individual elements | Full connection layers, feature vectors |
nn.Dropout2d |
Randomly zeros entire channels | Convolutional layer feature maps |
nn.Dropout3d |
Randomly zeros entire 3D channels | 3D convolutional features |
Common Questions
Q1: How to choose the Dropout rate?
- 0.1-0.3: Light regularization, suitable for large datasets
- 0.4-0.5: Common default value
- 0.5+: Stronger regularization, suitable for small datasets
Q2: Where should Dropout be placed?
Usually placed after fully connected layers and before or after activation functions. It can also be placed before the activation function.
Q3: Should Dropout be turned off during evaluation?
Yes, during evaluation, use model.eval() to automatically turn off Dropout.
Usage Scenarios
nn.Dropout is mainly used in the following scenarios:
- Preventing Overfitting: Reduces dependency between neurons
- Model Ensembling: Approximates the effect of multiple networks
- Full Connection Layers: Most commonly used in FC layers
- Feature Dropping: Improves model robustness
Note: Dropout should be enabled during training and switched to evaluation mode during testing; otherwise, the output will be unstable.
```
YouTip