Pytorch Torch Nn Transformerencoderlayer

📅 2026-06-22 | 📂 PyTorch

# PyTorch torch.nn.TransformerEncoderLayer Function [![Image 3: PyTorch torch.nn Reference Manual](#) PyTorch torch.nn Reference Manual](#) * * * `nn.TransformerEncoderLayer` is the single-layer structure of the Transformer encoder. It contains self-attention and feed-forward network, and is the basic unit that constitutes a complete encoder. ### Function Definition torch.nn.TransformerEncoderLayer(d_model, nhead, dim_feedforward=2048, dropout=0.1, activation='gelu', batch_first=True) ### Parameters * `d_model`: Model dimension * `nhead`: Number of attention heads * `dim_feedforward`: FFN hidden layer dimension * * * ## Usage Examples ### Example 1: Basic Usage ## Example import torch import torch.nn as nn # Encoder layer layer = nn.TransformerEncoderLayer(d_model=512, nhead=8, batch_first=True) # Input x = torch.randn(32,100,512) output = layer(x) print("Input:", x.shape,"-> Output:", output.shape) ### Example 2: Custom Activation Function ## Example import torch import torch.nn as nn # Use ReLU activation layer_relu = nn.TransformerEncoderLayer(d_model=256, nhead=4, activation='relu', batch_first=True) x = torch.randn(8,50,256) out = layer_relu(x) print("ReLU activation output:", out.shape) ### Example 3: Multi-layer Stacking ## Example import torch import torch.nn as nn # Stack multiple layers layer = nn.TransformerEncoderLayer(256,4, batch_first=True) encoder = nn.Sequential( layer, nn.TransformerEncoderLayer(256,4, batch_first=True), nn.TransformerEncoderLayer(256,4, batch_first=True) ) x = torch.randn(4,30,256) out = encoder(x) print("3-layer encoder output:", out.shape) * * * ## Use Cases * **Building encoders** * **BERT and other models** * **Text understanding** * * PyTorch torch.nn Reference Manual](#)

YouTip

Pytorch Torch Nn Transformerencoderlayer

📂 Categories