YouTip LogoYouTip

Pytorch Torch Nn Linear

Image 1: PyTorch torch.nn Reference Manual PyTorch torch.nn Reference Manual * * * `torch.nn.Linear` is a module in PyTorch used to create fully connected layers (also known as linear layers or affine transformations). It is one of the most fundamental and commonly used layers in neural networks, responsible for linearly transforming input features to output feature space. ### Function Definition torch.nn.Linear(in_features, out_features, bias=True) **Parameter Description:** * `in_features` (int): The dimension of input features, i.e., the number of features output by the previous layer. * `out_features` (int): The dimension of output features, i.e., the number of features output by this layer. * `bias` (bool): Whether to add a bias term. Defaults to `True`. If set to `False`, this layer will not learn bias parameters. **Attributes:** * `weight` (Tensor): A learnable weight matrix with shape (out_features, in_features). * `bias` (Tensor): A learnable bias vector with shape (out_features,). If `bias=False`, this attribute does not exist. ### Mathematical Principle The calculation formula performed by `nn.Linear` is as follows: y = xA^T + b Where: * `x` is the input tensor with shape (..., in_features) * `A` is the weight matrix with shape (out_features, in_features) * `b` is the bias vector with shape (out_features) * `y` is the output tensor with shape (..., out_features) The symbol `...` indicates that the input can be a tensor of any dimension, and the linear transformation will be applied to the last dimension. * * * ## Usage Examples ### Example 1: Basic Usage Create a simple fully connected layer that maps 10-dimensional input to 5-dimensional output: ## Example import torch import torch.nn as nn # Create linear layer: input 10 dimensions, output 5 dimensions linear_layer = nn.Linear(in_features=10, out_features=5, bias=True) # Print weight and bias shapes print("Weight shape:", linear_layer.weight.shape)# torch.Size([5, 10]) print("Bias shape:", linear_layer.bias.shape)# torch.Size() # Create input tensor: batch_size=3, feature dimension=10 input_tensor = torch.randn(3,10) # Forward propagation output = linear_layer(input_tensor) print("Input shape:", input_tensor.shape)# torch.Size([3, 10]) print("Output shape:", output.shape)# torch.Size([3, 5]) print("Output data:n", output) Output: Weight shape: torch.Size([5, 10])Bias shape: torch.Size()Input shape: torch.Size([3, 10])Output shape: torch.Size([3, 5])Output data: tensor([[-0.1838, 0.0607, -0.4879, 0.8981, -0.2098], [ 0.1513, -0.1873, 0.1866, -0.2448, -0.6012], [ 0.2915, 0.3053, 0.2532, -0.3372, -0.3968]], grad_fn=) In this example, we created a fully connected layer with 10-dimensional input and 5-dimensional output. The input tensor has shape (3, 10), where 3 is the batch size and 10 is the feature dimension. The output tensor has shape (3, 5). ### Example 2: Without Bias Create a linear layer without a bias term: ## Example import torch import torch.nn as nn # Create linear layer without bias linear_no_bias = nn.Linear(in_features=10, out_features=5, bias=False) # Check if bias exists print("Bias exists:", linear_no_bias.bias is None)# True # Forward propagation input_tensor = torch.randn(3,10) output = linear_no_bias(input_tensor) print("Output shape:", output.shape)# torch.Size([3, 5]) print("Output:n", output) Output: Bias exists: TrueOutput shape: torch.Size([3, 5])Output: tensor([[-0.3312, -0.4113, 0.0257, -0.4876, 0.0780], [ 0.1513, 0.2459, -0.2983, 0.2456, -0.0727], [-0.0143, 0.3053, 0.1866, -0.3372, 0.2532]], grad_fn=) ### Example 3: Multi-dimensional Input `nn.Linear` can handle inputs of any dimension, only transforming the last dimension: ## Example import torch import torch.nn as nn # Create linear layer linear = nn.Linear(in_features=10, out_features=5) # Process 2D input (batch, features) input_2d = torch.randn(8,10) output_2d = linear(input_2d) print("2D input -> output:", input_2d.shape,"->", output_2d.shape) # Process 3D input (batch, seq, features) input_3d = torch.randn(4,6,10) output_3d = linear(input_3d) print("3D input -> output:", input_3d.shape,"->", output_3d.shape) # Process 4D input (batch, channels, height, width) input_4d = torch.randn(2,3,4,10) output_4d = linear(input_4d) print("4D input -> output:", input_4d.shape,"->", output_4d.shape) Output: 2D input -> output: torch.Size([8, 10]) -> torch.Size([8, 5])3D input -> output: torch.Size([4, 6, 10]) -> torch.Size([4, 6, 5])4D input -> output: torch.Size([2, 3, 4, 10]) -> torch.Size([2, 3, 4, 5]) ### Example 4: Using in Neural Networks In actual neural networks, `nn.Linear` is usually combined with other layers: ## Example import torch import torch.nn as nn # Define a Multi-Layer Perceptron (MLP) class MLP(nn.Module): def __init__ (self, input_dim, hidden_dim, output_dim): super(MLP,self). __init__ () # First layer: input -> hidden layer self.fc1= nn.Linear(input_dim, hidden_dim) # Activation function self.relu= nn.ReLU() # Second layer: hidden layer -> output self.fc2= nn.Linear(hidden_dim, output_dim) def forward(self, x): x =self.fc1(x) x =self.relu(x) x =self.fc2(x) return x # Create model model = MLP(input_dim=784, hidden_dim=256, output_dim=10) # Print model structure print("Model structure:") print(model) # Test forward propagation input_tensor = torch.randn(32,784)# batch_size=32, 28x28=784 output = model(input_tensor) print("nInput shape:", input_tensor.shape)# torch.Size([32, 784]) print("Output shape:", output.shape)# torch.Size([32, 10]) Output: Model structure: MLP( (fc1): Linear(in_features=784, out_features=256, bias=True) (relu): ReLU() (fc2): Linear(in_features=256, out_features=10, bias=True))Input shape: torch.Size([32, 784])Output shape: torch.Size([32, 10]) * * * ## Weight Initialization By default, `nn.Linear` uses PyTorch's default initialization strategy. You can also manually initialize weights: ## Example import torch import torch.nn as nn # Create linear layer linear = nn.Linear(10,5) # Initialize weights using Xavier initialization nn.init.xavier_uniform_(linear.weight) # Initialize bias to zero nn.init.zeros_(linear.bias) # View initialized weights print("Weights:n", linear.weight.data) print("Bias:", linear.bias.data) * * * ## Difference from nn.functional.linear PyTorch also provides a functional interface `torch.nn.functional.linear`: ## Example import torch import torch.nn as nn import torch.nn.functional as F # Method 1: Using nn.Module linear_module = nn.Linear(10,5) output1 = linear_module(torch.randn(3,10)) # Method 2: Using functional interface weight = torch.randn(5,10) bias = torch.randn(5) output2 = F.linear(torch.randn(3,10), weight, bias) print("nn.Module output shape:", output1.shape) print("nn.functional output shape:", output2.shape) The differences between the two: * `nn.Linear` is a module class that saves weight and bias parameters, making it convenient for training and saving models. * `F.linear` is a function that requires manually passing weights and biases, commonly used in situations without learnable parameters. > Note: Although both perform the same mathematical operations, when building neural networks, `nn.Linear` is usually preferred because it automatically registers parameters, making it easier for the optimizer to update them. * * * ## Common Questions ### Q1: How to check the number of parameters in a Linear layer? For a linear layer with (in_features, out_features): * Number of weight parameters: in_features * out_features * Number of bias parameters: out_features (if bias=True) ## Example import torch import torch.nn as nn linear = nn.Linear(100,50) print("Total parameters:",sum(p.numel()for p in linear.parameters())) print("Weight parameters:", linear.weight.numel()) print("Bias parameters:", linear.bias.numel()) ### Q2: How to freeze Linear layer parameters? If you want to fix certain layers so they don't participate in training, you can set `requires_grad=False`: ## Example import torch import torch.nn as nn linear = nn.Linear(10,5) # Freeze weight so it doesn't participate in gradient calculation linear.weight.requires_grad=False # Freeze bias linear.bias.requires_grad=False # Verify print("Weight requires_grad:", linear.weight.requires_grad) print("Bias requires_grad:", linear.bias.requires_grad) * * * ## Usage Scenarios `nn.Linear` is one of the most commonly used layers in neural networks, with main application scenarios including: * **Multi-Layer Perceptron (MLP)**: As a fully connected layer, mapping features to new feature spaces. * **Classifier**: At the end of convolutional or recurrent networks, using fully connected layers for classification. * **Feature Transformation**: Performing linear transformations on data to achieve dimensionality reduction or expansion. * **Attention Mechanism**: Used in Transformers to generate Q, K, V matrices. * * * Image 2: PyTorch torch.nn Reference Manual PyTorch torch.nn Reference Manual
← Pytorch Torch Nn MselossPytorch Torch Nn Layernorm β†’