Pytorch Torch Nn Bilinear
## PyTorch torch.nn.Bilinear
`torch.nn.Bilinear` is a built-in layer in PyTorch that applies a bilinear transformation to two incoming inputs. It is widely used for feature fusion, multi-modal learning, and modeling complex interactions between two distinct feature spaces.
---
## Mathematical Formulation
The bilinear layer applies the following transformation to the inputs $x_1$ and $x_2$:
$$y_i = x_1^T W_i x_2 + b_i$$
Where:
* $x_1$ is the first input tensor of shape $(*, H_{in1})$.
* $x_2$ is the second input tensor of shape $(*, H_{in2})$.
* $W_i$ is the weight matrix of shape $(H_{in1}, H_{in2})$ corresponding to the $i$-th output feature.
* $b_i$ is the bias term for the $i$-th output feature.
* $y_i$ is the $i$-th element of the output vector.
The complete learnable weight tensor $W$ has the shape $(H_{out}, H_{in1}, H_{in2})$, and the bias tensor $b$ has the shape $(H_{out})$.
---
## Syntax and Parameters
### Constructor
```python
torch.nn.Bilinear(in1_features, in2_features, out_features, bias=True, device=None, dtype=None)
```
### Parameters
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `in1_features` | `int` | Size of each first input sample ($H_{in1}$). |
| `in2_features` | `int` | Size of each second input sample ($H_{in2}$). |
| `out_features` | `int` | Size of each output sample ($H_{out}$). |
| `bias` | `bool` | If set to `False`, the layer will not learn an additive bias. Default: `True`. |
| `device` | `torch.device` | The desired device of returned parameters. Default: `None`. |
| `dtype` | `torch.dtype` | The desired data type of returned parameters. Default: `None`. |
---
## Code Examples
### Example 1: Basic Usage
This example demonstrates how to initialize a bilinear layer and pass two random input tensors through it.
```python
import torch
import torch.nn as nn
# Bilinear layer: two 100-dimensional inputs -> 50-dimensional output
bilinear = nn.Bilinear(100, 100, 50)
# Generate two random input tensors with batch size 4
x1 = torch.randn(4, 100)
x2 = torch.randn(4, 100)
# Forward pass
output = bilinear(x1, x2)
print("Input 1 shape:", x1.shape)
print("Input 2 shape:", x2.shape)
print("Output shape :", output.shape)
```
**Output:**
```text
Input 1 shape: torch.Size([4, 100])
Input 2 shape: torch.Size([4, 100])
Output shape : torch.Size([4, 50])
```
---
### Example 2: Feature Fusion in a Custom Module
Bilinear layers are highly effective for fusing features from different modalities (e.g., combining text and image embeddings).
```python
import torch
import torch.nn as nn
# Custom network for bilinear feature fusion
class FusionNet(nn.Module):
def __init__(self, dim=128):
super(FusionNet, self).__init__()
# Fuses two representations of size 'dim' into a single representation of size 'dim'
self.bilinear = nn.Bilinear(dim, dim, dim)
def forward(self, feat1, feat2):
return self.bilinear(feat1, feat2)
# Instantiate the model
model = FusionNet(dim=128)
# Simulated feature vectors (e.g., batch size of 4, 128 features each)
f1 = torch.randn(4, 128)
f2 = torch.randn(4, 128)
# Perform fusion
output = model(f1, f2)
print("Fused Output shape:", output.shape)
```
**Output:**
```text
Fused Output shape: torch.Size([4, 128])
```
---
### Example 3: Inspecting Parameters and Weights
Because the weight tensor is 3-dimensional, the number of parameters scales rapidly. You can inspect the shapes of the weights and biases as follows:
```python
import torch
import torch.nn as nn
# Initialize a bilinear layer
bilinear = nn.Bilinear(100, 100, 50)
# Count and display parameters
total_params = sum(p.numel() for p in bilinear.parameters())
print("Total parameters:", total_params)
print("Weight shape :", bilinear.weight.shape) # (out_features, in1_features, in2_features)
print("Bias shape :", bilinear.bias.shape) # (out_features)
```
**Output:**
```text
Total parameters: 500050
Weight shape : torch.Size([50, 100, 100])
Bias shape : torch.Size()
```
---
## Common Use Cases
* **Multi-Modal Feature Fusion**: Merging representations from different sources, such as combining visual features from a CNN with textual features from a Transformer.
* **Attention Mechanisms**: Calculating compatibility scores between query and key vectors in advanced attention variants.
* **Relational and Interaction Modeling**: Predicting links or relations between two entities in graph neural networks or recommendation systems.
---
## Considerations & Best Practices
> β οΈ **Warning: Parameter Complexity**
>
> The number of parameters in a bilinear layer is proportional to $H_{out} \times H_{in1} \times H_{in2}$. If your input dimensions are large (e.g., 1024), the weight matrix will become extremely large, leading to high memory consumption and a risk of overfitting.
>
> **Recommendation**: If you encounter performance bottlenecks, consider reducing the input dimensions using a linear layer (`nn.Linear`) before passing them to the bilinear layer.
YouTip