Pytorch Torch Var
## PyTorch torch.var Function
In PyTorch, `torch.var` is a built-in function used to calculate the variance of a tensor. Variance measures the spread of a set of numbersβspecifically, it is the average of the squared differences from the mean. It is a fundamental metric in statistics and machine learning for understanding data dispersion.
---
### Syntax and Parameters
The signature of the `torch.var` function is as follows:
```python
torch.var(input, dim=None, correction=1, keepdim=False, *, out=None)
```
> **Note on `unbiased` vs `correction`:** In older PyTorch versions, the parameter `unbiased` (a boolean) was used. In modern PyTorch versions, it has been deprecated in favor of `correction` (an integer representing the degrees of freedom adjustment), though `unbiased` is still supported for backward compatibility.
#### Parameter Descriptions:
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `input` | *Tensor* | The input tensor containing the values to calculate the variance for. |
| `dim` | *int or tuple of ints* | The dimension or dimensions along which to compute the variance. If `None`, the variance is calculated over all elements in the input tensor. |
| `correction` | *int* | Difference between the sample size and sample degrees of freedom. Defaults to `1` (Bessel's correction, which calculates the unbiased sample variance). Setting `correction=0` calculates the biased sample variance (population variance). |
| `unbiased` | *bool* | *(Deprecated)* If `True`, Bessel's correction is used (equivalent to `correction=1`). If `False`, the population variance is calculated (equivalent to `correction=0`). |
| `keepdim` | *bool* | Whether the output tensor has `dim` retained or not. If `True`, the output tensor will be of the same size as `input` except in the dimension(s) `dim` where it will be of size 1. |
| `out` | *Tensor* | The alternative output tensor in which to write the result. |
---
### Code Examples
Here are practical examples demonstrating how to use `torch.var` in different scenarios.
```python
import torch
# Create a 1D tensor
x = torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0])
# 1. Calculate the variance of all elements (unbiased by default)
print("Variance of x:", torch.var(x))
# Create a 2D tensor
y = torch.tensor([[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0]])
# 2. Calculate variance along dim=0 (column-wise)
print("Variance along dim=0:", torch.var(y, dim=0))
# 3. Calculate variance along dim=1 (row-wise)
print("Variance along dim=1:", torch.var(y, dim=1))
```
#### Output:
```text
Variance of x: tensor(2.5000)
Variance along dim=0: tensor([2.2500, 2.2500, 2.2500])
Variance along dim=1: tensor([1., 1.])
```
---
### Key Considerations
#### 1. Unbiased vs. Biased Variance
By default, PyTorch calculates the **unbiased sample variance** (using Bessel's correction, where the denominator is $N - 1$).
If you want to calculate the **population variance** (where the denominator is $N$), you should set `correction=0` (or `unbiased=False` in older versions):
```python
# Unbiased sample variance (denominator is N - 1)
unbiased_var = torch.var(x, correction=1) # tensor(2.5000)
# Biased population variance (denominator is N)
biased_var = torch.var(x, correction=0) # tensor(2.0000)
```
#### 2. Data Type Requirements
The `torch.var` function requires the input tensor to have a floating-point or complex data type (e.g., `torch.float32`, `torch.float64`). If you pass an integer tensor, PyTorch will throw a `RuntimeError`.
```python
# This will raise an error:
# z = torch.tensor([1, 2, 3, 4, 5])
# torch.var(z)
# Correct approach:
z = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32)
print(torch.var(z))
```
YouTip