Pytorch Torch Bincount

## PyTorch torch.bincount The `torch.bincount` function in PyTorch is used to count the frequency of each non-negative integer value in an input tensor. It returns a new 1-dimensional tensor where the $i$-th element represents the number of times the value $i$ appears in the input. This function is highly optimized and commonly used for tasks such as histogram computation, category frequency counting, and grouped statistics in machine learning workflows. --- ## Syntax and Parameters ### Function Definition ```python torch.bincount(input, weights=None, minlength=0) -> Tensor ``` ### Parameters | Parameter | Type | Description | | :--- | :--- | :--- | | `input` | *Tensor* | A 1-dimensional tensor containing non-negative integers. | | `weights` | *Tensor (Optional)* | An optional tensor of weights with the same shape as `input`. If specified, the output will accumulate the weights of the corresponding values instead of incrementing by 1. | | `minlength` | *int (Optional)* | The minimum number of bins in the output tensor. If the maximum value in `input` is less than `minlength - 1`, the output tensor will be padded with zeros up to `minlength`. Default is `0`. | ### Return Value * **Tensor**: A 1-dimensional tensor. Its length is $\max(\text{input}) + 1$ or `minlength`, whichever is larger. If `weights` is provided, the output tensor will share the same data type as `weights`; otherwise, it defaults to `torch.int64`. --- ## Code Examples ### 1. Basic Usage: Counting Frequencies In this basic example, we count the occurrences of each non-negative integer in a 1D tensor. ```python import torch # Input tensor with non-negative integers x = torch.tensor([0, 1, 1, 2, 2, 2, 3, 3, 4]) # Count occurrences counts = torch.bincount(x) print("Input Tensor:", x) print("Bincount Result:", counts) # Output: tensor([1, 2, 3, 2, 1]) # Explanation: # Value 0 appears 1 time -> index 0 is 1 # Value 1 appears 2 times -> index 1 is 2 # Value 2 appears 3 times -> index 2 is 3 # Value 3 appears 2 times -> index 3 is 2 # Value 4 appears 1 time -> index 4 is 1 ``` ### 2. Using the `weights` Parameter When `weights` is provided, the function sums up the weights associated with each value instead of counting occurrences. ```python import torch # Input tensor x = torch.tensor([0, 1, 1, 2, 2, 2]) # Weights corresponding to each element in x weights = torch.tensor([1.0, 2.0, 3.0, 1.0, 2.0, 3.0]) # Compute weighted counts weighted_counts = torch.bincount(x, weights=weights) print("Weighted Counts:", weighted_counts) # Output: tensor([1., 5., 6.]) # Explanation: # Value 0: weight is 1.0 -> sum = 1.0 # Value 1: weights are 2.0 + 3.0 -> sum = 5.0 # Value 2: weights are 1.0 + 2.0 + 3.0 -> sum = 6.0 ``` ### 3. Setting a Minimum Length (`minlength`) You can use `minlength` to ensure the output tensor has a fixed minimum size, even if the maximum value in the input is small. ```python import torch # Input tensor with a single element x = torch.tensor() # Set minlength to 5 to force the output tensor to have at least 5 bins counts = torch.bincount(x, minlength=5) print("Bincount with minlength=5:", counts) # Output: tensor([1, 0, 0, 0, 0]) ``` --- ## Important Considerations 1. **1D Tensor Requirement**: The `input` tensor must be 1-dimensional. If you have a multi-dimensional tensor, you must flatten it first using `.flatten()` or `.view(-1)` before passing it to `torch.bincount`. 2. **Non-negative Integers Only**: The `input` tensor must contain only non-negative integers (values $\ge 0$). Passing negative integers or floating-point numbers will result in a `RuntimeError`. 3. **Memory Usage**: The size of the output tensor is determined by the maximum value in the `input` tensor. If your input contains a very large integer (e.g., `1,000,000,000`), `torch.bincount` will attempt to allocate an output tensor of that size, which can easily lead to an Out-Of-Memory (OOM) error.

YouTip

Pytorch Torch Bincount

📂 Categories