Pytorch Torch Randint
The `torch.randint` function is a fundamental utility in PyTorch used to generate tensors filled with random integers. These integers are drawn uniformly from a specified discrete range.
In deep learning and scientific computing, `torch.randint` is widely used for tasks such as:
* **Data Augmentation**: Generating random cropping coordinates, rotation angles, or image transformations.
* **Stochastic Masking**: Creating binary or multi-class masks for dropout-like operations or sequence masking in NLP (e.g., masked language modeling).
* **Dataset Sampling**: Generating random indices to sample batches or shuffle data manually.
* **Synthetic Data Generation**: Creating mock categorical labels or discrete features for testing neural network architectures.
---
## Syntax and Parameters
The function generates random integers in the half-open interval $[low, high)$, meaning the lower bound is inclusive, while the upper bound is exclusive.
### Function Signature
```python
torch.randint(low=0, high, size, *, generator=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) -> Tensor
```
### Parameters
| Parameter | Type | Required / Optional | Description |
| :--- | :--- | :--- | :--- |
| `low` | `int` | Optional | The lowest integer to be drawn from the distribution. Default: `0`. |
| `high` | `int` | **Required** | One above the highest integer to be drawn from the distribution. |
| `size` | `tuple` or `list` | **Required** | A tuple or list defining the shape of the output tensor (e.g., `(3, 4)`). |
| `generator` | `torch.Generator` | Optional | A pseudorandom number generator for reproducibility. |
| `out` | `Tensor` | Optional | The output tensor in which to write the result. |
| `dtype` | `torch.dtype` | Optional | The desired data type of the returned tensor. Default: `torch.int64` (LongTensor). |
| `layout` | `torch.layout` | Optional | The desired layout of the returned Tensor. Default: `torch.strided`. |
| `device` | `torch.device` | Optional | The desired device of the returned tensor (e.g., `'cpu'`, `'cuda'`). Default: current device. |
| `requires_grad`| `bool` | Optional | If PyTorch should record operations on the returned tensor. Default: `False`. |
### Input and Output Shapes
* **Input**: The shape is determined by the `size` argument, which accepts any number of dimensions (e.g., 1D, 2D, 3D, or higher-order tensors).
* **Output**: A tensor of shape `size` containing random integers in the range $[low, high)$ with the specified `dtype` and allocated on the specified `device`.
---
## Code Examples
Below is a comprehensive, executable script demonstrating the common use cases of `torch.randint`, including basic generation, device allocation, and reproducibility.
```python
import torch
# Set a global seed for reproducibility of basic operations
torch.manual_seed(42)
# ---------------------------------------------------------
# Example 1: Basic Usage (Default low=0, default dtype=int64)
# ---------------------------------------------------------
# Generates a 1D tensor of 5 elements with values in range [0, 10)
tensor_1d = torch.randint(high=10, size=(5,))
print("1. Basic 1D Tensor [0, 10):")
print(tensor_1d)
print(f"Dtype: {tensor_1d.dtype}\n")
# ---------------------------------------------------------
# Example 2: Specifying both Low and High Bounds
# ---------------------------------------------------------
# Generates a 2D tensor of shape (3, 4) with values in range [-5, 5)
tensor_2d = torch.randint(low=-5, high=5, size=(3, 4))
print("2. 2D Tensor with custom range [-5, 5):")
print(tensor_2d)
print(f"Shape: {tensor_2d.shape}\n")
# ---------------------------------------------------------
# Example 3: Changing Data Type (dtype)
# ---------------------------------------------------------
# By default, PyTorch uses int64. We can cast it to int32 or uint8.
tensor_uint8 = torch.randint(low=0, high=256, size=(2, 2), dtype=torch.uint8)
print("3. Tensor with uint8 data type (useful for pixel values):")
print(tensor_uint8)
print(f"Dtype: {tensor_uint8.dtype}\n")
# ---------------------------------------------------------
# Example 4: Target Device Allocation (GPU/CUDA)
# ---------------------------------------------------------
# Check if CUDA is available and assign device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tensor_device = torch.randint(low=1, high=100, size=(3,), device=device)
print(f"4. Tensor allocated on device: {device}")
print(tensor_device)
print(f"Device property: {tensor_device.device}\n")
# ---------------------------------------------------------
# Example 5: Reproducibility using torch.Generator
# ---------------------------------------------------------
# Using a local generator ensures reproducibility without affecting global state
gen1 = torch.Generator().manual_seed(101)
gen2 = torch.Generator().manual_seed(101)
tensor_gen1 = torch.randint(0, 100, size=(4,), generator=gen1)
tensor_gen2 = torch.randint(0, 100, size=(4,), generator=gen2)
print("5. Reproducibility with Local Generators:")
print(f"Generator 1 Output: {tensor_gen1}")
print(f"Generator 2 Output: {tensor_gen2}")
assert torch.equal(tensor_gen1, tensor_gen2), "Tensors must be identical!"
```
---
## Best Practices and Common Pitfalls
### 1. Remember the Exclusive Upper Bound
A common mistake is assuming the `high` parameter is inclusive. If you need to generate random dice rolls (values from 1 to 6), setting `high=6` will only yield values up to 5. You must set `high=7`.
```python
# Incorrect: This will only generate values: 1, 2, 3, 4, 5
incorrect_dice = torch.randint(low=1, high=6, size=(10,))
# Correct: This generates values: 1, 2, 3, 4, 5, 6
correct_dice = torch.randint(low=1, high=7, size=(10,))
```
### 2. Match Dtype to Downstream Requirements
By default, `torch.randint` returns a 64-bit integer (`torch.int64` or `torch.long`). If you are using these integers as indices for embedding layers (`torch.nn.Embedding`) or loss functions like `torch.nn.CrossEntropyLoss`, `int64` is correct and expected.
However, if you are using these integers as mask values or dummy image data, `int64` consumes twice the memory of `int32` and eight times the memory of `uint8`. Always specify the optimal `dtype` for your use case to save GPU memory.
### 3. Avoid `requires_grad=True` on Integer Tensors
PyTorch does not support calculating gradients with respect to discrete integer tensors because integer operations are non-differentiable. Attempting to set `requires_grad=True` on an integer tensor will result in a runtime error:
```python
# This will raise a RuntimeError: Only Tensors of floating point and complex dtype can require gradients
try:
grad_tensor = torch.randint(0, 10, size=(5,), requires_grad=True)
except RuntimeError as e:
print(f"Error caught: {e}")
```
If you need to pass gradients through a sampling process, consider using continuous approximations (like the Gumbel-Softmax trick) instead of discrete integer sampling.
YouTip