Pandas Arrays Scalars Data Types Ref
For most data types, Pandas uses NumPy arrays as the underlying storage objects, which are contained within Index, Series, or DataFrame.
For certain data types, Pandas extends NumPy's type system. The string aliases for these types can be found in dtypes.
### **Pandas Arrays**
| **Class/Method** | **Description** |
| --- | --- |
| `pd.array(data, dtype)` | Create a Pandas array (`ExtensionArray`). |
| `pd.Series.array` | Return the underlying array (`ExtensionArray`) of the Series. |
| `pd.arrays.IntegerArray` | Array for storing integer data (supports missing values). |
| `pd.arrays.BooleanArray` | Array for storing boolean data (supports missing values). |
| `pd.arrays.StringArray` | Array for storing string data (supports missing values). |
| `pd.arrays.IntervalArray` | Array for storing interval data. |
| `pd.arrays.DatetimeArray` | Array for storing datetime data. |
| `pd.arrays.TimedeltaArray` | Array for storing timedelta data. |
| `pd.arrays.PeriodArray` | Array for storing period data. |
| `pd.arrays.SparseArray` | Array for storing sparse data. |
* * *
### **Pandas Scalars**
| **Class/Method** | **Description** |
| --- | --- |
| `pd.NA` | Scalar representing a missing value (similar to `NaN`). |
| `pd.Timestamp` | Scalar representing a timestamp. |
| `pd.Timedelta` | Scalar representing a timedelta. |
| `pd.Period` | Scalar representing a period. |
| `pd.Interval` | Scalar representing an interval. |
| `pd.Categorical` | Scalar representing categorical data. |
* * *
### **Pandas Data Types**
| **Class/Method** | **Description** |
| --- | --- |
| `pd.StringDtype()` | String data type (supports missing values). |
| `pd.BooleanDtype()` | Boolean data type (supports missing values). |
| `pd.Int8Dtype()` | 8-bit integer data type (supports missing values). |
| `pd.Int16Dtype()` | 16-bit integer data type (supports missing values). |
| `pd.Int32Dtype()` | 32-bit integer data type (supports missing values). |
| `pd.Int64Dtype()` | 64-bit integer data type (supports missing values). |
| `pd.Float32Dtype()` | 32-bit floating-point data type (supports missing values). |
| `pd.Float64Dtype()` | 64-bit floating-point data type (supports missing values). |
| `pd.CategoricalDtype()` | Categorical data type. |
| `pd.DatetimeTZDtype()` | Datetime data type with timezone. |
| `pd.PeriodDtype()` | Period data type. |
| `pd.IntervalDtype()` | Interval data type. |
| `pd.SparseDtype()` | Sparse data type. |
* * *
### **Common Methods**
#### **Array Methods**
| **Method** | **Description** |
| --- | --- |
| `array.take(indices)` | Extract elements from the array based on indices. |
| `array.copy()` | Copy the array. |
| `array.isna()` | Check for missing values in the array. |
| `array.fillna(value)` | Fill missing values with a specified value. |
| `array.unique()` | Return unique values in the array. |
| `array.value_counts()` | Return the frequency of each value in the array. |
#### **Scalar Methods**
| **Method** | **Description** |
| --- | --- |
| `timestamp.to_pydatetime()` | Convert `Timestamp` to a Python `datetime` object. |
| `timedelta.total_seconds()` | Convert `Timedelta` to total seconds. |
| `period.start_time` | Return the start time of the `Period`. |
| `period.end_time` | Return the end time of the `Period`. |
| `interval.left` | Return the left boundary of the `Interval`. |
| `interval.right` | Return the right boundary of the `Interval`. |
#### **Data Type Methods**
| **Method** | **Description** |
| --- | --- |
| `dtype.name` | Return the name of the data type. |
| `dtype.kind` | Return the kind of the data type (e.g., `i` for integer, `f` for float). |
| `dtype.construct_array_type()` | Return the array class associated with the data type. |
* * *
## Examples
import pandas as pd
# Create a Pandas array
arr = pd.array([1,2,None], dtype=pd.Int64Dtype())
print(arr)
# Use Pandas scalars
ts = pd.Timestamp('2023-01-01')
print(ts.year)# Output the year
# Use Pandas data types
dtype = pd.StringDtype()
print(dtype.name)# Output the data type name
* * *
For more detailed information, please refer to the (https://pandas.pydata.org/docs/reference/arrays.html).
YouTip