Pandas Dataframe Api Reference
DataFrame is a two-dimensional labeled data structure, which you can think of as an Excel spreadsheet or SQL table, or a collection of dictionary type.
Here is the common API manual for Pandas DataFrame:
### **DataFrame Constructor**
| **Method** | **Description** |
| --- | --- |
| `pd.DataFrame(data, index, columns, dtype, copy)` | Creates a DataFrame object, supporting custom data, index, column names, and data types. |
* * *
### **DataFrame Attributes**
| **Attribute** | **Description** |
| --- | --- |
| `DataFrame.values` | Returns the data part of the DataFrame (numpy array). |
| `DataFrame.index` | Returns the row index of the DataFrame. |
| `DataFrame.columns` | Returns the column names of the DataFrame. |
| `DataFrame.dtypes` | Returns the data type of each column. |
| `DataFrame.shape` | Returns the shape of the DataFrame (in tuple form). |
| `DataFrame.size` | Returns the total number of elements in the DataFrame. |
| `DataFrame.empty` | Checks if the DataFrame is empty. |
| `DataFrame.ndim` | Returns the number of dimensions of the DataFrame (always 2). |
| `DataFrame.T` | Returns the transpose of the DataFrame. |
| `DataFrame.axes` | Returns a list of row index and column names. |
| `DataFrame.memory_usage()` | Returns the memory usage of each column. |
* * *
### **DataFrame Methods**
#### **Data Viewing**
| **Method** | **Description** |
| --- | --- |
| `DataFrame.head(n=5)` | Returns the first n rows of data. |
| `DataFrame.tail(n=5)` | Returns the last n rows of data. |
| `DataFrame.describe()` | Returns statistical summary of the DataFrame (such as count, mean, standard deviation, etc.). |
| `DataFrame.info()` | Prints brief information about the DataFrame (such as column names, data types, non-null value counts, etc.). |
#### **Missing Value Handling**
| **Method** | **Description** |
| --- | --- |
| `DataFrame.isnull()` | Checks if each element is a missing value (NaN). |
| `DataFrame.notnull()` | Checks if each element is not a missing value. |
| `DataFrame.dropna()` | Deletes rows or columns containing missing values. |
| `DataFrame.fillna(value)` | Fills missing values with the specified value. |
#### **Data Operations**
| **Method** | **Description** |
| --- | --- |
| `DataFrame.drop()` | Deletes specified rows or columns. |
| `DataFrame.rename()` | Renames row index or column names. |
| `DataFrame.set_index()` | Sets a specified column as the index. |
| `DataFrame.reset_index()` | Resets the index. |
| `DataFrame.sort_values()` | Sorts by values. |
| `DataFrame.sort_index()` | Sorts by index. |
| `DataFrame.replace()` | Replaces values in the DataFrame. |
| `DataFrame.append()` | Appends another DataFrame. |
| `DataFrame.join()` | Joins another DataFrame based on index or columns. |
| `DataFrame.merge()` | Merges another DataFrame based on specified columns. |
| `DataFrame.concat()` | Concatenates multiple DataFrames along the specified axis. |
| `DataFrame.update()` | Updates the current DataFrame with values from another DataFrame. |
| `DataFrame.pivot()` | Creates a pivot table. |
| `DataFrame.melt()` | Converts wide format data to long format data. |
#### **Data Selection**
| **Method** | **Description** |
| --- | --- |
| `DataFrame.loc[]` | Selects data by label. |
| `DataFrame.iloc[]` | Selects data by position. |
| `DataFrame.at[]` | Selects a single value by label. |
| `DataFrame.iat[]` | Selects a single value by position. |
| `DataFrame.filter()` | Selects data based on column names. |
| `DataFrame.get()` | Gets the value of the specified column. |
| `DataFrame.query()` | Queries data based on conditions. |
#### **Data Conversion**
| **Method** | **Description** |
| --- | --- |
| `DataFrame.astype()` | Converts the DataFrame to the specified data type. |
| `DataFrame.apply()` | Applies a function to rows or columns of the DataFrame. |
| `DataFrame.applymap()` | Applies a function to each element of the DataFrame. |
| `DataFrame.map()` | Applies a function to each element in a Series. |
| `DataFrame.to_dict()` | Converts the DataFrame to a dictionary. |
| `DataFrame.to_numpy()` | Converts the DataFrame to a numpy array. |
| `DataFrame.to_csv()` | Saves the DataFrame as a CSV file. |
| `DataFrame.to_excel()` | Saves the DataFrame as an Excel file. |
#### **Statistical Calculations**
| **Method** | **Description** |
| --- | --- |
| `DataFrame.sum()` | Returns the sum of each column. |
| `DataFrame.mean()` | Returns the mean of each column. |
| `DataFrame.median()` | Returns the median of each column. |
| `DataFrame.min()` | Returns the minimum value of each column. |
| `DataFrame.max()` | Returns the maximum value of each column. |
| `DataFrame.std()` | Returns the standard deviation of each column. |
| `DataFrame.var()` | Returns the variance of each column. |
| `DataFrame.count()` | Returns the count of non-missing values for each column. |
| `DataFrame.corr()` | Returns the correlation coefficient matrix between columns. |
| `DataFrame.cov()` | Returns the covariance matrix between columns. |
| `DataFrame.mode()` | Returns the mode of each column. |
| `DataFrame.quantile()` | Returns the quantile of each column. |
#### **Time Series Operations**
| **Method** | **Description** |
| --- | --- |
| `DataFrame.dt` | Accesses datetime properties (only applicable for datetime type columns). |
| `DataFrame.resample()` | Resamples time series data. |
| `DataFrame.shift()` | Shifts data along the time axis. |
#### **String Operations**
| **Method**
YouTip