YouTip LogoYouTip

Pandas General Functions

Pandas provides a large number of functions for data processing and analysis. Here are some commonly used functions: ### **General Functions** | **Function** | **Description** | | --- | --- | | `pd.isna(obj)` | Check if object is missing value. | | `pd.notna(obj)` | Check if object is not missing value. | | `pd.concat(objs, axis)` | Concatenate multiple objects. | | `pd.merge(left, right, on)` | Merge DataFrames by column. | | `pd.get_dummies(data)` | One-Hot encoding for categorical variables. | | `pd.cut(x, bins)` | Binning continuous data. | | `pd.qcut(x, q)` | Binning by quantiles. | | `pd.to_numeric(arg)` | Convert to numeric. | | `pd.to_datetime(arg)` | Convert to datetime. | | `pd.unique(values)` | Get unique values. | | `pd.value_counts(values)` | Count frequencies. | | `pd.factorize(values)` | Encode categorical variables. | | `pd.crosstab(index, columns)` | Cross tabulation. | | `pd.pivot_table(data)` | Pivot table. | | `pd.melt(frame)` | Wide to long. | * * * ### **Data Reading and Writing (IO)** | Function | Description | | --- | --- | | `pd.read_csv()` | Read CSV file. | | `pd.read_excel()` | Read Excel. | | `pd.read_json()` | Read JSON. | | `pd.read_html()` | Parse HTML table. | | `pd.read_sql()` | Read from database. | | `df.to_csv()` | Write to CSV. | | `df.to_excel()` | Write to Excel. | | `df.to_json()` | Write to JSON. | | `df.to_parquet()` | Write to Parquet. | * * * ### **Data Cleaning** | Function | Description | | --- | --- | | `df.dropna()` | Delete missing values. | | `df.fillna()` | Fill missing values. | | `df.replace()` | Replace data. | | `df.drop_duplicates()` | Remove duplicates. | | `df.astype()` | Type conversion. | | `df.rename()` | Rename columns. | | `df.sort_values()` | Sort. | | `df.reset_index()` | Reset index. | * * * ### **Data Selection and Filtering** | Function | Description | | --- | --- | | `df.head()` | First few rows. | | `df.tail()` | Last few rows. | | `df.loc[]` | Label-based indexing. | | `df.iloc[]` | Integer-based indexing. | | `df.query()` | Conditional filtering. | | `df.filter()` | Column filtering. | * * * ### **Grouping and Aggregation** | Function | Description | | --- | --- | | `df.groupby()` | Grouping operation. | | `groupby.sum()` | Aggregation sum. | | `groupby.mean()` | Mean. | | `groupby.agg()` | Multiple aggregation. | | `groupby.transform()` | Transform. | * * * ### **Math and Statistical Functions** | Function | Description | | --- | --- | | `Series.sum()` | Sum. | | `Series.mean()` | Mean. | | `Series.median()` | Median. | | `Series.std()` | Standard deviation. | | `Series.var()` | Variance. | | `Series.corr()` | Correlation coefficient. | | `Series.quantile()` | Quantile. | | `Series.cumsum()` | Cumulative sum. | * * * ### **String Processing** | Function | Description | | --- | --- | | `Series.str.lower()` | Lowercase. | | `Series.str.upper()` | Uppercase. | | `Series.str.strip()` | Remove whitespace. | | `Series.str.replace()` | Replace. | | `Series.str.contains()` | Match. | | `Series.str.split()` | Split. | | `Series.str.len()` | Length. | * * * ### **Time Series** | Function | Description | | --- | --- | | `pd.date_range()` | Generate dates. | | `pd.Timestamp()` | Timestamp. | | `pd.Timedelta()` | Time delta. | | `Series.dt.year` | Year. | | `Series.dt.month` | Month. | | `Series.dt.day` | Day. | | `Series.dt.weekday` | Weekday. | * * * ### **Data Reshaping** | Function | Description | | --- | --- | | `df.pivot()` | Pivot. | | `df.pivot_table()` | Pivot table. | | `df.stack()` | Columns to rows. | | `df.unstack()` | Rows to columns. | | `pd.melt()` | Wide to long. | * * * ## Example import pandas as pd # General Functions s = pd.Series([1,2,3,None]) print(pd.isna(s)) # Math print(s.sum()) # String s_str = pd.Series(['a','b']) print(s_str.str.upper()) # Time dates = pd.to_datetime(['2023-01-01']) print(dates.dt.month) * * * For more detailed information, you can refer to (https://pandas.pydata.org/docs/reference/general_functions.html).
← Pandas Index ObjectPandas Dataframe Api Reference β†’