\n\n
pd.notna() is a function in the Pandas library used to detect missing values. It checks each element in the input object to determine whether it is a non-missing value and returns a boolean value (True/False) as a result.
This is one of the most commonly used functions in data cleaning, helping us quickly identify valid values in our data, preparing for subsequent data processing and analysis.
\n\nWord breakdown: not means negation, na stands for "not available", so together they mean "not missing value".
\n\n
Basic Syntax and Parameters
\n\npd.notna() is a top-level function in the Pandas library that can be called directly via pd.notna(), or through the .notna() method of Series or DataFrame objects.
Syntax Format
\npd.notna(obj)\n\n\nParameter Description
\n- \n
- Parameter:
obj\n- \n
- Type: Any Python object such as Series, DataFrame, list, array, scalar value, etc. \n
- Description: The object to check for non-missing values. Missing values in Pandas include
None,NaN,NaT(missing time type), andpandas.NA. \n
\n
Function Description
\n- \n
- Return Value: Returns a boolean object with the same shape as the input object. If it's a Series or DataFrame, it returns an object of the same type; if it's a scalar, it returns a single boolean value. \n
- Effect: Returns
Truefor non-missing values andFalsefor missing values. \n
\n\n
Examples
\n\nLetβs go through a series of examples from simple to complex to fully master the usage of pd.notna().
Example 1: Basic Usage β Detecting Missing Values in Scalars and Lists
\n\nExample
\nimport pandas as pd\n\nimport numpy as np\n\n# 1. Check if scalar values are non-missing\n\nprint("=== Scalar Detection ===")\n\nprint(f"pd.notna(10): {pd.notna(10)}")# Normal number β True\n\nprint(f"pd.notna('tutorial'): {pd.notna('tutorial')}")# Normal string β True\n\nprint(f"pd.notna(None): {pd.notna(None)}")# None β False\n\nprint(f"pd.notna(np.nan): {pd.notna(np.nan)}")# np.nan β False\n\n# 2. Check for missing values in a list\n\nprint("n=== List Detection ===")\n\ndata_list =[1,2, np.nan,'tutorial',None,5]\n\nresult = pd.notna(data_list)\n\nprint(f"Original list: {data_list}")\n\nprint(f"Detection result: {result.tolist()}")# Convert to list for easy viewing\n\n\nExpected Output:
\n=== Scalar Detection ===\npd.notna(10): True\npd.notna('tutorial'): True\npd.notna(None): False\npd.notna(np.nan): False\n=== List Detection ===\nOriginal list: [1, 2, nan, 'tutorial', None, 5]\nDetection result: [True, False, True, False, True]\n\n\nCode Explanation:
\n- \n
pd.notna(10)andpd.notna('tutorial')detect normal numbers and strings, returningTrue. \npd.notna(None)andpd.notna(np.nan)detect missing value markers in Python and NumPy, returningFalse. \n- Using
pd.notna()on a list returns a boolean list where missing values are marked asFalse. \n
Example 2: Detecting Missing Values in a Series
\n\nWhen working with Series data, pd.notna() can quickly locate valid data.
Example
\nimport pandas as pd\n\nimport numpy as np\n\n# Create a Series containing missing values\n\ns = pd.Series([1,2, np.nan,4,None,'tutorial', np.nan])\n\nprint("=== Original Series ===")\n\nprint(s)\n\nprint(f"n Type: {type(s)}")\n\n# Use pd.notna() to detect\n\nprint("n=== pd.notna() Detection Result ===")\n\nresult = pd.notna(s)\n\nprint(result)\n\n# Use the Series' notna() method (equivalent effect)\n\nprint("n=== s.notna() Method Detection ===")\n\nprint(s.notna())\n\n# Filter out non-missing values\n\nprint("n=== Filtering Non-Missing Values ===")\n\nprint(s[s.notna()])\n\n\nExpected Output:
\n=== Original Series ===\n0 1\n1 2\n2 NaN\n3 4\n4 None\n5 tutorial\n6 NaN\ndtype: object\n\n=== pd.notna() Detection Result ===\n0 True\n1 True\n2 False\n3 True\n4 False\n5 True\n6 False\ndtype: bool\n\n=== s.notna() Method Detection ===\n0 True\n1 True\n2 False\n3 True\n4 False\n5 True\n6 False\ndtype: bool\n\n=== Filtering Non-Missing Values ===\n0 1\n1 2\n3 4\n5 tutorial\ndtype: object\n\n\nCode Explanation:
\n- \n
- Both
pd.notna(s)ands.notna()can detect missing values in a Series, returning a boolean Series. \n - Using boolean indexing
s[s.notna()]allows you to quickly filter all non-missing elements, which is very useful in data cleaning. \n
Example 3: Detecting Missing Values in a DataFrame
\n\nWhen handling tabular data, pd.notna() can quickly give you an overview of data completeness.
Example
\nimport pandas as pd\n\nimport numpy as np\n\n# Create a DataFrame with missing values\n\ndf = pd.DataFrame({\n\n'name': ['Alice','Bob',None,'Diana'],\n\n'age': [25, np.nan,30,28],\n\n'score': [85,90, np.nan,95]\n\n})\n\nprint("=== Original DataFrame ===")\n\nprint(df)\n\n# Detect the entire DataFrame\n\nprint("n=== pd.notna() Detection Result ===")\n\nprint(pd.notna(df))\n\n# Count non-missing values per column\n\nprint("n=== Number of Non-Missing Values Per Column ===")\n\nprint(df.notna().sum())\n\n# Count non-missing values per row\n\nprint("n=== Number of Non-Missing Values Per Row ===")\n\nprint(df.notna().sum(axis=1))\n\n# Calculate missing value ratio\n\nprint("n=== Missing Value Ratio ===")\n\nmissing_ratio = df.isna().mean()\n\nprint(missing_ratio)\n\n\nExpected Output:
\n=== Original DataFrame ===\n name age score\n0 Alice 25.0 85.0\n1 Bob NaN 90.0\n2 None 30.0 NaN\n3 Diana 28.0 95.0\n\n=== pd.notna() Detection Result ===\n name age score\n0 True True True\n1 True False True\n2 False True False\n3 True True True\n\n=== Number of Non-Missing Values Per Column ===\nname 3\nage 3\nscore 3\ndtype: int64\n\n=== Number of Non-Missing Values Per Row ===\n0 3\n1 2\n2 1\n3 3\ndtype: int64\n\n=== Missing Value Ratio ===\nname 0.25\nage 0.25\nscore 0.25\ndtype: float64\n\n\nCode Explanation:
\n- \n
pd.notna(df)returns a boolean DataFrame with the same shape as the original, indicating whether each position is a non-missing value. \ndf.notna().sum()counts the number of non-missing values per column. \ndf.isna().mean()calculates the proportion of missing values per column, helping assess data quality. \n
\n\n
Tip: pd.notna() and pd.isna() are opposite functions. Where pd.notna() returns True, pd.isna() returns False, and vice versa.
YouTip