YouTip LogoYouTip

Pandas Df Tail

[![Image 1: Pandas Common Functions](#) Pandas Common Functions](#) * * * `tail()` is an important function in Pandas DataFrame and Series used to view the end portion of a dataset. It returns the last n rows of data, allowing us to quickly understand the ending situation and latest records of the data without traversing the entire dataset. In time series data analysis, `tail()` is particularly useful because we usually care about the most recent data, such as recent stock prices, sales figures, or sensor readings. This function complements `head()`, together forming the basic tools for data exploration. * * * ## Basic Syntax and Parameters `tail()` is a member function of DataFrame and Series, called using the dot operator `.`. Its usage is identical to `head()`, just viewing in the opposite direction. ### Syntax Format DataFrame.tail(n=5)Series.tail(n=5) ### Parameter Description | Parameter | Type | Required | Description | Default Value | | --- | --- | --- | --- | --- | | n | int | Optional | Returns the last n rows of data. If n exceeds the total number of rows, all data will be returned. | 5 | ### Return Value Description * **Return Type**: When the caller is a DataFrame, it returns a DataFrame; when the caller is a Series, it returns a Series. * **Number of Rows Returned**: At most n rows are returned. If the number of data rows is less than n, all data is returned. * **Index Preservation**: The returned data retains the original DataFrame's index values. * * * ## Examples Let's fully master the usage of `tail()` through examples. ### Example 1: Basic Usage - View Last Few Rows of DataFrame Create a DataFrame and use `tail()` to view the end data. ## Example import pandas as pd # Create an example DataFrame data ={ 'name': ['Alice','Bob','Charlie','David','Eve','Frank','Grace','Henry','Iris','Jack'], 'age': [18,19,17,18,20,19,18,17,19,18], 'score': [85,92,78,90,88,95,82,76,89,91], 'grade': ['A','A','B','A','B','A','B','C','B','A'] } df = pd.DataFrame(data) # Default returns the last 5 rows print("Default last 5 rows:") print(df.tail()) # Specify returning the last 3 rows print("Last 3 rows:") print(df.tail(3)) # Return the last 7 rows print("Last 7 rows:") print(df.tail(7)) **Output Result:** Default last 5 rows: name age score grade 5 Frank 19 95 A 6 Grace 18 82 B 7 Henry 17 76 C 8 Iris 19 89 B 9 Jack 18 91 A Last 3 rows: name age score grade 7 Henry 17 76 C 8 Iris 19 89 B 9 Jack 18 91 A Last 7 rows: name age score grade 3 David 18 90 A 4 Eve 20 88 B 5 Frank 19 95 A 6 Grace 18 82 B 7 Henry 17 76 C 8 Iris 19 89 B 9 Jack 18 91 A **Code Explanation:** 1. The DataFrame has 10 rows of data with indices from 0 to 9. 2. `df.tail()` returns the last 5 rows with indices 5 to 9. 3. `df.tail(3)` only returns the last 3 rows with indices 7, 8, 9. 4. Note that the index values retain the original DataFrame's indices and are not renumbered starting from 0. ### Example 2: View Last Few Rows of Series `tail()` also works on Series objects. ## Example import pandas as pd # Create a Series s = pd.Series([10,20,30,40,50,60,70,80,90,100]) # View the last 3 elements print("Last 3 elements:") print(s.tail(3)) # Create a Series with custom index s2 = pd.Series([100,200,300,400,500], index=['a','b','c','d','e']) print("Last 2 elements of indexed Series:") print(s2.tail(2)) **Output Result:** Last 3 elements:7 808 909 100 dtype: int64 Indexed Series last 2 elements: d 500 e 600 dtype: int64 **Code Explanation:** * The `tail()` method of a Series returns the last n elements while preserving the original index. * Custom-indexed Series also applies, returning the corresponding elements at the end. ### Example 3: Application in Time Series Data When dealing with time series data, `tail()` is especially useful for quickly viewing the latest data records. ## Example import pandas as pd import numpy as np # Create a time series DataFrame simulating stock data np.random.seed(42) dates = pd.date_range('2024-01-01', periods=30, freq='D') prices =100 + np.cumsum(np.random.randn(30))# Simulate stock price trend stock_df = pd.DataFrame({ 'date': dates, 'price': prices.round(2), 'volume': np.random.randint(1000,10000,30) }) print("Comparison of first few and last few rows of full data:") print("First 5 rows:") print(stock_df.head()) print("Last 5 rows:") print(stock_df.tail()) # Use tail to view data from the latest days latest_data = stock_df.tail(7) print("Latest 7 days' data:") print(latest_data) # Further analysis can be performed on the tail result print("Average price over the last 7 days:", latest_data['price'].mean()) print("Total trading volume over the last 7 days:", latest_data['volume'].sum()) **Output Result:** Full data first few and last few rows comparison:First 5 rows: date price volume 0 2024-01-01 100.34 52341 2024-01-02 99.87 34212 2024-01-03 100.21 67893 2024-01-04 98.45 43214 2024-01-05 99.12 5654Last 5 rows: date price volume 25 2024-01-26 102.45 432126 2024-01-27 103.12 567827 2024-01-28 101.89 654328 2024-01-29 104.23 321129 2024-01-30 103.87 4532Latest 7 days' data: date price volume 23 2024-01-24 101.56 432124 2024-01-25 102.01 543225 2024-01-26 45 432126 2024-01-27 103.12 567827 2024-01-28 101.89 654328 2024-01-29 104.23 321129 2024-01-30 103.87 4532Average price over the last 7 days: 102.79Total trading volume over the last 7 days: 34938 **Code Explanation:** 1. A DataFrame containing 30 days of stock price data was created. 2. `tail()` quickly locates the latest data records. 3. Further functions can be applied to the data returned by `tail()` for calculations and analysis. 4. This pattern is very practical in real-time data analysis scenarios. * * * ## Notes * `tail()` does not modify the original DataFrame or Series; it returns a new object. * When n is less than or equal to 0, an empty DataFrame or empty Series is returned. * For large datasets, `tail()` is more efficient than loading all data since it doesn't need to read the entire file. * `tail()` and `head()` are often used together to gain a comprehensive understanding of both ends of the data. > Tip: In the exploratory phase of data analysis, it is recommended to use both `head()` and `tail()` together so you can quickly determine whether the data is sorted as expected and check its completeness. * * * ## Summary `tail()` is a function in Pandas used to view the end part of data, complementing `head()`. It is particularly useful in time series analysis, real-time data monitoring, log analysis, and other scenarios. Mastering these two functions, `head()` and `tail()`, allows you to quickly understand the beginning and end of a dataset, which is a crucial first step in the data exploration phase. Combined with functions like `info()` and `describe()`, we can comprehensively grasp the characteristics and distribution of the data. [![Image 2: Pandas Common Functions](#) Pandas Common Functions](#)
← Pandas Df IlocPandas Df Reset Index β†’