YouTip LogoYouTip

Pandas Datetime

Pandas provides powerful date and time processing capabilities, allowing convenient conversion of strings to date types and performing various date-related calculations and analyses. * * * ## Creating Date and Time ### date_range Function ## Example import pandas as pd # Create a date range dates = pd.date_range("2024-01-01", periods=10, freq="D") print("Daily:") print(dates) print() # Monthly dates_month = pd.date_range("2024-01-01", periods=12, freq="M") print("Monthly:") print(dates_month) print() # Hourly dates_hour = pd.date_range("2024-01-01 00:00", periods=24, freq="H") print("Hourly (first 5):") print(dates_hour[:5]) ### DatetimeIndex and DatetimeArray ## Example import pandas as pd import numpy as np # Create using datetime objects dt = pd.DatetimeIndex([ pd.Timestamp("2024-01-01"), pd.Timestamp("2024-01-02"), pd.Timestamp("2024-01-03") ]) print("DatetimeIndex:") print(dt) print() # Create a Series with time index s = pd.Series( [100,200,300], index=pd.date_range("2024-01-01", periods=3, freq="D") ) print("Series with date index:") print(s) * * * ## Converting Strings to Dates ### pd.to_datetime ## Example import pandas as pd # Convert various string formats to dates dates_str =["2024-01-01","2024/01/02","01-03-2024","20240104"] # Auto-infer format dt = pd.to_datetime(dates_str) print("Auto-inferred format:") print(dt) print() # Specify format dt2 = pd.to_datetime(dates_str, format="%Y-%m-%d") print("Specified format:") print(dt2) print() # Handle invalid values dt3 = pd.to_datetime(["2024-01-01","invalid","2024-01-03"], errors="coerce") print("Handle invalid values (converted to NaT):") print(dt3) ### read_csv Automatic Date Parsing ## Example import pandas as pd from io import StringIO # Simulate CSV data csv_data ="""Date,Sales 2024-01-01,100 2024-01-02,200 2024-01-03,150 """ # Method 1: Read then convert df = pd.read_csv(StringIO(csv_data)) df= pd.to_datetime(df) print("Convert after reading:") print(df.dtypes) print() # Method 2: Parse during reading df2 = pd.read_csv(StringIO(csv_data), parse_dates=) print("Parse during reading:") print(df2.dtypes) * * * ## Accessing Date Properties After converting to datetime type, various properties can be easily extracted from date times. ## Example import pandas as pd # Create a date Series s = pd.Series(pd.date_range("2024-01-15", periods=5, freq="D")) print("Date Series:") print(s) print() # Extract year/month/day print("Extract year:") print(s.dt.year) print() print("Extract month:") print(s.dt.month) print() print("Extract day:") print(s.dt.day) print() # Extract day of week (0=Monday, 6=Sunday) print("Day of week (number):") print(s.dt.dayofweek) print() # Extract day name print("Day name:") print(s.dt.day_name()) ### More Date Properties | Property | Description | Example | | --- | --- | --- | | `year` | Year | 2024 | | `month` | Month (1-12) | 1 | | `day` | Day (1-31) | 15 | | `hour` | Hour (0-23) | 10 | | `minute` | Minute (0-59) | 30 | | `dayofweek` | Day of week (0-6) | 0 | | `quarter` | Quarter (1-4) | 1 | | `is_month_start` | Is beginning of month | True/False | | `is_month_end` | Is end of month | True/False | * * * ## Date Operations ### Date Addition and Subtraction ## Example import pandas as pd # Create a date date = pd.Timestamp("2024-01-15") print(f"Base date: {date}") print() # Add/subtract days print(f"+3 days: {date + pd.Timedelta(days=3)}") print(f"-5 days: {date - pd.Timedelta(days=5)}") print() # Date difference date1 = pd.Timestamp("2024-01-01") date2 = pd.Timestamp("2024-01-15") print(f"Date difference: {date2 - date1}") print(f"Days difference: {(date2 - date1).days}") print() # Date Series operations dates = pd.date_range("2024-01-01", periods=5, freq="D") print("Date + 3 days:") print(dates + pd.Timedelta(days=3)) ### Date Offsets ## Example import pandas as pd date = pd.Timestamp("2024-01-15") print(f"Base date: {date}") print() # Beginning/end of month print(f"Beginning of month: {date + pd.offsets.MonthBegin(1)}") print(f"End of month: {date + pd.offsets.MonthEnd(1)}") print() # Year offset print(f"Add 1 year: {date + pd.DateOffset(years=1)}") print(f"Subtract 1 month: {date + pd.DateOffset(months=-1)}") print() # Weekday print(f"Next Monday: {date + pd.offsets.Week(weekday=0)}") * * * ## Timezone Handling ## Example import pandas as pd # Create timezone-naive timestamps dates = pd.date_range("2024-01-01 10:00", periods=3, freq="H") print("Without timezone:") print(dates) print() # Set timezone dates_utc = dates.tz_localize("UTC") print("Set UTC timezone:") print(dates_utc) print() # Convert timezone dates_shanghai = dates_utc.tz_convert("Asia/Shanghai") print("Converted to Shanghai timezone:") print(dates_shanghai) * * * ## Practical Example: Sales Data Analysis ## Example import pandas as pd # Simulate sales data df = pd.DataFrame({ "Date": pd.date_range("2024-01-01", periods=30, freq="D"), "Sales": [100,150,120,180,200,90,80] * 4 + [100,100] }) df= pd.to_datetime(df) print("Sales data:") print(df.head(10)) print() # Group by weekday print("Average sales by weekday:") weekday_sales = df.groupby(df.dt.day_name()).mean() print(weekday_sales) print() # Group by month print("Monthly summary:") df= df.dt.to_period("M") monthly_sales = df.groupby("Month").sum() print(monthly_sales) print() # Calculate 7-day rolling average df= df.rolling(window=7).mean() print("Added 7-day rolling average:") print(df.head(10)) * * * ## Common Issues **1. Inconsistent Date Formats** When using `pd.to_datetime`, specify the `format` parameter to clearly define the format and avoid parsing errors. **2. Timezone Confusion** When handling cross-timezone data, ensure all times use a consistent timezone or UTC. **3. Date Operations Using Timedelta** Use `pd.Timedelta` for date addition/subtraction instead of simple integer arithmetic. > When working with time series data, it's recommended to convert date strings to datetime types early on to leverage Pandas' powerful date features.
← Pandas ApplyPandas Duplicate β†’