YouTip LogoYouTip

Python Anagram Check

## Python Anagram Check: How to Determine if Two Strings are Anagrams An **anagram** is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once. For example, the words **"listen"** and **"silent"** are anagrams of each other. In Python, we can determine if two strings are anagrams by comparing their character compositions. This tutorial covers the most common and efficient ways to perform an anagram check in Python. --- ## Method 1: Using the Sorting Approach The most intuitive way to check if two strings are anagrams is to sort both strings and compare them. If they contain the exact same characters with the exact same frequencies, their sorted representations will be identical. ### Code Example ```python def is_anagram(str1, str2): # Remove whitespace and convert strings to lowercase str1 = str1.replace(" ", "").lower() str2 = str2.replace(" ", "").lower() # If lengths are different, they cannot be anagrams if len(str1) != len(str2): return False # Sort the characters of both strings and compare return sorted(str1) == sorted(str2) # Test cases print(is_anagram("listen", "silent")) # Output: True print(is_anagram("triangle", "integral")) # Output: True print(is_anagram("apple", "pale")) # Output: False ``` ### Code Explanation 1. **Normalization**: `str1.replace(" ", "").lower()` removes any spaces and converts the string to lowercase. This ensures that the comparison is case-insensitive and ignores spacing (which is useful for multi-word anagrams like "New York" and "nowhere"). 2. **Length Check**: `if len(str1) != len(str2):` acts as a guard clause. If the two processed strings do not have the same length, they cannot be anagrams, allowing the function to return `False` early. 3. **Sorting and Comparison**: `sorted(str1) == sorted(str2)` converts the strings into sorted lists of characters (e.g., `"listen"` becomes `['e', 'i', 'l', 'n', 's', 't']`). If the sorted lists are identical, the function returns `True`. ### Output ```text True True False ``` --- ## Method 2: Using `collections.Counter` (Optimal Performance) While the sorting method is simple and clean, its time complexity is $O(N \log N)$ due to the sorting step. For a more efficient $O(N)$ linear time complexity solution, you can use Python's built-in `Counter` class from the `collections` module. This counts the frequency of each character in both strings and compares the counts. ### Code Example ```python from collections import Counter def is_anagram_optimized(str1, str2): # Remove whitespace and convert to lowercase str1 = str1.replace(" ", "").lower() str2 = str2.replace(" ", "").lower() # Compare character frequencies return Counter(str1) == Counter(str2) # Test cases print(is_anagram_optimized("astronomer", "moon starer")) # Output: True print(is_anagram_optimized("hello", "billion")) # Output: False ``` ### Why use `Counter`? * **Time Complexity**: $O(N)$ because it only requires a single pass through each string to count character occurrences. * **Space Complexity**: $O(K)$ where $K$ is the number of unique characters in the strings. --- ## Considerations & Best Practices When implementing an anagram check in real-world applications, consider the following edge cases: * **Case Sensitivity**: Decide whether "Listen" and "Silent" should be considered anagrams. Standard practice is to convert both strings to lowercase using `.lower()` or `.casefold()` before comparison. * **Whitespace and Punctuation**: Phrases like "Dormitory" and "Dirty room" are anagrams. You should strip out spaces and punctuation marks if you want to support multi-word phrase anagrams. * **Character Encoding**: If you are dealing with Unicode characters (e.g., accented letters like `Γ©` or non-Latin scripts), ensure proper normalization using the `unicodedata` module if necessary.
← Python Find SubstringPython List Squares β†’