YouTip LogoYouTip

Python Longest Substring

## Finding the Longest Common Substring in Python The **Longest Common Substring (LCS)** problem is a classic computer science challenge. Given two strings, the goal is to find the longest sequence of characters that appears in both strings in the exact same order and consecutively. Unlike *subsequences*, which do not require characters to be adjacent, *substrings* must be contiguous. In this tutorial, we will explore how to solve this problem efficiently in Python using **Dynamic Programming (DP)**. --- ## The Dynamic Programming Approach Dynamic programming solves complex problems by breaking them down into simpler subproblems. For the Longest Common Substring problem, we can define a 2D array (or table) `dp` where: * `dp` represents the length of the longest common suffix of the substrings `s1[0...i-1]` and `s2[0...j-1]`. ### State Transition Formula 1. If the characters at the current positions match (`s1 == s2`), we extend the common substring found so far: $$\text{dp} = \text{dp} + 1$$ 2. If the characters do not match (`s1 != s2`), the continuity is broken, so we reset the length: $$\text{dp} = 0$$ By keeping track of the maximum value in this `dp` table and its position, we can easily extract the longest common substring. --- ## Python Implementation Below is the complete Python implementation using the dynamic programming approach: ```python def longest_common_substring(s1: str, s2: str) -> str: m = len(s1) n = len(s2) # Create a 2D array initialized with 0s to store lengths of common substrings dp = [ * (n + 1) for _ in range(m + 1)] max_length = 0 # Keeps track of the length of the longest common substring end_pos = 0 # Keeps track of the ending index of the longest common substring in s1 # Fill the DP table for i in range(1, m + 1): for j in range(1, n + 1): if s1 == s2: dp = dp + 1 # Update the maximum length and ending position if a longer substring is found if dp > max_length: max_length = dp end_pos = i else: dp = 0 # Extract and return the longest common substring using slicing return s1[end_pos - max_length:end_pos] # Test the function s1 = "abcdef" s2 = "zbcdf" result = longest_common_substring(s1, s2) print("The longest common substring is:", result) ``` ### Output ```text The longest common substring is: bcd ``` --- ## Code Explanation 1. **DP Table Initialization**: We initialize a 2D list `dp` of size `(m + 1) x (n + 1)` with zeros. The extra row and column handle the base cases where one of the strings is empty. 2. **Nested Loops**: We iterate through each character of `s1` (index `i`) and `s2` (index `j`). 3. **Character Comparison**: * If `s1 == s2`, we add `1` to the diagonal value `dp`. * If they do not match, we set `dp` to `0`. 4. **Tracking the Maximum**: The variable `max_length` stores the length of the longest match found so far, and `end_pos` stores the ending index in `s1`. 5. **String Slicing**: Finally, we use Python's slice notation `s1[end_pos - max_length : end_pos]` to extract and return the actual substring. --- ## Complexity Analysis * **Time Complexity**: $\mathcal{O}(m \times n)$, where $m$ and $n$ are the lengths of the two strings. We use nested loops to traverse the DP table of size $(m+1) \times (n+1)$. * **Space Complexity**: $\mathcal{O}(m \times n)$ to store the 2D DP table. > **Tip:** If you only need the *length* of the longest common substring and not the substring itself, the space complexity can be optimized to $\mathcal{O}(n)$ by keeping track of only the current and previous rows of the DP table.
← Python Count KeysPython Find Character β†’