Sorting Algorithms
A sorting algorithm (English: Sorting algorithm) is an algorithm that can sort a set of data according to a specific sorting method. After sorting, the data can be placed in an ordered array.
Sorting refers to the process of rearranging a set of data (such as an array or list) in a specific order (ascending or descending).
The basis for sorting is usually a certain **key** of the data elements, such as the size of numbers or the lexicographical order of strings.
Sorting is one of the most fundamental and core algorithmic problems in computer science. Whether it's organizing a contact list, analyzing exam scores, or optimizing database queries, sorting algorithms are ubiquitous.
Understanding different sorting algorithms not only helps us solve practical problems but also provides an excellent way to delve into the design and analysis of algorithms.
Classification of Sorting Algorithms
Sorting algorithms can be classified from multiple dimensions. The most common classification is based on their **time complexity** and **space complexity**.
| Classification Dimension | Category | Description | Typical Algorithms |
|---|---|---|---|
| Time Complexity | O(nΒ²) Level | Simple and intuitive, but low efficiency, suitable for small-scale data. | Bubble Sort, Selection Sort, Insertion Sort |
| O(n log n) Level | High efficiency, preferred for handling large-scale data. | Quick Sort, Merge Sort, Heap Sort | |
| Space Complexity | In-place Sorting | Only uses constant extra space during sorting. | Bubble, Selection, Insertion, Shell, Heap Sort, Quick Sort |
| Non-in-place Sorting | Requires additional space comparable to the size of the data during sorting. | Merge Sort, Counting Sort, Bucket Sort, Radix Sort | |
| Stability | Stable Sorting | Equal elements maintain their relative order after sorting. | Bubble, Insertion, Merge, Counting, Bucket, Radix Sort |
| Unstable Sorting | Equal elements may change their relative order after sorting. | Selection, Shell, Heap, Quick Sort |
Key Concept Explanations:
- Time Complexity: Measures how the execution time of an algorithm grows with the increase in data size.
- Space Complexity: Measures the amount of additional storage space required by the algorithm during execution.
- Stability: For elements with equal values, whether their relative order remains unchanged after sorting. This is very important in multi-keyword sorting.
The following flowchart shows how to make an initial choice among several classic algorithms based on different requirements:
O(nΒ²) Level Basic Sorting Algorithms
These algorithms have simple ideas and are the starting point for understanding sorting logic.
Bubble Sort
Core Idea: Repeatedly "traverse" the sequence to be sorted, compare two adjacent elements each time, and swap them if they are in the wrong order. Each traversal will bubble the largest (or smallest) element in the unsorted part to its correct position.
Algorithm Steps:
- Compare adjacent elements. If the first is greater than the second (ascending), swap them.
- Repeat this process for every pair of adjacent elements, from the start to the end. After this step, the last element will be the largest number.
- Repeat the above steps for all elements, except the last one (since each round determines one final element).
- Continue repeating the above steps for fewer and fewer elements until no more pairs need to be compared.
Code Implementation:
## Instance
def bubble_sort(arr):
"""
Bubble Sort (Ascending)
:param arr: List to be sorted
"""
n = len(arr)
# Outer loop controls the number of passes (n-1 passes)
for i in range(n - 1):
# Inner loop compares and swaps adjacent elements
# After each pass, the last i+1 elements are sorted, so the comparison range is 0 to n-1-i
for j in range(0, n - 1 - i):
if arr > arr[j + 1]: # If the previous element is larger
# Swap the positions of the two elements
arr, arr[j + 1] = arr[j + 1], arr
return arr
# Test Data
test_data = [64, 34, 25, 12, 22, 11, 90]
print("Original Array:", test_data)
sorted_data = bubble_sort(test_data.copy()) # Use a copy to avoid modifying the original data
print("After Bubble Sort:", sorted_data)
Output:
Original Array: [64, 34, 25, 12, 22, 11, 90] After Bubble Sort: [11, 12, 22, 25, 34, 64, 90]
Performance Analysis:
- Time Complexity: Best case O(n) (when already sorted, can be optimized), worst and average case O(nΒ²).
- Space Complexity: O(1), in-place sorting.
- Stability: Stable.
Selection Sort
Core Idea: In the unsorted sequence, find the minimum (or maximum) element and place it at the beginning of the sorted sequence. Then continue finding the minimum (maximum) element from the remaining unsorted elements and place it at the end of the sorted sequence. Repeat this process until all elements are sorted.
Algorithm Steps:
- Initial state: The entire sequence is unsorted.
- At the i-th round (i starts from 0), the unsorted region is `arr[i...n-1]`. Find the record with the smallest key in this region, `arr`.
- Swap `arr` with the first record in the unsorted region, `arr`. Now, `arr[0...i]` forms a sorted region.
- After n-1 rounds, the array is sorted.
Code Implementation:
## Instance
def selection_sort(arr):
"""
Selection Sort (Ascending)
:param arr: List to be sorted
"""
n = len(arr)
for i in range(n):
# Assume the element at index i is the smallest
min_index = i
# Search for a smaller element in the range from i+1 to n-1
for j in range(i + 1, n):
if arr < arr:
min_index = j # Update the index of the smallest element
# Swap the found smallest element with the element at position i
arr, arr = arr, arr
return arr
# Test
test_data = [64, 25, 12, 22, 11]
print("Original Array:", test_data)
print("After Selection Sort:", selection_sort(test_data.copy()))
Output:
Original Array: [64, 25, 12, 22, 11] After Selection Sort: [11, 12, 22, 25, 64]
Performance Analysis:
- Time Complexity: Always O(nΒ²), because regardless of whether the data is sorted, complete comparisons are needed.
- Space Complexity: O(1), in-place sorting.
- Stability: **Unstable**. For example, in the sequence `[5, 8, 5, 2, 9]`, the first `5` and `2` will be swapped in the first round, breaking the original order of the two `5`s.
Insertion Sort
Core Idea: By building an ordered sequence, take the next element and scan backward through the already sorted elements to find the appropriate position and insert it. This is similar to how we organize playing cards.
Algorithm Steps:
- Treat the first element as the sorted sequence.
- Take the next element and scan backward through the sorted elements.
- If the current element (already sorted) is greater than the new element, move it to the next position.
- Repeat step 3 until the correct position for the new element is found.
- Insert the new element into the found position.
- Repeat steps 2-5 until all elements are processed.
Code Implementation:
## Instance
def insertion_sort(arr):
"""
Insertion Sort (Ascending)
:param arr: List to be sorted
"""
n = len(arr)
# Start from the second element (index 1), as the first element is considered sorted
for i in range(1, n):
key = arr # Current element to be inserted
j = i - 1 # Index of the last element in the sorted sequence
# Scan backward through the sorted sequence to find the insertion position for key
# At the same time, move elements larger than key one position to the right
while j >= 0 and key < arr:
arr[j + 1] = arr
j -= 1
# Insert key into the correct position
arr[j + 1] = key
return arr
# Test
test_data = [12, 11, 13, 5, 6]
print("Original Array:", test_data)
print("After Insertion Sort:", insertion_sort(test_data.copy()))
Output:
Original Array: [12, 11, 13, 5, 6] After Insertion Sort: [5, 6, 11, 12, 13]
Performance Analysis:
- Time Complexity: Best case O(n) (already sorted), worst and average case O(nΒ²).
- Space Complexity: O(1), in-place sorting.
- Stability: Stable.
- Features: Very efficient for small-scale or nearly sorted data. It is the algorithm used in advanced sorting algorithms (like Timsort) when dealing with small data sizes.
O(n log n) Level Efficient Sorting Algorithms
When the data size increases, O(nΒ²) algorithms become very slow. The following more efficient algorithms are the mainstays in engineering practice.
Quick Sort
Core Idea: **Divide and Conquer**. Choose a pivot element, partition the records to be sorted into two independent parts through one pass of sorting, where the keys of one part are all less than those of the other part, and then recursively sort these two parts to achieve a fully sorted sequence.
Algorithm Steps:
- Select Pivot: Pick an element from the sequence as the "pivot".
- Partition Operation: Rearrange the sequence so that all elements smaller than the pivot are placed before it, and all elements larger than the pivot are placed after it (elements equal to the pivot can go on either side). After this partitioning, the pivot is in its correct position in the sequence. This is called the partition operation.
- Recursive Sorting: Recursively sort the subarrays before and after the pivot.
Code Implementation:
## Instance
def quick_sort(arr):
"""
Quick Sort (Ascending) - Main Function
"""
def _quick_sort(arr, low, high):
if low < high:
# pi is the partition index, arr is now in the correct position
pi = partition(arr, low, high)
# Recursively sort the subarrays before and after the partition
_quick_sort(arr, low, pi - 1)
_quick_sort(arr, pi + 1, high)
def partition(arr, low, high):
"""
Partition function, choose the last element as the pivot
:return: Index of the pivot's correct final position
"""
pivot = arr # Choose pivot
i = low - 1 # Points to the end of the subarray with elements less than the pivot
for j in range(low, high):
# If the current element is less than or equal to the pivot
if arr <= pivot:
i += 1
arr, arr = arr, arr # Swap
# Place the pivot in its correct position (i+1)
arr[i + 1], arr = arr, arr[i + 1]
return i + 1
_quick_sort(arr, 0, len(arr) - 1)
return arr
# Test
test_data = [10, 80, 30, 90, 40, 50, 70]
print("Original Array:", test_data)
print("After Quick Sort:", quick_sort(test_data.copy()))
Output:
Original Array: [10, 80, 30, 90, 40, 50, 70] After Quick Sort: [10, 30, 40, 50, 70, 80, 90]
Performance Analysis:
- Time Complexity: Average case O(n log n), worst case O(nΒ²) (when the array is already sorted or reversed, and the pivot is poorly chosen). Random pivot selection or "median-of-three" method can greatly avoid the worst case.
- Space Complexity: Average O(log n) (recursive call stack), worst O(n).
- Stability: **Unstable**.
Merge Sort
Core Idea: A typical application of **divide and conquer**. Merge already sorted sub-sequences to obtain a completely sorted sequence. That is, first make each sub-sequence sorted, then merge the sub-sequences.
Algorithm Steps:
- Divide: Split the current interval into two halves, i.e., find the midpoint `mid = (low + high)/2`.
- Solve: Recursively sort the two sub-intervals `arr[low...mid]` and `arr[mid+1...high]`. The recursive termination condition is when the sub-interval length is 1 (considered naturally sorted).
- Merge: Merge the two already sorted sub-arrays into one sorted array.
Code Implementation:
## Instance
def merge_sort(arr):
"""
Merge Sort (Ascending) - Main Function
"""
if len(arr) > 1:
mid = len(arr) // 2 # Find the middle point to split the array
left_half = arr[:mid]
right_half = arr[mid:]
# Recursively sort the left and right halves
merge_sort(left_half)
merge_sort(right_half)
# Merge the two sorted sub-arrays
i = j = k = 0
# Compare elements from both sub-arrays and put the smaller one into the original array
while i < len(left_half) and j < len(right_half):
if left_half < right_half:
arr = left_half
i += 1
else:
arr = right_half
j += 1
k += 1
# Check for any remaining elements (left half or right half)
while i < len(left_half):
arr = left_half
i += 1
k += 1
while j < len(right_half):
arr
YouTip