Python Statistics
## Python3.x Python statistics Module
In data analysis and scientific computing, statistics is a very important tool.
Python provides a built-in `statistics` module, specifically for handling basic statistical calculations. This article will provide a detailed introduction to the functions and usage of the `statistics` module, helping beginners quickly master how to use this module for basic statistical analysis.
The `statistics` module provides many commonly used statistical functions, such as mean, median, variance, standard deviation, etc.
To use statistics functions, you must first import:
import statistics
View the contents of the statistics module:
>>>import statistics
>>>dir(statistics)
['Counter','Decimal','Fraction','NormalDist','StatisticsError','__all__','__builtins__','__cached__','__doc__','__file__','__loader__','__name__','__package__','__spec__','_coerce','_convert','_exact_ratio','_fail_neg','_find_lteq','_find_rteq','_isfinite','_normal_dist_inv_cdf','_ss','_sum','bisect_left','bisect_right','erf','exp','fabs','fmean','fsum','geometric_mean','groupby','harmonic_mean','hypot','itemgetter','log','math','mean','median','median_grouped','median_high','median_low','mode','multimode','numbers','pstdev','pvariance','quantiles','random','sqrt','stdev','tau','variance']
* * *
## Common Statistical Functions
### Mean
The mean is the average value of all numbers in a dataset. The `statistics` module provides the `mean()` function to calculate the mean.
## Example
data =[1,2,3,4,5]
mean_value = statistics.mean(data)
print("Mean:", mean_value)
Output:
Mean: 3
### Median
The median is the value at the middle position when a dataset is arranged in order. The `statistics` module provides the `median()` function to calculate the median.
## Example
data =[1,2,3,4,5]
median_value = statistics.median(data)
print("Median:", median_value)
Output:
Median: 3
If the length of the dataset is even, the `median()` function will automatically calculate the average of the two middle numbers.
## Example
data =[1,2,3,4]
median_value = statistics.median(data)
print("Median:", median_value)
Output:
Median: 2.5
### Mode
The mode is the value that appears most frequently in a dataset. The `statistics` module provides the `mode()` function to calculate the mode.
## Example
data =[1,2,2,3,4]
mode_value = statistics.mode(data)
print("Mode:", mode_value)
Output:
Mode: 2
If there are no duplicate values in the dataset, the `mode()` function will throw a `StatisticsError` exception.
### Variance
Variance is a measure of how spread out the values in a dataset are. The `statistics` module provides the `variance()` function to calculate variance.
## Example
data =[1,2,3,4,5]
variance_value = statistics.variance(data)
print("Variance:", variance_value)
Output:
Variance: 2.5
### Standard Deviation
Standard deviation is the square root of variance, used to measure the dispersion of a dataset. The `statistics` module provides the `stdev()` function to calculate the standard deviation.
## Example
data =[1,2,3,4,5]
stdev_value = statistics.stdev(data)
print("Standard Deviation:", stdev_value)
Output:
Standard Deviation: 1.5811388300841898
### Harmonic Mean
The harmonic mean is a special type of average, suitable for calculating rates and other scenarios. The `statistics` module provides the `harmonic_mean()` function to calculate the harmonic mean.
## Example
data =[1,2,4]
harmonic_mean_value = statistics.harmonic_mean(data)
print("Harmonic Mean:", harmonic_mean_value)
Output:
Harmonic Mean: 1.7142857142857142
### Geometric Mean
The geometric mean is an average used for calculating growth rates or ratios. The `statistics` module provides the `geometric_mean()` function to calculate the geometric mean.
## Example
data =[1,2,4]
geometric_mean_value = statistics.geometric_mean(data)
print("Geometric Mean:", geometric_mean_value)
Output:
Geometric Mean: 2.0
* * *
## Other Common Functions
### Median Low and Median High
The `statistics` module also provides `median_low()` and `median_high()` functions, which are used to calculate the low median and high median of a dataset respectively.
## Example
data =[1,2,3,4]
median_low_value = statistics.median_low(data)
median_high_value = statistics.median_high(data)
print("Median Low:", median_low_value)
print("Median High:", median_high_value)
Output:
Median Low: 2Median High: 3
### Quantiles
Quantiles are values that divide a dataset into equal parts. The `statistics` module provides the `quantiles()` function to calculate quantiles.
## Example
data =[1,2,3,4,5]
quantiles_value = statistics.quantiles(data, n=4)
print("Quartiles:", quantiles_value)
Output:
Quartiles: [1.5, 3.0, 4.5]
* * *
## math Module Methods
| Method | Description |
| --- | --- |
| [statistics.harmonic_mean()](#) | Calculates the harmonic mean of the given dataset. |
| [statistics.mean()](#) | Calculates the mean of the dataset |
| [statistics.median()](#) | Calculates the median of the dataset |
| [statistics.median_grouped()](#) | Calculates the grouped median of the given grouped dataset |
| [statistics.median_high()](#) | Calculates the high median of the given dataset |
| [statistics.median_low()](#) | Calculates the low median of the given dataset. |
| [statistics.mode()](#) | Calculates the mode of the dataset (the value that appears most frequently) |
| [statistics.pstdev()](#) | Calculates the sample standard deviation of the given dataset |
| [statistics.stdev()](#) | Calculates the standard deviation of the dataset |
| [statistics.pvariance()](http
YouTip