ID : 367
viewed : 81
Tags : PythonPython MathPython NumPy
92
Percentiles indicate the percentage of scores that fall below a certain value. An individual with an IQ of 120, for instance, is at the 91st percentile, which means that his IQ is greater than 91% of other people.
This article will discuss some methods to calculate percentile in Python.
scipy
PackageThis package will calculate the score of the input series at a given percentile. The syntax of the scoreatpercentile()
function is given below:
scipy.stats.scoreatpercentile(a, per, limit=(), interpolation_method='fraction', axis=None)
In the scoreatpercentile()
function, the parameter a
represents a 1-D array, and per
specifies the percentile ranging from 0 to 100. The other two parameters are optional. The NumPy
library is used to get the numbers on which we calculated percentile.
The complete example code is given below.
from scipy import stats import numpy as np array = np.arange(100) percentile=stats.scoreatpercentile(array, 50) print("The percentile is:",percentile)
Output:
The percentile is: 49.5
NumPy
PackageThis package has a percentile()
function that will calculate the percentile of given array. The syntax of the percentile()
function is given below.
numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear', keepdims=False)
The parameter q
represents the percentile calculation number. a
represents an array while the other parameters are optional.
The complete example code is given below.
import numpy as np arry = np.array([4,6,8,10,12]) percentile = np.percentile(arry, 50) print("The percentile is:",percentile)
Output:
The percentile is: 8.0
math
PackageThe math
package with its basic function - ceil
can be used to calculate different percentiles.
The complete example code is given below.
import math arry=[1,2,3,4,5,6,7,8,9,10] def calculate_percentile(arry, percentile): size = len(arry) return sorted(arry)[int(math.ceil((size * percentile) / 100)) - 1] percentile_25 = calculate_percentile(arry, 25) percentile_50 = calculate_percentile(arry, 50) percentile_75 = calculate_percentile(arry, 75) print("The 25th percentile is:",percentile_25) print("The 50th percentile is:",percentile_50) print("The 75th percentile is:",percentile_75)
The math.ceil(x)
rounds off the value and returns the smallest integer greater than or equal to x
, while the sorted
function sorts the array.
Output:
The 25th percentile is: 3 The 50th percentile is: 5 The 75th percentile is: 8
statistics
PackageThe quantiles()
function in the statistics
package is used to break down the data into equal probability and return a distribution list of n-1
. The syntax of this function is given below.
statistics.quantiles(data, *, n=4, method='exclusive')
The complete example code is given below.
from statistics import quantiles data =[1,2,3,4,5] percentle=quantiles(data, n=4) print("The Percentile is:",percentle)
Output:
The Percentile is: [1.5, 3.0, 4.5]
We can calculate different percentiles using the interpolation mode. The interpolation modes are linear
, lower
, higher
, midpoint
and nearest
. These interpolations are used when the percentiles are in between two data points, i
and j
. When the percentile value is i
, it is lower interpolation mode, j
represents higher interpolation mode, and i + (j - i) * fraction
represents the linear mode where fraction
indicates the index surrounded by i
and j
.
The complete example code for linear interpolation mode is given below.
import numpy as np arry=np.array([1,2,3,4,5,6,7,8,9,10]) print('percentiles using interpolation = ', "linear") percentile_10 = np.percentile(arry, 10,interpolation='linear') percentile_50 = np.percentile(arry, 50,interpolation='linear') percentile_75 = np.percentile(arry, 75,interpolation='linear') print('percentile_10 = ',percentile_10,', median = ',percentile_50,' and percentile_75 = ',percentile_75)
We use numpy.percentile()
function with additional parameter interpolation
. You can see that we get float values for this interpolation.
Output:
percentiles using interpolation = linear percentile_10 = 1.9 , median = 5.5 and percentile_75 = 7.75
The complete example code for lower interpolation mode is given below.
import numpy as np arry=np.array([1,2,3,4,5,6,7,8,9,10]) print('percentiles using interpolation = ', "lower") percentile_10 = np.percentile(arry, 10,interpolation='lower') percentile_50 = np.percentile(arry, 50,interpolation='lower') percentile_75 = np.percentile(arry, 75,interpolation='lower') print('percentile_10 = ',percentile_10,', median = ',percentile_50,' and percentile_75 = ',percentile_75)
Output:
percentiles using interpolation = lower percentile_10 = 1 , median = 5 and percentile_75 = 7
You can see that the final percentile is rouded-off to the lowest value.
This method will give percentiles of the given array to the highest round-off value.
The complete example code for higher interpolation mode is given below.
import numpy as np arry=np.array([1,2,3,4,5,6,7,8,9,10]) print('percentiles using interpolation = ', "higher") percentile_10 = np.percentile(arry, 10,interpolation='higher') percentile_50 = np.percentile(arry, 50,interpolation='higher') percentile_75 = np.percentile(arry, 75,interpolation='higher') print('percentile_10 = ',percentile_10,', median = ',percentile_50,' and percentile_75 = ',percentile_75)
Output:
percentiles using interpolation = higher percentile_10 = 2 , median = 6 and percentile_75 = 8
This method will give midpoints of the percentile values.
The complete example code for midpoint interpolation mode is given below.
import numpy as np arry=np.array([1,2,3,4,5,6,7,8,9,10]) print('percentiles using interpolation = ', "midpoint") percentile_10 = np.percentile(arry, 10,interpolation='midpoint') percentile_50 = np.percentile(arry, 50,interpolation='midpoint') percentile_75 = np.percentile(arry, 75,interpolation='midpoint') print('percentile_10 = ',percentile_10,', median = ',percentile_50,' and percentile_75 = ',percentile_75)
Output:
percentiles using interpolation = midpoint percentile_10 = 1.5 , median = 5.5 and percentile_75 = 7.5