Tags
 IOS SQL HTML C RUBY-ON-RAILS MYSQL ASP.NET DEVELOPMENT RUBY .NET LINUX SQL-SERVER REGEX WINDOWS ALGORITHM ECLIPSE VISUAL-STUDIO STRING SVN PERFORMANCE APACHE-FLEX UNIT-TESTING SECURITY LINQ UNIX MATH EMAIL OOP LANGUAGE-AGNOSTIC VB6

# Applying stats.percentileofscore to every row by column

By : Jojo Jacob
Date : October 17 2020, 06:10 PM
will be helpful for those in need Series.rank
with pct=True, this is the equivalent of stats.percentileofscore with the default kind='rank'
code :
``````df[0].rank(pct=True)*100
#0     14.285714
#1     35.714286
#2     71.428571
#3     35.714286
#4     85.714286
#5     57.142857
#6    100.000000
#Name: 0, dtype: float64
``````
``````from scipy import stats

for idx, val in df[0].iteritems():
print(f'{val}: {stats.percentileofscore(df[0], score=val)}')

#1 : 14.285714285714286
#5 : 35.714285714285715
#34 : 71.42857142857143
#5 : 35.714285714285715
#67 : 85.71428571428571
#8 : 57.142857142857146
#98 : 100.0
``````

Share :

## Getting PostgreSQL percent_rank and scipy.stats.percentileofscore results to match

By : Sushma Girish
Date : March 29 2020, 07:55 AM
Any of those help You can use scipy.stats.rankdata. The following example reproduces the result shown at http://docs.aws.amazon.com/redshift/latest/dg/r_WF_PERCENT_RANK.html:
code :
``````In [12]: import numpy as np

In [13]: from scipy.stats import rankdata

In [14]: values = np.array([15, 20, 20, 20, 30, 30, 40])
``````
``````In [15]: rank = rankdata(values, method='min')

In [16]: rank
Out[16]: array([1, 2, 2, 2, 5, 5, 7])
``````
``````In [17]: (rank - 1) / (len(values) - 1)
Out[17]:
array([ 0.        ,  0.16666667,  0.16666667,  0.16666667,  0.66666667,
0.66666667,  1.        ])
``````
``````In [87]: import numpy as np

In [88]: from scipy.stats import percentileofscore

In [89]: values = np.array([15, 20, 20, 20, 30, 30, 40])

In [90]: n = len(values)
``````
``````In [91]: [n*percentileofscore(values, val, kind='strict')/100/(n-1) for val in values]
Out[91]:
[0.0,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.66666666666666663,
0.66666666666666663,
1.0]
``````

## What is the significance of t-stats value while applying ttest_ind on two pandas series?

By : xiaodiu
Date : March 29 2020, 07:55 AM
The calculated t-statistic. The two-tailed p-value.

## Weighted version of scipy percentileofscore

By : MannyFreshCode
Date : March 29 2020, 07:55 AM
hop of those help? I'd like to pass weights to scipy.stats.percentileofscore. For example: , This should do the job.
code :
``````import numpy as np

def weighted_percentile_of_score(a, weights, score, kind='weak'):
npa = np.array(a)
npw = np.array(weights)

if kind == 'rank':  # Equivalent to 'weak' since we have weights.
kind = 'weak'

if kind in ['strict', 'mean']:
indx = npa < score
strict = 100 * sum(npw[indx]) / sum(weights)
if kind == 'strict':
return strict

if kind in ['weak', 'mean']:
indx = npa <= score
weak = 100 * sum(npw[indx]) / sum(weights)
if kind == 'weak':
return weak

if kind == 'mean':
return (strict + weak) / 2

a = [1, 2, 3, 4]
weights = [2, 2, 3, 3]
print(weighted_percentile_of_score(a, weights, 3))  # 70.0 as desired.
``````
``````[weighted_percentile_of_score(a, weights, val) for val in a]
# [20.0, 40.0, 70.0, 100.0]
``````

## Python numpy percentile vs scipy percentileofscore

By : user3154361
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , I think you're not quite understanding what percentileofscore and percentile actually do. They are not inverses of each other.
code :
``````ap = np.asarray(sorted(df))
Nx = df.shape[0]

indices = z1 / 100 * (Nx - 1)
indices_below = np.floor(indices).astype(int)
indices_above = indices_below + 1

weight_above = indices - indices_below
weight_below = 1 - weight_above

x1 = ap[b] * weight_below   # 57.50000000000004
x2 = ap[a] * weight_above   # 12.499999999999956

x1 + x2
``````
``````70.0
``````

## Pandas describe vs scipy.stats percentileofscore with NaN?

By : 万明辉
Date : March 29 2020, 07:55 AM
may help you . scipy.stats.percentileofscore does not ignore nan, nor does it check for the value and handle it in some special way. It is just another floating point value in your data. This means the behavior of percentileofscore with data containing nan is undefined, because of the behavior of nan in comparisons: