Getting PostgreSQL percent_rank and scipy.stats.percentileofscore results to match
By : Sushma Girish
Date : March 29 2020, 07:55 AM
Any of those help You can use scipy.stats.rankdata. The following example reproduces the result shown at http://docs.aws.amazon.com/redshift/latest/dg/r_WF_PERCENT_RANK.html: code :
In [12]: import numpy as np
In [13]: from scipy.stats import rankdata
In [14]: values = np.array([15, 20, 20, 20, 30, 30, 40])
In [15]: rank = rankdata(values, method='min')
In [16]: rank
Out[16]: array([1, 2, 2, 2, 5, 5, 7])
In [17]: (rank  1) / (len(values)  1)
Out[17]:
array([ 0. , 0.16666667, 0.16666667, 0.16666667, 0.66666667,
0.66666667, 1. ])
In [87]: import numpy as np
In [88]: from scipy.stats import percentileofscore
In [89]: values = np.array([15, 20, 20, 20, 30, 30, 40])
In [90]: n = len(values)
In [91]: [n*percentileofscore(values, val, kind='strict')/100/(n1) for val in values]
Out[91]:
[0.0,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.66666666666666663,
0.66666666666666663,
1.0]

What is the significance of tstats value while applying ttest_ind on two pandas series?
By : xiaodiu
Date : March 29 2020, 07:55 AM
should help you out As you can read here, the scipy.stats.ttest_ind has two outputs The calculated tstatistic. The twotailed pvalue.

Weighted version of scipy percentileofscore
By : MannyFreshCode
Date : March 29 2020, 07:55 AM
hop of those help? I'd like to pass weights to scipy.stats.percentileofscore. For example: , This should do the job. code :
import numpy as np
def weighted_percentile_of_score(a, weights, score, kind='weak'):
npa = np.array(a)
npw = np.array(weights)
if kind == 'rank': # Equivalent to 'weak' since we have weights.
kind = 'weak'
if kind in ['strict', 'mean']:
indx = npa < score
strict = 100 * sum(npw[indx]) / sum(weights)
if kind == 'strict':
return strict
if kind in ['weak', 'mean']:
indx = npa <= score
weak = 100 * sum(npw[indx]) / sum(weights)
if kind == 'weak':
return weak
if kind == 'mean':
return (strict + weak) / 2
a = [1, 2, 3, 4]
weights = [2, 2, 3, 3]
print(weighted_percentile_of_score(a, weights, 3)) # 70.0 as desired.
[weighted_percentile_of_score(a, weights, val) for val in a]
# [20.0, 40.0, 70.0, 100.0]

Python numpy percentile vs scipy percentileofscore
By : user3154361
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , I think you're not quite understanding what percentileofscore and percentile actually do. They are not inverses of each other. code :
ap = np.asarray(sorted(df))
Nx = df.shape[0]
indices = z1 / 100 * (Nx  1)
indices_below = np.floor(indices).astype(int)
indices_above = indices_below + 1
weight_above = indices  indices_below
weight_below = 1  weight_above
x1 = ap[b] * weight_below # 57.50000000000004
x2 = ap[a] * weight_above # 12.499999999999956
x1 + x2
70.0

Pandas describe vs scipy.stats percentileofscore with NaN?
By : 万明辉
Date : March 29 2020, 07:55 AM
may help you . scipy.stats.percentileofscore does not ignore nan, nor does it check for the value and handle it in some special way. It is just another floating point value in your data. This means the behavior of percentileofscore with data containing nan is undefined, because of the behavior of nan in comparisons:

