logo
Tags down

shadow

Take the difference of all elements of a series with the previous ones in python pandas


By : Gaurav Singh
Date : October 16 2020, 06:10 PM
Does that help I have a dataframe with sorted values labeled by ids and I want to take the difference of the value for the first element of an id with the value of the last elements of the all previous ids. The code below does what I want: , Depending on how many ids you have, this works with few thousands:
code :
# enumerate ids, should be careful
ids = [a,b,c]
num_ids = len(ids)

# compute first and last
f = df.groupby('id').value.agg(['first','last'])

# lower triangle mask
mask = np.array([[i>=j for j in range(num_ids)] for i in range(num_ids)])

# compute diff of first and last, then mask 
diff = np.where(mask, None, f['first'][None,:] - f['last'][:,None])
diff = pd.DataFrame(diff,
                    index = ids,
                    columns = ids)
# stack
diff.stack()
a  b    2
   c    4
b  c    1
dtype: object
# create blocks of consecutive id
blocks = df['id'].ne(df['id'].shift()).cumsum()

# groupby
groups = df.groupby(blocks)

# create first and last values
df['fv'] = groups.value.transform('first')
df['lv'] = groups.value.transform('last')

# the above f and ids 
# note the column name change
f = df[['id','fv', 'lv']].drop_duplicates()
ids = f['id'].values
num_ids = len(ids)
a   b     2
    c     4
    a     5
b   c     1
    a     2
c   a     1
dtype: object


Share : facebook icon twitter icon

Difference between dictionary and pandas series in Python


By : Deeptish Mukherjee
Date : March 29 2020, 07:55 AM
it should still fix some issue I have a requirement to keep data in key value pairs. I search and found 2 ways in python: , Always read the docs first
But since you asked:

Pandas Time Series - Add difference from previous column to New Row


By : Ted Moon
Date : March 29 2020, 07:55 AM
I wish this help you I'm not sure I fully understand unless you provide the values you are looking for but maybe this is it?
code :
df['ES_Inverse_price'] = df['Close_x'].shift(1) - df['ES_difference']
df
   close_p ticker
0    100.0   aapl
1    102.0   aapl
2    103.4   aapl
3    101.2   aapl
4    106.2   apple

df['es_difference'] = df['close_p'].diff()
   close_p ticker  es_difference
0    100.0   aapl            NaN
1    102.0   aapl            2.0
2    103.4   aapl            1.4
3    101.2   aapl           -2.2
4    106.2   apple            5.0

df['es_inverse_price'] = df['close_p']-df['es_difference'].cumsum() - df['es_difference'].cumsum()
   close_p ticker  es_difference  es_inverse_price
0    100.0   aapl            NaN               NaN
1    102.0   aapl            2.0              98.0
2    103.4   aapl            1.4              96.6
3    101.2   aapl           -2.2              98.8
4    106.2   aapl            5.0              93.8

Python & Pandas - pd.Series difference between int32 and int64


By : Njb
Date : March 29 2020, 07:55 AM
it helps some times They're semantically different in that in the first version you pass a dict with a single scalar value so the dtype becomes int64, for the second, you pass a range which can be trvially converted to a numpy array and this is int32:
code :
In[57]:
np.array(range(6)).dtype

Out[57]: dtype('int32')
subarr = np.array(arr, dtype=object, copy=copy)

python pandas time series count number of previous matches


By : user1842234
Date : March 29 2020, 07:55 AM
Any of those help First use groupby on client_employer then access client_name column and transform the column using map created based on dict of client_name unique values as keys and range of number of unique values as values:
code :
df['employers_count'] = df.groupby(['client_employer'])['client_name'].transform(lambda x: x.map(dict(zip(x.unique(),range(x.nunique())))))

         date client_employer client_name  employers_count
0  2015-01-05       company A       John                 0
1  2015-01-06       company B        Bill                0
2  2015-01-07       company B        Bill                0
3  2015-01-08       company A       Sarah                1
4  2015-01-09       company B        Alex                1
5  2015-01-10       company B       Brian                2

Drop element in numpy array (or pandas series) if difference to previous element is <N


By : user3250033
Date : March 29 2020, 07:55 AM
may help you . I have a numpy array that looks like that: , IIUC
code :
s=pd.Series(a)
s[~(s.diff()<=2)]
Out[289]: 
0     0
1    10
2    19
4    30
5    40
7    49
dtype: int32
s[~(s.diff()<=2)].to_numpy()
Out[292]: array([ 0, 10, 19, 30, 40, 49])
Related Posts Related Posts :
  • Python hex string encoding
  • Get week start date from week number
  • How to use imports from requirements.txt in python
  • Removing tab indent in ipython shell
  • I need to remove duplicates from a list but add the numeric value in them
  • Delay default arguments being read until function is called
  • Interpolate / fillna with a decay formula in pandas
  • What python package can translate Greek letter to ASCII requivalent?
  • How to get output of OS command from Jupyter notebook?
  • Printing AND writing the RIGHTLY formatted number
  • How do I create a shortcut to import most used python modules?
  • Matplotlib: Show selected date labels on x axis
  • Understanding memoization in Python
  • why does the len function return 2 on some iterations when they are all the same length?
  • Change in preference value does not affect the results of Affinity propagation Clustering
  • returning values inside a function
  • Why cant I use a variable in str slicing?
  • Section divider in Spyder
  • Conditional statement in selenium if element does not exists
  • Pandas : how to select index/row label in dataframe that matches a condition
  • What does zero do in A[0] in this code? Why not empty or another number?
  • Google App Engine urlfetch PayloadTooLargeError: Request exceeds 10 MiB limit for URL
  • Is there a way to set up optional arguments to bypass input arguments?
  • Suppress OpenMP debug messages when running Tensorflow on CPU
  • How to do GridSearchCV for F1-score in classification problem with scikit-learn?
  • Why does .pop() eventually stop and not keep removing items from a list until the list is empty?
  • How do I acess my Spider data from my main.py script?
  • Python Pandas Expand a Column of List of Lists to Two New Column
  • Overhead of python multiprocessing initialization is worse than benefits
  • Python Joining List and adding and removing characters
  • Adding an lxml library to project
  • Concatenating tensors in Tensorflow with None axis
  • Need help understanding why i get attribute error
  • How to force a MIDI device to report control status?
  • What does *** mean in Python -3?
  • How to get GFCC instead of MFCC in python?
  • How do I print a number n times in python?
  • How do i split a string wherever there are digits?
  • List Comprehension Python Prime numbers
  • "list index out of range" when reading data from file
  • What's the correct datetime format for the specified date string?
  • I cannot import CSV file?
  • Matplotlib pyplot plots look different after calling pandas profiling. How can I fix this?
  • Stopping all the instances of a specific region
  • Deal with Birtish summer time
  • Unable to use ColorWheel without loading kv (AttributeError)
  • What are these characters called: 。. !?etc Trying to split sentences stops working with non standard characters
  • rand.randint returning same number over and over?
  • Find longest sequence that does not contain a certain number
  • How do I convert a map object to list and also assign to a variable
  • sympy error: 'Symbol' object has no attribute 'pi'
  • How to remove words without vowels from a list in python
  • Downloading python to macbook
  • TypeError: __init__() missing 1 required positional argument: 'units'
  • Check if a class is a dataclass in Python
  • Unable to scrape google news heading via their class
  • Array of structs with dynamic allocation runs very slow in C in comparison to Python
  • Python Pandas - find all unique combinations of rows of a DataFrame without repeating values in the columns
  • How do I change the numbers in a cell to the word 'Bus' in Pandas Python
  • 'ascii' codec can't encode character : ordinal not in range (128)
  • shadow
    Privacy Policy - Terms - Contact Us © 35dp-dentalpractice.co.uk