logo
Tags down

shadow

Pandas Dataframe sum function with various column criteria


By : Regis Spindola
Date : September 15 2020, 01:00 AM
To fix this issue Idea is chain by | for bitwise OR new constion for compare by whitespace:
code :
def per_sum(startdate, enddate, fund, account, analysis):
    return df[(df.Datenum > startdate) &
              (df.Datenum < enddate) &
              ((df.Fund == fund) | (fund == '')) &
              ((df.Account == account) | (account == '')) &
              ((df.Analysis == analysis) | (analysis == ''))
              ].Cost.sum()

print(per_sum(20190000,20200000,'','',''))
2502.36

print(per_sum(20190000,20200000,'','','I2'))
2065.0
def per_sum(startdate, enddate, fund, account, analysis):
    startdate = -np.inf if startdate == '' else startdate
    enddate = np.inf if enddate == '' else enddate
    return df[(df.Datenum > startdate) &
              (df.Datenum < enddate) &
              ((df.Fund == fund) | (fund == '')) &
              ((df.Account == account) | (account == '')) &
              ((df.Analysis == analysis) | (analysis == ''))
              ].Cost.sum()

print(per_sum('','','','',''))
2707.36


Share : facebook icon twitter icon

Get column and row index pairs of Pandas DataFrame matching some criteria


By : henry
Date : March 29 2020, 07:55 AM
around this issue You can use numpy's argwhere:
code :
In [11]: np.argwhere(c2 > 0.8)
Out[11]: 
array([[1, 3],
       [1, 4],
       [3, 4]])
[(c2.index[i], c2.columns[j]) for i, j in np.argwhere(c2 > 0.8)]

Filtering Pandas dataframe on two criteria where one column is a list


By : Pradeep Upadhyay
Date : March 29 2020, 07:55 AM
I wish did fix the issue. I have a Pandas Dataframe with columns Project Type and Parts. I would like to know how many part As are used in projects of Project Type 1. I am trying to use .count(), but it doesn't return just a single number. , you can try something like this :
code :
sum(['A' in i for i in parts_df[parts_df['Project Type']=='Type 1']['Parts'].tolist()])
In[32]: parts_df = pd.DataFrame(data = [['Type 1', ['A', 'B']], ['Type 2', ['A']], ['Type 1', ['C']]], columns=['Project Type', 'Parts'])
In[33]: sum(['A' in i for i in parts_df[parts_df['Project Type']=='Type 1']['Parts'].tolist()])
Out[33]: 1

Compute pandas DataFrame column for value between current row and a future row matching criteria


By : iUser
Date : March 29 2020, 07:55 AM
hop of those help? Use pd.merge_asof to bring the closest future measurement to a new column, then perform the subtraction.
code :
import pandas as pd

df = pd.merge_asof(df, 
                   df.loc[df.action != 0, ['measurement']], 
                   left_index=True, 
                   right_index=True, 
                   direction='forward',
                   allow_exact_matches=False,  # True if you want same row matches
                   suffixes=['', '_future'])

df['distance_to_action'] = df.measurement - df.measurement_future
    measurement  action  measurement_future  distance_to_action
0           101       0               313.0              -212.0
1           322       0               313.0                 9.0
2           313       1               234.0                79.0
3           454       0               234.0               220.0
4           511       0               234.0               277.0
5           234      -1               134.0               100.0
6           122       0               134.0               -12.0
7           134       1               432.0              -298.0
8           222       0               432.0              -210.0
9           321       0               432.0              -111.0
10          221       0               432.0              -211.0
11          432      -1                 NaN                 NaN

Vectorized way to find first column matching criteria in a Pandas DataFrame


By : user2253441
Date : March 29 2020, 07:55 AM
Any of those help You can use NumPy argmax, but will need to overwrite instances where your condition is never met in a given row:
code :
mask = df.lt(0.5)
df['first'] = np.where(mask.any(1), df.columns[mask.values.argmax(1)], 'No Match')
df['first'] = np.where(mask.any(1), mask.idxmax(1), 'No Match')

print(df)

          A         B         C     first
0  0.548814  0.791725  0.978618  No Match
1  0.715189  0.528895  0.799159  No Match
2  0.602763  0.568045  0.461479         C
3  0.544883  0.925597  0.780529  No Match
4  0.423655  0.071036  0.118274         A
5  0.645894  0.087129  0.639921         B
6  0.437587  0.020218  0.143353         A
7  0.891773  0.832620  0.944669  No Match
8  0.963663  0.778157  0.521848  No Match
9  0.383442  0.870012  0.414662         A

Apply function to dataframe column based on criteria of a different dataframe while ensuring no duplicates


By : user3631866
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further Will this work for you? I just added 3-4 lines to your code. Created 2 lists(lst, lst2) and kept on appending the values to lst2 which were already chosen. Before choosing the random name and returning the value for the New_Name column it will check if the value does not exist in lst2 to avoid duplicate names in the final df.
code :
def random_word(num):
    numDF = nameDF[nameDF['Number']==num]
    global lst
    lst = numDF['Name'].tolist()
    x = np.random.choice([i for i in lst if i not in lst2],1)[0]
    lst2.append(x)
    return x
global lst2
lst2 =[]
df['New_Name'] = df['Number'].apply(random_word)
df
Related Posts Related Posts :
  • Unicodedata.normalize : TypeError: normalize() argument 2 must be str, not list
  • What uses the memory of my python process? (RSS vs VMS)
  • Pandas: filter dataframe by multiple conditions with column containing nan
  • Selenium webdriver.Remote driver does not work with tor proxy(webdriver.Chrome does)
  • generate keyword arguments from positional arguments in python
  • Find all words including those with special characters
  • The total maximum value of the value chosen
  • Nested while Loops is not working in python
  • Why do my python sklearn logistic regression results differ from this example?
  • Python Regex: How do I use regular expression to read in a file with multiple lines, and extract words from each line to
  • What is the use of drop_first in pandas?
  • Is it possible to pass a Flask session to another .py-File which is not in the routing?
  • tensorflow_hub to pull BERT embedding on windows machine - extending to albert
  • Python Pandas slicing with various datatypes
  • Pandas: Checking and changing all items in a column
  • Why does __call__ returned values get garbage collected when calling a class twice: SomeClass()()
  • Insert element at every nth location in list of lists
  • PD Read in Jupyter Notebook 3.7
  • Visualize Results of each iteration of While Loop into a Time Series Chart
  • Run a function for each row and create a new Column Pandas Dataframe
  • How can I create a small IDLE-like Python Shell in Tkinter?
  • extract variable and data from a string in python
  • CUDA implementation of Softmax
  • The function to_excel of pandas generate an unexpected TypeError
  • string is contain with newline symbol (\n), how to use regex to replace \n to \n?
  • How can I use %s to replace text within a file in python?
  • How to Reference a Pandas Column that has a dot in the name
  • How to use tuple as a key of a dictionary
  • How to extract two integer values from a column of a dataframe
  • How properly build a class in the __new__ with type(3 args) and 2 ancestors?
  • How to declare the return of a function as the default parameter to another function without calling the first function?
  • Elegant way to check arguments across multiple functions
  • How can I replace elemts of a list with other elements
  • i want to use variable globally in veiws.py
  • Pandas data not being plotted
  • Python Generator: How do I generate pairs from two different lists based on user input (of how many pairs to print)
  • Python: How to use a dictionary to call methods (values in dictionary) to run based on user input (key in dictionary) in
  • Read lines between two keywords Python
  • How do you insert data from the user into the file with the most optimal using Python?
  • How do you create a loop that will work in Snowflake?
  • Why can't I change the __class__ attribute of an instance of object?
  • Concatenating pandas dataframes from pickle vs. from in-memory dictionary - why does in-memory fail?
  • How to Calculate time difference between two date columns
  • In '<string>' requires string as left operand, not list
  • Django clean() change field requirement
  • Python - TypeError: write() argument must be str, not bytes
  • Commutative Count in a groupby dataframe on other columns condition
  • Undo np.fft.fft2 to get the original image
  • What is the proper way to share a program without sharing personal information?
  • Pandas DataFrame - summing rows by multiple column values
  • Python - best approach to mapping codes in data to description
  • I need to know how to do this, but it may be impossible
  • pandas dataframe columns with list values
  • Wrong value of standard deviation
  • Django POST error: tuple has no attribute get, despite similar code working previously
  • Convert a 3d-array into 2d-array
  • how to find the shortest distance between two regex
  • Python unit testing on class methods with no input arguments
  • Bokeh: Repeated plotting in jupyter lab increases (browser) memory usage
  • Cannot convert string into float in a for loop
  • shadow
    Privacy Policy - Terms - Contact Us © 35dp-dentalpractice.co.uk