I would like to create a column of boolean values based on the evaluation of one other column using pandas. Ideally, I would like to do it with syntax similar to what I've copied, but if that's impossible, I'm open to other suggestions.
df is a pandas dataframe. AggRow is a column of data with integers.
So, I have data for AggRow that have a range of values. I can successfully create a new column, conditionmet, based on a single criterion, like so, if I want conditionmet to be True wherever AggRow is less than or equal to 6001:
conditionmet = df['AggRow'] <= 6001
But if I want conditionmet to be true if AggRow is <= 6001 or between 10001 and 10009, inclusive, I am having trouble. The following expression yields conditionmet = True only for the first condition, i.e., where AggRow <= 6001, apparently ignoring what I'm telling it about 10001 and 10009.
conditionmet = ((df['AggRow'] <= 6001) | ((df['AggRow'] >= 10001) & (df['AggRow'] <= 10009)))
How do I make conditionmet = True whereever AggRow <= 6001 or AggRow both >= 10001 and <= 10009? Again, I'd like an answer that uses similar syntax, if possible. Thank you.
Try using .loc, this will populate the column 'conditionmet' with True where your conditions are met.
df.loc[(df['AggRow'] <= 6001) |((df['AggRow'] <= 10009) &(df['AggRow'] >= 10001)), 'conditionmet'] = 'True'
You can also fill the NaN's (aka every record where your conditions above were not met) with 'False' if you desire to: