Monday, August 29, 2016
One Line Code Challenge: Special Merge
Suppose we have two data frames:
In[29]: df_data
Out[29]:
A B C val_0
0 Y M 1 1
1 Y F 1 2
2 N M 1 3
3 N F 1 4
4 Y M 2 5
5 Y F 2 4
6 N M 2 3
7 N F 2 2
In[30]: df_rule
Out[30]:
A B C val_1
0 Y NaN NaN 100
1 N F NaN 200
2 N M NaN 300
In the context of an ordinary merge on ['A','B','C'], there is no matches because there are NaN in the df_rule.
In the context of a special merge, we allow NaN to be anything. That is to say
(Y,M,1) can match with (Y,M,NaN) (or (Y,NaN,1)etc)
The task is to achieve this special merge.
Solution:
df_rule.groupby(df_rule.apply(lambda x: '-'.join(x[x.notnull()].index.values), axis=1)).apply(lambda df: pd.merge(df, df_data,on=df.dropna(axis=1,how='all').columns.intersection(df_data.columns).tolist(), how='left',suffixes=('_to_remove','')))[df_rule.columns.union(df_data.columns)]
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment