python - pandas dataframe: groupby by several columns, apply function and map back the result -
here there example:
np.random.seed(1) df = pd.dataframe({"x": np.random.random(size=10), "y": np.arange(10)}) df["z"] = np.where(df.x < 0.5, 0, 1) print df
it gives following result:
x y z 0 0.417022 0 0 1 0.720324 1 1 2 0.000114 2 0 3 0.302333 3 0 4 0.146756 4 0 5 0.092339 5 0 6 0.186260 6 0 7 0.345561 7 0 8 0.396767 8 0 9 0.538817 9 1
i want add new column mean
df
mean values computed values of x
column grouped y
, z
columns. know how compute mean values:
tmp = df.groupby(["y", "z"]).mean()
however can't find out how map results new column mean
.
use transform
add result of groupby
operation column, transform
returns series
it's index aligned original df:
in [15]: df['mean'] = df.groupby(["y", "z"]).transform('mean') df out[15]: x y z mean 0 0.423578 0 0 0.423578 1 0.270675 1 0 0.270675 2 0.707611 2 1 0.707611 3 0.589192 3 1 0.589192 4 0.768653 4 1 0.768653 5 0.420577 5 0 0.420577 6 0.930490 6 1 0.930490 7 0.380576 7 0 0.380576 8 0.055940 8 0 0.055940 9 0.678355 9 1 0.678355
Comments
Post a Comment