python - pandas dataframe: groupby by several columns, apply function and map back the result -


here there example:

np.random.seed(1) df = pd.dataframe({"x": np.random.random(size=10),                    "y": np.arange(10)}) df["z"] = np.where(df.x < 0.5, 0, 1) print df 

it gives following result:

          x  y  z 0  0.417022  0  0 1  0.720324  1  1 2  0.000114  2  0 3  0.302333  3  0 4  0.146756  4  0 5  0.092339  5  0 6  0.186260  6  0 7  0.345561  7  0 8  0.396767  8  0 9  0.538817  9  1 

i want add new column mean df mean values computed values of x column grouped y , z columns. know how compute mean values:

tmp = df.groupby(["y", "z"]).mean() 

however can't find out how map results new column mean.

use transform add result of groupby operation column, transform returns series it's index aligned original df:

in [15]: df['mean'] = df.groupby(["y", "z"]).transform('mean') df  out[15]:           x  y  z      mean 0  0.423578  0  0  0.423578 1  0.270675  1  0  0.270675 2  0.707611  2  1  0.707611 3  0.589192  3  1  0.589192 4  0.768653  4  1  0.768653 5  0.420577  5  0  0.420577 6  0.930490  6  1  0.930490 7  0.380576  7  0  0.380576 8  0.055940  8  0  0.055940 9  0.678355  9  1  0.678355 

Comments

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

windows - Debug iNetMgr.exe unhandle exception System.Management.Automation.CmdletInvocationException -

android - CoordinatorLayout, FAB and container layout conflict -