pandas用均值填充nan_python pandas dataframe:用条件均值填充nans

我有以下数据帧:

import numpy as np

import pandas as pd

df = pd.DataFrame(data={‘Cat’ : [‘A’, ‘A’, ‘A’,’B’, ‘B’, ‘A’, ‘B’],

‘Vals’ : [1, 2, 3, 4, 5, np.nan, np.nan]})

Cat Vals

0 A 1

1 A 2

2 A 3

3 B 4

4 B 5

5 A NaN

6 B NaN

我希望索引5和6基于’Cat’列填充’Vals’的条件均值,即2和4.5

以下代码工作正常:

means = df.groupby(‘Cat’).Vals.mean()

for i in df[df.Vals.isnull()].index:

df.loc[i, ‘Vals’] = means[df.loc[i].Cat]

Cat Vals

0 A 1

1 A 2

2 A 3

3 B 4

4 B 5

5 A 2

6 B 4.5

但我正在寻找更好的东西,比如

df.Vals.fillna(df.Vals.mean(Conditionally to column ‘Cat’))

编辑:我发现这个,这是一行更短,但我仍然不满意它:

means = df.groupby(‘Cat’).Vals.mean()

df.Vals = df.apply(lambda x: means[x.Cat] if pd.isnull(x.Vals) else x.Vals, axis=1)