I'm trying to impute NA values in engine_capacity column with the median of engine_capacity grouped by car_model
I want to search for every NA value in nancap dataframe and if there's an NA value replace it with median engine_capacity in cap dataframe(only if it's the same car_model), I tried doing the following code but it didn't work. (sorry if my question is not clear)
url = 'https://raw.githubusercontent.com/YousefAlotaibi/saudi_used_cars_price_prediciton/main/data/cars_cleaned_data.csv'
df = pd.read_csv(url)
df.head()
cap = df.groupby('car_model')['engine_capacity'].median().reset_index()
nancap = df[['engine_capacity', 'car_model']]
for i, z in nancap.itertuples(index=False):
if i.is_integer() == False: # if NA
for c, ca in cap.itertuples(index=False):
if c == z: # if car_model in c of cap == car_model of z in cap
i = ca # assign median engine capacity which is ca to i
🟢 Solution