I am trying to produce partial dependence plots for a machine learning analysis (using XGBClassifier). Right now I am using the PDPIsolate function from pdpbox. However, I would like to create a datafame with two columns, one for the feature value and one for the predicted value, instead of just producing the plot automatically from the built in function so that I can plot it on my own later. However, I can't figure out how to extract the values used to create the plot that the PDPIsolate function uses.
Right now my code is:
pdp_iso = pdp.PDPIsolate(
model=model,
df=x_train_scaled,
model_features=x_train_scaled.columns.tolist(),
feature='mei',
feature_name='mei'
)
And I would like to create and output dataframe like:
pdp_data = pd.DataFrame({
'feature_values': pdp_iso.results, # this doesnt work/exist
'pdp': pdp_iso.pdp, # this doesn't work/exist
'feature_name': feature_name
})
Does PDPIsolate have the functionality to do something like this? Or is there another package that can do this?