Skip to content

AssertionError: Times should be within the range of event times to avoid exterpolation #138

@rvandewater

Description

@rvandewater

Hi,

Thank you for creating this package.

I am encountering an error when using my own dataset for creating a survival regression model (see below). I am using the Survival Regression with Auton-Survival notebook with the cox proportional hazards model (see code below error). I am using a preprocessed dataset extracted from eICU with the max time value 168 for train, test, and val.

What I tried: when I try to replace the 168 in validation to 167 it gives me the same error. I checked the original example, and there seems to be the same situation that the max value in validation is equal to the same value in training; however, it does not throw an error here.

Thank you for your help.

  nonnumeric_cols = [col for (col, dtype) in df.dtypes.iteritems() if dtype.name == "category" or dtype.kind not in "biuf"]

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[44], line 22
     20     # Obtain survival probabilities for validation set and compute the Integrated Brier Score 
     21     predictions_val = model.predict_survival(x_val, times)
---> 22     metric_val = survival_regression_metric('ibs', y_val, predictions_val, times, y_tr)
     23     models.append([metric_val, model])
     25 # Select the best model based on the mean metric value computed for the validation set

File ~/projects/auton-survival/auton_survival/metrics.py:215, in survival_regression_metric(metric, outcomes, predictions, times, outcomes_train, n_bootstrap, random_seed)
    211     outcomes_train = outcomes
    212     warnings.warn("You are are evaluating model performance on the \
    213 same data used to estimate the censoring distribution.")
--> 215   assert max(times) < outcomes_train.time.max(), "Times should \
    216 be within the range of event times to avoid exterpolation."
    217   assert max(times) <= outcomes.time.max(), "Times \
    218 must be within the range of event times."
    220   survival_train = util.Surv.from_dataframe('event', 'time', outcomes_train)

AssertionError: Times should be within the range of event times to avoid exterpolation.
from auton_survival.estimators import SurvivalModel
from auton_survival.metrics import survival_regression_metric
from sklearn.model_selection import ParameterGrid

# Define parameters for tuning the model
param_grid = {'l2' : [1e-3, 1e-4]}
params = ParameterGrid(param_grid)

# Define the times for model evaluation
times = np.quantile(y_tr['time'][y_tr['event']==1], np.linspace(0.1, 1, 10)).tolist()

# Perform hyperparameter tuning 
models = []
for param in params:
    model = SurvivalModel('cph', random_seed=2, l2=param['l2'])
    
    # The fit method is called to train the model
    model.fit(x_tr, y_tr)

    # Obtain survival probabilities for validation set and compute the Integrated Brier Score 
    predictions_val = model.predict_survival(x_val, times)
    metric_val = survival_regression_metric('ibs', y_val, predictions_val, times, y_tr)
    models.append([metric_val, model])
    
# Select the best model based on the mean metric value computed for the validation set
metric_vals = [i[0] for i in models]
first_min_idx = metric_vals.index(min(metric_vals))
model = models[first_min_idx][1]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions