Skip to contents

Trains and evaluates one or more registered diagnostic models on a given dataset.

Usage

models_dia(
  data,
  model = "all_dia",
  tune = FALSE,
  seed = 123,
  threshold_choices = "default",
  positive_label_value = 1,
  negative_label_value = 0,
  new_positive_label = "Positive",
  new_negative_label = "Negative"
)

Arguments

data

A data frame where the first column is the sample ID, the second is the outcome label, and subsequent columns are features.

model

A character string or vector of character strings, specifying which models to run. Use "all_dia" to run all registered models.

tune

Logical, whether to enable hyperparameter tuning for individual models.

seed

An integer, for reproducibility of random processes.

threshold_choices

A character string (e.g., "f1", "youden", "default") or a numeric value (0-1), or a named list/vector allowing different threshold strategies/values for each model.

positive_label_value

A numeric or character value in the raw data representing the positive class.

negative_label_value

A numeric or character value in the raw data representing the negative class.

new_positive_label

A character string, the desired factor level name for the positive class (e.g., "Positive").

new_negative_label

A character string, the desired factor level name for the negative class (e.g., "Negative").

Value

A named list, where each element corresponds to a run model and contains its trained model_object, sample_score data frame, and evaluation_metrics.

Examples

# \donttest{
# This example assumes your package includes a dataset named 'train_dia'.
# If not, you should create a toy data frame similar to the one below.
#
# train_dia <- data.frame(
#   ID = paste0("Patient", 1:100),
#   Disease_Status = sample(c(0, 1), 100, replace = TRUE),
#   FeatureA = rnorm(100),
#   FeatureB = runif(100)
# )

# Ensure the 'train_dia' dataset is available in the environment
# For example, if it is exported by your package:
# data(train_dia)

# Check if 'train_dia' exists, otherwise skip the example
if (exists("train_dia")) {
  # 1. Initialize the modeling system
  initialize_modeling_system_dia()

  # 2. Run selected models
  results <- models_dia(
    data = train_dia,
    model = c("rf", "lasso"), # Run only Random Forest and Lasso
    threshold_choices = list(rf = "f1", lasso = 0.6), # Different thresholds
    positive_label_value = 1,
    negative_label_value = 0,
    new_positive_label = "Case",
    new_negative_label = "Control",
    seed = 42
  )

  # 3. Print summaries
  for (model_name in names(results)) {
    print_model_summary_dia(model_name, results[[model_name]])
  }
}
#> Diagnostic modeling system already initialized.
#> Running model: rf
#> Warning: ci.auc() of a ROC curve with AUC == 1 is always 1-1 and can be misleading.
#> Running model: lasso
#> 
#> --- rf Model (on Training Data) Metrics ---
#> 
#> AUROC: 1.0000 (95% CI: 1.0000 - 1.0000)
#> AUPRC: 1.0000
#> Accuracy: 1.0000
#> F1: 1.0000
#> Precision: 1.0000
#> Recall: 1.0000
#> Specificity: 1.0000
#> --------------------------------------------------
#> 
#> --- lasso Model (on Training Data) Metrics ---
#> 
#> AUROC: 0.9946 (95% CI: 0.9910 - 0.9983)
#> AUPRC: 0.9995
#> Accuracy: 0.9722
#> F1: 0.9847
#> Precision: 0.9822
#> Recall: 0.9872
#> Specificity: 0.8250
#> --------------------------------------------------
# }