Implements a Voting ensemble, combining predictions from multiple base models through soft or hard voting.
Usage
voting_dia(
results_all_models,
data,
type = c("soft", "hard"),
weight_metric = "AUROC",
top = 5,
seed = 789,
threshold_choices = "f1",
positive_label_value = 1,
negative_label_value = 0,
new_positive_label = "Positive",
new_negative_label = "Negative"
)
Arguments
- results_all_models
A list of results from
models_dia()
, containing trained base model objects and their evaluation metrics.- data
A data frame where the first column is the sample ID, the second is the outcome label, and subsequent columns are features. Used for evaluation.
- type
A character string, "soft" for weighted average of probabilities or "hard" for majority class voting.
- weight_metric
A character string, the metric to use for weighting base models in soft voting (e.g., "AUROC", "F1"). Ignored for hard voting.
- top
An integer, the number of top-performing base models (ranked by
weight_metric
) to include in the ensemble.- seed
An integer, for reproducibility.
- threshold_choices
A character string (e.g., "f1", "youden", "default") or a numeric value (0-1) for determining the evaluation threshold for the ensemble.
- positive_label_value
A numeric or character value in the raw data representing the positive class.
- negative_label_value
A numeric or character value in the raw data representing the negative class.
- new_positive_label
A character string, the desired factor level name for the positive class (e.g., "Positive").
- new_negative_label
A character string, the desired factor level name for the negative class (e.g., "Negative").
Examples
# \donttest{
# 1. Initialize the modeling system
initialize_modeling_system_dia()
#> Diagnostic modeling system already initialized.
# 2. Create a toy dataset for demonstration
set.seed(42)
data_toy <- data.frame(
ID = paste0("Sample", 1:60),
Status = sample(c(0, 1), 60, replace = TRUE),
Feat1 = rnorm(60),
Feat2 = runif(60)
)
# 3. Generate mock base model results (as if from models_dia)
base_model_results <- models_dia(
data = data_toy,
model = c("rf", "lasso"),
seed = 123
)
#> Running model: rf
#> Warning: ci.auc() of a ROC curve with AUC == 1 is always 1-1 and can be misleading.
#> Running model: lasso
# 4. Run the soft voting ensemble
soft_voting_results <- voting_dia(
results_all_models = base_model_results,
data = data_toy,
type = "soft",
weight_metric = "AUROC",
top = 2,
threshold_choices = "f1"
)
#> Running Voting model: Voting_dia (type: soft)
#> Warning: ci.auc() of a ROC curve with AUC == 1 is always 1-1 and can be misleading.
print_model_summary_dia("Soft Voting", soft_voting_results)
#>
#> --- Soft Voting Model (on Training Data) Metrics ---
#> Ensemble Type: Voting (Type: soft, Weight Metric: AUROC, Base models used: rf, lasso)
#>
#> AUROC: 1.0000 (95% CI: 1.0000 - 1.0000)
#> AUPRC: 1.0000
#> Accuracy: 1.0000
#> F1: 1.0000
#> Precision: 1.0000
#> Recall: 1.0000
#> Specificity: 1.0000
#> --------------------------------------------------
# }