Title: | Explainers for Regression Models in HIV Research |
---|---|
Description: | A dedicated viral-explainer model tool designed to empower researchers in the field of HIV research, particularly in viral load and CD4 (Cluster of Differentiation 4) lymphocytes regression modeling. Drawing inspiration from the 'tidymodels' framework for rigorous model building of Max Kuhn and Hadley Wickham (2020) <https://www.tidymodels.org>, and the 'DALEXtra' tool for explainability by Przemyslaw Biecek (2020) <doi:10.48550/arXiv.2009.13248>. It aims to facilitate interpretable and reproducible research in biostatistics and computational biology for the benefit of understanding HIV dynamics. |
Authors: | Juan Pablo Acuña González [aut, cre] |
Maintainer: | Juan Pablo Acuña González <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.3.1 |
Built: | 2025-01-03 03:58:33 UTC |
Source: | https://github.com/juanv66x/viralx |
This function generates a visualization for the global feature importance of a Cubist Rules (CR) model trained on HIV data with specified hyperparameters.
glob_cr_vis(vip_featured, hiv_data, cr_hyperparameters, vip_train, v_train)
glob_cr_vis(vip_featured, hiv_data, cr_hyperparameters, vip_train, v_train)
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
cr_hyperparameters |
A list of hyperparameters for the CR model, including:
|
vip_train |
The dataset used for training the CR model. |
v_train |
The response variable used for training the CR model. |
A visualization of global feature importance for the CR model.
## Not run: library(dplyr) library(rsample) library(rules) library(Cubist) set.seed(123) hiv_data <- train2 cr_hyperparameters <- list(neighbors = 5, committees = 58) vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") vip_train <- train2 |> dplyr::select(rsample::all_of(vip_features)) v_train <- train2 |> dplyr::select(rsample::all_of(vip_featured)) glob_cr_vis(vip_featured, hiv_data, cr_hyperparameters, vip_train, v_train) ## End(Not run)
## Not run: library(dplyr) library(rsample) library(rules) library(Cubist) set.seed(123) hiv_data <- train2 cr_hyperparameters <- list(neighbors = 5, committees = 58) vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") vip_train <- train2 |> dplyr::select(rsample::all_of(vip_features)) v_train <- train2 |> dplyr::select(rsample::all_of(vip_featured)) glob_cr_vis(vip_featured, hiv_data, cr_hyperparameters, vip_train, v_train) ## End(Not run)
This function generates a visualization for the global feature importance of a K-Nearest Neighbors (KNN) model trained on HIV data with specified hyperparameters.
glob_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train)
glob_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train)
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The dataset used for training the KNN model. |
v_train |
The response variable used for training the KNN model. |
A visualization of global feature importance for the KNN model.
## Not run: library(dplyr) library(rsample) set.seed(123) hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") vip_train <- train2 |> select(all_of(vip_features)) v_train <- train2 |> select(all_of(vip_featured)) glob_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train) ## End(Not run)
## Not run: library(dplyr) library(rsample) set.seed(123) hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") vip_train <- train2 |> select(all_of(vip_features)) v_train <- train2 |> select(all_of(vip_featured)) glob_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train) ## End(Not run)
The glob_nn_vis
function generates a global visualization of SHAP (Shapley
Additive Explanations) values for a neural network model. It utilizes the
DALEXtra package to explain the model's predictions and then creates a global
SHAP visualization.
glob_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train)
glob_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train)
vip_featured |
A character value specifying the featured variable of interest. |
hiv_data |
A data frame containing the HIV research data used for model training. |
hu |
A numeric value specifying the number of hidden units in the neural network model. |
plty |
A numeric value specifying the penalty parameter for the neural network model. |
epo |
A numeric value specifying the number of epochs (training iterations) for the neural network model. |
vip_train |
A data frame containing the training data used to fit the neural network model. |
v_train |
A numeric vector representing the response variable corresponding to the training data. |
A global visualization of SHAP values for the specified neural network model.
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) v_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_featured)) glob_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train) ## End(Not run)
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) v_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_featured)) glob_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train) ## End(Not run)
This dataset contains training data for viral load explainer models. It includes CD4 and viral load measurements for different years.
data(train2)
data(train2)
A tibble (data frame) with 25 rows and 6 columns.
To explore more rows of this dataset, you can use the print(n = ...)
function.
Juan Pablo Acuña González [email protected]
data(train2) train2
data(train2) train2
Explains the predictions of a K-Nearest Neighbors (KNN) model for CD4 and viral load data using the DALEX and DALEXtra packages. It provides insights into the specified variable's impact on the KNN model's predictions.
viralx_knn(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new)
viralx_knn(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new)
vip_featured |
The name of the variable to be explained. |
hiv_data |
The data frame containing the CD4 and viral load data. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The training data used for creating the explainer object. |
vip_new |
A new observation for which to generate explanations. |
A data frame containing explanations for the specified variable.
## Not run: hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_train <- hiv_data vip_new <- vip_train[1,] viralx_knn(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new) ## End(Not run)
## Not run: hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_train <- hiv_data vip_new <- vip_train[1,] viralx_knn(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new) ## End(Not run)
This function calculates global feature importance for a K-Nearest Neighbors (KNN) model trained on HIV data with specified hyperparameters.
viralx_knn_glob( vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train )
viralx_knn_glob( vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train )
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The dataset used for training the KNN model. |
v_train |
The response variable used for training the KNN model. |
A list of global feature importance measures for each predictor variable.
## Not run: library(dplyr) library(rsample) set.seed(123) hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") vip_train <- train2 |> select(all_of(vip_features)) v_train <- train2 |> select(all_of(vip_featured)) viralx_knn_glob(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train) ## End(Not run)
## Not run: library(dplyr) library(rsample) set.seed(123) hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") vip_train <- train2 |> select(all_of(vip_features)) v_train <- train2 |> select(all_of(vip_featured)) viralx_knn_glob(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train) ## End(Not run)
This function calculates SHAP (SHapley Additive exPlanations) values for a K-Nearest Neighbors (KNN) model trained on HIV data with specified hyperparameters.
viralx_knn_shap( vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings )
viralx_knn_shap( vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings )
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The dataset used for training the KNN model. |
vip_new |
The dataset for which SHAP values are calculated. |
orderings |
The number of orderings for SHAP value calculations. |
A list of SHAP values for each observation in vip_new
.
## Not run: set.seed(123) hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_train <- hiv_data vip_new <- vip_train[1, ] orderings <- 20 viralx_knn_shap(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings) ## End(Not run)
## Not run: set.seed(123) hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_train <- hiv_data vip_new <- vip_train[1, ] orderings <- 20 viralx_knn_shap(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings) ## End(Not run)
Visualizes SHAP (Shapley Additive Explanations) values for a KNN (K-Nearest Neighbor) model by employing the DALEXtra and DALEX packages to provide visual insights into the impact of a specified variable on the model's predictions.
viralx_knn_vis( vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings )
viralx_knn_vis( vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings )
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The dataset used for training the KNN model. |
vip_new |
The dataset for which SHAP values are calculated. |
orderings |
The number of orderings for SHAP value calculations. |
A list of SHAP values for each observation in vip_new
.
## Not run: set.seed(123) hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_train <- hiv_data vip_new <- vip_train[1,] orderings <- 20 viralx_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings) ## End(Not run)
## Not run: set.seed(123) hiv_data <- train2 knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783) vip_featured <- "cd_2022" vip_train <- hiv_data vip_new <- vip_train[1,] orderings <- 20 viralx_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings) ## End(Not run)
Explains the predictions of a Multivariate Adaptive Regression Splines (MARS) model for viral load or CD4 counts using the DALEX and DALEXtra tools.
viralx_mars(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new)
viralx_mars(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new)
vip_featured |
A character value |
hiv_data |
A data frame |
nt |
A numeric value |
pd |
A numeric value |
pru |
A character value |
vip_train |
A data frame |
vip_new |
A numeric vector |
A data frame
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() nt <- 3 pd <- 1 pru <- "none" vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] viralx_mars(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new) ## End(Not run)
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() nt <- 3 pd <- 1 pru <- "none" vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] viralx_mars(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new) ## End(Not run)
Explains the predictions of a MARS (Multivariate Adaptive Regression Splines) model using SHAP (Shapley Additive Explanations) values. It utilizes the DALEXtra and DALEX packages to provide SHAP-based explanations for the specified model.
viralx_mars_shap( vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new, orderings )
viralx_mars_shap( vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new, orderings )
vip_featured |
A character value |
hiv_data |
A data frame |
nt |
A numeric value |
pd |
A numeric value |
pru |
A character value |
vip_train |
A data frame |
vip_new |
A numeric vector |
orderings |
A numeric value |
A data frame
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() nt <- 3 pd <- 1 pru <- "none" vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] orderings <- 20 viralx_mars_shap(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new,orderings) ## End(Not run)
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() nt <- 3 pd <- 1 pru <- "none" vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] orderings <- 20 viralx_mars_shap(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new,orderings) ## End(Not run)
Visualizes SHAP (Shapley Additive Explanations) values for a MARS (Multivariate Adaptive Regression Splines) model by employing the DALEXtra and DALEX packages to provide visual insights into the impact of a specified variable on the model's predictions.
viralx_mars_vis( vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new, orderings )
viralx_mars_vis( vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new, orderings )
vip_featured |
A character value |
hiv_data |
A data frame |
nt |
A numeric value |
pd |
A numeric value |
pru |
A character value |
vip_train |
A data frame |
vip_new |
A numeric vector |
orderings |
A numeric value |
A ggplot object
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() nt <- 3 pd <- 1 pru <- "none" vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] orderings <- 20 viralx_mars_vis(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new, orderings) ## End(Not run)
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() nt <- 3 pd <- 1 pru <- "none" vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] orderings <- 20 viralx_mars_vis(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new, orderings) ## End(Not run)
Explains the predictions of a neural network regression model for viral load or CD4 counts using the DALEX and DALEXtra tools
viralx_nn(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new)
viralx_nn(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new)
vip_featured |
A character value |
hiv_data |
A data frame |
hu |
A numeric value |
plty |
A numeric value |
epo |
A numeric value |
vip_train |
A data frame |
vip_new |
A numeric vector |
A data frame
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] viralx_nn(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new) ## End(Not run)
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] viralx_nn(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new) ## End(Not run)
The viralx_nn_glob function is designed to provide global explanations for the specified neural network model.
viralx_nn_glob(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train)
viralx_nn_glob(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train)
vip_featured |
A character value specifying the variable of interest for which you want to explain predictions. |
hiv_data |
A data frame containing the dataset used for training the neural network model. |
hu |
A numeric value representing the number of hidden units in the neural network. |
plty |
A numeric value representing the penalty term for the neural network model. |
epo |
A numeric value specifying the number of epochs for training the neural network. |
vip_train |
A data frame containing the training data used for generating global explanations. |
v_train |
A numeric vector representing the target variable for the global explanations. |
A list containing global explanations for the specified neural network model.
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) v_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_featured)) viralx_nn_glob(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train) ## End(Not run)
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) v_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_featured)) viralx_nn_glob(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train) ## End(Not run)
Explains the predictions of a neural network model using SHAP (Shapley Additive Explanations) values. It utilizes the DALEXtra and DALEX packages to provide SHAP-based explanations for the specified model.
viralx_nn_shap( vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings )
viralx_nn_shap( vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings )
vip_featured |
A character value |
hiv_data |
A data frame |
hu |
A numeric value |
plty |
A numeric value |
epo |
A numeric value |
vip_train |
A data frame |
vip_new |
A numeric vector |
orderings |
A numeric value |
A data frame
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] orderings <- 20 viralx_nn_shap(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings) ## End(Not run)
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] orderings <- 20 viralx_nn_shap(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings) ## End(Not run)
Visualizes SHAP (Shapley Additive Explanations) values for a neural network model by employing the DALEXtra and DALEX packages to provide visual insights into the impact of a specified variable on the model's predictions.
viralx_nn_vis( vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings )
viralx_nn_vis( vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings )
vip_featured |
A character value |
hiv_data |
A data frame |
hu |
A numeric value |
plty |
A numeric value |
epo |
A numeric value |
vip_train |
A data frame |
vip_new |
A numeric vector |
orderings |
A numeric value |
A ggplot object
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] orderings <- 20 viralx_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings) ## End(Not run)
## Not run: library(dplyr) library(rsample) cd_2019 <- c(824, 169, 342, 423, 441, 507, 559, 173, 764, 780, 244, 527, 417, 800, 602, 494, 345, 780, 780, 527, 556, 559, 238, 288, 244, 353, 169, 556, 824, 169, 342, 423, 441, 507, 559) vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103, 11388, 46, 103, 11388, 40, 0, 11388, 0, 4095, 40, 93, 49, 49, 49, 4095, 6837, 38961, 38961, 0, 0, 93, 40, 11388, 38961, 40, 75, 4095, 103) cd_2021 <- c(992, 275, 331, 454, 479, 553, 496, 230, 605, 432, 170, 670, 238, 238, 634, 422, 429, 513, 327, 465, 479, 661, 382, 364, 109, 398, 209, 1960, 992, 275, 331, 454, 479, 553, 496) vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0, 262, 0, 15089, 13016, 1513, 60, 60, 49248, 159308, 56, 0, 516675, 49, 237, 84, 292, 414, 26176, 62, 126, 93, 80, 1690, 5113, 71, 289, 3063, 0) cd_2022 <- c(700, 127, 127, 547, 547, 547, 777, 149, 628, 614, 253, 918, 326, 326, 574, 361, 253, 726, 659, 596, 427, 447, 326, 253, 248, 326, 260, 918, 700, 127, 127, 547, 547, 547, 777) vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0, 955, 0, 0, 0, 0, 40, 0, 49248, 159308, 56, 0, 516675, 49, 237, 0, 23601, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |> as.data.frame() set.seed(123) hi_data <- rsample::initial_split(x) set.seed(123) hiv_data <- hi_data |> rsample::training() hu <- 5 plty <- 1.131656e-09 epo <- 176 vip_featured <- c("cd_2022") vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022") set.seed(123) vi_train <- rsample::initial_split(x) set.seed(123) vip_train <- vi_train |> rsample::training() |> dplyr::select(rsample::all_of(vip_features)) vip_new <- vip_train[1,] orderings <- 20 viralx_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings) ## End(Not run)