├── .ignore ├── man-roxygen ├── param_task.R ├── param_ydt.R ├── param_fselector.R ├── param_id.R ├── param_extra.R ├── param_param_set.R ├── param_callbacks.R ├── param_store_models.R ├── param_learner.R ├── param_rush.R ├── param_terminator.R ├── param_measure.R ├── field_id.R ├── param_store_benchmark_result.R ├── param_term_evals.R ├── param_term_time.R ├── param_inst_async.R ├── param_check_values.R ├── param_label.R ├── param_measures.R ├── param_search_space.R ├── param_packages.R ├── section_dictionary_fselectors.R ├── param_properties.R ├── param_man.R ├── param_codomain.R ├── param_xdt.R ├── param_store_fselect_instance.R ├── param_resampling.R ├── param_ties_method.R └── example.R ├── man ├── figures │ └── logo.png ├── mlr3fselect.async_freeze_archive.Rd ├── mlr3fselect.internal_tuning.Rd ├── reexports.Rd ├── mlr3fselect.backup.Rd ├── assert_async_fselect_callback.Rd ├── mlr3fselect_assertions.Rd ├── faggregate.Rd ├── mlr3fselect.one_se_rule.Rd ├── mlr3fselect.svm_rfe.Rd ├── mlr3fselect-package.Rd ├── fs.Rd ├── mlr_fselectors.Rd ├── extract_inner_fselect_results.Rd ├── extract_inner_fselect_archives.Rd ├── ObjectiveFSelectAsync.Rd ├── ContextAsyncFSelect.Rd ├── ContextBatchFSelect.Rd ├── fselect_nested.Rd ├── mlr_fselectors_async_design_points.Rd ├── embedded_ensemble_fselect.Rd ├── mlr_fselectors_async_random_search.Rd ├── FSelectorBatchFromOptimizerBatch.Rd ├── FSelectorAsync.Rd ├── CallbackBatchFSelect.Rd ├── CallbackAsyncFSelect.Rd ├── mlr_fselectors_async_exhaustive_search.Rd └── callback_batch_fselect.Rd ├── tests ├── testthat │ ├── teardown.R │ ├── test_FSelectorGeneticSearch.R │ ├── helper.R │ ├── test_FSelectorRandomSearch.R │ ├── setup.R │ ├── test_mlr_fselectors.R │ ├── test_fselect_nested.R │ ├── test_FSelectorBatchDesignPoints.R │ ├── test_FSelectorAsyncRandomSearch.R │ ├── test_FSelectorAsyncExhaustiveSearch.R │ ├── test_FSelectorAsyncDesignPoints.R │ ├── test_FSelectorExhaustiveSearch.R │ ├── test_ArchiveAsyncFSelectFrozen.R │ ├── test_FSelectorSequential.R │ ├── test_fsi_async.R │ ├── test_FSelectInstanceMultiCrit.R │ ├── test_auto_fselector.R │ ├── test_fsi.R │ ├── test_fselect.R │ ├── test_FSelectorShadowVariableSearch.R │ └── test_embedded_ensemble_fselect.R └── testthat.R ├── R ├── reexports.R ├── helper.R ├── FSelectorAsyncDesignPoints.R ├── FSelectorAsync.R ├── zzz.R ├── mlr_fselectors.R ├── auto_fselector.R ├── assertions.R ├── FSelectorAsyncRandomSearch.R ├── FSelectorBatchFromOptimizerBatch.R ├── FSelectorBatchGeneticSearch.R ├── FSelectorAsyncFromOptimizerAsync.R ├── FSelectorBatchExhaustiveSearch.R ├── FSelectorBatchDesignPoints.R ├── fselect_nested.R ├── ContextAsyncFSelect.R ├── ContextBatchFSelect.R ├── FSelectorAsyncExhaustiveSearch.R ├── ObjectiveFSelect.R ├── faggregate.R ├── FSelectorBatchRandomSearch.R ├── ObjectiveFSelectAsync.R ├── extract_inner_fselect_results.R ├── extract_inner_fselect_archives.R ├── FSelectorBatch.R ├── FSelectInstanceAsyncSingleCrit.R ├── FSelectInstanceAsyncMultiCrit.R ├── embedded_ensemble_fselect.R └── FSelectorBatchSequential.R ├── .editorconfig ├── .Rbuildignore ├── mlr3fselect.Rproj ├── .lintr ├── inst ├── WORDLIST └── testthat │ ├── helper_expectations.R │ ├── helper_fselector.R │ └── helper_misc.R ├── pkgdown └── _pkgdown.yml ├── .github └── workflows │ ├── pkgdown.yml │ ├── r-cmd-check.yml │ ├── no-suggest-cmd-check.yml │ └── dev-cmd-check.yml ├── attic └── test_FSelectorEvolutionary.R ├── NAMESPACE ├── DESCRIPTION └── .gitignore /.ignore: -------------------------------------------------------------------------------- 1 | man/ 2 | docs/ 3 | inst/doc/ 4 | attic/ 5 | vignettes/*.html 6 | -------------------------------------------------------------------------------- /man-roxygen/param_task.R: -------------------------------------------------------------------------------- 1 | #' @param task ([mlr3::Task])\cr 2 | #' Task to operate on. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_ydt.R: -------------------------------------------------------------------------------- 1 | #' @param ydt ([data.table::data.table()])\cr 2 | #' Optimal outcome. 3 | -------------------------------------------------------------------------------- /man/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mlr-org/mlr3fselect/HEAD/man/figures/logo.png -------------------------------------------------------------------------------- /man-roxygen/param_fselector.R: -------------------------------------------------------------------------------- 1 | #' @param fselector ([FSelector])\cr 2 | #' Optimization algorithm. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_id.R: -------------------------------------------------------------------------------- 1 | #' @param id (`character(1)`)\cr 2 | #' Identifier for the new instance. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_extra.R: -------------------------------------------------------------------------------- 1 | #' @param extra (`data.table::data.table()`)\cr 2 | #' Additional information. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_param_set.R: -------------------------------------------------------------------------------- 1 | #' @param param_set [paradox::ParamSet]\cr 2 | #' Set of control parameters. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_callbacks.R: -------------------------------------------------------------------------------- 1 | #' @param callbacks (list of [CallbackBatchFSelect])\cr 2 | #' List of callbacks. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_store_models.R: -------------------------------------------------------------------------------- 1 | #' @param store_models (`logical(1)`). 2 | #' Store models in benchmark result? 3 | -------------------------------------------------------------------------------- /man-roxygen/param_learner.R: -------------------------------------------------------------------------------- 1 | #' @param learner ([mlr3::Learner])\cr 2 | #' Learner to optimize the feature subset for. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_rush.R: -------------------------------------------------------------------------------- 1 | #' @param rush (`Rush`)\cr 2 | #' If a rush instance is supplied, the optimization runs without batches. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_terminator.R: -------------------------------------------------------------------------------- 1 | #' @param terminator ([bbotk::Terminator])\cr 2 | #' Stop criterion of the feature selection. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_measure.R: -------------------------------------------------------------------------------- 1 | #' @param measure ([mlr3::Measure])\cr 2 | #' Measure to optimize. If `NULL`, default measure is used. 3 | -------------------------------------------------------------------------------- /man-roxygen/field_id.R: -------------------------------------------------------------------------------- 1 | #' @field id (`character(1)`)\cr 2 | #' Identifier of the object. 3 | #' Used in tables, plot and text output. 4 | -------------------------------------------------------------------------------- /man-roxygen/param_store_benchmark_result.R: -------------------------------------------------------------------------------- 1 | #' @param store_benchmark_result (`logical(1)`)\cr 2 | #' Store benchmark result in archive? 3 | -------------------------------------------------------------------------------- /tests/testthat/teardown.R: -------------------------------------------------------------------------------- 1 | options(old_opts) 2 | lg_mlr3$set_threshold(old_threshold_mlr3) 3 | lg_rush$set_threshold(old_threshold_rush) 4 | 5 | -------------------------------------------------------------------------------- /man-roxygen/param_term_evals.R: -------------------------------------------------------------------------------- 1 | #' @param term_evals (`integer(1)`)\cr 2 | #' Number of allowed evaluations. 3 | #' Ignored if `terminator` is passed. 4 | -------------------------------------------------------------------------------- /man-roxygen/param_term_time.R: -------------------------------------------------------------------------------- 1 | #' @param term_time (`integer(1)`)\cr 2 | #' Maximum allowed time in seconds. 3 | #' Ignored if `terminator` is passed. 4 | -------------------------------------------------------------------------------- /man-roxygen/param_inst_async.R: -------------------------------------------------------------------------------- 1 | #' @param inst ([FSelectInstanceAsyncSingleCrit] | [FSelectInstanceAsyncMultiCrit] )\cr 2 | #' The feature selection instance. 3 | -------------------------------------------------------------------------------- /man-roxygen/param_check_values.R: -------------------------------------------------------------------------------- 1 | #' @param check_values (`logical(1)`)\cr 2 | #' Check the parameters before the evaluation and the results for 3 | #' validity? 4 | -------------------------------------------------------------------------------- /man-roxygen/param_label.R: -------------------------------------------------------------------------------- 1 | #' @param label (`character(1)`)\cr 2 | #' Label for this object. 3 | #' Can be used in tables, plot and text output instead of the ID. 4 | -------------------------------------------------------------------------------- /man-roxygen/param_measures.R: -------------------------------------------------------------------------------- 1 | #' @param measures (list of [mlr3::Measure])\cr 2 | #' Measures to optimize. 3 | #' If `NULL`, \CRANpkg{mlr3}'s default measure is used. 4 | -------------------------------------------------------------------------------- /man-roxygen/param_search_space.R: -------------------------------------------------------------------------------- 1 | #' @param search_space ([paradox::ParamSet])\cr 2 | #' Search space. 3 | #' Internally created from provided [mlr3::Task] by instance. 4 | 5 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | if (requireNamespace("testthat", quietly = TRUE)) { 2 | library(testthat) 3 | library(checkmate) 4 | library(mlr3fselect) 5 | test_check("mlr3fselect") 6 | } 7 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorGeneticSearch.R: -------------------------------------------------------------------------------- 1 | skip_if_not_installed("genalg") 2 | 3 | test_that("default parameters work", { 4 | test_fselector("genetic_search", term_evals = 10) 5 | }) 6 | -------------------------------------------------------------------------------- /man-roxygen/param_packages.R: -------------------------------------------------------------------------------- 1 | #' @param packages (`character()`)\cr 2 | #' Set of required packages. 3 | #' Note that these packages will be loaded via [requireNamespace()], and are not attached. 4 | -------------------------------------------------------------------------------- /man-roxygen/section_dictionary_fselectors.R: -------------------------------------------------------------------------------- 1 | #' @section Dictionary: 2 | #' This [FSelector] can be instantiated with the associated sugar function [fs()]: 3 | #' ``` 4 | #' fs("<%= id %>") 5 | #' ``` 6 | -------------------------------------------------------------------------------- /man-roxygen/param_properties.R: -------------------------------------------------------------------------------- 1 | #' @param properties (`character()`)\cr 2 | #' Set of properties of the fselector. 3 | #' Must be a subset of [`mlr_reflections$fselect_properties`][mlr3::mlr_reflections]. 4 | -------------------------------------------------------------------------------- /man-roxygen/param_man.R: -------------------------------------------------------------------------------- 1 | #' @param man (`character(1)`)\cr 2 | #' String in the format `[pkg]::[topic]` pointing to a manual page for this object. 3 | #' The referenced help package can be opened via method `$help()`. 4 | -------------------------------------------------------------------------------- /R/reexports.R: -------------------------------------------------------------------------------- 1 | #' @export 2 | bbotk::mlr_terminators 3 | 4 | #' @export 5 | bbotk::trm 6 | 7 | #' @export 8 | bbotk::trms 9 | 10 | #' @export 11 | mlr3misc::mlr_callbacks 12 | 13 | #' @export 14 | mlr3misc::clbk 15 | 16 | #' @export 17 | mlr3misc::clbks 18 | -------------------------------------------------------------------------------- /man-roxygen/param_codomain.R: -------------------------------------------------------------------------------- 1 | #' @param codomain ([paradox::ParamSet])\cr 2 | #' Specifies codomain of function. 3 | #' Most importantly the tags of each output "Parameter" define whether it should 4 | #' be minimized or maximized. The default is to minimize each component. 5 | -------------------------------------------------------------------------------- /man-roxygen/param_xdt.R: -------------------------------------------------------------------------------- 1 | #' @param xdt (`data.table::data.table()`)\cr 2 | #' x values as `data.table`. Each row is one point. Contains the value in 3 | #' the *search space* of the [FSelectInstanceBatchMultiCrit] object. Can contain 4 | #' additional columns for extra information. 5 | -------------------------------------------------------------------------------- /man-roxygen/param_store_fselect_instance.R: -------------------------------------------------------------------------------- 1 | #' @param store_fselect_instance (`logical(1)`)\cr 2 | #' If `TRUE` (default), stores the internally created [FSelectInstanceBatchSingleCrit] with all intermediate results in slot `$fselect_instance`. 3 | #' Is set to `TRUE`, if `store_models = TRUE` 4 | -------------------------------------------------------------------------------- /man-roxygen/param_resampling.R: -------------------------------------------------------------------------------- 1 | #' @param resampling ([mlr3::Resampling])\cr 2 | #' Resampling that is used to evaluated the performance of the feature subsets. 3 | #' Uninstantiated resamplings are instantiated during construction so that all feature subsets are evaluated on the same data splits. 4 | #' Already instantiated resamplings are kept unchanged. 5 | -------------------------------------------------------------------------------- /.editorconfig: -------------------------------------------------------------------------------- 1 | # See http://editorconfig.org 2 | root = true 3 | 4 | [*] 5 | charset = utf-8 6 | end_of_line = lf 7 | insert_final_newline = true 8 | indent_style = space 9 | trim_trailing_whitespace = true 10 | 11 | [*.{r,R,md,Rmd}] 12 | indent_size = 2 13 | 14 | [*.{c,h}] 15 | indent_size = 4 16 | 17 | [*.{cpp,hpp}] 18 | indent_size = 4 19 | 20 | [{NEWS.md,DESCRIPTION,LICENSE}] 21 | max_line_length = 80 22 | -------------------------------------------------------------------------------- /tests/testthat/helper.R: -------------------------------------------------------------------------------- 1 | # nolint start 2 | library(mlr3) 3 | library(mlr3misc) 4 | library(paradox) 5 | library(R6) 6 | library(data.table) 7 | 8 | lapply(list.files(system.file("testthat", package = "mlr3"), pattern = "^helper.*\\.[rR]", full.names = TRUE), source) 9 | lapply(list.files(system.file("testthat", package = "mlr3fselect"), pattern = "^helper.*\\.[rR]", full.names = TRUE), source) 10 | # nolint end 11 | -------------------------------------------------------------------------------- /.Rbuildignore: -------------------------------------------------------------------------------- 1 | 2 | ^README\.Rmd$ 3 | ^README\.html$ 4 | ^LICENSE$ 5 | ^\.github$ 6 | ^codecov\.yml$ 7 | ^tic\.R$ 8 | ^appveyor\.yml$ 9 | ^\.travis\.yml$ 10 | ^.*\.Rproj$ 11 | ^\.Rproj\.user$ 12 | ^\.editorconfig$ 13 | ^\.ignore$ 14 | ^docs$ 15 | ^pkgdown$ 16 | ^renv$ 17 | ^renv\.lock$ 18 | ^\.ccache$ 19 | ^clang-.* 20 | man-roxygen 21 | ^\.lintr$ 22 | ^\.vscode$ 23 | ^attic$ 24 | ^cran-comments\.md$ 25 | ^CRAN-RELEASE$ 26 | ^CRAN-SUBMISSION$ 27 | ^revdep$ 28 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorRandomSearch.R: -------------------------------------------------------------------------------- 1 | test_that("default parameters work", { 2 | test_fselector("random_search", batch_size = 5, term_evals = 10) 3 | }) 4 | 5 | test_that("max_features parameter work", { 6 | z = test_fselector("random_search", max_features = 1, term_evals = 10) 7 | a = z$inst$archive$data 8 | expect_feature_number(a[, 1:4], n = 1) 9 | }) 10 | 11 | test_that("multi-crit works", { 12 | test_fselector_2D("random_search", batch_size = 5, term_evals = 10) 13 | }) 14 | -------------------------------------------------------------------------------- /man/mlr3fselect.async_freeze_archive.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlr_callbacks.R 3 | \name{mlr3fselect.async_freeze_archive} 4 | \alias{mlr3fselect.async_freeze_archive} 5 | \title{Freeze Archive Callback} 6 | \description{ 7 | This \link{CallbackAsyncFSelect} freezes the \link{ArchiveAsyncFSelect} to \link{ArchiveAsyncFSelectFrozen} after the optimization has finished. 8 | } 9 | \examples{ 10 | clbk("mlr3fselect.async_freeze_archive") 11 | } 12 | -------------------------------------------------------------------------------- /mlr3fselect.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | ProjectId: 525fdb42-84c9-412c-ab6a-7ca5904c3d37 3 | 4 | RestoreWorkspace: No 5 | SaveWorkspace: No 6 | AlwaysSaveHistory: Default 7 | 8 | EnableCodeIndexing: Yes 9 | UseSpacesForTab: Yes 10 | NumSpacesForTab: 2 11 | Encoding: UTF-8 12 | 13 | RnwWeave: Sweave 14 | LaTeX: pdfLaTeX 15 | 16 | AutoAppendNewline: Yes 17 | StripTrailingWhitespace: Yes 18 | LineEndingConversion: Posix 19 | 20 | BuildType: Package 21 | PackageUseDevtools: Yes 22 | PackageRoxygenize: rd,collate,namespace 23 | -------------------------------------------------------------------------------- /man-roxygen/param_ties_method.R: -------------------------------------------------------------------------------- 1 | #' @param ties_method (`character(1)`)\cr 2 | #' The method to break ties when selecting sets while optimizing and when selecting the best set. 3 | #' Can be `"least_features"` or `"random"`. 4 | #' The option `"least_features"` (default) selects the feature set with the least features. 5 | #' If there are multiple best feature sets with the same number of features, one is selected randomly. 6 | #' The `random` method returns a random feature set from the best feature sets. 7 | #' Ignored if multiple measures are used. 8 | -------------------------------------------------------------------------------- /man/mlr3fselect.internal_tuning.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlr_callbacks.R 3 | \name{mlr3fselect.internal_tuning} 4 | \alias{mlr3fselect.internal_tuning} 5 | \title{Internal Tuning Callback} 6 | \description{ 7 | This callback runs internal tuning alongside the feature selection. 8 | The internal tuning values are aggregated and stored in the results. 9 | The final model is trained with the best feature set and the tuned value. 10 | } 11 | \examples{ 12 | clbk("mlr3fselect.internal_tuning") 13 | } 14 | -------------------------------------------------------------------------------- /.lintr: -------------------------------------------------------------------------------- 1 | linters: linters_with_defaults( 2 | # lintr defaults: https://github.com/jimhester/lintr#available-linters 3 | # the following setup changes/removes certain linters 4 | assignment_linter = NULL, # do not force using <- for assignments 5 | object_name_linter = object_name_linter(c("snake_case", "CamelCase")), # only allow snake case and camel case object names 6 | cyclocomp_linter = NULL, # do not check function complexity 7 | commented_code_linter = NULL, # allow code in comments 8 | line_length_linter = NULL 9 | ) 10 | 11 | -------------------------------------------------------------------------------- /tests/testthat/setup.R: -------------------------------------------------------------------------------- 1 | library("mlr3") 2 | library("checkmate") 3 | 4 | old_opts = options( 5 | warnPartialMatchArgs = TRUE, 6 | warnPartialMatchAttr = TRUE, 7 | warnPartialMatchDollar = TRUE 8 | ) 9 | 10 | # https://github.com/HenrikBengtsson/Wishlist-for-R/issues/88 11 | old_opts = lapply(old_opts, function(x) if (is.null(x)) FALSE else x) 12 | 13 | lg_mlr3 = lgr::get_logger("mlr3") 14 | lg_rush = lgr::get_logger("rush") 15 | 16 | old_threshold_mlr3 = lg_mlr3$threshold 17 | old_threshold_rush = lg_rush$threshold 18 | 19 | lg_mlr3$set_threshold(0) 20 | lg_rush$set_threshold(0) 21 | 22 | 23 | -------------------------------------------------------------------------------- /tests/testthat/test_mlr_fselectors.R: -------------------------------------------------------------------------------- 1 | test_that("mlr_fselectors", { 2 | expect_dictionary(mlr_fselectors, min_items = 1L) 3 | keys = mlr_fselectors$keys() 4 | 5 | for (key in keys) { 6 | fselector = fs(key) 7 | expect_r6(fselector, "FSelector") 8 | } 9 | }) 10 | 11 | test_that("mlr_fselectors sugar", { 12 | expect_class(fs("random_search"), "FSelector") 13 | expect_class(fss(c("random_search", "random_search")), "list") 14 | }) 15 | 16 | test_that("as.data.table objects parameter", { 17 | tab = as.data.table(mlr_fselectors, objects = TRUE) 18 | expect_data_table(tab) 19 | expect_list(tab$object, "FSelector", any.missing = FALSE) 20 | }) 21 | -------------------------------------------------------------------------------- /tests/testthat/test_fselect_nested.R: -------------------------------------------------------------------------------- 1 | test_that("fselect_nested function works", { 2 | rr = fselect_nested(fselector = fs("random_search"), task = tsk("pima"), learner = lrn("classif.rpart"), 3 | inner_resampling = rsmp ("holdout"), outer_resampling = rsmp("cv", folds = 3), measure = msr("classif.ce"), 4 | term_evals = 2) 5 | 6 | expect_resample_result(rr) 7 | expect_equal(rr$resampling$id, "cv") 8 | expect_equal(rr$resampling$iters, 3) 9 | expect_data_table(extract_inner_fselect_results(rr), nrows = 3) 10 | expect_class(rr$learners[[1]], "AutoFSelector") 11 | expect_equal(rr$learners[[1]]$fselect_instance$objective$resampling$id, "holdout") 12 | }) 13 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorBatchDesignPoints.R: -------------------------------------------------------------------------------- 1 | test_that("default parameters work", { 2 | design = data.table( 3 | x1 = c(TRUE, FALSE), 4 | x2 = c(TRUE, FALSE), 5 | x3 = c(FALSE, TRUE), 6 | x4 = c(FALSE, TRUE)) 7 | 8 | z = test_fselector("design_points", design = design) 9 | a = z$inst$archive$data 10 | expect_equal(a[, 1:4], design) 11 | }) 12 | 13 | test_that("multi-crit works", { 14 | design = data.table( 15 | x1 = c(TRUE, FALSE), 16 | x2 = c(TRUE, FALSE), 17 | x3 = c(FALSE, TRUE), 18 | x4 = c(FALSE, TRUE)) 19 | 20 | z = test_fselector_2D("design_points", design = design) 21 | a = z$inst$archive$data 22 | expect_equal(a[, 1:4], design) 23 | }) 24 | -------------------------------------------------------------------------------- /man/reexports.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/reexports.R 3 | \docType{import} 4 | \name{reexports} 5 | \alias{reexports} 6 | \alias{mlr_terminators} 7 | \alias{trm} 8 | \alias{trms} 9 | \alias{mlr_callbacks} 10 | \alias{clbk} 11 | \alias{clbks} 12 | \title{Objects exported from other packages} 13 | \keyword{internal} 14 | \description{ 15 | These objects are imported from other packages. Follow the links 16 | below to see their documentation. 17 | 18 | \describe{ 19 | \item{bbotk}{\code{\link[bbotk]{mlr_terminators}}, \code{\link[bbotk]{trm}}, \code{\link[bbotk:trm]{trms}}} 20 | 21 | \item{mlr3misc}{\code{\link[mlr3misc]{clbk}}, \code{\link[mlr3misc:clbk]{clbks}}, \code{\link[mlr3misc]{mlr_callbacks}}} 22 | }} 23 | 24 | -------------------------------------------------------------------------------- /inst/WORDLIST: -------------------------------------------------------------------------------- 1 | ArchiveBatchFSelect 2 | AutoFSelector 3 | BenchmarkResult 4 | Bengio 5 | Bergstra 6 | CallbackBatchFSelect 7 | CallbackBatchFSelects 8 | Codomain 9 | ContextBatchFSelect 10 | ContextBatch 11 | FSelect 12 | FSelectInstanceBatchMultiCrit 13 | FSelectInstanceBatchSingleCrit 14 | FSelector 15 | FSelectorBatchFromOptimizerBatch 16 | FSelectorBatchSequential 17 | FSelectorBatchShadowVariableSearch 18 | FSelectors 19 | Hepp 20 | Mattermost 21 | Mayr 22 | ORCID 23 | ObjectiveFSelect 24 | ParamSet 25 | Pseudovariables 26 | RFE 27 | ResampleResult 28 | StackOverflow 29 | Stefanski 30 | Uninstantiated 31 | bbotk 32 | cheatsheet 33 | cloneable 34 | cmd 35 | codomain 36 | dev 37 | fselect 38 | fselector 39 | fselectors 40 | getters 41 | iteratively 42 | mlr 43 | parallelize 44 | permutated 45 | saveRDS 46 | th 47 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorAsyncRandomSearch.R: -------------------------------------------------------------------------------- 1 | test_that("FSelectorAsyncRandomSearch works", { 2 | skip_on_cran() 3 | skip_if_not_installed("rush") 4 | flush_redis() 5 | 6 | fselector = fs("async_random_search") 7 | expect_class(fselector, "FSelectorAsync") 8 | 9 | on.exit(mirai::daemons(0)) 10 | mirai::daemons(2) 11 | rush::rush_plan(n_workers = 2, worker_type = "remote") 12 | instance = fsi_async( 13 | task = TEST_MAKE_TSK(), 14 | learner = lrn("regr.rpart"), 15 | resampling = rsmp("holdout"), 16 | measures = msr("dummy"), 17 | terminator = trm("evals", n_evals = 5L) 18 | ) 19 | 20 | expect_data_table(fselector$optimize(instance), nrows = 1) 21 | expect_data_table(instance$archive$data, min.rows = 5) 22 | 23 | expect_rush_reset(instance$rush, type = "kill") 24 | }) 25 | -------------------------------------------------------------------------------- /inst/testthat/helper_expectations.R: -------------------------------------------------------------------------------- 1 | expect_fselector = function(fselector) { 2 | expect_r6(fselector, "FSelector", 3 | public = c("optimize", "param_set", "properties", "packages"), 4 | private = c(".optimize", ".assign_result")) 5 | } 6 | 7 | expect_best_features = function(res, features) { 8 | expect_set_equal(names(res)[as.logical(res)], features) 9 | } 10 | 11 | expect_feature_number = function(features, n) { 12 | res = rowSums(features) 13 | expect_set_equal(res, n) 14 | } 15 | 16 | expect_max_features = function(features, n) { 17 | res = max(rowSums(features)) 18 | expect_set_equal(res, n) 19 | } 20 | 21 | expect_features = function(res, identical_to = NULL, must_include = NULL) { 22 | expect_names(names(res)[as.logical(res)], must.include = must_include, identical.to = identical_to) 23 | } 24 | -------------------------------------------------------------------------------- /man/mlr3fselect.backup.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlr_callbacks.R 3 | \name{mlr3fselect.backup} 4 | \alias{mlr3fselect.backup} 5 | \title{Backup Benchmark Result Callback} 6 | \description{ 7 | This \link{CallbackBatchFSelect} writes the \link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult} after each batch to disk. 8 | } 9 | \examples{ 10 | clbk("mlr3fselect.backup", path = "backup.rds") 11 | 12 | # Run feature selection on the Palmer Penguins data set 13 | instance = fselect( 14 | fselector = fs("random_search"), 15 | task = tsk("pima"), 16 | learner = lrn("classif.rpart"), 17 | resampling = rsmp ("holdout"), 18 | measures = msr("classif.ce"), 19 | term_evals = 4, 20 | callbacks = clbk("mlr3fselect.backup", path = tempfile(fileext = ".rds"))) 21 | } 22 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorAsyncExhaustiveSearch.R: -------------------------------------------------------------------------------- 1 | test_that("FSelectorAsyncExhaustiveSearch works", { 2 | skip_on_cran() 3 | skip_if_not_installed("rush") 4 | flush_redis() 5 | 6 | fselector = fs("async_exhaustive_search") 7 | expect_class(fselector, "FSelectorAsync") 8 | 9 | on.exit(mirai::daemons(0)) 10 | mirai::daemons(2) 11 | rush::rush_plan(n_workers = 2, worker_type = "remote") 12 | instance = fsi_async( 13 | task = TEST_MAKE_TSK(), 14 | learner = lrn("regr.rpart"), 15 | resampling = rsmp("holdout"), 16 | measures = msr("dummy"), 17 | terminator = trm("evals", n_evals = 15) 18 | ) 19 | 20 | expect_data_table(fselector$optimize(instance), nrows = 1) 21 | expect_data_table(instance$archive$data, nrows = 15) 22 | 23 | expect_rush_reset(instance$rush, type = "kill") 24 | }) 25 | -------------------------------------------------------------------------------- /man/assert_async_fselect_callback.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/CallbackAsyncFSelect.R 3 | \name{assert_async_fselect_callback} 4 | \alias{assert_async_fselect_callback} 5 | \alias{assert_async_fselect_callbacks} 6 | \title{Assertions for Callbacks} 7 | \usage{ 8 | assert_async_fselect_callback(callback, null_ok = FALSE) 9 | 10 | assert_async_fselect_callbacks(callbacks) 11 | } 12 | \arguments{ 13 | \item{callback}{(\link{CallbackAsyncFSelect}).} 14 | 15 | \item{null_ok}{(\code{logical(1)})\cr 16 | If \code{TRUE}, \code{NULL} is allowed.} 17 | 18 | \item{callbacks}{(list of \link{CallbackAsyncFSelect}).} 19 | } 20 | \value{ 21 | [CallbackAsyncFSelect | List of \link{CallbackAsyncFSelect}s. 22 | } 23 | \description{ 24 | Assertions for \link{CallbackAsyncFSelect} class. 25 | } 26 | -------------------------------------------------------------------------------- /R/helper.R: -------------------------------------------------------------------------------- 1 | task_to_domain = function(task) { 2 | params = rep(list(p_lgl()), length(task$feature_names)) 3 | names(params) = task$feature_names 4 | do.call(ps, params) 5 | } 6 | 7 | measures_to_codomain = function(measures) { 8 | measures = as_measures(measures) 9 | domains = map(measures, function(s) { 10 | if ("set_id" %in% names(ps())) { 11 | # old paradox 12 | get("ParamDbl")$new(id = s$id, tags = ifelse(s$minimize, "minimize", "maximize")) 13 | } else { 14 | p_dbl(tags = ifelse(s$minimize, "minimize", "maximize")) 15 | } 16 | }) 17 | names(domains) = ids(measures) 18 | Codomain$new(domains) 19 | } 20 | 21 | extract_runtime = function(resample_result) { 22 | runtimes = map_dbl(get_private(resample_result)$.data$learner_states(get_private(resample_result)$.view), function(state) { 23 | state$train_time + state$predict_time 24 | }) 25 | sum(runtimes) 26 | } 27 | -------------------------------------------------------------------------------- /man-roxygen/example.R: -------------------------------------------------------------------------------- 1 | <% if (id == "genetic_search") { -%> 2 | #' @examplesIf mlr3misc::require_namespaces("genalg", quietly = TRUE) 3 | <% } else { -%> 4 | #' @examples 5 | <% } -%> 6 | #' # Feature Selection 7 | #' \donttest{ 8 | #' 9 | #' # retrieve task and load learner 10 | #' task = tsk("penguins") 11 | #' learner = lrn("classif.rpart") 12 | #' 13 | #' # run feature selection on the Palmer Penguins data set 14 | #' instance = fselect( 15 | #' fselector = fs("<%= id %>"), 16 | #' task = task, 17 | #' learner = learner, 18 | #' resampling = rsmp("holdout"), 19 | #' measure = msr("classif.ce"), 20 | #' term_evals = 10 21 | #' ) 22 | #' 23 | #' # best performing feature set 24 | #' instance$result 25 | #' 26 | #' # all evaluated feature sets 27 | #' as.data.table(instance$archive) 28 | #' 29 | #' # subset the task and fit the final model 30 | #' task$select(instance$result_feature_set) 31 | #' learner$train(task) 32 | #' } 33 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorAsyncDesignPoints.R: -------------------------------------------------------------------------------- 1 | test_that("FSelectorAsyncDesignPoints works", { 2 | skip_on_cran() 3 | skip_if_not_installed("rush") 4 | flush_redis() 5 | 6 | on.exit(mirai::daemons(0)) 7 | mirai::daemons(2) 8 | rush::rush_plan(n_workers = 2, worker_type = "remote") 9 | instance = fsi_async( 10 | task = TEST_MAKE_TSK(), 11 | learner = lrn("regr.rpart"), 12 | resampling = rsmp("holdout"), 13 | measures = msr("dummy"), 14 | terminator = trm("evals", n_evals = 2), 15 | store_benchmark_result = FALSE 16 | ) 17 | 18 | design = data.table( 19 | x1 = c(TRUE, FALSE), 20 | x2 = c(TRUE, FALSE), 21 | x3 = c(FALSE, TRUE), 22 | x4 = c(FALSE, TRUE)) 23 | 24 | fselector = fs("async_design_points", design = design) 25 | expect_data_table(fselector$optimize(instance), nrows = 1) 26 | 27 | expect_data_table(instance$archive$data, nrows = 2) 28 | expect_rush_reset(instance$rush, type = "kill") 29 | }) 30 | -------------------------------------------------------------------------------- /R/FSelectorAsyncDesignPoints.R: -------------------------------------------------------------------------------- 1 | #' @title Feature Selection with Asynchronous Design Points 2 | #' 3 | #' @name mlr_fselectors_async_design_points 4 | #' 5 | #' @description 6 | #' Subclass for asynchronous design points feature selection. 7 | #' 8 | #' @templateVar id async_design_points 9 | #' @template section_dictionary_fselectors 10 | #' 11 | #' @inheritSection bbotk::OptimizerAsyncDesignPoints Parameters 12 | #' 13 | #' @family FSelectorAsync 14 | #' @export 15 | FSelectorAsyncDesignPoints = R6Class("FSelectorAsyncDesignPoints", 16 | inherit = FSelectorAsyncFromOptimizerAsync, 17 | public = list( 18 | 19 | #' @description 20 | #' Creates a new instance of this [R6][R6::R6Class] class. 21 | initialize = function() { 22 | super$initialize( 23 | optimizer = OptimizerAsyncDesignPoints$new(), 24 | man = "mlr3fselect::mlr_fselectors_async_design_points" 25 | ) 26 | } 27 | ) 28 | ) 29 | 30 | mlr_fselectors$add("async_design_points", FSelectorAsyncDesignPoints) 31 | -------------------------------------------------------------------------------- /pkgdown/_pkgdown.yml: -------------------------------------------------------------------------------- 1 | url: https://mlr3fselect.mlr-org.com 2 | 3 | template: 4 | bootstrap: 5 5 | light-switch: true 6 | math-rendering: mathjax 7 | package: mlr3pkgdowntemplate 8 | 9 | development: 10 | mode: auto 11 | version_label: default 12 | version_tooltip: "Version" 13 | 14 | toc: 15 | depth: 3 16 | 17 | navbar: 18 | structure: 19 | left: [reference, news, book] 20 | right: [search, github, mattermost, stackoverflow, rss, lightswitch] 21 | components: 22 | home: ~ 23 | reference: 24 | icon: fa fa-file-alt 25 | text: Reference 26 | href: reference/index.html 27 | mattermost: 28 | icon: fa fa-comments 29 | href: https://lmmisld-lmu-stats-slds.srv.mwn.de/mlr_invite/ 30 | book: 31 | text: mlr3book 32 | icon: fa fa-link 33 | href: https://mlr3book.mlr-org.com 34 | stackoverflow: 35 | icon: fab fa-stack-overflow 36 | href: https://stackoverflow.com/questions/tagged/mlr 37 | rss: 38 | icon: fa-rss 39 | href: https://mlr-org.com/ 40 | -------------------------------------------------------------------------------- /man/mlr3fselect_assertions.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/assertions.R 3 | \name{mlr3fselect_assertions} 4 | \alias{mlr3fselect_assertions} 5 | \alias{assert_fselectors} 6 | \alias{assert_fselector_async} 7 | \alias{assert_fselector_batch} 8 | \alias{assert_fselect_instance} 9 | \alias{assert_fselect_instance_async} 10 | \alias{assert_fselect_instance_batch} 11 | \title{Assertion for mlr3fselect objects} 12 | \usage{ 13 | assert_fselectors(fselectors) 14 | 15 | assert_fselector_async(fselector) 16 | 17 | assert_fselector_batch(fselector) 18 | 19 | assert_fselect_instance(inst) 20 | 21 | assert_fselect_instance_async(inst) 22 | 23 | assert_fselect_instance_batch(inst) 24 | } 25 | \arguments{ 26 | \item{fselectors}{(list of \link{FSelector}).} 27 | 28 | \item{fselector}{(\link{FSelectorBatch}).} 29 | 30 | \item{inst}{(\link{FSelectInstanceBatchSingleCrit} | \link{FSelectInstanceBatchMultiCrit}).} 31 | } 32 | \description{ 33 | Most assertion functions ensure the right class attribute, and optionally additional properties. 34 | } 35 | \keyword{internal} 36 | -------------------------------------------------------------------------------- /man/faggregate.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/faggregate.R 3 | \name{faggregate} 4 | \alias{faggregate} 5 | \title{Fast Aggregation of ResampleResults and BenchmarkResults} 6 | \usage{ 7 | faggregate(obj, measure, conditions = FALSE) 8 | } 9 | \arguments{ 10 | \item{obj}{(\link[mlr3:ResampleResult]{mlr3::ResampleResult} | \link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult}).} 11 | 12 | \item{measure}{(\link[mlr3:Measure]{mlr3::Measure}).} 13 | 14 | \item{conditions}{(\code{logical(1)})\cr 15 | If \code{TRUE}, the function returns the number of warnings and the number of errors.} 16 | } 17 | \value{ 18 | (\code{\link[data.table:data.table]{data.table::data.table()}}) 19 | } 20 | \description{ 21 | Aggregates a \link[mlr3:ResampleResult]{mlr3::ResampleResult} or \link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult} for a single simple measure. 22 | Returns the aggregated score for each resample result. 23 | } 24 | \details{ 25 | This function is faster than \verb{$aggregate()} because it does not reassemble the resampling results. 26 | It only works on simple measures which do not require the task, learner, model or train set to be available. 27 | } 28 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorExhaustiveSearch.R: -------------------------------------------------------------------------------- 1 | test_that("default parameters work", { 2 | z = test_fselector("exhaustive_search") 3 | a = z$inst$archive$data 4 | 5 | expect_feature_number(a[seq(4), list(x1, x2, x3, x4)], n = 1) 6 | expect_feature_number(a[6:10, list(x1, x2, x3, x4)], n = 2) 7 | expect_feature_number(a[11:14, list(x1, x2, x3, x4)], n = 3) 8 | expect_feature_number(a[15, list(x1, x2, x3, x4)], n = 4) 9 | r = z$inst$result_x_search_space 10 | expect_equal(r, data.table(x1 = TRUE, x2 = TRUE, x3 = TRUE, x4 = FALSE)) 11 | }) 12 | 13 | test_that("max_features parameter works", { 14 | z = test_fselector("exhaustive_search", max_features = 2) 15 | a = z$inst$archive$data 16 | 17 | expect_max_features(a[,list(x1, x2, x3, x4)], n = 2) 18 | r = z$inst$result_x_search_space 19 | expect_equal(r, data.table(x1 = TRUE, x2 = TRUE, x3 = FALSE, x4 = FALSE)) 20 | }) 21 | 22 | test_that("multi-crit works", { 23 | test_fselector_2D("exhaustive_search") 24 | }) 25 | 26 | test_that("batch_size parameter works", { 27 | z = test_fselector("exhaustive_search", batch_size = 2) 28 | a = z$inst$archive$data 29 | 30 | expect_data_table(a[list(1), , on = "batch_nr"], nrows = 2) 31 | }) 32 | -------------------------------------------------------------------------------- /man/mlr3fselect.one_se_rule.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlr_callbacks.R 3 | \name{mlr3fselect.one_se_rule} 4 | \alias{mlr3fselect.one_se_rule} 5 | \title{One Standard Error Rule Callback} 6 | \source{ 7 | Kuhn, Max, Johnson, Kjell (2013). 8 | \dQuote{Applied Predictive Modeling.} 9 | In chapter Over-Fitting and Model Tuning, 61--92. 10 | Springer New York, New York, NY. 11 | ISBN 978-1-4614-6849-3. 12 | } 13 | \description{ 14 | Selects the smallest feature set within one standard error of the best as the result. 15 | If there are multiple such feature sets with the same number of features, the first one is selected. 16 | If the sets have exactly the same performance but different number of features, 17 | the one with the smallest number of features is selected. 18 | } 19 | \examples{ 20 | clbk("mlr3fselect.one_se_rule") 21 | 22 | # Run feature selection on the pima data set with the callback 23 | instance = fselect( 24 | fselector = fs("random_search"), 25 | task = tsk("pima"), 26 | learner = lrn("classif.rpart"), 27 | resampling = rsmp ("cv", folds = 3), 28 | measures = msr("classif.ce"), 29 | term_evals = 10, 30 | callbacks = clbk("mlr3fselect.one_se_rule")) 31 | # Smallest feature set within one standard error of the best 32 | instance$result 33 | } 34 | -------------------------------------------------------------------------------- /tests/testthat/test_ArchiveAsyncFSelectFrozen.R: -------------------------------------------------------------------------------- 1 | test_that("ArchiveAsyncTuningFrozen works", { 2 | skip_on_cran() 3 | skip_if_not_installed("rush") 4 | flush_redis() 5 | 6 | mirai::daemons(2) 7 | rush::rush_plan(n_workers = 2, worker_type = "remote") 8 | 9 | instance = fsi_async( 10 | task = tsk("pima"), 11 | learner = lrn("classif.rpart"), 12 | resampling = rsmp("cv", folds = 3), 13 | measures = msr("classif.ce"), 14 | terminator = trm("evals", n_evals = 20), 15 | store_benchmark_result = TRUE 16 | ) 17 | fselector = fs("async_random_search") 18 | fselector$optimize(instance) 19 | 20 | archive = instance$archive 21 | frozen_archive = ArchiveAsyncFSelectFrozen$new(archive) 22 | 23 | expect_data_table(frozen_archive$data) 24 | expect_data_table(frozen_archive$queued_data) 25 | expect_data_table(frozen_archive$running_data) 26 | expect_data_table(frozen_archive$finished_data) 27 | expect_data_table(frozen_archive$failed_data) 28 | expect_number(frozen_archive$n_queued) 29 | expect_number(frozen_archive$n_running) 30 | expect_number(frozen_archive$n_finished) 31 | expect_number(frozen_archive$n_failed) 32 | expect_number(frozen_archive$n_evals) 33 | expect_benchmark_result(frozen_archive$benchmark_result) 34 | 35 | expect_data_table(as.data.table(frozen_archive)) 36 | expect_rush_reset(instance$rush) 37 | }) 38 | -------------------------------------------------------------------------------- /.github/workflows/pkgdown.yml: -------------------------------------------------------------------------------- 1 | # pkgdown workflow of the mlr3 ecosystem v0.1.0 2 | # https://github.com/mlr-org/actions 3 | on: 4 | push: 5 | branches: 6 | - main 7 | pull_request: 8 | branches: 9 | - main 10 | release: 11 | types: 12 | - published 13 | workflow_dispatch: 14 | 15 | name: pkgdown 16 | 17 | jobs: 18 | pkgdown: 19 | runs-on: ubuntu-latest 20 | 21 | concurrency: 22 | group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }} 23 | env: 24 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 25 | steps: 26 | - uses: actions/checkout@v3 27 | 28 | - uses: r-lib/actions/setup-pandoc@v2 29 | 30 | - uses: r-lib/actions/setup-r@v2 31 | 32 | - uses: r-lib/actions/setup-r-dependencies@v2 33 | with: 34 | extra-packages: any::pkgdown, local::. 35 | needs: website 36 | 37 | - name: Install template 38 | run: pak::pkg_install("mlr-org/mlr3pkgdowntemplate") 39 | shell: Rscript {0} 40 | 41 | - name: Build site 42 | run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE) 43 | shell: Rscript {0} 44 | 45 | - name: Deploy 46 | if: github.event_name != 'pull_request' 47 | uses: JamesIves/github-pages-deploy-action@v4.4.1 48 | with: 49 | clean: false 50 | branch: gh-pages 51 | folder: docs 52 | -------------------------------------------------------------------------------- /R/FSelectorAsync.R: -------------------------------------------------------------------------------- 1 | #' @title Class for Asynchronous Feature Selection Algorithms 2 | #' 3 | #' @include mlr_fselectors.R 4 | #' 5 | #' @description 6 | #' The [FSelectorAsync] implements the asynchronous optimization algorithm. 7 | #' 8 | #' @details 9 | #' [FSelectorAsync] is an abstract base class that implements the base functionality each asynchronous fselector must provide. 10 | #' 11 | #' @inheritSection FSelector Resources 12 | #' 13 | #' @template param_id 14 | #' @template param_param_set 15 | #' @template param_properties 16 | #' @template param_packages 17 | #' @template param_label 18 | #' @template param_man 19 | #' 20 | #' @export 21 | FSelectorAsync = R6Class("FSelectorAsync", 22 | inherit = FSelector, 23 | public = list( 24 | 25 | #' @description 26 | #' Performs the feature selection on a [FSelectInstanceAsyncSingleCrit] or [FSelectInstanceAsyncMultiCrit] until termination. 27 | #' The single evaluations will be written into the [ArchiveAsyncFSelect] that resides in the [FSelectInstanceAsyncSingleCrit]/[FSelectInstanceAsyncMultiCrit]. 28 | #' The result will be written into the instance object. 29 | #' 30 | #' @param inst ([FSelectInstanceAsyncSingleCrit] | [FSelectInstanceAsyncMultiCrit]). 31 | #' 32 | #' @return [data.table::data.table()] 33 | optimize = function(inst) { 34 | assert_fselect_instance_async(inst) 35 | optimize_async_default(inst, self) 36 | } 37 | ) 38 | ) 39 | -------------------------------------------------------------------------------- /man/mlr3fselect.svm_rfe.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlr_callbacks.R 3 | \name{mlr3fselect.svm_rfe} 4 | \alias{mlr3fselect.svm_rfe} 5 | \title{SVM-RFE Callback} 6 | \source{ 7 | Guyon I, Weston J, Barnhill S, Vapnik V (2002). 8 | \dQuote{Gene Selection for Cancer Classification using Support Vector Machines.} 9 | \emph{Machine Learning}, \bold{46}(1), 389--422. 10 | ISSN 1573-0565, \doi{10.1023/A:1012487302797}. 11 | } 12 | \description{ 13 | Runs a recursive feature elimination with a \link[mlr3learners:mlr_learners_classif.svm]{mlr3learners::LearnerClassifSVM}. 14 | The SVM must be configured with \code{type = "C-classification"} and \code{kernel = "linear"}. 15 | } 16 | \examples{ 17 | \dontshow{if (requireNamespace("mlr3learners", quietly = TRUE)) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} 18 | clbk("mlr3fselect.svm_rfe") 19 | 20 | library(mlr3learners) 21 | 22 | # Create instance with classification svm with linear kernel 23 | instance = fsi( 24 | task = tsk("sonar"), 25 | learner = lrn("classif.svm", type = "C-classification", kernel = "linear"), 26 | resampling = rsmp("cv", folds = 3), 27 | measures = msr("classif.ce"), 28 | terminator = trm("none"), 29 | callbacks = clbk("mlr3fselect.svm_rfe"), 30 | store_models = TRUE 31 | ) 32 | 33 | fselector = fs("rfe", feature_number = 5, n_features = 10) 34 | 35 | # Run recursive feature elimination on the Sonar data set 36 | fselector$optimize(instance) 37 | \dontshow{\}) # examplesIf} 38 | } 39 | -------------------------------------------------------------------------------- /R/zzz.R: -------------------------------------------------------------------------------- 1 | #' @import data.table 2 | #' @import checkmate 3 | #' @import cli 4 | #' @import paradox 5 | #' @import mlr3misc 6 | #' @import mlr3 7 | #' @import bbotk 8 | #' @importFrom R6 R6Class 9 | #' @importFrom utils combn head tail packageVersion 10 | #' @importFrom stats sd 11 | "_PACKAGE" 12 | 13 | .onLoad = function(libname, pkgname) { 14 | # nocov start 15 | utils::globalVariables(c("super", "self", "n_features", "errors")) 16 | 17 | # reflections 18 | x = utils::getFromNamespace("bbotk_reflections", ns = "bbotk") 19 | x$optimizer_properties = c(x$optimizer_properties, "requires_model") 20 | 21 | x = utils::getFromNamespace("mlr_reflections", ns = "mlr3") 22 | walk(names(x$task_col_roles), function(task_type) { 23 | x$task_col_roles[[task_type]] = unique(c(x$task_col_roles[[task_type]], "always_included")) 24 | }) 25 | 26 | x$loaded_packages = c(x$loaded_packages, "mlr3fselect") 27 | 28 | # callbacks 29 | x = utils::getFromNamespace("mlr_callbacks", ns = "mlr3misc") 30 | x$add("mlr3fselect.backup", load_callback_backup) 31 | x$add("mlr3fselect.svm_rfe", load_callback_svm_rfe) 32 | x$add("mlr3fselect.one_se_rule", load_callback_one_se_rule) 33 | x$add("mlr3fselect.internal_tuning", load_callback_internal_tuning) 34 | x$add("mlr3fselect.async_freeze_archive", load_callback_freeze_archive) 35 | 36 | assign("lg", lgr::get_logger("mlr3/bbotk"), envir = parent.env(environment())) 37 | if (Sys.getenv("IN_PKGDOWN") == "true") { 38 | lg$set_threshold("warn") 39 | } 40 | } # nocov end 41 | 42 | leanify_package() 43 | -------------------------------------------------------------------------------- /man/mlr3fselect-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/zzz.R 3 | \docType{package} 4 | \name{mlr3fselect-package} 5 | \alias{mlr3fselect} 6 | \alias{mlr3fselect-package} 7 | \title{mlr3fselect: Feature Selection for 'mlr3'} 8 | \description{ 9 | \if{html}{\figure{logo.png}{options: style='float: right' alt='logo' width='120'}} 10 | 11 | Feature selection package of the 'mlr3' ecosystem. It selects the optimal feature set for any 'mlr3' learner. The package works with several optimization algorithms e.g. Random Search, Recursive Feature Elimination, and Genetic Search. Moreover, it can automatically optimize learners and estimate the performance of optimized feature sets with nested resampling. 12 | } 13 | \seealso{ 14 | Useful links: 15 | \itemize{ 16 | \item \url{https://mlr3fselect.mlr-org.com} 17 | \item \url{https://github.com/mlr-org/mlr3fselect} 18 | \item Report bugs at \url{https://github.com/mlr-org/mlr3fselect/issues} 19 | } 20 | 21 | } 22 | \author{ 23 | \strong{Maintainer}: Marc Becker \email{marcbecker@posteo.de} (\href{https://orcid.org/0000-0002-8115-0400}{ORCID}) 24 | 25 | Authors: 26 | \itemize{ 27 | \item Patrick Schratz \email{patrick.schratz@gmail.com} (\href{https://orcid.org/0000-0003-0748-6624}{ORCID}) 28 | \item Michel Lang \email{michellang@gmail.com} (\href{https://orcid.org/0000-0001-9754-0393}{ORCID}) 29 | \item Bernd Bischl \email{bernd_bischl@gmx.net} (\href{https://orcid.org/0000-0001-6002-6980}{ORCID}) 30 | \item John Zobolas \email{bblodfon@gmail.com} (\href{https://orcid.org/0000-0002-3609-8674}{ORCID}) 31 | } 32 | 33 | } 34 | -------------------------------------------------------------------------------- /attic/test_FSelectorEvolutionary.R: -------------------------------------------------------------------------------- 1 | context("FSelectEvolutionary") 2 | 3 | test_that("FSelectEvolutionary", { 4 | z = test_fselector("evolutionary", mu = 4, lambda = 8, term_evals = 12) 5 | }) 6 | 7 | test_that("FSelectEvolutionary - Initial solution", { 8 | z = test_fselector("evolutionary", mu = 4, lambda = 8, 9 | initial.solutions = list(c(1, 1, 1, 0)), term_evals = 12) 10 | r = z$inst$result_x_search_space 11 | expect_equal(r, data.table(x1 = TRUE, 12 | x2 = TRUE, 13 | x3 = TRUE, 14 | x4 = FALSE)) 15 | }) 16 | 17 | test_that("FSelectEvolutionary - Parent selector", { 18 | test_fselector("evolutionary", mu = 4, lambda = 8, 19 | parent.selector = "selRoulette", term_evals = 12) 20 | test_fselector("evolutionary", mu = 10, lambda = 8, 21 | parent.selector = "selGreedy", term_evals = 12) 22 | }) 23 | 24 | test_that("FSelectEvolutionary - Survial selector", { 25 | test_fselector("evolutionary", mu = 4, lambda = 8, 26 | survival.selector = "selRoulette", term_evals = 12) 27 | }) 28 | 29 | test_that("FSelectEvolutionary - Survial strategy", { 30 | test_fselector("evolutionary", mu = 4, lambda = 8, survival.strategy = "comma", 31 | n.elite = 1, term_evals = 12) 32 | }) 33 | 34 | test_that("FSelectEvolutionary - Task with no features as inital solution", { 35 | test_fselector("evolutionary", mu = 4, lambda = 8, 36 | initial.solutions = list(c(0, 0, 0, 0)), term_evals = 10) 37 | }) 38 | 39 | test_that("FSelectEvolutionary flips random bit if feature set is empty", { 40 | test_fselector("evolutionary", mu = 4, lambda = 8, 41 | initial.solutions = list(c(0, 0, 0, 0)), term_evals = 12) 42 | }) 43 | -------------------------------------------------------------------------------- /man/fs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/sugar.R 3 | \name{fs} 4 | \alias{fs} 5 | \alias{fss} 6 | \title{Syntactic Sugar for Feature Selection Objects Construction} 7 | \usage{ 8 | fs(.key, ...) 9 | 10 | fss(.keys, ...) 11 | } 12 | \arguments{ 13 | \item{.key}{(\code{character(1)})\cr 14 | Key passed to the respective \link[mlr3misc:Dictionary]{dictionary} to retrieve the object.} 15 | 16 | \item{...}{(any)\cr 17 | Additional arguments.} 18 | 19 | \item{.keys}{(\code{character()})\cr 20 | Keys passed to the respective \link[mlr3misc:Dictionary]{dictionary} to retrieve multiple objects.} 21 | } 22 | \value{ 23 | \link[R6:R6Class]{R6::R6Class} object of the respective type, or a list of \link[R6:R6Class]{R6::R6Class} objects for the plural versions. 24 | } 25 | \description{ 26 | Functions to retrieve objects, set parameters and assign to fields in one go. 27 | Relies on \code{\link[mlr3misc:dictionary_sugar_get]{mlr3misc::dictionary_sugar_get()}} to extract objects from the respective \link[mlr3misc:Dictionary]{mlr3misc::Dictionary}: 28 | \itemize{ 29 | \item \code{fs()} for a \link{FSelector} from \link{mlr_fselectors}. 30 | \item \code{fss()} for a list of \link[=FSelector]{FSelectors} from \link{mlr_fselectors}. 31 | \item \code{trm()} for a \link[bbotk:Terminator]{bbotk::Terminator} from \link{mlr_terminators}. 32 | \item \code{trms()} for a list of \link[bbotk:Terminator]{Terminators} from \link{mlr_terminators}. 33 | } 34 | } 35 | \examples{ 36 | # random search fselector with batch size of 5 37 | fs("random_search", batch_size = 5) 38 | 39 | # run time terminator with 20 seconds 40 | trm("run_time", secs = 20) 41 | } 42 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorSequential.R: -------------------------------------------------------------------------------- 1 | test_that("default parameters works", { 2 | z = test_fselector("sequential") 3 | a = z$inst$archive$data 4 | expect_feature_number(a[batch_nr == 1, 1:4], n = 1) 5 | expect_feature_number(a[batch_nr == 2, 1:4], n = 2) 6 | expect_feature_number(a[batch_nr == 3, 1:4], n = 3) 7 | expect_feature_number(a[batch_nr == 4, 1:4], n = 4) 8 | }) 9 | 10 | test_that("sbs strategy works", { 11 | z = test_fselector("sequential", strategy = "sbs") 12 | a = z$inst$archive$data 13 | expect_feature_number(a[batch_nr == 1, 1:4], n = 4) 14 | expect_feature_number(a[batch_nr == 2, 1:4], n = 3) 15 | expect_feature_number(a[batch_nr == 3, 1:4], n = 2) 16 | expect_feature_number(a[batch_nr == 4, 1:4], n = 1) 17 | }) 18 | 19 | test_that("sfs strategy works with max_features parameter", { 20 | z = test_fselector("sequential", max_features = 2) 21 | a = z$inst$archive$data 22 | expect_max_features(a[, 1:4], n = 2) 23 | }) 24 | 25 | test_that("sbs strategy works with max_features parameter", { 26 | z = test_fselector("sequential", max_features = 2, strategy = "sbs") 27 | a = z$inst$archive$data 28 | expect_max_features(a[, 1:4], n = 2) 29 | }) 30 | 31 | test_that("optimization_path method works", { 32 | z = test_fselector("sequential") 33 | op = z$fselector$optimization_path(z$inst) 34 | expect_data_table(op, nrows = 4, ncols = 6) 35 | expect_equal(op$dummy, c(1, 2, 4, 3)) 36 | }) 37 | 38 | test_that("optimization_path method works with included uhash", { 39 | z = test_fselector("sequential") 40 | op = z$fselector$optimization_path(z$inst, include_uhash = TRUE) 41 | expect_data_table(op) 42 | expect_names(names(op), must.include = "uhash") 43 | expect_equal(op$dummy, c(1, 2, 4, 3)) 44 | }) 45 | -------------------------------------------------------------------------------- /tests/testthat/test_fsi_async.R: -------------------------------------------------------------------------------- 1 | test_that("fsi_async function creates a FSelectInstanceAsyncSingleCrit", { 2 | skip_on_cran() 3 | skip_if_not_installed("rush") 4 | flush_redis() 5 | 6 | instance = fsi_async( 7 | task = tsk("pima"), 8 | learner = lrn("classif.rpart"), 9 | resampling = rsmp("holdout"), 10 | measures = msr("classif.ce"), 11 | terminator = trm("evals", n_evals = 2)) 12 | expect_class(instance, "FSelectInstanceAsyncSingleCrit") 13 | }) 14 | 15 | test_that("fsi_async function creates a FSelectInstanceAsyncMultiCrit", { 16 | skip_on_cran() 17 | skip_if_not_installed("rush") 18 | flush_redis() 19 | 20 | instance = fsi_async( 21 | task = tsk("pima"), 22 | learner = lrn("classif.rpart"), 23 | resampling = rsmp("holdout"), 24 | measures = msrs(c("classif.ce", "classif.acc")), 25 | terminator = trm("evals", n_evals = 2)) 26 | expect_class(instance, "FSelectInstanceAsyncMultiCrit") 27 | }) 28 | 29 | test_that("fsi_async interface is equal to FSelectInstanceAsyncSingleCrit", { 30 | skip_on_cran() 31 | skip_if_not_installed("rush") 32 | flush_redis() 33 | 34 | fsi_args = formalArgs(fsi_async) 35 | fsi_args[fsi_args == "measures"] = "measure" 36 | instance_args = formalArgs(FSelectInstanceAsyncSingleCrit$public_methods$initialize) 37 | 38 | expect_equal(fsi_args, instance_args) 39 | }) 40 | 41 | test_that("fsi_async interface is equal to FSelectInstanceAsyncMultiCrit", { 42 | skip_on_cran() 43 | skip_if_not_installed("rush") 44 | flush_redis() 45 | 46 | fsi_args = formalArgs(fsi_async) 47 | fsi_args = fsi_args[fsi_args != "ties_method"] 48 | instance_args = formalArgs(FSelectInstanceAsyncMultiCrit$public_methods$initialize) 49 | 50 | expect_equal(fsi_args, instance_args) 51 | }) 52 | -------------------------------------------------------------------------------- /R/mlr_fselectors.R: -------------------------------------------------------------------------------- 1 | #' @title Dictionary of FSelectors 2 | #' 3 | #' @usage NULL 4 | #' @format [R6::R6Class] object inheriting from [mlr3misc::Dictionary]. 5 | #' 6 | #' @description 7 | #' A [mlr3misc::Dictionary] storing objects of class [FSelector]. 8 | #' Each fselector has an associated help page, see `mlr_fselectors_[id]`. 9 | #' 10 | #' For a more convenient way to retrieve and construct fselectors, see [fs()]/[fss()]. 11 | #' 12 | #' @section Methods: 13 | #' See [mlr3misc::Dictionary]. 14 | #' 15 | #' @section S3 methods: 16 | #' * `as.data.table(dict, ..., objects = FALSE)`\cr 17 | #' [mlr3misc::Dictionary] -> [data.table::data.table()]\cr 18 | #' Returns a [data.table::data.table()] with fields "key", "label", "properties" and "packages" as columns. 19 | #' If `objects` is set to `TRUE`, the constructed objects are returned in the list column named `object`. 20 | #' 21 | #' @family Dictionary 22 | #' @family FSelector 23 | #' @seealso 24 | #' Sugar functions: [fs()], [fss()] 25 | #' 26 | #' @export 27 | #' @examples 28 | #' as.data.table(mlr_fselectors) 29 | #' mlr_fselectors$get("random_search") 30 | #' fs("random_search") 31 | mlr_fselectors = R6Class("DictionaryFSelector", 32 | inherit = Dictionary, 33 | cloneable = FALSE 34 | )$new() 35 | 36 | #' @export 37 | as.data.table.DictionaryFSelector = function(x, ..., objects = FALSE) { 38 | assert_flag(objects) 39 | 40 | setkeyv(map_dtr(x$keys(), function(key) { 41 | t = withCallingHandlers(x$get(key), 42 | packageNotFoundWarning = function(w) invokeRestart("muffleWarning")) 43 | insert_named( 44 | list(key = key, label = t$label, properties = list(t$properties), packages = list(t$packages)), 45 | if (objects) list(object = list(t)) 46 | ) 47 | }, .fill = TRUE), "key")[] 48 | } 49 | -------------------------------------------------------------------------------- /.github/workflows/r-cmd-check.yml: -------------------------------------------------------------------------------- 1 | # r cmd check workflow of the mlr3 ecosystem v0.4.0 2 | # https://github.com/mlr-org/actions 3 | # modified to use supercharge/redis-github-action@1.7.0 4 | on: 5 | workflow_dispatch: 6 | inputs: 7 | debug_enabled: 8 | type: boolean 9 | description: 'Run the build with tmate debugging enabled' 10 | required: false 11 | default: false 12 | push: 13 | branches: 14 | - main 15 | pull_request: 16 | branches: 17 | - main 18 | 19 | name: r-cmd-check 20 | 21 | jobs: 22 | r-cmd-check: 23 | runs-on: ${{ matrix.config.os }} 24 | 25 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 26 | 27 | env: 28 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 29 | 30 | strategy: 31 | fail-fast: false 32 | matrix: 33 | config: 34 | - {os: ubuntu-latest, r: 'devel'} 35 | - {os: ubuntu-latest, r: 'release'} 36 | 37 | steps: 38 | - uses: actions/checkout@v3 39 | 40 | - uses: r-lib/actions/setup-pandoc@v2 41 | 42 | - uses: r-lib/actions/setup-r@v2 43 | with: 44 | r-version: ${{ matrix.config.r }} 45 | 46 | - uses: supercharge/redis-github-action@1.7.0 47 | with: 48 | redis-version: 7 49 | 50 | - uses: r-lib/actions/setup-r-dependencies@v2 51 | with: 52 | extra-packages: any::rcmdcheck 53 | needs: check 54 | 55 | - uses: mxschmitt/action-tmate@v3 56 | if: ${{ github.event_name == 'workflow_dispatch' && inputs.debug_enabled }} 57 | with: 58 | limit-access-to-actor: true 59 | 60 | - uses: r-lib/actions/check-r-package@v2 61 | with: 62 | args: 'c("--no-manual", "--as-cran")' 63 | error-on: '"note"' 64 | -------------------------------------------------------------------------------- /.github/workflows/no-suggest-cmd-check.yml: -------------------------------------------------------------------------------- 1 | # r cmd check workflow without suggests of the mlr3 ecosystem v0.3.1 2 | # https://github.com/mlr-org/actions 3 | # modified to use supercharge/redis-github-action@1.7.0 4 | on: 5 | workflow_dispatch: 6 | inputs: 7 | debug_enabled: 8 | type: boolean 9 | description: 'Run the build with tmate debugging enabled' 10 | required: false 11 | default: false 12 | push: 13 | branches: 14 | - main 15 | pull_request: 16 | branches: 17 | - main 18 | 19 | name: no-suggest-cmd-check 20 | 21 | jobs: 22 | no-suggest-cmd-check: 23 | runs-on: ${{ matrix.config.os }} 24 | 25 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 26 | 27 | env: 28 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 29 | 30 | strategy: 31 | fail-fast: false 32 | matrix: 33 | config: 34 | - {os: ubuntu-latest, r: 'release'} 35 | 36 | steps: 37 | - uses: actions/checkout@v5 38 | 39 | - uses: r-lib/actions/setup-pandoc@v2 40 | 41 | - uses: r-lib/actions/setup-r@v2 42 | with: 43 | r-version: ${{ matrix.config.r }} 44 | 45 | - uses: supercharge/redis-github-action@1.7.0 46 | with: 47 | redis-version: 7 48 | 49 | - uses: r-lib/actions/setup-r-dependencies@v2 50 | with: 51 | extra-packages: | 52 | any::rcmdcheck 53 | any::testthat 54 | any::knitr 55 | any::rmarkdown 56 | needs: check 57 | dependencies: '"hard"' 58 | cache: false 59 | 60 | - uses: mxschmitt/action-tmate@v3 61 | if: ${{ github.event_name == 'workflow_dispatch' && inputs.debug_enabled }} 62 | with: 63 | limit-access-to-actor: true 64 | 65 | - uses: r-lib/actions/check-r-package@v2 66 | with: 67 | args: 'c("--no-manual", "--as-cran")' 68 | -------------------------------------------------------------------------------- /man/mlr_fselectors.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlr_fselectors.R 3 | \docType{data} 4 | \name{mlr_fselectors} 5 | \alias{mlr_fselectors} 6 | \title{Dictionary of FSelectors} 7 | \format{ 8 | \link[R6:R6Class]{R6::R6Class} object inheriting from \link[mlr3misc:Dictionary]{mlr3misc::Dictionary}. 9 | } 10 | \description{ 11 | A \link[mlr3misc:Dictionary]{mlr3misc::Dictionary} storing objects of class \link{FSelector}. 12 | Each fselector has an associated help page, see \code{mlr_fselectors_[id]}. 13 | 14 | For a more convenient way to retrieve and construct fselectors, see \code{\link[=fs]{fs()}}/\code{\link[=fss]{fss()}}. 15 | } 16 | \section{Methods}{ 17 | 18 | See \link[mlr3misc:Dictionary]{mlr3misc::Dictionary}. 19 | } 20 | 21 | \section{S3 methods}{ 22 | 23 | \itemize{ 24 | \item \code{as.data.table(dict, ..., objects = FALSE)}\cr 25 | \link[mlr3misc:Dictionary]{mlr3misc::Dictionary} -> \code{\link[data.table:data.table]{data.table::data.table()}}\cr 26 | Returns a \code{\link[data.table:data.table]{data.table::data.table()}} with fields "key", "label", "properties" and "packages" as columns. 27 | If \code{objects} is set to \code{TRUE}, the constructed objects are returned in the list column named \code{object}. 28 | } 29 | } 30 | 31 | \examples{ 32 | as.data.table(mlr_fselectors) 33 | mlr_fselectors$get("random_search") 34 | fs("random_search") 35 | } 36 | \seealso{ 37 | Sugar functions: \code{\link[=fs]{fs()}}, \code{\link[=fss]{fss()}} 38 | 39 | Other FSelector: 40 | \code{\link{FSelector}}, 41 | \code{\link{mlr_fselectors_design_points}}, 42 | \code{\link{mlr_fselectors_exhaustive_search}}, 43 | \code{\link{mlr_fselectors_genetic_search}}, 44 | \code{\link{mlr_fselectors_random_search}}, 45 | \code{\link{mlr_fselectors_rfe}}, 46 | \code{\link{mlr_fselectors_rfecv}}, 47 | \code{\link{mlr_fselectors_sequential}}, 48 | \code{\link{mlr_fselectors_shadow_variable_search}} 49 | } 50 | \concept{Dictionary} 51 | \concept{FSelector} 52 | \keyword{datasets} 53 | -------------------------------------------------------------------------------- /R/auto_fselector.R: -------------------------------------------------------------------------------- 1 | #' @title Function for Automatic Feature Selection 2 | #' 3 | #' @inherit AutoFSelector description 4 | #' @inheritSection AutoFSelector Resources 5 | #' @inherit AutoFSelector details 6 | #' @inheritSection AutoFSelector Nested Resampling 7 | #' 8 | #' @return [AutoFSelector]. 9 | #' 10 | #' @template param_fselector 11 | #' @template param_learner 12 | #' @template param_resampling 13 | #' @template param_measure 14 | #' @template param_term_evals 15 | #' @template param_term_time 16 | #' @template param_terminator 17 | #' @template param_store_fselect_instance 18 | #' @template param_store_benchmark_result 19 | #' @template param_store_models 20 | #' @template param_check_values 21 | #' @template param_callbacks 22 | #' @template param_ties_method 23 | #' @template param_rush 24 | #' @template param_id 25 | #' 26 | #' @export 27 | #' @examples 28 | #' afs = auto_fselector( 29 | #' fselector = fs("random_search"), 30 | #' learner = lrn("classif.rpart"), 31 | #' resampling = rsmp("holdout"), 32 | #' measure = msr("classif.ce"), 33 | #' term_evals = 4) 34 | #' 35 | #' afs$train(tsk("pima")) 36 | auto_fselector = function( 37 | fselector, 38 | learner, 39 | resampling, 40 | measure = NULL, 41 | term_evals = NULL, 42 | term_time = NULL, 43 | terminator = NULL, 44 | store_fselect_instance = TRUE, 45 | store_benchmark_result = TRUE, 46 | store_models = FALSE, 47 | check_values = FALSE, 48 | callbacks = NULL, 49 | ties_method = "least_features", 50 | rush = NULL, 51 | id = NULL 52 | ) { 53 | terminator = terminator %??% terminator_selection(term_evals, term_time) 54 | 55 | AutoFSelector$new( 56 | fselector = fselector, 57 | learner = learner, 58 | resampling = resampling, 59 | measure = measure, 60 | terminator = terminator, 61 | store_fselect_instance = store_fselect_instance, 62 | store_benchmark_result = store_benchmark_result, 63 | store_models = store_models, 64 | check_values = check_values, 65 | callbacks = callbacks, 66 | ties_method = ties_method, 67 | rush = rush, 68 | id = id) 69 | } 70 | -------------------------------------------------------------------------------- /R/assertions.R: -------------------------------------------------------------------------------- 1 | #' @title Assertion for mlr3fselect objects 2 | #' 3 | #' @description 4 | #' Most assertion functions ensure the right class attribute, and optionally additional properties. 5 | #' 6 | #' @name mlr3fselect_assertions 7 | #' @keywords internal 8 | NULL 9 | 10 | assert_fselector = function(fselector) { 11 | assert_r6(fselector, "FSelector") 12 | } 13 | 14 | #' @export 15 | #' @param fselectors (list of [FSelector]). 16 | #' @rdname mlr3fselect_assertions 17 | assert_fselectors = function(fselectors) { 18 | invisible(lapply(fselectors, assert_fselector)) 19 | } 20 | 21 | #' @export 22 | #' @param fselector (`FSelectorAsync`). 23 | #' @rdname mlr3fselect_assertions 24 | assert_fselector_async = function(fselector) { 25 | assert_r6(fselector, "FSelectorAsync") 26 | } 27 | 28 | #' @export 29 | #' @param fselector ([FSelectorBatch]). 30 | #' @rdname mlr3fselect_assertions 31 | assert_fselector_batch = function(fselector) { 32 | assert_r6(fselector, "FSelectorBatch") 33 | } 34 | 35 | #' @export 36 | #' @param inst ([FSelectInstanceBatchSingleCrit] | [FSelectInstanceBatchMultiCrit] | `FSelectInstanceAsyncSingleCrit` | `FSelectInstanceAsyncMultiCrit`). 37 | #' @rdname mlr3fselect_assertions 38 | assert_fselect_instance = function(inst) { 39 | assert_multi_class(inst, c( 40 | "FSelectInstanceBatchSingleCrit", 41 | "FSelectInstanceBatchMultiCrit", 42 | "FSelectInstanceAsyncSingleCrit", 43 | "FSelectInstanceAsyncMultiCrit")) 44 | } 45 | 46 | #' @export 47 | #' @param inst (`FSelectInstanceAsyncSingleCrit` | `FSelectInstanceAsyncMultiCrit`). 48 | #' @rdname mlr3fselect_assertions 49 | assert_fselect_instance_async = function(inst) { 50 | assert_multi_class(inst, c( 51 | "FSelectInstanceAsyncSingleCrit", 52 | "FSelectInstanceAsyncMultiCrit")) 53 | } 54 | 55 | #' @export 56 | #' @param inst ([FSelectInstanceBatchSingleCrit] | [FSelectInstanceBatchMultiCrit]). 57 | #' @rdname mlr3fselect_assertions 58 | assert_fselect_instance_batch = function(inst) { 59 | assert_multi_class(inst, c( 60 | "FSelectInstanceBatchSingleCrit", 61 | "FSelectInstanceBatchMultiCrit")) 62 | } 63 | 64 | -------------------------------------------------------------------------------- /inst/testthat/helper_fselector.R: -------------------------------------------------------------------------------- 1 | test_fselector = function(.key, ..., term_evals = NULL, store_models = FALSE) { 2 | fselector = fs(.key, ...) 3 | expect_fselector(fselector) 4 | expect_man_exists(fselector$man) 5 | 6 | inst = fselect( 7 | fselector = fselector, 8 | task = TEST_MAKE_TSK(), 9 | learner = lrn("regr.rpart"), 10 | resampling = rsmp("holdout"), 11 | measures = msr("dummy"), 12 | term_evals = term_evals, 13 | store_models = store_models 14 | ) 15 | 16 | # result checks 17 | archive = inst$archive 18 | expect_data_table(inst$result, nrows = 1) 19 | expect_names(names(inst$result), must.include = c("x1", "x2", "x3", "x4", "features", "n_features", "dummy")) 20 | expect_subset(inst$result$features[[1]], c("x1", "x2", "x3", "x4")) 21 | expect_data_table(inst$result_x_search_space, nrows = 1, ncols = 4, types = "logical") 22 | expect_names(names(inst$result_x_search_space), identical.to = c("x1", "x2", "x3", "x4")) 23 | expect_names(names(inst$result_y), identical.to = "dummy") 24 | 25 | list(fselector = fselector, inst = inst) 26 | } 27 | 28 | test_fselector_2D = function(.key, ..., term_evals = NULL, store_models = FALSE) { 29 | fselector = fs(.key, ...) 30 | expect_fselector(fselector) 31 | expect_man_exists(fselector$man) 32 | 33 | inst = fselect( 34 | fselector = fselector, 35 | task = TEST_MAKE_TSK(), 36 | learner = lrn("regr.rpart"), 37 | resampling = rsmp("holdout"), 38 | measures = msrs(c("regr.rmse", "regr.mse")), 39 | term_evals = term_evals, 40 | store_models = store_models 41 | ) 42 | 43 | # result checks 44 | expect_names(names(inst$result), identical.to = c("x1", "x2", "x3", "x4", "features", "n_features", "regr.rmse", "regr.mse")) 45 | expect_subset(inst$result$features[[1]], c("x1", "x2", "x3", "x4")) 46 | expect_data_table(inst$result_x_search_space, types = "logical") 47 | expect_names(names(inst$result_x_search_space), identical.to = c("x1", "x2", "x3", "x4")) 48 | expect_names(names(inst$result_y), identical.to = c("regr.rmse", "regr.mse")) 49 | 50 | list(fselector = fselector, inst = inst) 51 | } 52 | -------------------------------------------------------------------------------- /R/FSelectorAsyncRandomSearch.R: -------------------------------------------------------------------------------- 1 | #' @title Feature Selection with Asynchronous Random Search 2 | #' 3 | #' @include mlr_fselectors.R 4 | #' @name mlr_fselectors_async_random_search 5 | #' 6 | #' @description 7 | #' Feature selection using Asynchronous Random Search Algorithm. 8 | #' 9 | #' @templateVar id async_random_search 10 | #' @template section_dictionary_fselectors 11 | #' 12 | #' @section Control Parameters: 13 | #' \describe{ 14 | #' \item{`max_features`}{`integer(1)`\cr 15 | #' Maximum number of features. 16 | #' By default, number of features in [mlr3::Task].} 17 | #' } 18 | #' 19 | #' @source 20 | #' `r format_bib("bergstra_2012")` 21 | #' 22 | #' @family FSelectorAsync 23 | #' @export 24 | FSelectorAsyncRandomSearch = R6Class("FSelectorAsyncRandomSearch", 25 | inherit = FSelectorAsync, 26 | public = list( 27 | 28 | #' @description 29 | #' Creates a new instance of this [R6][R6::R6Class] class. 30 | initialize = function() { 31 | ps = ps( 32 | max_features = p_int(lower = 1L) 33 | ) 34 | 35 | super$initialize( 36 | id = "async_random_search", 37 | param_set = ps, 38 | properties = c("single-crit", "multi-crit"), 39 | label = "Asynchronous Random Search", 40 | man = "mlr3fselect::mlr_fselectors_async_random_search" 41 | ) 42 | } 43 | ), 44 | 45 | private = list( 46 | .optimize = function(inst) { 47 | pars = self$param_set$values 48 | feature_names = inst$archive$cols_x 49 | max_features = pars$max_features %??% length(feature_names) 50 | 51 | # usually the queue is empty but callbacks might have added points 52 | get_private(inst)$.eval_queue() 53 | 54 | while (!inst$is_terminated) { 55 | # sample new points 56 | n = sample.int(max_features, 1L) 57 | x = sample.int(length(feature_names), n) 58 | xs = as.list(set_names(replace(logical(length(feature_names)), x, TRUE), feature_names)) 59 | # evaluate 60 | get_private(inst)$.eval_point(xs) 61 | } 62 | } 63 | ) 64 | ) 65 | 66 | mlr_fselectors$add("async_random_search", FSelectorAsyncRandomSearch) 67 | -------------------------------------------------------------------------------- /.github/workflows/dev-cmd-check.yml: -------------------------------------------------------------------------------- 1 | # dev cmd check workflow of the mlr3 ecosystem v0.4.0 2 | # https://github.com/mlr-org/actions 3 | # modified to use supercharge/redis-github-action@1.7.0 4 | on: 5 | workflow_dispatch: 6 | inputs: 7 | debug_enabled: 8 | type: boolean 9 | description: 'Run the build with tmate debugging enabled' 10 | required: false 11 | default: false 12 | push: 13 | branches: 14 | - main 15 | pull_request: 16 | branches: 17 | - main 18 | 19 | name: dev-check 20 | 21 | jobs: 22 | check-package: 23 | runs-on: ${{ matrix.config.os }} 24 | 25 | name: ${{ matrix.config.dev-package }} 26 | 27 | env: 28 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 29 | 30 | strategy: 31 | fail-fast: false 32 | matrix: 33 | config: 34 | - {os: ubuntu-latest, r: 'release', dev-package: 'mlr-org/bbotk'} 35 | - {os: ubuntu-latest, r: 'release', dev-package: 'mlr-org/mlr3'} 36 | - {os: ubuntu-latest, r: 'release', dev-package: 'mlr-org/mlr3misc'} 37 | - {os: ubuntu-latest, r: 'release', dev-package: 'mlr-org/paradox'} 38 | 39 | steps: 40 | - uses: actions/checkout@v3 41 | 42 | - uses: r-lib/actions/setup-pandoc@v2 43 | 44 | - uses: r-lib/actions/setup-r@v2 45 | with: 46 | r-version: ${{ matrix.config.r }} 47 | 48 | - uses: supercharge/redis-github-action@1.7.0 49 | with: 50 | redis-version: 7 51 | 52 | - uses: r-lib/actions/setup-r-dependencies@v2 53 | with: 54 | extra-packages: any::rcmdcheck 55 | needs: check 56 | 57 | 58 | - name: Install dev versions 59 | run: pak::pkg_install('${{ matrix.config.dev-package }}') 60 | shell: Rscript {0} 61 | 62 | - uses: mxschmitt/action-tmate@v3 63 | if: ${{ github.event_name == 'workflow_dispatch' && inputs.debug_enabled }} 64 | with: 65 | limit-access-to-actor: true 66 | 67 | - uses: r-lib/actions/check-r-package@v2 68 | with: 69 | args: 'c("--no-manual", "--as-cran")' 70 | error-on: '"note"' 71 | -------------------------------------------------------------------------------- /R/FSelectorBatchFromOptimizerBatch.R: -------------------------------------------------------------------------------- 1 | #' @title FSelectorBatchFromOptimizerBatch 2 | #' 3 | #' @description 4 | #' Internally used to transform [bbotk::Optimizer] to [FSelector]. 5 | #' 6 | #' @template param_man 7 | #' 8 | #' @keywords internal 9 | #' @export 10 | FSelectorBatchFromOptimizerBatch= R6Class("FSelectorBatchFromOptimizerBatch", 11 | inherit = FSelectorBatch, 12 | public = list( 13 | 14 | #' @description 15 | #' Creates a new instance of this [R6][R6::R6Class] class. 16 | #' 17 | #' @param optimizer [bbotk::Optimizer]\cr 18 | #' Optimizer that is called. 19 | initialize = function(optimizer, man = NA_character_) { 20 | private$.optimizer = assert_optimizer(optimizer) 21 | packages = union("mlr3fselect", optimizer$packages) 22 | assert_string(man, na.ok = TRUE) 23 | 24 | super$initialize( 25 | id = if ("id" %in% names(optimizer)) optimizer$id else "optimizer", 26 | param_set = optimizer$param_set, 27 | properties = optimizer$properties, 28 | packages = packages, 29 | label = optimizer$label, 30 | man = man 31 | ) 32 | }, 33 | 34 | #' @description 35 | #' Performs the feature selection on a [FSelectInstanceBatchSingleCrit] / 36 | #' [FSelectInstanceBatchMultiCrit] until termination. 37 | #' 38 | #' @param inst ([FSelectInstanceBatchSingleCrit] | [FSelectInstanceBatchMultiCrit]). 39 | #' 40 | #' @return [data.table::data.table]. 41 | optimize = function(inst) { 42 | # We check for both classes since there is no FSelectInstance super 43 | # class anymore and OptimInstance would not ensure that we are in the 44 | # scope of mlr3fselect 45 | assert_fselect_instance_batch(inst) 46 | result = private$.optimizer$optimize(inst) 47 | inst$objective$.__enclos_env__$private$.xss = NULL 48 | inst$objective$.__enclos_env__$private$.design = NULL 49 | inst$objective$.__enclos_env__$private$.benchmark_result = NULL 50 | inst$objective$.__enclos_env__$private$.aggregated_performance = NULL 51 | return(result) 52 | } 53 | ), 54 | 55 | private = list( 56 | .optimizer = NULL 57 | ) 58 | ) 59 | -------------------------------------------------------------------------------- /R/FSelectorBatchGeneticSearch.R: -------------------------------------------------------------------------------- 1 | #' @title Feature Selection with Genetic Search 2 | #' 3 | #' @include mlr_fselectors.R 4 | #' @name mlr_fselectors_genetic_search 5 | #' 6 | #' @description 7 | #' Feature selection using the Genetic Algorithm from the package \CRANpkg{genalg}. 8 | #' 9 | #' @templateVar id genetic_search 10 | #' @template section_dictionary_fselectors 11 | #' 12 | #' @section Control Parameters: 13 | #' For the meaning of the control parameters, see [genalg::rbga.bin()]. 14 | #' [genalg::rbga.bin()] internally terminates after `iters` iteration. 15 | #' We set `ìters = 100000` to allow the termination via our terminators. 16 | #' If more iterations are needed, set `ìters` to a higher value in the parameter set. 17 | #' 18 | #' @family FSelector 19 | #' @export 20 | #' @template example 21 | FSelectorBatchGeneticSearch = R6Class("FSelectorBatchGeneticSearch", 22 | inherit = FSelectorBatch, 23 | public = list( 24 | 25 | #' @description 26 | #' Creates a new instance of this [R6][R6::R6Class] class. 27 | initialize = function() { 28 | ps = ps( 29 | suggestion = p_uty(), 30 | popSize = p_int(lower = 5L, default = 200L), 31 | mutationChance = p_dbl(lower = 0, upper = 1), 32 | elitism = p_int(lower = 1L), 33 | zeroToOneRatio = p_int(lower = 1, default = 10L), 34 | iters = p_int(lower = 1, default = 100000L) 35 | ) 36 | ps$values$iters = 100000L 37 | 38 | super$initialize( 39 | id = "genetic_search", 40 | param_set = ps, 41 | properties = "single-crit", 42 | packages = "genalg", 43 | label = "Genetic Search", 44 | man = "mlr3fselect::mlr_fselectors_genetic_search" 45 | ) 46 | } 47 | ), 48 | private = list( 49 | .optimize = function(inst) { 50 | pars = self$param_set$values 51 | if (is.null(pars$mutationChance)) pars$mutationChance = NA 52 | if (is.null(pars$elitism)) pars$elitism = NA 53 | n = inst$objective$domain$length 54 | 55 | mlr3misc::invoke(genalg::rbga.bin, size = n, evalFunc = inst$objective_function, .args = pars) 56 | } 57 | ) 58 | ) 59 | 60 | mlr_fselectors$add("genetic_search", FSelectorBatchGeneticSearch) 61 | -------------------------------------------------------------------------------- /R/FSelectorAsyncFromOptimizerAsync.R: -------------------------------------------------------------------------------- 1 | #' @title FSelectorAsyncFromOptimizerAsync 2 | #' 3 | #' @description 4 | #' Internally used to transform [bbotk::Optimizer] to [FSelector]. 5 | #' 6 | #' @template param_man 7 | #' 8 | #' @keywords internal 9 | #' @export 10 | FSelectorAsyncFromOptimizerAsync = R6Class("FSelectorAsyncFromOptimizerAsync", 11 | inherit = FSelectorAsync, 12 | public = list( 13 | 14 | #' @description 15 | #' Creates a new instance of this [R6][R6::R6Class] class. 16 | #' 17 | #' @param optimizer [bbotk::Optimizer]\cr 18 | #' Optimizer that is called. 19 | initialize = function(optimizer, man = NA_character_) { 20 | private$.optimizer = assert_optimizer_async(optimizer) 21 | packages = union("mlr3fselect", optimizer$packages) 22 | assert_string(man, na.ok = TRUE) 23 | 24 | super$initialize( 25 | id = if ("id" %in% names(optimizer)) optimizer$id else "fselector", 26 | param_set = optimizer$param_set, 27 | properties = optimizer$properties, 28 | packages = packages, 29 | label = optimizer$label, 30 | man = man 31 | ) 32 | }, 33 | 34 | #' @description 35 | #' Performs the feature selection on a [FSelectInstanceAsyncSingleCrit] / 36 | #' [FSelectInstanceAsyncMultiCrit] until termination. The single evaluations and 37 | #' the final results will be written into the [ArchiveAsyncFSelect] that 38 | #' resides in the [FSelectInstanceAsyncSingleCrit]/[FSelectInstanceAsyncMultiCrit]. 39 | #' The final result is returned. 40 | #' 41 | #' @param inst ([FSelectInstanceAsyncSingleCrit] | [FSelectInstanceAsyncMultiCrit]). 42 | #' 43 | #' @return [data.table::data.table]. 44 | optimize = function(inst) { 45 | assert_fselect_instance_async(inst) 46 | private$.optimizer$optimize(inst) 47 | } 48 | ), 49 | 50 | active = list( 51 | 52 | #' @field param_set ([paradox::ParamSet])\cr 53 | #' Set of control parameters. 54 | param_set = function(rhs) { 55 | if (!missing(rhs) && !identical(rhs, private$.optimizer$param_set)) { 56 | stop("$param_set is read-only.") 57 | } 58 | private$.optimizer$param_set 59 | } 60 | ), 61 | 62 | private = list( 63 | .optimizer = NULL 64 | ) 65 | ) 66 | 67 | 68 | -------------------------------------------------------------------------------- /R/FSelectorBatchExhaustiveSearch.R: -------------------------------------------------------------------------------- 1 | #' @title Feature Selection with Exhaustive Search 2 | #' 3 | #' @include mlr_fselectors.R 4 | #' @name mlr_fselectors_exhaustive_search 5 | #' 6 | #' @description 7 | #' Feature Selection using the Exhaustive Search Algorithm. 8 | #' Exhaustive Search generates all possible feature sets. 9 | #' 10 | #' @details 11 | #' The feature selection terminates itself when all feature sets are evaluated. 12 | #' It is not necessary to set a termination criterion. 13 | #' 14 | #' @templateVar id exhaustive_search 15 | #' @template section_dictionary_fselectors 16 | #' 17 | #' @section Control Parameters: 18 | #' \describe{ 19 | #' \item{`max_features`}{`integer(1)`\cr 20 | #' Maximum number of features. 21 | #' By default, number of features in [mlr3::Task].} 22 | #' } 23 | #' 24 | #' @family FSelector 25 | #' @export 26 | #' @template example 27 | FSelectorBatchExhaustiveSearch = R6Class("FSelectorBatchExhaustiveSearch", 28 | inherit = FSelectorBatch, 29 | public = list( 30 | 31 | #' @description 32 | #' Creates a new instance of this [R6][R6::R6Class] class. 33 | initialize = function() { 34 | ps = ps( 35 | max_features = p_int(lower = 1L), 36 | batch_size = p_int(lower = 1L, tags = "required") 37 | ) 38 | ps$values = list(batch_size = 10L) 39 | 40 | super$initialize( 41 | id = "exhaustive_search", 42 | param_set = ps, 43 | properties = c("single-crit", "multi-crit"), 44 | label = "Exhaustive Search", 45 | man = "mlr3fselect::mlr_fselectors_exhaustive_search") 46 | } 47 | ), 48 | private = list( 49 | .optimize = function(inst) { 50 | pars = self$param_set$values 51 | feature_names = inst$archive$cols_x 52 | n_features = length(feature_names) 53 | 54 | fun = function(i, state) { 55 | state[i] = TRUE 56 | as.list(state) 57 | } 58 | 59 | states = set_col_names(rbindlist(unlist(map(seq(pars$max_features %??% n_features), function(n) { 60 | combn(n_features, n, fun, simplify = FALSE, state = logical(n_features)) 61 | }), recursive = FALSE)), feature_names) 62 | 63 | chunks = split(seq_row(states), ceiling(seq_along(seq_row(states)) / pars$batch_size)) 64 | walk(chunks, function(row_ids) inst$eval_batch(states[row_ids])) 65 | } 66 | ) 67 | ) 68 | 69 | mlr_fselectors$add("exhaustive_search", FSelectorBatchExhaustiveSearch) 70 | -------------------------------------------------------------------------------- /R/FSelectorBatchDesignPoints.R: -------------------------------------------------------------------------------- 1 | #' @title Feature Selection with Design Points 2 | #' 3 | #' @include mlr_fselectors.R 4 | #' @name mlr_fselectors_design_points 5 | #' 6 | #' @description 7 | #' Feature selection using user-defined feature sets. 8 | #' 9 | #' @details 10 | #' The feature sets are evaluated in order as given. 11 | #' 12 | #' The feature selection terminates itself when all feature sets are evaluated. 13 | #' It is not necessary to set a termination criterion. 14 | #' 15 | #' @templateVar id design_points 16 | #' @template section_dictionary_fselectors 17 | #' 18 | #' @inheritSection bbotk::OptimizerBatchDesignPoints Parameters 19 | #' 20 | #' @family FSelector 21 | #' @export 22 | #' @examples 23 | #' # Feature Selection 24 | #' \donttest{ 25 | #' 26 | #' # retrieve task and load learner 27 | #' task = tsk("pima") 28 | #' learner = lrn("classif.rpart") 29 | #' 30 | #' # create design 31 | #' design = mlr3misc::rowwise_table( 32 | #' ~age, ~glucose, ~insulin, ~mass, ~pedigree, ~pregnant, ~pressure, ~triceps, 33 | #' TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, 34 | #' TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, 35 | #' TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, FALSE, FALSE, 36 | #' TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE 37 | #' ) 38 | #' 39 | #' # run feature selection on the Pima Indians diabetes data set 40 | #' instance = fselect( 41 | #' fselector = fs("design_points", design = design), 42 | #' task = task, 43 | #' learner = learner, 44 | #' resampling = rsmp("holdout"), 45 | #' measure = msr("classif.ce") 46 | #' ) 47 | #' 48 | #' # best performing feature set 49 | #' instance$result 50 | #' 51 | #' # all evaluated feature sets 52 | #' as.data.table(instance$archive) 53 | #' 54 | #' # subset the task and fit the final model 55 | #' task$select(instance$result_feature_set) 56 | #' learner$train(task) 57 | #' } 58 | FSelectorBatchDesignPoints = R6Class("FSelectorBatchDesignPoints", 59 | inherit = FSelectorBatchFromOptimizerBatch, 60 | public = list( 61 | 62 | #' @description 63 | #' Creates a new instance of this [R6][R6::R6Class] class. 64 | initialize = function() { 65 | super$initialize( 66 | optimizer = OptimizerBatchDesignPoints$new(), 67 | man = "mlr3fselect::mlr_fselectors_design_points" 68 | ) 69 | } 70 | ) 71 | ) 72 | 73 | mlr_fselectors$add("design_points", FSelectorBatchDesignPoints) 74 | -------------------------------------------------------------------------------- /man/extract_inner_fselect_results.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/extract_inner_fselect_results.R 3 | \name{extract_inner_fselect_results} 4 | \alias{extract_inner_fselect_results} 5 | \title{Extract Inner Feature Selection Results} 6 | \usage{ 7 | extract_inner_fselect_results(x, fselect_instance, ...) 8 | } 9 | \arguments{ 10 | \item{x}{(\link[mlr3:ResampleResult]{mlr3::ResampleResult} | \link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult}).} 11 | 12 | \item{fselect_instance}{(\code{logical(1)})\cr 13 | If \code{TRUE}, instances are added to the table.} 14 | 15 | \item{...}{(any)\cr 16 | Additional arguments.} 17 | } 18 | \value{ 19 | \code{\link[data.table:data.table]{data.table::data.table()}}. 20 | } 21 | \description{ 22 | Extract inner feature selection results of nested resampling. 23 | Implemented for \link[mlr3:ResampleResult]{mlr3::ResampleResult} and \link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult}. 24 | } 25 | \details{ 26 | The function iterates over the \link{AutoFSelector} objects and binds the feature selection results to a \code{\link[data.table:data.table]{data.table::data.table()}}. 27 | \link{AutoFSelector} must be initialized with \code{store_fselect_instance = TRUE} and \code{resample()} or \code{benchmark()} must be called with \code{store_models = TRUE}. 28 | Optionally, the instance can be added for each iteration. 29 | } 30 | \section{Data structure}{ 31 | 32 | 33 | The returned data table has the following columns: 34 | \itemize{ 35 | \item \code{experiment} (integer(1))\cr 36 | Index, giving the according row number in the original benchmark grid. 37 | \item \code{iteration} (integer(1))\cr 38 | Iteration of the outer resampling. 39 | \item One column for each feature of the task. 40 | \item One column for each performance measure. 41 | \item \code{features} (character())\cr 42 | Vector of selected feature set. 43 | \item \code{task_id} (\code{character(1)}). 44 | \item \code{learner_id} (\code{character(1)}). 45 | \item \code{resampling_id} (\code{character(1)}). 46 | } 47 | } 48 | 49 | \examples{ 50 | # Nested Resampling on Palmer Penguins Data Set 51 | 52 | # create auto fselector 53 | at = auto_fselector( 54 | fselector = fs("random_search"), 55 | learner = lrn("classif.rpart"), 56 | resampling = rsmp ("holdout"), 57 | measure = msr("classif.ce"), 58 | term_evals = 4) 59 | 60 | resampling_outer = rsmp("cv", folds = 2) 61 | rr = resample(tsk("iris"), at, resampling_outer, store_models = TRUE) 62 | 63 | # extract inner results 64 | extract_inner_fselect_results(rr) 65 | } 66 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectInstanceMultiCrit.R: -------------------------------------------------------------------------------- 1 | test_that("empty FSelectInstanceBatchMultiCrit works", { 2 | inst = TEST_MAKE_INST_2D() 3 | 4 | expect_data_table(inst$archive$data, nrows = 0L) 5 | expect_identical(inst$archive$n_evals, 0L) 6 | expect_identical(inst$archive$n_batch, 0L) 7 | expect_null(inst$result) 8 | }) 9 | 10 | test_that("eval_batch works", { 11 | inst = TEST_MAKE_INST_2D() 12 | 13 | xdt = data.table(x1 = list(TRUE, FALSE), x2 = list(FALSE, TRUE), 14 | x3 = list(TRUE, TRUE), x4 = list(TRUE, TRUE)) 15 | 16 | z = inst$eval_batch(xdt) 17 | expect_named(z, c("regr.mse", "regr.rmse")) 18 | expect_identical(inst$archive$n_evals, 2L) 19 | expect_data_table(z, nrows = 2L) 20 | 21 | z = inst$eval_batch(xdt) 22 | expect_named(z, c("regr.mse", "regr.rmse")) 23 | expect_identical(inst$archive$n_evals, 4L) 24 | expect_data_table(z, nrows = 2L) 25 | }) 26 | 27 | test_that("objective_function works", { 28 | inst = TEST_MAKE_INST_2D() 29 | y = inst$objective_function(c(1, 1, 0, 0)) 30 | expect_named(y, c("regr.mse", "regr.rmse")) 31 | }) 32 | 33 | test_that("store_benchmark_result flag works", { 34 | inst = TEST_MAKE_INST_2D(store_benchmark_result = FALSE) 35 | xdt = data.table(x1 = list(TRUE, FALSE), x2 = list(FALSE, TRUE), 36 | x3 = list(TRUE, TRUE), x4 = list(TRUE, TRUE)) 37 | inst$eval_batch(xdt) 38 | 39 | expect_true("uhashes" %nin% colnames(inst$archive$data)) 40 | 41 | inst = TEST_MAKE_INST_2D(store_benchmark_result = TRUE) 42 | xdt = data.table(x1 = list(TRUE, FALSE), x2 = list(FALSE, TRUE), 43 | x3 = list(TRUE, TRUE), x4 = list(TRUE, TRUE)) 44 | inst$eval_batch(xdt) 45 | expect_r6(inst$archive$benchmark_result, "BenchmarkResult") 46 | }) 47 | 48 | test_that("result$features works", { 49 | inst = TEST_MAKE_INST_2D(store_benchmark_result = FALSE) 50 | xdt = data.table(x1 = list(TRUE, FALSE), x2 = list(FALSE, TRUE), 51 | x3 = list(TRUE, TRUE), x4 = list(TRUE, TRUE)) 52 | ydt = data.table(regr.mse = list(0.1, 0.2), 53 | regr.rmse = list(0.2, 0.1)) 54 | inst$assign_result(xdt, ydt) 55 | expect_list(inst$result_feature_set) 56 | expect_character(inst$result_feature_set[[1]]) 57 | 58 | inst = TEST_MAKE_INST_2D(store_benchmark_result = FALSE) 59 | xdt = data.table(x1 = list(TRUE), x2 = list(FALSE), 60 | x3 = list(TRUE), x4 = list(TRUE)) 61 | ydt = data.table(regr.mse = list(0.1), 62 | regr.rmse = list(0.2)) 63 | inst$assign_result(xdt, ydt) 64 | expect_list(inst$result_feature_set) 65 | expect_character(inst$result_feature_set[[1]]) 66 | }) 67 | -------------------------------------------------------------------------------- /R/fselect_nested.R: -------------------------------------------------------------------------------- 1 | #' @title Function for Nested Resampling 2 | #' 3 | #' @description 4 | #' Function to conduct nested resampling. 5 | #' 6 | #' @param inner_resampling ([mlr3::Resampling])\cr 7 | #' Resampling used for the inner loop. 8 | #' @param outer_resampling [mlr3::Resampling])\cr 9 | #' Resampling used for the outer loop. 10 | #' 11 | #' @return [mlr3::ResampleResult] 12 | #' 13 | #' @template param_fselector 14 | #' @template param_task 15 | #' @template param_learner 16 | #' @template param_measure 17 | #' @template param_term_evals 18 | #' @template param_term_time 19 | #' @template param_terminator 20 | #' @template param_store_fselect_instance 21 | #' @template param_store_benchmark_result 22 | #' @template param_store_models 23 | #' @template param_check_values 24 | #' @template param_callbacks 25 | #' @template param_ties_method 26 | #' 27 | #' @export 28 | #' @examples 29 | #' # Nested resampling on Palmer Penguins data set 30 | #' rr = fselect_nested( 31 | #' fselector = fs("random_search"), 32 | #' task = tsk("penguins"), 33 | #' learner = lrn("classif.rpart"), 34 | #' inner_resampling = rsmp ("holdout"), 35 | #' outer_resampling = rsmp("cv", folds = 2), 36 | #' measure = msr("classif.ce"), 37 | #' term_evals = 4) 38 | #' 39 | #' # Performance scores estimated on the outer resampling 40 | #' rr$score() 41 | #' 42 | #' # Unbiased performance of the final model trained on the full data set 43 | #' rr$aggregate() 44 | fselect_nested = function( 45 | fselector, 46 | task, 47 | learner, 48 | inner_resampling, 49 | outer_resampling, 50 | measure = NULL, 51 | term_evals = NULL, 52 | term_time = NULL, 53 | terminator = NULL, 54 | store_fselect_instance = TRUE, 55 | store_benchmark_result = TRUE, 56 | store_models = FALSE, 57 | check_values = FALSE, 58 | callbacks = NULL, 59 | ties_method = "least_features" 60 | ) { 61 | assert_task(task) 62 | assert_resampling(inner_resampling) 63 | assert_resampling(outer_resampling) 64 | terminator = terminator %??% terminator_selection(term_evals, term_time) 65 | 66 | afs = auto_fselector( 67 | learner = learner, 68 | resampling = inner_resampling, 69 | measure = measure, 70 | terminator = terminator, 71 | fselector = fselector, 72 | store_fselect_instance = store_fselect_instance, 73 | store_benchmark_result = store_benchmark_result, 74 | store_models = store_models, 75 | check_values = check_values, 76 | callbacks = callbacks, 77 | ties_method = ties_method) 78 | 79 | resample(task, afs, outer_resampling, store_models = TRUE) 80 | } 81 | -------------------------------------------------------------------------------- /inst/testthat/helper_misc.R: -------------------------------------------------------------------------------- 1 | TEST_MAKE_TSK = function(n = 4L) { 2 | x = set_names(map_dtc(seq(n), function(x) rnorm(100L)), paste0("x", seq(n))) 3 | y = rnorm(100) 4 | TaskRegr$new(id = "mlr3fselect", backend = cbind(x, y), target = "y") 5 | } 6 | 7 | TEST_MAKE_INST_1D = function(n = 4L, folds = 2L, store_models = TRUE, store_benchmark_result = TRUE, 8 | measure = msr("dummy"), terminator = trm("evals", n_evals = 10)) { 9 | FSelectInstanceBatchSingleCrit$new( 10 | task = TEST_MAKE_TSK(n), 11 | learner = lrn("regr.rpart"), 12 | resampling = rsmp("cv", folds = folds), 13 | measure = measure, 14 | terminator = terminator, 15 | store_models = store_models, 16 | store_benchmark_result = store_benchmark_result) 17 | } 18 | 19 | TEST_MAKE_INST_2D = function(n = 4L, folds = 2L, store_models = FALSE, store_benchmark_result = TRUE) { 20 | FSelectInstanceBatchMultiCrit$new( 21 | task = TEST_MAKE_TSK(n), 22 | learner = lrn("regr.rpart"), 23 | resampling = rsmp("cv", folds = folds), 24 | measures = msrs(c("regr.mse", "regr.rmse")), 25 | terminator = trm("evals", n_evals = 10), 26 | store_models, 27 | store_benchmark_result = store_benchmark_result) 28 | } 29 | 30 | MeasureDummy = R6Class("MeasureDummy", inherit = MeasureRegr, 31 | public = list( 32 | initialize = function(score_design = NULL, minimize = FALSE) { 33 | if (is.null(score_design)) { 34 | score_design = data.table( 35 | score = c(1, 2, 4, 3), 36 | features = list("x1", c("x1", "x2"), c("x1", "x2", "x3"), c("x1", "x2", "x3", "x4")) 37 | ) 38 | } 39 | private$.score_design = score_design 40 | super$initialize(id = "dummy", range = c(0, 4), minimize = minimize, properties = c("requires_task", "requires_learner")) 41 | } 42 | ), 43 | private = list( 44 | .score = function(prediction, learner, task, ...) { 45 | score = private$.score_design[sapply(get("features"), identical, task$feature_names), score] 46 | if (length(score) == 0) 0 else score 47 | }, 48 | 49 | .score_design = NULL 50 | ) 51 | ) 52 | mlr3::mlr_measures$add("dummy", MeasureDummy) 53 | 54 | flush_redis = function() { 55 | config = redux::redis_config() 56 | r = redux::hiredis(config) 57 | r$FLUSHDB() 58 | } 59 | 60 | expect_rush_reset = function(rush, type = "kill") { 61 | rush$reset(type = type) 62 | # Sys.sleep(1) 63 | # keys = rush$connector$command(c("KEYS", "*")) 64 | # if (!test_list(keys, len = 0)) { 65 | # stopf("Found keys in redis after reset: %s", keys) 66 | # } 67 | mirai::daemons(0) 68 | } 69 | -------------------------------------------------------------------------------- /R/ContextAsyncFSelect.R: -------------------------------------------------------------------------------- 1 | #' @title Asynchronous Feature Selection Context 2 | #' 3 | #' @description 4 | #' A [CallbackAsyncFSelect] accesses and modifies data during the optimization via the `ContextAsyncFSelect`. 5 | #' See the section on active bindings for a list of modifiable objects. 6 | #' See [callback_async_fselect()] for a list of stages that access `ContextAsyncFSelect`. 7 | #' 8 | #' @details 9 | #' Changes to `$instance` and `$optimizer` in the stages executed on the workers are not reflected in the main process. 10 | #' 11 | #' @template param_inst_async 12 | #' @template param_fselector 13 | #' 14 | #' @export 15 | ContextAsyncFSelect = R6Class("ContextAsyncFSelect", 16 | inherit = ContextAsync, 17 | public = list( 18 | 19 | #' @field auto_fselector ([AutoFSelector])\cr 20 | #' The [AutoFSelector] instance. 21 | auto_fselector = NULL 22 | ), 23 | 24 | active = list( 25 | 26 | #' @field xs_objective (`list()`)\cr 27 | #' The feature subset currently evaluated. 28 | xs_objective = function(rhs) { 29 | if (missing(rhs)) { 30 | return(get_private(self$instance$objective)$.xs) 31 | } else { 32 | self$instance$objective$.__enclos_env__$private$.xs = rhs 33 | } 34 | }, 35 | 36 | #' @field resample_result ([mlr3::BenchmarkResult])\cr 37 | #' The resample result of the feature subset currently evaluated. 38 | resample_result = function(rhs) { 39 | if (missing(rhs)) { 40 | return(get_private(self$instance$objective)$.resample_result) 41 | } else { 42 | self$instance$objective$.__enclos_env__$private$.resample_result = rhs 43 | } 44 | }, 45 | 46 | #' @field aggregated_performance (`list()`)\cr 47 | #' Aggregated performance scores and training time of the evaluated feature subset. 48 | #' This list is passed to the archive. 49 | #' A callback can add additional elements which are also written to the archive. 50 | aggregated_performance = function(rhs) { 51 | if (missing(rhs)) { 52 | return(get_private(self$instance$objective)$.aggregated_performance) 53 | } else { 54 | self$instance$objective$.__enclos_env__$private$.aggregated_performance = rhs 55 | } 56 | }, 57 | 58 | #' @field result_feature_set (character())\cr 59 | #' The feature set passed to `instance$assign_result()`. 60 | result_feature_set = function(rhs) { 61 | if (missing(rhs)) { 62 | return(get_private(self$instance)$.result_feature_set) 63 | } else { 64 | self$instance$.__enclos_env__$private$.result_feature_set = rhs 65 | } 66 | } 67 | ) 68 | ) 69 | -------------------------------------------------------------------------------- /R/ContextBatchFSelect.R: -------------------------------------------------------------------------------- 1 | #' @title Evaluation Context 2 | #' 3 | #' @description 4 | #' The [ContextBatchFSelect] allows [CallbackBatchFSelect]s to access and modify data while a batch of feature sets is evaluated. 5 | #' See the section on active bindings for a list of modifiable objects. 6 | #' See [callback_batch_fselect()] for a list of stages that access [ContextBatchFSelect]. 7 | #' 8 | #' @details 9 | #' This context is re-created each time a new batch of feature sets is evaluated. 10 | #' Changes to `$objective_fselect`, `$design` `$benchmark_result` are discarded after the function is finished. 11 | #' Modification on the data table in `$aggregated_performance` are written to the archive. 12 | #' Any number of columns can be added. 13 | #' 14 | #' @export 15 | ContextBatchFSelect = R6Class("ContextBatchFSelect", 16 | inherit = ContextBatch, 17 | public = list( 18 | 19 | #' @field auto_fselector ([AutoFSelector])\cr 20 | #' The [AutoFSelector] instance. 21 | auto_fselector = NULL 22 | ), 23 | 24 | active = list( 25 | #' @field xss (list())\cr 26 | #' The feature sets of the latest batch. 27 | xss = function(rhs) { 28 | if (missing(rhs)) { 29 | return(get_private(self$instance$objective)$.xss) 30 | } else { 31 | get_private(self$instance$objective)$.xss = rhs 32 | } 33 | }, 34 | 35 | #' @field design ([data.table::data.table])\cr 36 | #' The benchmark design of the latest batch. 37 | design = function(rhs) { 38 | if (missing(rhs)) { 39 | return(get_private(self$instance$objective)$.design) 40 | } else { 41 | get_private(self$instance$objective)$.design = rhs 42 | } 43 | }, 44 | 45 | #' @field benchmark_result ([mlr3::BenchmarkResult])\cr 46 | #' The benchmark result of the latest batch. 47 | benchmark_result = function(rhs) { 48 | if (missing(rhs)) { 49 | return(get_private(self$instance$objective)$.benchmark_result) 50 | } else { 51 | get_private(self$instance$objective)$.benchmark_result = rhs 52 | } 53 | }, 54 | 55 | #' @field aggregated_performance ([data.table::data.table])\cr 56 | #' Aggregated performance scores and training time of the latest batch. 57 | #' This data table is passed to the archive. 58 | #' A callback can add additional columns which are also written to the archive. 59 | aggregated_performance = function(rhs) { 60 | if (missing(rhs)) { 61 | return(get_private(self$instance$objective)$.aggregated_performance) 62 | } else { 63 | get_private(self$instance$objective)$.aggregated_performance = rhs 64 | } 65 | } 66 | ) 67 | ) 68 | -------------------------------------------------------------------------------- /R/FSelectorAsyncExhaustiveSearch.R: -------------------------------------------------------------------------------- 1 | #' @title Feature Selection with Asynchronous Exhaustive Search 2 | #' 3 | #' @include FSelectorAsync.R 4 | #' @name mlr_fselectors_async_exhaustive_search 5 | #' 6 | #' @description 7 | #' Feature Selection using the Asynchronous Exhaustive Search Algorithm. 8 | #' Exhaustive Search generates all possible feature sets. 9 | #' The feature sets are evaluated asynchronously. 10 | #' 11 | #' @details 12 | #' The feature selection terminates itself when all feature sets are evaluated. 13 | #' It is not necessary to set a termination criterion. 14 | #' 15 | #' @templateVar id async_exhaustive_search 16 | #' @template section_dictionary_fselectors 17 | #' 18 | #' @section Control Parameters: 19 | #' \describe{ 20 | #' \item{`max_features`}{`integer(1)`\cr 21 | #' Maximum number of features. 22 | #' By default, number of features in [mlr3::Task].} 23 | #' } 24 | #' 25 | #' @family FSelectorAsync 26 | #' @export 27 | FSelectorAsyncExhaustiveSearch = R6Class("FSelectorAsyncExhaustiveSearch", 28 | inherit = FSelectorAsync, 29 | public = list( 30 | 31 | #' @description 32 | #' Creates a new instance of this [R6][R6::R6Class] class. 33 | initialize = function() { 34 | ps = ps( 35 | max_features = p_int(lower = 1L) 36 | ) 37 | 38 | super$initialize( 39 | id = "async_exhaustive_search", 40 | param_set = ps, 41 | properties = c("single-crit", "multi-crit", "async"), 42 | packages = "rush", 43 | label = "Asynchronous Exhaustive Search", 44 | man = "mlr3fselect::mlr_fselectors_async_exhaustive_search") 45 | }, 46 | 47 | #' @description 48 | #' Starts the asynchronous optimization. 49 | #' 50 | #' @param inst ([FSelectInstanceAsyncSingleCrit] | [FSelectInstanceAsyncMultiCrit]). 51 | #' @return [data.table::data.table]. 52 | optimize = function(inst) { 53 | pars = self$param_set$values 54 | feature_names = inst$archive$cols_x 55 | n_features = length(feature_names) 56 | 57 | fun = function(i, state) { 58 | state[i] = TRUE 59 | as.list(state) 60 | } 61 | 62 | states = set_col_names(rbindlist(unlist(map(seq(pars$max_features %??% n_features), function(n) { 63 | combn(n_features, n, fun, simplify = FALSE, state = logical(n_features)) 64 | }), recursive = FALSE)), feature_names) 65 | 66 | optimize_async_default(inst, self, states) 67 | } 68 | ), 69 | 70 | private = list( 71 | .optimize = function(inst) { 72 | # evaluate feature sets 73 | get_private(inst)$.eval_queue() 74 | } 75 | ) 76 | ) 77 | 78 | mlr_fselectors$add("async_exhaustive_search", FSelectorAsyncExhaustiveSearch) 79 | -------------------------------------------------------------------------------- /tests/testthat/test_auto_fselector.R: -------------------------------------------------------------------------------- 1 | test_that("auto_fselector function works", { 2 | afs = auto_fselector(fselector = fs("random_search", batch_size = 10), learner = lrn("classif.rpart"), resampling = rsmp ("holdout"), 3 | measure = msr("classif.ce"), term_evals = 50) 4 | 5 | expect_class(afs, "AutoFSelector") 6 | expect_class(afs$instance_args$terminator, "TerminatorEvals") 7 | 8 | afs = auto_fselector(fselector = fs("random_search", batch_size = 10), learner = lrn("classif.rpart"), resampling = rsmp ("holdout"), 9 | measure = msr("classif.ce"), term_time = 50) 10 | 11 | expect_class(afs, "AutoFSelector") 12 | expect_class(afs$instance_args$terminator, "TerminatorRunTime") 13 | 14 | afs = auto_fselector(fselector = fs("random_search", batch_size = 10), learner = lrn("classif.rpart"), resampling = rsmp ("holdout"), 15 | measure = msr("classif.ce"), term_evals = 10, term_time = 50) 16 | 17 | expect_class(afs, "AutoFSelector") 18 | expect_class(afs$instance_args$terminator, "TerminatorCombo") 19 | }) 20 | 21 | # Async ------------------------------------------------------------------------ 22 | 23 | test_that("async auto fselector works", { 24 | skip_on_cran() 25 | skip_if_not_installed("rush") 26 | flush_redis() 27 | 28 | mirai::daemons(2) 29 | rush::rush_plan(n_workers = 2, worker_type = "remote") 30 | 31 | afs = auto_fselector( 32 | fselector = fs("async_random_search"), 33 | learner = lrn("classif.rpart"), 34 | resampling = rsmp("cv", folds = 3), 35 | measure = msr("classif.ce"), 36 | terminator = trm("evals", n_evals = 3) 37 | ) 38 | 39 | expect_class(afs, "AutoFSelector") 40 | afs$train(tsk("pima")) 41 | 42 | expect_class(afs$fselect_instance, "FSelectInstanceAsyncSingleCrit") 43 | expect_rush_reset(afs$fselect_instance$rush, type = "kill") 44 | }) 45 | 46 | test_that("async auto fselector works with rush controller", { 47 | skip_on_cran() 48 | skip_if_not_installed("rush") 49 | flush_redis() 50 | 51 | on.exit(mirai::daemons(0)) 52 | mirai::daemons(2) 53 | rush::rush_plan(n_workers = 2, worker_type = "remote") 54 | rush = rush::rsh(network_id = "fselect_network") 55 | 56 | afs = auto_fselector( 57 | fselector = fs("async_random_search"), 58 | learner = lrn("classif.rpart"), 59 | resampling = rsmp("cv", folds = 3), 60 | measure = msr("classif.ce"), 61 | terminator = trm("evals", n_evals = 3), 62 | rush = rush 63 | ) 64 | 65 | expect_class(afs, "AutoFSelector") 66 | expect_class(afs$instance_args$rush, "Rush") 67 | afs$train(tsk("pima")) 68 | 69 | expect_class(afs$fselect_instance, "FSelectInstanceAsyncSingleCrit") 70 | expect_rush_reset(afs$fselect_instance$rush, type = "kill") 71 | }) 72 | -------------------------------------------------------------------------------- /R/ObjectiveFSelect.R: -------------------------------------------------------------------------------- 1 | #' @title Class for Feature Selection Objective 2 | #' 3 | #' @description 4 | #' Stores the objective function that estimates the performance of feature subsets. 5 | #' This class is usually constructed internally by the [FSelectInstanceBatchSingleCrit] / [FSelectInstanceBatchMultiCrit]. 6 | #' 7 | #' @template param_task 8 | #' @template param_learner 9 | #' @template param_resampling 10 | #' @template param_measures 11 | #' @template param_store_models 12 | #' @template param_check_values 13 | #' @template param_store_benchmark_result 14 | #' @template param_callbacks 15 | #' 16 | #' @export 17 | ObjectiveFSelect = R6Class("ObjectiveFSelect", 18 | inherit = Objective, 19 | public = list( 20 | 21 | #' @field task ([mlr3::Task]). 22 | task = NULL, 23 | 24 | #' @field learner ([mlr3::Learner]). 25 | learner = NULL, 26 | 27 | #' @field resampling ([mlr3::Resampling]). 28 | resampling = NULL, 29 | 30 | #' @field measures (list of [mlr3::Measure]). 31 | measures = NULL, 32 | 33 | #' @field store_models (`logical(1)`). 34 | store_models = NULL, 35 | 36 | #' @field store_benchmark_result (`logical(1)`). 37 | store_benchmark_result = NULL, 38 | 39 | #' @field callbacks (List of [CallbackBatchFSelect]s). 40 | callbacks = NULL, 41 | 42 | #' @description 43 | #' Creates a new instance of this [R6][R6::R6Class] class. 44 | initialize = function( 45 | task, 46 | learner, 47 | resampling, 48 | measures, 49 | check_values = TRUE, 50 | store_benchmark_result = TRUE, 51 | store_models = FALSE, 52 | callbacks = NULL 53 | ) { 54 | self$task = assert_task(as_task(task, clone = TRUE)) 55 | self$learner = assert_learner(as_learner(learner, clone = TRUE), task = self$task) 56 | self$measures = assert_measures(as_measures(measures, clone = TRUE), task = self$task, learner = self$learner) 57 | self$store_models = assert_flag(store_models) 58 | self$store_benchmark_result = assert_flag(store_benchmark_result) || self$store_models 59 | self$callbacks = assert_callbacks(as_callbacks(callbacks)) 60 | 61 | super$initialize( 62 | id = sprintf("%s_on_%s", self$learner$id, self$task$id), 63 | properties = "noisy", 64 | domain = task_to_domain(self$task), 65 | codomain = measures_to_codomain(self$measures), 66 | constants = ps(resampling = p_uty()), 67 | check_values = check_values) 68 | 69 | # set resamplings in constants 70 | resampling = assert_resampling(as_resampling(resampling, clone = TRUE)) 71 | if (!resampling$is_instantiated) resampling$instantiate(task) 72 | self$resampling = resampling 73 | self$constants$values$resampling = list(resampling) 74 | } 75 | ) 76 | ) 77 | -------------------------------------------------------------------------------- /tests/testthat/test_fsi.R: -------------------------------------------------------------------------------- 1 | test_that("fsi function creates a FSelectInstanceBatchSingleCrit", { 2 | instance = fsi( 3 | task = tsk("pima"), 4 | learner = lrn("classif.rpart"), 5 | resampling = rsmp ("holdout"), 6 | measures = msr("classif.ce"), 7 | terminator = trm("evals", n_evals = 2)) 8 | expect_class(instance, "FSelectInstanceBatchSingleCrit") 9 | }) 10 | 11 | test_that("fsi function creates a FSelectInstanceBatchMultiCrit", { 12 | instance = fsi( 13 | task = tsk("pima"), 14 | learner = lrn("classif.rpart"), 15 | resampling = rsmp ("holdout"), 16 | measures = msrs(c("classif.ce", "classif.acc")), 17 | terminator = trm("evals", n_evals = 2)) 18 | expect_class(instance, "FSelectInstanceBatchMultiCrit") 19 | }) 20 | 21 | test_that("fsi and FSelectInstanceBatchSingleCrit are equal", { 22 | fsi_args = formalArgs(fsi) 23 | fsi_args[fsi_args == "measures"] = "measure" 24 | 25 | expect_equal(fsi_args, formalArgs(FSelectInstanceBatchSingleCrit$public_methods$initialize)) 26 | 27 | task = tsk("pima") 28 | learner = lrn("classif.rpart") 29 | resampling = rsmp ("holdout") 30 | measures = msr("classif.ce") 31 | terminator = trm("evals", n_evals = 2) 32 | store_benchmark_result = FALSE 33 | store_models = TRUE 34 | check_values = TRUE 35 | callbacks = clbk("mlr3fselect.backup") 36 | resampling$instantiate(task) 37 | 38 | instance_1 = FSelectInstanceBatchSingleCrit$new(task, learner, resampling, measures, terminator, store_benchmark_result, store_models, check_values, callbacks) 39 | instance_2 = fsi(task, learner, resampling, measures, terminator, store_benchmark_result, store_models, check_values, callbacks) 40 | 41 | suppressWarnings(expect_equal(instance_1, instance_2)) 42 | }) 43 | 44 | test_that("fsi and FSelectInstanceBatchMultiCrit are equal", { 45 | fsi_args = formalArgs(fsi) 46 | fsi_args = fsi_args[fsi_args != "ties_method"] 47 | 48 | expect_equal(fsi_args, formalArgs(FSelectInstanceBatchMultiCrit$public_methods$initialize)) 49 | 50 | task = tsk("pima") 51 | learner = lrn("classif.rpart") 52 | resampling = rsmp ("holdout") 53 | measures = msrs(c("classif.ce", "classif.acc")) 54 | terminator = trm("evals", n_evals = 2) 55 | store_benchmark_result = FALSE 56 | store_models = TRUE 57 | check_values = TRUE 58 | callbacks = clbk("mlr3fselect.backup") 59 | resampling$instantiate(task) 60 | 61 | instance_1 = FSelectInstanceBatchMultiCrit$new(task, learner, resampling, measures, terminator, store_benchmark_result, store_models, check_values, callbacks) 62 | instance_2 = fsi(task, learner, resampling, measures, terminator, store_benchmark_result, store_models, check_values, callbacks) 63 | 64 | suppressWarnings(expect_equal(instance_1, instance_2)) 65 | }) 66 | 67 | -------------------------------------------------------------------------------- /man/extract_inner_fselect_archives.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/extract_inner_fselect_archives.R 3 | \name{extract_inner_fselect_archives} 4 | \alias{extract_inner_fselect_archives} 5 | \title{Extract Inner Feature Selection Archives} 6 | \usage{ 7 | extract_inner_fselect_archives(x, exclude_columns = "uhash") 8 | } 9 | \arguments{ 10 | \item{x}{(\link[mlr3:ResampleResult]{mlr3::ResampleResult} | \link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult}).} 11 | 12 | \item{exclude_columns}{(\code{character()})\cr 13 | Exclude columns from result table. Set to \code{NULL} if no column should be 14 | excluded.} 15 | } 16 | \value{ 17 | \code{\link[data.table:data.table]{data.table::data.table()}}. 18 | } 19 | \description{ 20 | Extract inner feature selection archives of nested resampling. 21 | Implemented for \link[mlr3:ResampleResult]{mlr3::ResampleResult} and \link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult}. 22 | The function iterates over the \link{AutoFSelector} objects and binds the archives to a \code{\link[data.table:data.table]{data.table::data.table()}}. 23 | \link{AutoFSelector} must be initialized with \code{store_fselect_instance = TRUE} and \code{resample()} or \code{benchmark()} must be called with \code{store_models = TRUE}. 24 | } 25 | \section{Data structure}{ 26 | 27 | 28 | The returned data table has the following columns: 29 | \itemize{ 30 | \item \code{experiment} (integer(1))\cr 31 | Index, giving the according row number in the original benchmark grid. 32 | \item \code{iteration} (integer(1))\cr 33 | Iteration of the outer resampling. 34 | \item One column for each feature of the task. 35 | \item One column for each performance measure. 36 | \item \code{runtime_learners} (\code{numeric(1)})\cr 37 | Sum of training and predict times logged in learners per 38 | \link[mlr3:ResampleResult]{mlr3::ResampleResult} / evaluation. This does not include potential 39 | overhead time. 40 | \item \code{timestamp} (\code{POSIXct})\cr 41 | Time stamp when the evaluation was logged into the archive. 42 | \item \code{batch_nr} (\code{integer(1)})\cr 43 | Feature sets are evaluated in batches. Each batch has a unique batch 44 | number. 45 | \item \code{resample_result} (\link[mlr3:ResampleResult]{mlr3::ResampleResult})\cr 46 | Resample result of the inner resampling. 47 | \item \code{task_id} (\code{character(1)}). 48 | \item \code{learner_id} (\code{character(1)}). 49 | \item \code{resampling_id} (\code{character(1)}). 50 | } 51 | } 52 | 53 | \examples{ 54 | # Nested Resampling on Palmer Penguins Data Set 55 | 56 | # create auto fselector 57 | at = auto_fselector( 58 | fselector = fs("random_search"), 59 | learner = lrn("classif.rpart"), 60 | resampling = rsmp ("holdout"), 61 | measure = msr("classif.ce"), 62 | term_evals = 4) 63 | 64 | resampling_outer = rsmp("cv", folds = 2) 65 | rr = resample(tsk("penguins"), at, resampling_outer, store_models = TRUE) 66 | 67 | # extract inner archives 68 | extract_inner_fselect_archives(rr) 69 | } 70 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(as.data.table,ArchiveAsyncFSelect) 4 | S3method(as.data.table,ArchiveAsyncFSelectFrozen) 5 | S3method(as.data.table,ArchiveBatchFSelect) 6 | S3method(as.data.table,DictionaryFSelector) 7 | S3method(as.data.table,EnsembleFSResult) 8 | S3method(c,EnsembleFSResult) 9 | S3method(extract_inner_fselect_archives,BenchmarkResult) 10 | S3method(extract_inner_fselect_archives,ResampleResult) 11 | S3method(extract_inner_fselect_results,BenchmarkResult) 12 | S3method(extract_inner_fselect_results,ResampleResult) 13 | export(ArchiveAsyncFSelect) 14 | export(ArchiveAsyncFSelectFrozen) 15 | export(ArchiveBatchFSelect) 16 | export(AutoFSelector) 17 | export(CallbackAsyncFSelect) 18 | export(ContextAsyncFSelect) 19 | export(ContextBatchFSelect) 20 | export(EnsembleFSResult) 21 | export(FSelectInstanceAsyncMultiCrit) 22 | export(FSelectInstanceAsyncSingleCrit) 23 | export(FSelectInstanceBatchMultiCrit) 24 | export(FSelectInstanceBatchSingleCrit) 25 | export(FSelector) 26 | export(FSelectorAsync) 27 | export(FSelectorAsyncDesignPoints) 28 | export(FSelectorAsyncExhaustiveSearch) 29 | export(FSelectorAsyncFromOptimizerAsync) 30 | export(FSelectorAsyncRandomSearch) 31 | export(FSelectorBatch) 32 | export(FSelectorBatchDesignPoints) 33 | export(FSelectorBatchExhaustiveSearch) 34 | export(FSelectorBatchFromOptimizerBatch) 35 | export(FSelectorBatchGeneticSearch) 36 | export(FSelectorBatchRFE) 37 | export(FSelectorBatchRFECV) 38 | export(FSelectorBatchRandomSearch) 39 | export(FSelectorBatchSequential) 40 | export(FSelectorBatchShadowVariableSearch) 41 | export(ObjectiveFSelect) 42 | export(ObjectiveFSelectAsync) 43 | export(ObjectiveFSelectBatch) 44 | export(assert_async_fselect_callback) 45 | export(assert_async_fselect_callbacks) 46 | export(assert_fselect_instance) 47 | export(assert_fselect_instance_async) 48 | export(assert_fselect_instance_batch) 49 | export(assert_fselector_async) 50 | export(assert_fselector_batch) 51 | export(assert_fselectors) 52 | export(auto_fselector) 53 | export(callback_async_fselect) 54 | export(callback_batch_fselect) 55 | export(clbk) 56 | export(clbks) 57 | export(embedded_ensemble_fselect) 58 | export(ensemble_fselect) 59 | export(extract_inner_fselect_archives) 60 | export(extract_inner_fselect_results) 61 | export(faggregate) 62 | export(fs) 63 | export(fselect) 64 | export(fselect_nested) 65 | export(fsi) 66 | export(fsi_async) 67 | export(fss) 68 | export(mlr_callbacks) 69 | export(mlr_fselectors) 70 | export(mlr_terminators) 71 | export(trm) 72 | export(trms) 73 | import(bbotk) 74 | import(checkmate) 75 | import(cli) 76 | import(data.table) 77 | import(mlr3) 78 | import(mlr3misc) 79 | import(paradox) 80 | importFrom(R6,R6Class) 81 | importFrom(bbotk,mlr_terminators) 82 | importFrom(bbotk,trm) 83 | importFrom(bbotk,trms) 84 | importFrom(mlr3misc,clbk) 85 | importFrom(mlr3misc,clbks) 86 | importFrom(mlr3misc,mlr_callbacks) 87 | importFrom(stats,sd) 88 | importFrom(utils,bibentry) 89 | importFrom(utils,combn) 90 | importFrom(utils,head) 91 | importFrom(utils,packageVersion) 92 | importFrom(utils,tail) 93 | -------------------------------------------------------------------------------- /man/ObjectiveFSelectAsync.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ObjectiveFSelectAsync.R 3 | \name{ObjectiveFSelectAsync} 4 | \alias{ObjectiveFSelectAsync} 5 | \title{Class for Feature Selection Objective} 6 | \description{ 7 | Stores the objective function that estimates the performance of feature subsets. 8 | This class is usually constructed internally by the \link{FSelectInstanceAsyncSingleCrit} or \link{FSelectInstanceAsyncMultiCrit}. 9 | } 10 | \section{Super classes}{ 11 | \code{\link[bbotk:Objective]{bbotk::Objective}} -> \code{\link[mlr3fselect:ObjectiveFSelect]{mlr3fselect::ObjectiveFSelect}} -> \code{ObjectiveFSelectAsync} 12 | } 13 | \section{Methods}{ 14 | \subsection{Public methods}{ 15 | \itemize{ 16 | \item \href{#method-ObjectiveFSelectAsync-clone}{\code{ObjectiveFSelectAsync$clone()}} 17 | } 18 | } 19 | \if{html}{\out{ 20 |
Inherited methods 21 | 30 |
31 | }} 32 | \if{html}{\out{
}} 33 | \if{html}{\out{}} 34 | \if{latex}{\out{\hypertarget{method-ObjectiveFSelectAsync-clone}{}}} 35 | \subsection{Method \code{clone()}}{ 36 | The objects of this class are cloneable with this method. 37 | \subsection{Usage}{ 38 | \if{html}{\out{
}}\preformatted{ObjectiveFSelectAsync$clone(deep = FALSE)}\if{html}{\out{
}} 39 | } 40 | 41 | \subsection{Arguments}{ 42 | \if{html}{\out{
}} 43 | \describe{ 44 | \item{\code{deep}}{Whether to make a deep clone.} 45 | } 46 | \if{html}{\out{
}} 47 | } 48 | } 49 | } 50 | -------------------------------------------------------------------------------- /R/faggregate.R: -------------------------------------------------------------------------------- 1 | #' @title Fast Aggregation of ResampleResults and BenchmarkResults 2 | #' 3 | #' @description 4 | #' Aggregates a [mlr3::ResampleResult] or [mlr3::BenchmarkResult] for a single simple measure. 5 | #' Returns the aggregated score for each resample result. 6 | #' 7 | #' @details 8 | #' This function is faster than `$aggregate()` because it does not reassemble the resampling results. 9 | #' It only works on simple measures which do not require the task, learner, model or train set to be available. 10 | #' 11 | #' @param obj ([mlr3::ResampleResult] | [mlr3::BenchmarkResult]). 12 | #' @param measure ([mlr3::Measure]). 13 | #' @param conditions (`logical(1)`)\cr 14 | #' If `TRUE`, the function returns the number of warnings and the number of errors. 15 | #' 16 | #' @return ([data.table::data.table()]) 17 | #' 18 | #' @export 19 | faggregate = function(obj, measure, conditions = FALSE) { 20 | tab = fscore(obj, measure, conditions = conditions) 21 | aggregator = measure$aggregator %??% mean 22 | if (conditions) { 23 | set_names(tab[, list( 24 | score = aggregator(get(measure$id)), 25 | warnings = sum(warnings), 26 | errors = sum(errors)), 27 | by = "uhash"], c("uhash", measure$id, "warnings", "errors"))[, -c("uhash"), with = FALSE] 28 | } else { 29 | set_names(tab[, list(score = aggregator(get(measure$id))), by = "uhash"], c("uhash", measure$id))[, -c("uhash"), with = FALSE] 30 | } 31 | } 32 | 33 | fscore = function(obj, measure, conditions = FALSE) { 34 | data = get_private(obj)$.data$data 35 | # sort by uhash 36 | tab = data$fact[data$uhashes, c("iteration", "prediction", "uhash", "learner_state"), with = FALSE] 37 | set(tab, j = measure$id, value = map_dbl(tab$prediction, fscore_single_measure, measure = measure)) 38 | cns = c("uhash", measure$id) 39 | if (conditions) { 40 | set(tab, j = "warnings", value = map_int(tab$learner_state, function(s) sum(s$log$class == "warning"))) 41 | set(tab, j = "errors", value = map_int(tab$learner_state, function(s) sum(s$log$class == "error"))) 42 | cns = c(cns, "warnings", "errors") 43 | } 44 | tab[, cns, with = FALSE] 45 | } 46 | 47 | fscore_single_measure = function(prediction, measure) { 48 | # no predict sets 49 | if (!length(measure$predict_sets)) { 50 | score = get_private(measure)$.score(prediction = NULL, task = NULL) 51 | return(score) 52 | } 53 | 54 | # merge multiple predictions (on different predict sets) to a single one 55 | if (is.list(prediction)) { 56 | ii = match(measure$predict_sets, names(prediction)) 57 | if (anyMissing(ii)) { 58 | return(NaN) 59 | } 60 | prediction = do.call(c, prediction[ii]) 61 | } 62 | 63 | # convert pdata to regular prediction 64 | prediction = as_prediction(prediction, check = FALSE) 65 | 66 | if (is.null(prediction) && length(measure$predict_sets)) { 67 | return(NaN) 68 | } 69 | 70 | if (!is_scalar_na(measure$predict_type) && measure$predict_type %nin% prediction$predict_types) { 71 | return(NaN) 72 | } 73 | 74 | get_private(measure)$.score(prediction = prediction, task = NULL, weights = if (measure$use_weights == "use") prediction$weights) 75 | } 76 | -------------------------------------------------------------------------------- /tests/testthat/test_fselect.R: -------------------------------------------------------------------------------- 1 | test_that("fselect function works with single measure", { 2 | instance = fselect(fselector = fs("random_search", batch_size = 1), task = tsk("pima"), learner = lrn("classif.rpart"), resampling = rsmp ("holdout"), 3 | measures = msr("classif.ce"), term_evals = 2) 4 | 5 | expect_class(instance, "FSelectInstanceBatchSingleCrit") 6 | expect_data_table(instance$archive$data, nrows = 2) 7 | expect_class(instance$terminator, "TerminatorEvals") 8 | }) 9 | 10 | test_that("fselect function works with multiple measures", { 11 | instance = fselect(fselector = fs("random_search", batch_size = 1), task = tsk("pima"), learner = lrn("classif.rpart"), resampling = rsmp ("holdout"), 12 | measures = msrs(c("classif.ce", "classif.acc")), term_evals = 2) 13 | 14 | expect_class(instance, "FSelectInstanceBatchMultiCrit") 15 | expect_data_table(instance$archive$data, nrows = 2) 16 | expect_class(instance$terminator, "TerminatorEvals") 17 | }) 18 | 19 | test_that("fselect function accepts string input for method", { 20 | instance = fselect(fselector = fs("random_search", batch_size = 1), task = tsk("pima"), learner = lrn("classif.rpart"), resampling = rsmp ("holdout"), 21 | measures = msr("classif.ce"), term_evals = 2) 22 | 23 | expect_class(instance, "FSelectInstanceBatchSingleCrit") 24 | expect_data_table(instance$archive$data, nrows = 2) 25 | expect_class(instance$terminator, "TerminatorEvals") 26 | }) 27 | 28 | test_that("fselect interface is equal to FSelectInstanceBatchSingleCrit", { 29 | fselect_args = formalArgs(fselect) 30 | fselect_args = fselect_args[fselect_args != "fselector"] 31 | fselect_args[fselect_args == "measures"] = "measure" 32 | 33 | instance_args = formalArgs(FSelectInstanceBatchSingleCrit$public_methods$initialize) 34 | instance_args = c(instance_args, "term_evals", "term_time", "rush") 35 | 36 | expect_set_equal(fselect_args, instance_args) 37 | }) 38 | 39 | test_that("fselect interface is equal to FSelectInstanceBatchMultiCrit", { 40 | fselect_args = formalArgs(fselect) 41 | fselect_args = fselect_args[fselect_args %nin% c("fselector", "ties_method")] 42 | 43 | instance_args = formalArgs(FSelectInstanceBatchMultiCrit$public_methods$initialize) 44 | instance_args = c(instance_args, "term_evals", "term_time", "rush") 45 | 46 | expect_set_equal(fselect_args, instance_args) 47 | }) 48 | 49 | test_that("fselect interface is equal to FSelectInstanceAsyncSingleCrit", { 50 | fselect_args = formalArgs(fselect) 51 | fselect_args = fselect_args[fselect_args %nin% c("fselector")] 52 | fselect_args[fselect_args == "measures"] = "measure" 53 | 54 | instance_args = formalArgs(FSelectInstanceAsyncSingleCrit$public_methods$initialize) 55 | instance_args = c(instance_args, "term_evals", "term_time") 56 | 57 | expect_set_equal(fselect_args, instance_args) 58 | }) 59 | 60 | test_that("fselect interface is equal to FSelectInstanceAsyncMultiCrit", { 61 | fselect_args = formalArgs(fselect) 62 | fselect_args = fselect_args[fselect_args %nin% c("fselector", "ties_method")] 63 | 64 | instance_args = formalArgs(FSelectInstanceAsyncMultiCrit$public_methods$initialize) 65 | instance_args = c(instance_args, "term_evals", "term_time") 66 | 67 | expect_set_equal(fselect_args, instance_args) 68 | }) 69 | -------------------------------------------------------------------------------- /R/FSelectorBatchRandomSearch.R: -------------------------------------------------------------------------------- 1 | #' @title Feature Selection with Random Search 2 | #' 3 | #' @include mlr_fselectors.R 4 | #' @name mlr_fselectors_random_search 5 | #' 6 | #' @description 7 | #' Feature selection using Random Search Algorithm. 8 | #' 9 | #' @details 10 | #' The feature sets are randomly drawn. 11 | #' The sets are evaluated in batches of size `batch_size`. 12 | #' Larger batches mean we can parallelize more, smaller batches imply a more fine-grained checking of termination criteria. 13 | #' 14 | #' @templateVar id random_search 15 | #' @template section_dictionary_fselectors 16 | #' 17 | #' @section Control Parameters: 18 | #' \describe{ 19 | #' \item{`max_features`}{`integer(1)`\cr 20 | #' Maximum number of features. 21 | #' By default, number of features in [mlr3::Task].} 22 | #' \item{`batch_size`}{`integer(1)`\cr 23 | #' Maximum number of feature sets to try in a batch.} 24 | #' } 25 | #' 26 | #' @source 27 | #' `r format_bib("bergstra_2012")` 28 | #' 29 | #' @family FSelector 30 | #' @export 31 | #' @examples 32 | #' # Feature Selection 33 | #' \donttest{ 34 | #' 35 | #' # retrieve task and load learner 36 | #' task = tsk("penguins") 37 | #' learner = lrn("classif.rpart") 38 | #' 39 | #' # run feature selection on the Palmer Penguins data set 40 | #' instance = fselect( 41 | #' fselector = fs("random_search"), 42 | #' task = task, 43 | #' learner = learner, 44 | #' resampling = rsmp("holdout"), 45 | #' measure = msr("classif.ce"), 46 | #' term_evals = 10 47 | #' ) 48 | #' 49 | #' # best performing feature subset 50 | #' instance$result 51 | #' 52 | #' # all evaluated feature subsets 53 | #' as.data.table(instance$archive) 54 | #' 55 | #' # subset the task and fit the final model 56 | #' task$select(instance$result_feature_set) 57 | #' learner$train(task) 58 | #' } 59 | FSelectorBatchRandomSearch = R6Class("FSelectorBatchRandomSearch", 60 | inherit = FSelectorBatch, 61 | public = list( 62 | 63 | #' @description 64 | #' Creates a new instance of this [R6][R6::R6Class] class. 65 | initialize = function() { 66 | ps = ps( 67 | max_features = p_int(lower = 1L), 68 | batch_size = p_int(lower = 1L, tags = "required") 69 | ) 70 | ps$values = list(batch_size = 10L) 71 | 72 | super$initialize( 73 | id = "random_search", 74 | param_set = ps, 75 | properties = c("single-crit", "multi-crit"), 76 | label = "Random Search", 77 | man = "mlr3fselect::mlr_fselectors_random_search" 78 | ) 79 | } 80 | ), 81 | 82 | private = list( 83 | .optimize = function(inst) { 84 | pars = self$param_set$values 85 | feature_names = inst$archive$cols_x 86 | max_features = pars$max_features %??% length(feature_names) 87 | 88 | repeat { 89 | X = t(replicate(pars$batch_size, { 90 | n = sample.int(max_features, 1L) 91 | x = sample.int(length(feature_names), n) 92 | replace(logical(length(feature_names)), x, TRUE) 93 | })) 94 | colnames(X) = feature_names 95 | inst$eval_batch(as.data.table(X)) 96 | } 97 | } 98 | ) 99 | ) 100 | 101 | mlr_fselectors$add("random_search", FSelectorBatchRandomSearch) 102 | -------------------------------------------------------------------------------- /R/ObjectiveFSelectAsync.R: -------------------------------------------------------------------------------- 1 | #' @title Class for Feature Selection Objective 2 | #' 3 | #' @description 4 | #' Stores the objective function that estimates the performance of feature subsets. 5 | #' This class is usually constructed internally by the [FSelectInstanceAsyncSingleCrit] or [FSelectInstanceAsyncMultiCrit]. 6 | #' 7 | #' @template param_task 8 | #' @template param_learner 9 | #' @template param_resampling 10 | #' @template param_measures 11 | #' @template param_store_models 12 | #' @template param_check_values 13 | #' @template param_store_benchmark_result 14 | #' @template param_callbacks 15 | #' 16 | #' @export 17 | ObjectiveFSelectAsync = R6Class("ObjectiveFSelectAsync", 18 | inherit = ObjectiveFSelect, 19 | private = list( 20 | .eval = function(xs, resampling) { 21 | lg$debug("Evaluating feature subset %s", as_short_string(xs)) 22 | 23 | # restore features 24 | all_features = self$task$feature_names 25 | on.exit(self$task$set_col_roles(all_features, "feature")) 26 | 27 | # select features 28 | private$.xs = xs 29 | call_back("on_eval_after_xs", self$callbacks, self$context) 30 | self$task$select(names(private$.xs)[as.logical(private$.xs)]) 31 | 32 | lg$debug("Resampling feature subset") 33 | 34 | # resample feature subset 35 | private$.resample_result = resample(self$task, self$learner, self$resampling, store_models = self$store_models, clone = character(0), callbacks = self$callbacks) 36 | call_back("on_eval_after_resample", self$callbacks, self$context) 37 | 38 | lg$debug("Aggregating performance") 39 | 40 | # aggregate performance 41 | private$.aggregated_performance = if (length(self$measures) == 1 && all(c("requires_task", "requires_learner", "requires_model", "requires_train_set") %nin% self$measures[[1]]$properties)) { 42 | lg$debug("Fast aggregation on measure %s", self$measures[[1]]$id) 43 | 44 | as.list(faggregate(private$.resample_result, self$measures[[1]], conditions = FALSE)) 45 | } else { 46 | lg$debug("Slow aggregation on measures %s", paste(map(self$measures, "id"), collapse = ", ")) 47 | 48 | as.list(private$.resample_result$aggregate(self$measures)) 49 | } 50 | 51 | lg$debug("Aggregated performance %s", as_short_string(private$.aggregated_performance)) 52 | 53 | # add runtime, errors and warnings 54 | warnings = sum(map_int(get_private(private$.resample_result)$.data$learner_states(), function(s) sum(s$log$class == "warning"))) 55 | errors = sum(map_int(get_private(private$.resample_result)$.data$learner_states(), function(s) sum(s$log$class == "error"))) 56 | runtime_learners = extract_runtime(private$.resample_result) 57 | 58 | private$.aggregated_performance = c(private$.aggregated_performance, list(runtime_learners = runtime_learners, warnings = warnings, errors = errors)) 59 | 60 | # add benchmark result and models 61 | if (self$store_benchmark_result) { 62 | lg$debug("Storing resample result") 63 | private$.aggregated_performance = c(private$.aggregated_performance, list(resample_result = list(private$.resample_result))) 64 | } 65 | 66 | call_back("on_eval_before_archive", self$callbacks, self$context) 67 | private$.aggregated_performance 68 | }, 69 | 70 | .xs = NULL, 71 | .resample_result = NULL, 72 | .aggregated_performance = NULL 73 | ) 74 | ) 75 | -------------------------------------------------------------------------------- /man/ContextAsyncFSelect.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ContextAsyncFSelect.R 3 | \name{ContextAsyncFSelect} 4 | \alias{ContextAsyncFSelect} 5 | \title{Asynchronous Feature Selection Context} 6 | \description{ 7 | A \link{CallbackAsyncFSelect} accesses and modifies data during the optimization via the \code{ContextAsyncFSelect}. 8 | See the section on active bindings for a list of modifiable objects. 9 | See \code{\link[=callback_async_fselect]{callback_async_fselect()}} for a list of stages that access \code{ContextAsyncFSelect}. 10 | } 11 | \details{ 12 | Changes to \verb{$instance} and \verb{$optimizer} in the stages executed on the workers are not reflected in the main process. 13 | } 14 | \section{Super classes}{ 15 | \code{\link[mlr3misc:Context]{mlr3misc::Context}} -> \code{\link[bbotk:ContextAsync]{bbotk::ContextAsync}} -> \code{ContextAsyncFSelect} 16 | } 17 | \section{Public fields}{ 18 | \if{html}{\out{
}} 19 | \describe{ 20 | \item{\code{auto_fselector}}{(\link{AutoFSelector})\cr 21 | The \link{AutoFSelector} instance.} 22 | } 23 | \if{html}{\out{
}} 24 | } 25 | \section{Active bindings}{ 26 | \if{html}{\out{
}} 27 | \describe{ 28 | \item{\code{xs_objective}}{(\code{list()})\cr 29 | The feature subset currently evaluated.} 30 | 31 | \item{\code{resample_result}}{(\link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult})\cr 32 | The resample result of the feature subset currently evaluated.} 33 | 34 | \item{\code{aggregated_performance}}{(\code{list()})\cr 35 | Aggregated performance scores and training time of the evaluated feature subset. 36 | This list is passed to the archive. 37 | A callback can add additional elements which are also written to the archive.} 38 | 39 | \item{\code{result_feature_set}}{(character())\cr 40 | The feature set passed to \code{instance$assign_result()}.} 41 | } 42 | \if{html}{\out{
}} 43 | } 44 | \section{Methods}{ 45 | \subsection{Public methods}{ 46 | \itemize{ 47 | \item \href{#method-ContextAsyncFSelect-clone}{\code{ContextAsyncFSelect$clone()}} 48 | } 49 | } 50 | \if{html}{\out{ 51 |
Inherited methods 52 | 57 |
58 | }} 59 | \if{html}{\out{
}} 60 | \if{html}{\out{}} 61 | \if{latex}{\out{\hypertarget{method-ContextAsyncFSelect-clone}{}}} 62 | \subsection{Method \code{clone()}}{ 63 | The objects of this class are cloneable with this method. 64 | \subsection{Usage}{ 65 | \if{html}{\out{
}}\preformatted{ContextAsyncFSelect$clone(deep = FALSE)}\if{html}{\out{
}} 66 | } 67 | 68 | \subsection{Arguments}{ 69 | \if{html}{\out{
}} 70 | \describe{ 71 | \item{\code{deep}}{Whether to make a deep clone.} 72 | } 73 | \if{html}{\out{
}} 74 | } 75 | } 76 | } 77 | -------------------------------------------------------------------------------- /tests/testthat/test_FSelectorShadowVariableSearch.R: -------------------------------------------------------------------------------- 1 | test_that("default parameters work", { 2 | z = test_fselector("shadow_variable_search", store_models = TRUE) 3 | 4 | expect_best_features(z$inst$archive$best(batch = 1)[, 1:8], "x1") 5 | expect_best_features(z$inst$archive$best(batch = 2)[, 1:8], c("x1", "x2")) 6 | expect_best_features(z$inst$archive$best(batch = 3)[, 1:8], c("x1", "x2", "x3")) 7 | expect_best_features(z$inst$archive$best(batch = 4)[, 1:8], c("x1", "x2", "x3", "x4")) 8 | }) 9 | 10 | test_that("task is permuted", { 11 | instance = TEST_MAKE_INST_1D(terminator = trm("none")) 12 | task = instance$objective$task$clone() 13 | fselector = fs("shadow_variable_search") 14 | fselector$optimize(instance) 15 | 16 | task_permuted = as.data.table(instance$archive)$resample_result[[1]]$task 17 | expect_set_equal(task_permuted$backend$colnames, c("y", "x1", "x2", "x3", "x4", "..row_id", "permuted__x1", "permuted__x2", "permuted__x3", "permuted__x4")) 18 | expect_equal(task_permuted$data(cols = c("y", "x1", "x2", "x3", "x4"))[, 1:5], task$data()) 19 | expect_false(isTRUE(all.equal(task_permuted$data(cols = c("permuted__x1", "permuted__x2", "permuted__x3", "permuted__x4")), task$data()[, 2:5]))) 20 | }) 21 | 22 | test_that("first selected feature is a shadow variable works", { 23 | score_design = data.table(score = 1, features = "permuted__x1") 24 | instance = TEST_MAKE_INST_1D(measure = msr("dummy", score_design = score_design), terminator = trm("none")) 25 | fselector = fs("shadow_variable_search") 26 | expect_error(fselector$optimize(instance), regexp = "The first selected feature is a shadow variable.") 27 | }) 28 | 29 | test_that("second selected feature is a shadow variable works", { 30 | score_design = data.table(score = c(1, 2), features = list("x1", c("x1", "permuted__x1"))) 31 | instance = TEST_MAKE_INST_1D(measure = msr("dummy", score_design = score_design), terminator = trm("none")) 32 | task = instance$objective$task$clone() 33 | domain = instance$objective$domain$clone() 34 | fselector = fs("shadow_variable_search") 35 | fselector$optimize(instance) 36 | 37 | # two batch are evaluated but second batch is removed because the best feature subsets contains a shadow variable 38 | expect_equal(instance$archive$n_batch, 1) 39 | # expect that the best result is the one without shadow variable 40 | expect_equal(instance$result$features, list("x1")) 41 | expect_equal(instance$result_y, c("dummy" = 1)) 42 | # check that domain and search space are restored 43 | expect_equal(instance$search_space, domain) 44 | expect_equal(instance$objective$domain, domain) 45 | # check that task is restored 46 | suppressWarnings(expect_equal(instance$objective$task, task)) 47 | }) 48 | 49 | test_that("search is terminated by terminator works", { 50 | instance = TEST_MAKE_INST_1D(terminator = trm("evals", n_evals = 15)) 51 | task = instance$objective$task$clone() 52 | domain = instance$objective$domain$clone() 53 | fselector = fs("shadow_variable_search") 54 | fselector$optimize(instance) 55 | 56 | # check that last batch is not removed because the best feature subset contains no shadow variable 57 | expect_equal(instance$archive$n_batch, 2) 58 | expect_equal(instance$result$features, list(c("x1", "x2"))) 59 | expect_equal(instance$result_y, c("dummy" = 2)) 60 | # check that domain and search space are restored 61 | expect_equal(instance$search_space, domain) 62 | expect_equal(instance$objective$domain, domain) 63 | # check that task is restored 64 | suppressWarnings(expect_equal(instance$objective$task, task)) 65 | }) 66 | -------------------------------------------------------------------------------- /man/ContextBatchFSelect.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ContextBatchFSelect.R 3 | \name{ContextBatchFSelect} 4 | \alias{ContextBatchFSelect} 5 | \title{Evaluation Context} 6 | \description{ 7 | The \link{ContextBatchFSelect} allows \link{CallbackBatchFSelect}s to access and modify data while a batch of feature sets is evaluated. 8 | See the section on active bindings for a list of modifiable objects. 9 | See \code{\link[=callback_batch_fselect]{callback_batch_fselect()}} for a list of stages that access \link{ContextBatchFSelect}. 10 | } 11 | \details{ 12 | This context is re-created each time a new batch of feature sets is evaluated. 13 | Changes to \verb{$objective_fselect}, \verb{$design} \verb{$benchmark_result} are discarded after the function is finished. 14 | Modification on the data table in \verb{$aggregated_performance} are written to the archive. 15 | Any number of columns can be added. 16 | } 17 | \section{Super classes}{ 18 | \code{\link[mlr3misc:Context]{mlr3misc::Context}} -> \code{\link[bbotk:ContextBatch]{bbotk::ContextBatch}} -> \code{ContextBatchFSelect} 19 | } 20 | \section{Public fields}{ 21 | \if{html}{\out{
}} 22 | \describe{ 23 | \item{\code{auto_fselector}}{(\link{AutoFSelector})\cr 24 | The \link{AutoFSelector} instance.} 25 | } 26 | \if{html}{\out{
}} 27 | } 28 | \section{Active bindings}{ 29 | \if{html}{\out{
}} 30 | \describe{ 31 | \item{\code{xss}}{(list())\cr 32 | The feature sets of the latest batch.} 33 | 34 | \item{\code{design}}{(\link[data.table:data.table]{data.table::data.table})\cr 35 | The benchmark design of the latest batch.} 36 | 37 | \item{\code{benchmark_result}}{(\link[mlr3:BenchmarkResult]{mlr3::BenchmarkResult})\cr 38 | The benchmark result of the latest batch.} 39 | 40 | \item{\code{aggregated_performance}}{(\link[data.table:data.table]{data.table::data.table})\cr 41 | Aggregated performance scores and training time of the latest batch. 42 | This data table is passed to the archive. 43 | A callback can add additional columns which are also written to the archive.} 44 | } 45 | \if{html}{\out{
}} 46 | } 47 | \section{Methods}{ 48 | \subsection{Public methods}{ 49 | \itemize{ 50 | \item \href{#method-ContextBatchFSelect-clone}{\code{ContextBatchFSelect$clone()}} 51 | } 52 | } 53 | \if{html}{\out{ 54 |
Inherited methods 55 | 60 |
61 | }} 62 | \if{html}{\out{
}} 63 | \if{html}{\out{}} 64 | \if{latex}{\out{\hypertarget{method-ContextBatchFSelect-clone}{}}} 65 | \subsection{Method \code{clone()}}{ 66 | The objects of this class are cloneable with this method. 67 | \subsection{Usage}{ 68 | \if{html}{\out{
}}\preformatted{ContextBatchFSelect$clone(deep = FALSE)}\if{html}{\out{
}} 69 | } 70 | 71 | \subsection{Arguments}{ 72 | \if{html}{\out{
}} 73 | \describe{ 74 | \item{\code{deep}}{Whether to make a deep clone.} 75 | } 76 | \if{html}{\out{
}} 77 | } 78 | } 79 | } 80 | -------------------------------------------------------------------------------- /man/fselect_nested.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/fselect_nested.R 3 | \name{fselect_nested} 4 | \alias{fselect_nested} 5 | \title{Function for Nested Resampling} 6 | \usage{ 7 | fselect_nested( 8 | fselector, 9 | task, 10 | learner, 11 | inner_resampling, 12 | outer_resampling, 13 | measure = NULL, 14 | term_evals = NULL, 15 | term_time = NULL, 16 | terminator = NULL, 17 | store_fselect_instance = TRUE, 18 | store_benchmark_result = TRUE, 19 | store_models = FALSE, 20 | check_values = FALSE, 21 | callbacks = NULL, 22 | ties_method = "least_features" 23 | ) 24 | } 25 | \arguments{ 26 | \item{fselector}{(\link{FSelector})\cr 27 | Optimization algorithm.} 28 | 29 | \item{task}{(\link[mlr3:Task]{mlr3::Task})\cr 30 | Task to operate on.} 31 | 32 | \item{learner}{(\link[mlr3:Learner]{mlr3::Learner})\cr 33 | Learner to optimize the feature subset for.} 34 | 35 | \item{inner_resampling}{(\link[mlr3:Resampling]{mlr3::Resampling})\cr 36 | Resampling used for the inner loop.} 37 | 38 | \item{outer_resampling}{\link[mlr3:Resampling]{mlr3::Resampling})\cr 39 | Resampling used for the outer loop.} 40 | 41 | \item{measure}{(\link[mlr3:Measure]{mlr3::Measure})\cr 42 | Measure to optimize. If \code{NULL}, default measure is used.} 43 | 44 | \item{term_evals}{(\code{integer(1)})\cr 45 | Number of allowed evaluations. 46 | Ignored if \code{terminator} is passed.} 47 | 48 | \item{term_time}{(\code{integer(1)})\cr 49 | Maximum allowed time in seconds. 50 | Ignored if \code{terminator} is passed.} 51 | 52 | \item{terminator}{(\link[bbotk:Terminator]{bbotk::Terminator})\cr 53 | Stop criterion of the feature selection.} 54 | 55 | \item{store_fselect_instance}{(\code{logical(1)})\cr 56 | If \code{TRUE} (default), stores the internally created \link{FSelectInstanceBatchSingleCrit} with all intermediate results in slot \verb{$fselect_instance}. 57 | Is set to \code{TRUE}, if \code{store_models = TRUE}} 58 | 59 | \item{store_benchmark_result}{(\code{logical(1)})\cr 60 | Store benchmark result in archive?} 61 | 62 | \item{store_models}{(\code{logical(1)}). 63 | Store models in benchmark result?} 64 | 65 | \item{check_values}{(\code{logical(1)})\cr 66 | Check the parameters before the evaluation and the results for 67 | validity?} 68 | 69 | \item{callbacks}{(list of \link{CallbackBatchFSelect})\cr 70 | List of callbacks.} 71 | 72 | \item{ties_method}{(\code{character(1)})\cr 73 | The method to break ties when selecting sets while optimizing and when selecting the best set. 74 | Can be \code{"least_features"} or \code{"random"}. 75 | The option \code{"least_features"} (default) selects the feature set with the least features. 76 | If there are multiple best feature sets with the same number of features, one is selected randomly. 77 | The \code{random} method returns a random feature set from the best feature sets. 78 | Ignored if multiple measures are used.} 79 | } 80 | \value{ 81 | \link[mlr3:ResampleResult]{mlr3::ResampleResult} 82 | } 83 | \description{ 84 | Function to conduct nested resampling. 85 | } 86 | \examples{ 87 | # Nested resampling on Palmer Penguins data set 88 | rr = fselect_nested( 89 | fselector = fs("random_search"), 90 | task = tsk("penguins"), 91 | learner = lrn("classif.rpart"), 92 | inner_resampling = rsmp ("holdout"), 93 | outer_resampling = rsmp("cv", folds = 2), 94 | measure = msr("classif.ce"), 95 | term_evals = 4) 96 | 97 | # Performance scores estimated on the outer resampling 98 | rr$score() 99 | 100 | # Unbiased performance of the final model trained on the full data set 101 | rr$aggregate() 102 | } 103 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: mlr3fselect 2 | Title: Feature Selection for 'mlr3' 3 | Version: 1.5.0 4 | Authors@R: c( 5 | person("Marc", "Becker", , "marcbecker@posteo.de", role = c("aut", "cre"), 6 | comment = c(ORCID = "0000-0002-8115-0400")), 7 | person("Patrick", "Schratz", , "patrick.schratz@gmail.com", role = "aut", 8 | comment = c(ORCID = "0000-0003-0748-6624")), 9 | person("Michel", "Lang", , "michellang@gmail.com", role = "aut", 10 | comment = c(ORCID = "0000-0001-9754-0393")), 11 | person("Bernd", "Bischl", , "bernd_bischl@gmx.net", role = "aut", 12 | comment = c(ORCID = "0000-0001-6002-6980")), 13 | person("John", "Zobolas", , "bblodfon@gmail.com", role = "aut", 14 | comment = c(ORCID = "0000-0002-3609-8674")) 15 | ) 16 | Description: Feature selection package of the 'mlr3' ecosystem. It selects 17 | the optimal feature set for any 'mlr3' learner. The package works with 18 | several optimization algorithms e.g. Random Search, Recursive Feature 19 | Elimination, and Genetic Search. Moreover, it can automatically 20 | optimize learners and estimate the performance of optimized feature 21 | sets with nested resampling. 22 | License: LGPL-3 23 | URL: https://mlr3fselect.mlr-org.com, 24 | https://github.com/mlr-org/mlr3fselect 25 | BugReports: https://github.com/mlr-org/mlr3fselect/issues 26 | Depends: 27 | mlr3 (>= 1.0.1), 28 | R (>= 3.1.0) 29 | Imports: 30 | bbotk (>= 1.8.1), 31 | checkmate (>= 2.0.0), 32 | cli, 33 | data.table, 34 | lgr, 35 | mlr3misc (>= 0.15.1), 36 | paradox (>= 1.0.0), 37 | R6, 38 | stabm 39 | Suggests: 40 | e1071, 41 | fastVoteR, 42 | genalg, 43 | mirai, 44 | mlr3learners, 45 | mlr3pipelines, 46 | rpart, 47 | rush (>= 0.4.1), 48 | testthat (>= 3.0.0) 49 | Config/testthat/edition: 3 50 | Config/testthat/parallel: false 51 | Encoding: UTF-8 52 | Language: en-US 53 | NeedsCompilation: no 54 | Roxygen: list(markdown = TRUE) 55 | RoxygenNote: 7.3.2 56 | Collate: 57 | 'ArchiveAsyncFSelect.R' 58 | 'ArchiveAsyncFSelectFrozen.R' 59 | 'ArchiveBatchFSelect.R' 60 | 'AutoFSelector.R' 61 | 'CallbackAsyncFSelect.R' 62 | 'CallbackBatchFSelect.R' 63 | 'ContextAsyncFSelect.R' 64 | 'ContextBatchFSelect.R' 65 | 'EnsembleFSResult.R' 66 | 'FSelectInstanceAsyncSingleCrit.R' 67 | 'FSelectInstanceAsyncMultiCrit.R' 68 | 'FSelectInstanceBatchSingleCrit.R' 69 | 'FSelectInstanceBatchMultiCrit.R' 70 | 'mlr_fselectors.R' 71 | 'FSelector.R' 72 | 'FSelectorAsync.R' 73 | 'FSelectorAsyncDesignPoints.R' 74 | 'FSelectorAsyncExhaustiveSearch.R' 75 | 'FSelectorAsyncFromOptimizerAsync.R' 76 | 'FSelectorAsyncRandomSearch.R' 77 | 'FSelectorBatch.R' 78 | 'FSelectorBatchDesignPoints.R' 79 | 'FSelectorBatchExhaustiveSearch.R' 80 | 'FSelectorBatchFromOptimizerBatch.R' 81 | 'FSelectorBatchGeneticSearch.R' 82 | 'FSelectorBatchRFE.R' 83 | 'FSelectorBatchRFECV.R' 84 | 'FSelectorBatchRandomSearch.R' 85 | 'FSelectorBatchSequential.R' 86 | 'FSelectorBatchShadowVariableSearch.R' 87 | 'ObjectiveFSelect.R' 88 | 'ObjectiveFSelectAsync.R' 89 | 'ObjectiveFSelectBatch.R' 90 | 'assertions.R' 91 | 'auto_fselector.R' 92 | 'bibentries.R' 93 | 'embedded_ensemble_fselect.R' 94 | 'ensemble_fselect.R' 95 | 'extract_inner_fselect_archives.R' 96 | 'extract_inner_fselect_results.R' 97 | 'faggregate.R' 98 | 'fselect.R' 99 | 'fselect_nested.R' 100 | 'helper.R' 101 | 'mlr_callbacks.R' 102 | 'reexports.R' 103 | 'sugar.R' 104 | 'zzz.R' 105 | -------------------------------------------------------------------------------- /man/mlr_fselectors_async_design_points.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/FSelectorAsyncDesignPoints.R 3 | \name{mlr_fselectors_async_design_points} 4 | \alias{mlr_fselectors_async_design_points} 5 | \alias{FSelectorAsyncDesignPoints} 6 | \title{Feature Selection with Asynchronous Design Points} 7 | \description{ 8 | Subclass for asynchronous design points feature selection. 9 | } 10 | \section{Dictionary}{ 11 | 12 | This \link{FSelector} can be instantiated with the associated sugar function \code{\link[=fs]{fs()}}: 13 | 14 | \if{html}{\out{
}}\preformatted{fs("async_design_points") 15 | }\if{html}{\out{
}} 16 | } 17 | 18 | \section{Parameters}{ 19 | 20 | 21 | \describe{ 22 | \item{\code{design}}{\link[data.table:data.table]{data.table::data.table}\cr 23 | Design points to try in search, one per row.} 24 | } 25 | 26 | } 27 | 28 | \seealso{ 29 | Other FSelectorAsync: 30 | \code{\link{mlr_fselectors_async_exhaustive_search}}, 31 | \code{\link{mlr_fselectors_async_random_search}} 32 | } 33 | \concept{FSelectorAsync} 34 | \section{Super classes}{ 35 | \code{\link[mlr3fselect:FSelector]{mlr3fselect::FSelector}} -> \code{\link[mlr3fselect:FSelectorAsync]{mlr3fselect::FSelectorAsync}} -> \code{\link[mlr3fselect:FSelectorAsyncFromOptimizerAsync]{mlr3fselect::FSelectorAsyncFromOptimizerAsync}} -> \code{FSelectorAsyncDesignPoints} 36 | } 37 | \section{Methods}{ 38 | \subsection{Public methods}{ 39 | \itemize{ 40 | \item \href{#method-FSelectorAsyncDesignPoints-new}{\code{FSelectorAsyncDesignPoints$new()}} 41 | \item \href{#method-FSelectorAsyncDesignPoints-clone}{\code{FSelectorAsyncDesignPoints$clone()}} 42 | } 43 | } 44 | \if{html}{\out{ 45 |
Inherited methods 46 | 52 |
53 | }} 54 | \if{html}{\out{
}} 55 | \if{html}{\out{}} 56 | \if{latex}{\out{\hypertarget{method-FSelectorAsyncDesignPoints-new}{}}} 57 | \subsection{Method \code{new()}}{ 58 | Creates a new instance of this \link[R6:R6Class]{R6} class. 59 | \subsection{Usage}{ 60 | \if{html}{\out{
}}\preformatted{FSelectorAsyncDesignPoints$new()}\if{html}{\out{
}} 61 | } 62 | 63 | } 64 | \if{html}{\out{
}} 65 | \if{html}{\out{}} 66 | \if{latex}{\out{\hypertarget{method-FSelectorAsyncDesignPoints-clone}{}}} 67 | \subsection{Method \code{clone()}}{ 68 | The objects of this class are cloneable with this method. 69 | \subsection{Usage}{ 70 | \if{html}{\out{
}}\preformatted{FSelectorAsyncDesignPoints$clone(deep = FALSE)}\if{html}{\out{
}} 71 | } 72 | 73 | \subsection{Arguments}{ 74 | \if{html}{\out{
}} 75 | \describe{ 76 | \item{\code{deep}}{Whether to make a deep clone.} 77 | } 78 | \if{html}{\out{
}} 79 | } 80 | } 81 | } 82 | -------------------------------------------------------------------------------- /man/embedded_ensemble_fselect.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/embedded_ensemble_fselect.R 3 | \name{embedded_ensemble_fselect} 4 | \alias{embedded_ensemble_fselect} 5 | \title{Embedded Ensemble Feature Selection} 6 | \source{ 7 | Meinshausen, Nicolai, Buhlmann, Peter (2010). 8 | \dQuote{Stability Selection.} 9 | \emph{Journal of the Royal Statistical Society Series B: Statistical Methodology}, \bold{72}(4), 417--473. 10 | ISSN 1369-7412, \doi{10.1111/J.1467-9868.2010.00740.X}, 0809.2932. 11 | 12 | Hedou, Julien, Maric, Ivana, Bellan, Gregoire, Einhaus, Jakob, Gaudilliere, K. D, Ladant, Xavier F, Verdonk, Franck, Stelzer, A. I, Feyaerts, Dorien, Tsai, S. A, Ganio, A. E, Sabayev, Maximilian, Gillard, Joshua, Amar, Jonas, Cambriel, Amelie, Oskotsky, T. T, Roldan, Alennie, Golob, L. J, Sirota, Marina, Bonham, A. T, Sato, Masaki, Diop, Maigane, Durand, Xavier, Angst, S. M, Stevenson, K. D, Aghaeepour, Nima, Montanari, Andrea, Gaudilliere, Brice (2024). 13 | \dQuote{Discovery of sparse, reliable omic biomarkers with Stabl.} 14 | \emph{Nature Biotechnology 2024}, 1--13. 15 | ISSN 1546-1696, \doi{10.1038/s41587-023-02033-x}, \url{https://www.nature.com/articles/s41587-023-02033-x}. 16 | } 17 | \usage{ 18 | embedded_ensemble_fselect( 19 | task, 20 | learners, 21 | init_resampling, 22 | measure, 23 | store_benchmark_result = TRUE 24 | ) 25 | } 26 | \arguments{ 27 | \item{task}{(\link[mlr3:Task]{mlr3::Task})\cr 28 | Task to operate on.} 29 | 30 | \item{learners}{(list of \link[mlr3:Learner]{mlr3::Learner})\cr 31 | The learners to be used for feature selection. 32 | All learners must have the \code{selected_features} property, i.e. implement 33 | embedded feature selection (e.g. regularized models).} 34 | 35 | \item{init_resampling}{(\link[mlr3:Resampling]{mlr3::Resampling})\cr 36 | The initial resampling strategy of the data, from which each train set 37 | will be passed on to the learners and each test set will be used for 38 | prediction. 39 | Can only be \link[mlr3:mlr_resamplings_subsampling]{mlr3::ResamplingSubsampling} or \link[mlr3:mlr_resamplings_bootstrap]{mlr3::ResamplingBootstrap}.} 40 | 41 | \item{measure}{(\link[mlr3:Measure]{mlr3::Measure})\cr 42 | The measure used to score each learner on the test sets generated by 43 | \code{init_resampling}. 44 | If \code{NULL}, default measure is used.} 45 | 46 | \item{store_benchmark_result}{(\code{logical(1)})\cr 47 | Whether to store the benchmark result in \link{EnsembleFSResult} or not.} 48 | } 49 | \value{ 50 | an \link{EnsembleFSResult} object. 51 | } 52 | \description{ 53 | Ensemble feature selection using multiple learners. 54 | The ensemble feature selection method is designed to identify the most predictive features from a given dataset by leveraging multiple machine learning models and resampling techniques. 55 | Returns an \link{EnsembleFSResult}. 56 | } 57 | \details{ 58 | The method begins by applying an initial resampling technique specified by the user, to create \strong{multiple subsamples} from the original dataset (train/test splits). 59 | This resampling process helps in generating diverse subsets of data for robust feature selection. 60 | 61 | For each subsample (train set) generated in the previous step, the method applies learners 62 | that support \strong{embedded feature selection}. 63 | These learners are then scored on their ability to predict on the resampled 64 | test sets, storing the selected features during training, for each 65 | combination of subsample and learner. 66 | 67 | Results are stored in an \link{EnsembleFSResult}. 68 | } 69 | \examples{ 70 | \donttest{ 71 | eefsr = embedded_ensemble_fselect( 72 | task = tsk("sonar"), 73 | learners = lrns(c("classif.rpart", "classif.featureless")), 74 | init_resampling = rsmp("subsampling", repeats = 5), 75 | measure = msr("classif.ce") 76 | ) 77 | eefsr 78 | } 79 | } 80 | -------------------------------------------------------------------------------- /R/extract_inner_fselect_results.R: -------------------------------------------------------------------------------- 1 | #' @title Extract Inner Feature Selection Results 2 | #' 3 | #' @description 4 | #' Extract inner feature selection results of nested resampling. 5 | #' Implemented for [mlr3::ResampleResult] and [mlr3::BenchmarkResult]. 6 | #' 7 | #' @details 8 | #' The function iterates over the [AutoFSelector] objects and binds the feature selection results to a [data.table::data.table()]. 9 | #' [AutoFSelector] must be initialized with `store_fselect_instance = TRUE` and `resample()` or `benchmark()` must be called with `store_models = TRUE`. 10 | #' Optionally, the instance can be added for each iteration. 11 | #' 12 | #' @section Data structure: 13 | #' 14 | #' The returned data table has the following columns: 15 | #' 16 | #' * `experiment` (integer(1))\cr 17 | #' Index, giving the according row number in the original benchmark grid. 18 | #' * `iteration` (integer(1))\cr 19 | #' Iteration of the outer resampling. 20 | #' * One column for each feature of the task. 21 | #' * One column for each performance measure. 22 | #' * `features` (character())\cr 23 | #' Vector of selected feature set. 24 | #' * `task_id` (`character(1)`). 25 | #' * `learner_id` (`character(1)`). 26 | #' * `resampling_id` (`character(1)`). 27 | #' 28 | #' @param x ([mlr3::ResampleResult] | [mlr3::BenchmarkResult]). 29 | #' @param fselect_instance (`logical(1)`)\cr 30 | #' If `TRUE`, instances are added to the table. 31 | #' @param ... (any)\cr 32 | #' Additional arguments. 33 | #' 34 | #' @return [data.table::data.table()]. 35 | #' 36 | #' @export 37 | #' @examples 38 | #' # Nested Resampling on Palmer Penguins Data Set 39 | #' 40 | #' # create auto fselector 41 | #' at = auto_fselector( 42 | #' fselector = fs("random_search"), 43 | #' learner = lrn("classif.rpart"), 44 | #' resampling = rsmp ("holdout"), 45 | #' measure = msr("classif.ce"), 46 | #' term_evals = 4) 47 | #' 48 | #' resampling_outer = rsmp("cv", folds = 2) 49 | #' rr = resample(tsk("iris"), at, resampling_outer, store_models = TRUE) 50 | #' 51 | #' # extract inner results 52 | #' extract_inner_fselect_results(rr) 53 | extract_inner_fselect_results = function (x, fselect_instance, ...) { 54 | UseMethod("extract_inner_fselect_results", x) 55 | } 56 | 57 | #' @export 58 | extract_inner_fselect_results.ResampleResult = function(x, fselect_instance = FALSE, ...) { 59 | rr = assert_resample_result(x) 60 | if (is.null(rr$learners[[1]]$model$fselect_instance)) { 61 | return(data.table()) 62 | } 63 | tab = imap_dtr(rr$learners, function(learner, i) { 64 | data = setalloccol(learner$fselect_result) 65 | set(data, j = "iteration", value = i) 66 | if (fselect_instance) set(data, j = "fselect_instance", value = list(learner$fselect_instance)) 67 | data 68 | }) 69 | tab[, "task_id" := rr$task$id] 70 | tab[, "learner_id" := rr$learner$id] 71 | tab[, "resampling_id" := rr$resampling$id] 72 | cols_x = rr$learners[[1]]$archive$cols_x 73 | cols_y = rr$learners[[1]]$archive$cols_y 74 | setcolorder(tab, c("iteration", cols_x, cols_y)) 75 | tab 76 | } 77 | 78 | #' @export 79 | extract_inner_fselect_results.BenchmarkResult = function(x, fselect_instance = FALSE, ...) { 80 | bmr = assert_benchmark_result(x) 81 | tab = imap_dtr(bmr$resample_results$resample_result, function(rr, i) { 82 | data = extract_inner_fselect_results(rr, fselect_instance = fselect_instance) 83 | if (nrow(data) > 0) set(data, j = "experiment", value = i) 84 | }, .fill = TRUE) 85 | # reorder dt 86 | if (nrow(tab) > 0) { 87 | cols_x = unique(unlist(map(unique(tab$experiment), function(i) bmr$resample_results$resample_result[[i]]$learners[[1]]$archive$cols_x))) 88 | cols_y = unique(unlist(map(unique(tab$experiment), function(i) bmr$resample_results$resample_result[[i]]$learners[[1]]$archive$cols_y))) 89 | setcolorder(tab, unique(c("experiment", "iteration", cols_x, cols_y))) 90 | } 91 | tab 92 | } 93 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # File created using '.gitignore Generator' for Visual Studio Code: https://bit.ly/vscode-gig 2 | # Created by https://www.toptal.com/developers/gitignore/api/windows,visualstudiocode,r,macos,linux 3 | # Edit at https://www.toptal.com/developers/gitignore?templates=windows,visualstudiocode,r,macos,linux 4 | 5 | ### Linux ### 6 | *~ 7 | 8 | # temporary files which can be created if a process still has a handle open of a deleted file 9 | .fuse_hidden* 10 | 11 | # KDE directory preferences 12 | .directory 13 | 14 | # Linux trash folder which might appear on any partition or disk 15 | .Trash-* 16 | 17 | # .nfs files are created when an open file is removed but is still being accessed 18 | .nfs* 19 | 20 | ### macOS ### 21 | # General 22 | .DS_Store 23 | .AppleDouble 24 | .LSOverride 25 | 26 | # Icon must end with two \r 27 | Icon 28 | 29 | 30 | # Thumbnails 31 | ._* 32 | 33 | # Files that might appear in the root of a volume 34 | .DocumentRevisions-V100 35 | .fseventsd 36 | .Spotlight-V100 37 | .TemporaryItems 38 | .Trashes 39 | .VolumeIcon.icns 40 | .com.apple.timemachine.donotpresent 41 | 42 | # Directories potentially created on remote AFP share 43 | .AppleDB 44 | .AppleDesktop 45 | Network Trash Folder 46 | Temporary Items 47 | .apdisk 48 | 49 | ### macOS Patch ### 50 | # iCloud generated files 51 | *.icloud 52 | 53 | ### R ### 54 | # History files 55 | .Rhistory 56 | .Rapp.history 57 | 58 | # Session Data files 59 | .RData 60 | .RDataTmp 61 | 62 | # User-specific files 63 | .Ruserdata 64 | 65 | # Example code in package build process 66 | *-Ex.R 67 | 68 | # Output files from R CMD build 69 | /*.tar.gz 70 | 71 | # Output files from R CMD check 72 | /*.Rcheck/ 73 | 74 | # RStudio files 75 | .Rproj.user/ 76 | 77 | # produced vignettes 78 | vignettes/*.html 79 | vignettes/*.pdf 80 | 81 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3 82 | .httr-oauth 83 | 84 | # knitr and R markdown default cache directories 85 | *_cache/ 86 | /cache/ 87 | 88 | # Temporary files created by R markdown 89 | *.utf8.md 90 | *.knit.md 91 | 92 | # R Environment Variables 93 | .Renviron 94 | 95 | # pkgdown site 96 | docs/ 97 | 98 | # translation temp files 99 | po/*~ 100 | 101 | # RStudio Connect folder 102 | rsconnect/ 103 | 104 | ### R.Bookdown Stack ### 105 | # R package: bookdown caching files 106 | /*_files/ 107 | 108 | ### VisualStudioCode ### 109 | .vscode/* 110 | !.vscode/settings.json 111 | !.vscode/tasks.json 112 | !.vscode/launch.json 113 | !.vscode/extensions.json 114 | !.vscode/*.code-snippets 115 | 116 | # Local History for Visual Studio Code 117 | .history/ 118 | 119 | # Built Visual Studio Code Extensions 120 | *.vsix 121 | 122 | ### VisualStudioCode Patch ### 123 | # Ignore all local history of files 124 | .history 125 | .ionide 126 | 127 | ### Windows ### 128 | # Windows thumbnail cache files 129 | Thumbs.db 130 | Thumbs.db:encryptable 131 | ehthumbs.db 132 | ehthumbs_vista.db 133 | 134 | # Dump file 135 | *.stackdump 136 | 137 | # Folder config file 138 | [Dd]esktop.ini 139 | 140 | # Recycle Bin used on file shares 141 | $RECYCLE.BIN/ 142 | 143 | # Windows Installer files 144 | *.cab 145 | *.msi 146 | *.msix 147 | *.msm 148 | *.msp 149 | 150 | # Windows shortcuts 151 | *.lnk 152 | 153 | # End of https://www.toptal.com/developers/gitignore/api/windows,visualstudiocode,r,macos,linux 154 | 155 | # Custom rules (everything added below won't be overriden by 'Generate .gitignore File' if you use 'Update' option) 156 | 157 | # R 158 | .Rprofile 159 | README.html 160 | src/*.o 161 | src/*.so 162 | src/*.dll 163 | 164 | # CRAN 165 | cran-comments.md 166 | CRAN-RELEASE 167 | CRAN-SUBMISSION 168 | 169 | # pkgdown 170 | docs/ 171 | 172 | # renv 173 | renv/ 174 | renv.lock 175 | 176 | # vscode 177 | .vscode 178 | 179 | # revdep 180 | revdep/ 181 | 182 | # misc 183 | Meta/ 184 | -------------------------------------------------------------------------------- /man/mlr_fselectors_async_random_search.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/FSelectorAsyncRandomSearch.R 3 | \name{mlr_fselectors_async_random_search} 4 | \alias{mlr_fselectors_async_random_search} 5 | \alias{FSelectorAsyncRandomSearch} 6 | \title{Feature Selection with Asynchronous Random Search} 7 | \source{ 8 | Bergstra J, Bengio Y (2012). 9 | \dQuote{Random Search for Hyper-Parameter Optimization.} 10 | \emph{Journal of Machine Learning Research}, \bold{13}(10), 281--305. 11 | \url{https://jmlr.csail.mit.edu/papers/v13/bergstra12a.html}. 12 | } 13 | \description{ 14 | Feature selection using Asynchronous Random Search Algorithm. 15 | } 16 | \section{Dictionary}{ 17 | 18 | This \link{FSelector} can be instantiated with the associated sugar function \code{\link[=fs]{fs()}}: 19 | 20 | \if{html}{\out{
}}\preformatted{fs("async_random_search") 21 | }\if{html}{\out{
}} 22 | } 23 | 24 | \section{Control Parameters}{ 25 | 26 | \describe{ 27 | \item{\code{max_features}}{\code{integer(1)}\cr 28 | Maximum number of features. 29 | By default, number of features in \link[mlr3:Task]{mlr3::Task}.} 30 | } 31 | } 32 | 33 | \seealso{ 34 | Other FSelectorAsync: 35 | \code{\link{mlr_fselectors_async_design_points}}, 36 | \code{\link{mlr_fselectors_async_exhaustive_search}} 37 | } 38 | \concept{FSelectorAsync} 39 | \section{Super classes}{ 40 | \code{\link[mlr3fselect:FSelector]{mlr3fselect::FSelector}} -> \code{\link[mlr3fselect:FSelectorAsync]{mlr3fselect::FSelectorAsync}} -> \code{FSelectorAsyncRandomSearch} 41 | } 42 | \section{Methods}{ 43 | \subsection{Public methods}{ 44 | \itemize{ 45 | \item \href{#method-FSelectorAsyncRandomSearch-new}{\code{FSelectorAsyncRandomSearch$new()}} 46 | \item \href{#method-FSelectorAsyncRandomSearch-clone}{\code{FSelectorAsyncRandomSearch$clone()}} 47 | } 48 | } 49 | \if{html}{\out{ 50 |
Inherited methods 51 | 57 |
58 | }} 59 | \if{html}{\out{
}} 60 | \if{html}{\out{}} 61 | \if{latex}{\out{\hypertarget{method-FSelectorAsyncRandomSearch-new}{}}} 62 | \subsection{Method \code{new()}}{ 63 | Creates a new instance of this \link[R6:R6Class]{R6} class. 64 | \subsection{Usage}{ 65 | \if{html}{\out{
}}\preformatted{FSelectorAsyncRandomSearch$new()}\if{html}{\out{
}} 66 | } 67 | 68 | } 69 | \if{html}{\out{
}} 70 | \if{html}{\out{}} 71 | \if{latex}{\out{\hypertarget{method-FSelectorAsyncRandomSearch-clone}{}}} 72 | \subsection{Method \code{clone()}}{ 73 | The objects of this class are cloneable with this method. 74 | \subsection{Usage}{ 75 | \if{html}{\out{
}}\preformatted{FSelectorAsyncRandomSearch$clone(deep = FALSE)}\if{html}{\out{
}} 76 | } 77 | 78 | \subsection{Arguments}{ 79 | \if{html}{\out{
}} 80 | \describe{ 81 | \item{\code{deep}}{Whether to make a deep clone.} 82 | } 83 | \if{html}{\out{
}} 84 | } 85 | } 86 | } 87 | -------------------------------------------------------------------------------- /R/extract_inner_fselect_archives.R: -------------------------------------------------------------------------------- 1 | #' @title Extract Inner Feature Selection Archives 2 | #' 3 | #' @description 4 | #' Extract inner feature selection archives of nested resampling. 5 | #' Implemented for [mlr3::ResampleResult] and [mlr3::BenchmarkResult]. 6 | #' The function iterates over the [AutoFSelector] objects and binds the archives to a [data.table::data.table()]. 7 | #' [AutoFSelector] must be initialized with `store_fselect_instance = TRUE` and `resample()` or `benchmark()` must be called with `store_models = TRUE`. 8 | #' 9 | #' @section Data structure: 10 | #' 11 | #' The returned data table has the following columns: 12 | #' 13 | #' * `experiment` (integer(1))\cr 14 | #' Index, giving the according row number in the original benchmark grid. 15 | #' * `iteration` (integer(1))\cr 16 | #' Iteration of the outer resampling. 17 | #' * One column for each feature of the task. 18 | #' * One column for each performance measure. 19 | #' * `runtime_learners` (`numeric(1)`)\cr 20 | #' Sum of training and predict times logged in learners per 21 | #' [mlr3::ResampleResult] / evaluation. This does not include potential 22 | #' overhead time. 23 | #' * `timestamp` (`POSIXct`)\cr 24 | #' Time stamp when the evaluation was logged into the archive. 25 | #' * `batch_nr` (`integer(1)`)\cr 26 | #' Feature sets are evaluated in batches. Each batch has a unique batch 27 | #' number. 28 | #' * `resample_result` ([mlr3::ResampleResult])\cr 29 | #' Resample result of the inner resampling. 30 | #' * `task_id` (`character(1)`). 31 | #' * `learner_id` (`character(1)`). 32 | #' * `resampling_id` (`character(1)`). 33 | #' 34 | #' @param x ([mlr3::ResampleResult] | [mlr3::BenchmarkResult]). 35 | #' @param exclude_columns (`character()`)\cr 36 | #' Exclude columns from result table. Set to `NULL` if no column should be 37 | #' excluded. 38 | #' @return [data.table::data.table()]. 39 | #' 40 | #' @export 41 | #' @examples 42 | #' # Nested Resampling on Palmer Penguins Data Set 43 | #' 44 | #' # create auto fselector 45 | #' at = auto_fselector( 46 | #' fselector = fs("random_search"), 47 | #' learner = lrn("classif.rpart"), 48 | #' resampling = rsmp ("holdout"), 49 | #' measure = msr("classif.ce"), 50 | #' term_evals = 4) 51 | #' 52 | #' resampling_outer = rsmp("cv", folds = 2) 53 | #' rr = resample(tsk("penguins"), at, resampling_outer, store_models = TRUE) 54 | #' 55 | #' # extract inner archives 56 | #' extract_inner_fselect_archives(rr) 57 | extract_inner_fselect_archives = function (x, exclude_columns = "uhash") { 58 | UseMethod("extract_inner_fselect_archives") 59 | } 60 | 61 | #' @export 62 | extract_inner_fselect_archives.ResampleResult = function(x, exclude_columns = "uhash") { 63 | rr = assert_resample_result(x) 64 | if (is.null(rr$learners[[1]]$model$fselect_instance)) { 65 | return(data.table()) 66 | } 67 | tab = imap_dtr(rr$learners, function(learner, i) { 68 | data = as.data.table(learner$archive, exclude_columns) 69 | set(data, j = "iteration", value = i) 70 | }) 71 | tab[, "task_id" := rr$task$id] 72 | tab[, "learner_id" := rr$learner$id] 73 | tab[, "resampling_id" := rr$resampling$id] 74 | cols_x = rr$learners[[1]]$archive$cols_x 75 | cols_y = rr$learners[[1]]$archive$cols_y 76 | setcolorder(tab, c("iteration", cols_x, cols_y)) 77 | tab 78 | } 79 | 80 | #' @export 81 | extract_inner_fselect_archives.BenchmarkResult = function(x, exclude_columns = "uhash") { 82 | bmr = assert_benchmark_result(x) 83 | tab = imap_dtr(bmr$resample_results$resample_result, function(rr, i) { 84 | data = extract_inner_fselect_archives(rr, exclude_columns) 85 | if (nrow(data) > 0) set(data, j = "experiment", value = i) 86 | }, .fill = TRUE) 87 | 88 | if (nrow(tab) > 0) { 89 | # reorder dt 90 | cols_x = unique(unlist(map(unique(tab$experiment), function(i) bmr$resample_results$resample_result[[i]]$learners[[1]]$archive$cols_x))) 91 | cols_y = unique(unlist(map(unique(tab$experiment), function(i) bmr$resample_results$resample_result[[i]]$learners[[1]]$archive$cols_y))) 92 | setcolorder(tab, c("experiment", "iteration", cols_x, cols_y)) 93 | } 94 | tab 95 | } 96 | -------------------------------------------------------------------------------- /man/FSelectorBatchFromOptimizerBatch.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/FSelectorBatchFromOptimizerBatch.R 3 | \name{FSelectorBatchFromOptimizerBatch} 4 | \alias{FSelectorBatchFromOptimizerBatch} 5 | \title{FSelectorBatchFromOptimizerBatch} 6 | \description{ 7 | Internally used to transform \link[bbotk:Optimizer]{bbotk::Optimizer} to \link{FSelector}. 8 | } 9 | \keyword{internal} 10 | \section{Super classes}{ 11 | \code{\link[mlr3fselect:FSelector]{mlr3fselect::FSelector}} -> \code{\link[mlr3fselect:FSelectorBatch]{mlr3fselect::FSelectorBatch}} -> \code{FSelectorBatchFromOptimizerBatch} 12 | } 13 | \section{Methods}{ 14 | \subsection{Public methods}{ 15 | \itemize{ 16 | \item \href{#method-FSelectorBatchFromOptimizerBatch-new}{\code{FSelectorBatchFromOptimizerBatch$new()}} 17 | \item \href{#method-FSelectorBatchFromOptimizerBatch-optimize}{\code{FSelectorBatchFromOptimizerBatch$optimize()}} 18 | \item \href{#method-FSelectorBatchFromOptimizerBatch-clone}{\code{FSelectorBatchFromOptimizerBatch$clone()}} 19 | } 20 | } 21 | \if{html}{\out{ 22 |
Inherited methods 23 | 28 |
29 | }} 30 | \if{html}{\out{
}} 31 | \if{html}{\out{}} 32 | \if{latex}{\out{\hypertarget{method-FSelectorBatchFromOptimizerBatch-new}{}}} 33 | \subsection{Method \code{new()}}{ 34 | Creates a new instance of this \link[R6:R6Class]{R6} class. 35 | \subsection{Usage}{ 36 | \if{html}{\out{
}}\preformatted{FSelectorBatchFromOptimizerBatch$new(optimizer, man = NA_character_)}\if{html}{\out{
}} 37 | } 38 | 39 | \subsection{Arguments}{ 40 | \if{html}{\out{
}} 41 | \describe{ 42 | \item{\code{optimizer}}{\link[bbotk:Optimizer]{bbotk::Optimizer}\cr 43 | Optimizer that is called.} 44 | 45 | \item{\code{man}}{(\code{character(1)})\cr 46 | String in the format \verb{[pkg]::[topic]} pointing to a manual page for this object. 47 | The referenced help package can be opened via method \verb{$help()}.} 48 | } 49 | \if{html}{\out{
}} 50 | } 51 | } 52 | \if{html}{\out{
}} 53 | \if{html}{\out{}} 54 | \if{latex}{\out{\hypertarget{method-FSelectorBatchFromOptimizerBatch-optimize}{}}} 55 | \subsection{Method \code{optimize()}}{ 56 | Performs the feature selection on a \link{FSelectInstanceBatchSingleCrit} / 57 | \link{FSelectInstanceBatchMultiCrit} until termination. 58 | \subsection{Usage}{ 59 | \if{html}{\out{
}}\preformatted{FSelectorBatchFromOptimizerBatch$optimize(inst)}\if{html}{\out{
}} 60 | } 61 | 62 | \subsection{Arguments}{ 63 | \if{html}{\out{
}} 64 | \describe{ 65 | \item{\code{inst}}{(\link{FSelectInstanceBatchSingleCrit} | \link{FSelectInstanceBatchMultiCrit}).} 66 | } 67 | \if{html}{\out{
}} 68 | } 69 | \subsection{Returns}{ 70 | \link[data.table:data.table]{data.table::data.table}. 71 | } 72 | } 73 | \if{html}{\out{
}} 74 | \if{html}{\out{}} 75 | \if{latex}{\out{\hypertarget{method-FSelectorBatchFromOptimizerBatch-clone}{}}} 76 | \subsection{Method \code{clone()}}{ 77 | The objects of this class are cloneable with this method. 78 | \subsection{Usage}{ 79 | \if{html}{\out{
}}\preformatted{FSelectorBatchFromOptimizerBatch$clone(deep = FALSE)}\if{html}{\out{
}} 80 | } 81 | 82 | \subsection{Arguments}{ 83 | \if{html}{\out{
}} 84 | \describe{ 85 | \item{\code{deep}}{Whether to make a deep clone.} 86 | } 87 | \if{html}{\out{
}} 88 | } 89 | } 90 | } 91 | -------------------------------------------------------------------------------- /man/FSelectorAsync.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/FSelectorAsync.R 3 | \name{FSelectorAsync} 4 | \alias{FSelectorAsync} 5 | \title{Class for Asynchronous Feature Selection Algorithms} 6 | \description{ 7 | The \link{FSelectorAsync} implements the asynchronous optimization algorithm. 8 | } 9 | \details{ 10 | \link{FSelectorAsync} is an abstract base class that implements the base functionality each asynchronous fselector must provide. 11 | } 12 | \section{Resources}{ 13 | 14 | There are several sections about feature selection in the \href{https://mlr3book.mlr-org.com}{mlr3book}. 15 | \itemize{ 16 | \item Learn more about \href{https://mlr3book.mlr-org.com/chapters/chapter6/feature_selection.html#the-fselector-class}{fselectors}. 17 | } 18 | 19 | The \href{https://mlr-org.com/gallery.html}{gallery} features a collection of case studies and demos about optimization. 20 | \itemize{ 21 | \item Utilize the built-in feature importance of models with \href{https://mlr-org.com/gallery/optimization/2023-02-07-recursive-feature-elimination/}{Recursive Feature Elimination}. 22 | \item Run a feature selection with \href{https://mlr-org.com/gallery/optimization/2023-02-01-shadow-variable-search/}{Shadow Variable Search}. 23 | } 24 | } 25 | 26 | \section{Super class}{ 27 | \code{\link[mlr3fselect:FSelector]{mlr3fselect::FSelector}} -> \code{FSelectorAsync} 28 | } 29 | \section{Methods}{ 30 | \subsection{Public methods}{ 31 | \itemize{ 32 | \item \href{#method-FSelectorAsync-optimize}{\code{FSelectorAsync$optimize()}} 33 | \item \href{#method-FSelectorAsync-clone}{\code{FSelectorAsync$clone()}} 34 | } 35 | } 36 | \if{html}{\out{ 37 |
Inherited methods 38 | 44 |
45 | }} 46 | \if{html}{\out{
}} 47 | \if{html}{\out{}} 48 | \if{latex}{\out{\hypertarget{method-FSelectorAsync-optimize}{}}} 49 | \subsection{Method \code{optimize()}}{ 50 | Performs the feature selection on a \link{FSelectInstanceAsyncSingleCrit} or \link{FSelectInstanceAsyncMultiCrit} until termination. 51 | The single evaluations will be written into the \link{ArchiveAsyncFSelect} that resides in the \link{FSelectInstanceAsyncSingleCrit}/\link{FSelectInstanceAsyncMultiCrit}. 52 | The result will be written into the instance object. 53 | \subsection{Usage}{ 54 | \if{html}{\out{
}}\preformatted{FSelectorAsync$optimize(inst)}\if{html}{\out{
}} 55 | } 56 | 57 | \subsection{Arguments}{ 58 | \if{html}{\out{
}} 59 | \describe{ 60 | \item{\code{inst}}{(\link{FSelectInstanceAsyncSingleCrit} | \link{FSelectInstanceAsyncMultiCrit}).} 61 | } 62 | \if{html}{\out{
}} 63 | } 64 | \subsection{Returns}{ 65 | \code{\link[data.table:data.table]{data.table::data.table()}} 66 | } 67 | } 68 | \if{html}{\out{
}} 69 | \if{html}{\out{}} 70 | \if{latex}{\out{\hypertarget{method-FSelectorAsync-clone}{}}} 71 | \subsection{Method \code{clone()}}{ 72 | The objects of this class are cloneable with this method. 73 | \subsection{Usage}{ 74 | \if{html}{\out{
}}\preformatted{FSelectorAsync$clone(deep = FALSE)}\if{html}{\out{
}} 75 | } 76 | 77 | \subsection{Arguments}{ 78 | \if{html}{\out{
}} 79 | \describe{ 80 | \item{\code{deep}}{Whether to make a deep clone.} 81 | } 82 | \if{html}{\out{
}} 83 | } 84 | } 85 | } 86 | -------------------------------------------------------------------------------- /R/FSelectorBatch.R: -------------------------------------------------------------------------------- 1 | #' @title Class for Batch Feature Selection Algorithms 2 | #' 3 | #' @include mlr_fselectors.R 4 | #' 5 | #' @description 6 | #' The [FSelectorBatch] implements the optimization algorithm. 7 | #' 8 | #' @details 9 | #' [FSelectorBatch] is an abstract base class that implements the base functionality each fselector must provide. 10 | #' A subclass is implemented in the following way: 11 | #' * Inherit from FSelectorBatch. 12 | #' * Specify the private abstract method `$.optimize()` and use it to call into your optimizer. 13 | #' * You need to call `instance$eval_batch()` to evaluate design points. 14 | #' * The batch evaluation is requested at the [FSelectInstanceBatchSingleCrit]/[FSelectInstanceBatchMultiCrit] object `instance`, so each batch is possibly executed in parallel via [mlr3::benchmark()], and all evaluations are stored inside of `instance$archive`. 15 | #' * Before the batch evaluation, the [bbotk::Terminator] is checked, and if it is positive, an exception of class `"terminated_error"` is generated. 16 | #' In the latter case the current batch of evaluations is still stored in `instance`, but the numeric scores are not sent back to the handling optimizer as it has lost execution control. 17 | #' * After such an exception was caught we select the best set from `instance$archive` and return it. 18 | #' * Note that therefore more points than specified by the [bbotk::Terminator] may be evaluated, as the Terminator is only checked before a batch evaluation, and not in-between evaluation in a batch. 19 | #' How many more depends on the setting of the batch size. 20 | #' * Overwrite the private super-method `.assign_result()` if you want to decide how to estimate the final set in the instance and its estimated performance. 21 | #' The default behavior is: We pick the best resample experiment, regarding the given measure, then assign its set and aggregated performance to the instance. 22 | #' 23 | #' @section Private Methods: 24 | #' * `.optimize(instance)` -> `NULL`\cr 25 | #' Abstract base method. Implement to specify feature selection of your subclass. 26 | #' See technical details sections. 27 | #' * `.assign_result(instance)` -> `NULL`\cr 28 | #' Abstract base method. Implement to specify how the final feature subset is selected. 29 | #' See technical details sections. 30 | #' 31 | #' @inheritSection FSelector Resources 32 | #' 33 | #' @template param_id 34 | #' @template param_param_set 35 | #' @template param_properties 36 | #' @template param_packages 37 | #' @template param_label 38 | #' @template param_man 39 | #' 40 | #' @export 41 | FSelectorBatch = R6Class("FSelectorBatch", 42 | inherit = FSelector, 43 | public = list( 44 | 45 | #' @description 46 | #' Creates a new instance of this [R6][R6::R6Class] class. 47 | initialize = function( 48 | id = "fselector_batch", 49 | param_set, 50 | properties, 51 | packages = character(), 52 | label = NA_character_, 53 | man = NA_character_ 54 | ) { 55 | super$initialize( 56 | id = id, 57 | param_set = param_set, 58 | properties = properties, 59 | packages = packages, 60 | label = label, 61 | man = man 62 | ) 63 | }, 64 | 65 | #' @description 66 | #' Performs the feature selection on a [FSelectInstanceBatchSingleCrit] or [FSelectInstanceBatchMultiCrit] until termination. 67 | #' The single evaluations will be written into the [ArchiveBatchFSelect] that resides in the [FSelectInstanceBatchSingleCrit] / [FSelectInstanceBatchMultiCrit]. 68 | #' The result will be written into the instance object. 69 | #' 70 | #' @param inst ([FSelectInstanceBatchSingleCrit] | [FSelectInstanceBatchMultiCrit]). 71 | #' 72 | #' @return [data.table::data.table()]. 73 | optimize = function(inst) { 74 | assert_fselect_instance_batch(inst) 75 | if ("requires_model" %in% self$properties) inst$objective$.__enclos_env__$private$.model_required = TRUE 76 | result = optimize_batch_default(inst, self) 77 | inst$objective$.__enclos_env__$private$.xss = NULL 78 | inst$objective$.__enclos_env__$private$.design = NULL 79 | inst$objective$.__enclos_env__$private$.benchmark_result = NULL 80 | inst$objective$.__enclos_env__$private$.aggregated_performance = NULL 81 | return(result) 82 | } 83 | ) 84 | ) 85 | -------------------------------------------------------------------------------- /man/CallbackBatchFSelect.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/CallbackBatchFSelect.R 3 | \name{CallbackBatchFSelect} 4 | \alias{CallbackBatchFSelect} 5 | \title{Create Feature Selection Callback} 6 | \description{ 7 | Specialized \link[bbotk:CallbackBatch]{bbotk::CallbackBatch} for feature selection. 8 | Callbacks allow customizing the behavior of processes in mlr3fselect. 9 | The \code{\link[=callback_batch_fselect]{callback_batch_fselect()}} function creates a \link{CallbackBatchFSelect}. 10 | Predefined callbacks are stored in the \link[mlr3misc:Dictionary]{dictionary} \link{mlr_callbacks} and can be retrieved with \code{\link[=clbk]{clbk()}}. 11 | For more information on callbacks see \code{\link[=callback_batch_fselect]{callback_batch_fselect()}}. 12 | } 13 | \examples{ 14 | # Write archive to disk 15 | callback_batch_fselect("mlr3fselect.backup", 16 | on_optimization_end = function(callback, context) { 17 | saveRDS(context$instance$archive, "archive.rds") 18 | } 19 | ) 20 | } 21 | \section{Super classes}{ 22 | \code{\link[mlr3misc:Callback]{mlr3misc::Callback}} -> \code{\link[bbotk:CallbackBatch]{bbotk::CallbackBatch}} -> \code{CallbackBatchFSelect} 23 | } 24 | \section{Public fields}{ 25 | \if{html}{\out{
}} 26 | \describe{ 27 | \item{\code{on_eval_after_design}}{(\verb{function()})\cr 28 | Stage called after design is created. 29 | Called in \code{ObjectiveFSelectBatch$eval_many()}.} 30 | 31 | \item{\code{on_eval_after_benchmark}}{(\verb{function()})\cr 32 | Stage called after feature sets are evaluated. 33 | Called in \code{ObjectiveFSelectBatch$eval_many()}.} 34 | 35 | \item{\code{on_eval_before_archive}}{(\verb{function()})\cr 36 | Stage called before performance values are written to the archive. 37 | Called in \code{ObjectiveFSelectBatch$eval_many()}.} 38 | 39 | \item{\code{on_auto_fselector_before_final_model}}{(\verb{function()})\cr 40 | Stage called before the final model is trained. 41 | Called in \code{AutoFSelector$train()}. 42 | This stage is called after the optimization has finished and the final model is trained with the best feature set found.} 43 | 44 | \item{\code{on_auto_fselector_after_final_model}}{(\verb{function()})\cr 45 | Stage called after the final model is trained. 46 | Called in \code{AutoFSelector$train()}. 47 | This stage is called after the final model is trained with the best feature set found.} 48 | } 49 | \if{html}{\out{
}} 50 | } 51 | \section{Methods}{ 52 | \subsection{Public methods}{ 53 | \itemize{ 54 | \item \href{#method-CallbackBatchFSelect-clone}{\code{CallbackBatchFSelect$clone()}} 55 | } 56 | } 57 | \if{html}{\out{ 58 |
Inherited methods 59 | 66 |
67 | }} 68 | \if{html}{\out{
}} 69 | \if{html}{\out{}} 70 | \if{latex}{\out{\hypertarget{method-CallbackBatchFSelect-clone}{}}} 71 | \subsection{Method \code{clone()}}{ 72 | The objects of this class are cloneable with this method. 73 | \subsection{Usage}{ 74 | \if{html}{\out{
}}\preformatted{CallbackBatchFSelect$clone(deep = FALSE)}\if{html}{\out{
}} 75 | } 76 | 77 | \subsection{Arguments}{ 78 | \if{html}{\out{
}} 79 | \describe{ 80 | \item{\code{deep}}{Whether to make a deep clone.} 81 | } 82 | \if{html}{\out{
}} 83 | } 84 | } 85 | } 86 | -------------------------------------------------------------------------------- /man/CallbackAsyncFSelect.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/CallbackAsyncFSelect.R 3 | \name{CallbackAsyncFSelect} 4 | \alias{CallbackAsyncFSelect} 5 | \title{Asynchronous Feature Selection Callback} 6 | \description{ 7 | Specialized \link[bbotk:CallbackAsync]{bbotk::CallbackAsync} for asynchronous feature selection. 8 | Callbacks allow to customize the behavior of processes in mlr3fselect. 9 | The \code{\link[=callback_async_fselect]{callback_async_fselect()}} function creates a \link{CallbackAsyncFSelect}. 10 | Predefined callbacks are stored in the \link[mlr3misc:Dictionary]{dictionary} \link{mlr_callbacks} and can be retrieved with \code{\link[=clbk]{clbk()}}. 11 | For more information on feature selection callbacks see \code{\link[=callback_async_fselect]{callback_async_fselect()}}. 12 | } 13 | \section{Super classes}{ 14 | \code{\link[mlr3misc:Callback]{mlr3misc::Callback}} -> \code{\link[bbotk:CallbackAsync]{bbotk::CallbackAsync}} -> \code{CallbackAsyncFSelect} 15 | } 16 | \section{Public fields}{ 17 | \if{html}{\out{
}} 18 | \describe{ 19 | \item{\code{on_eval_after_xs}}{(\verb{function()})\cr 20 | Stage called after xs is passed. 21 | Called in \code{ObjectiveFSelectAsync$eval()}.} 22 | 23 | \item{\code{on_resample_begin}}{(\verb{function()})\cr 24 | Stage called at the beginning of an evaluation. 25 | Called in \code{workhorse()} (internal).} 26 | 27 | \item{\code{on_resample_before_train}}{(\verb{function()})\cr 28 | Stage called before training the learner. 29 | Called in \code{workhorse()} (internal).} 30 | 31 | \item{\code{on_resample_before_predict}}{(\verb{function()})\cr 32 | Stage called before predicting. 33 | Called in \code{workhorse()} (internal).} 34 | 35 | \item{\code{on_resample_end}}{(\verb{function()})\cr 36 | Stage called at the end of an evaluation. 37 | Called in \code{workhorse()} (internal).} 38 | 39 | \item{\code{on_eval_after_resample}}{(\verb{function()})\cr 40 | Stage called after feature subsets are evaluated. 41 | Called in \code{ObjectiveFSelectAsync$eval()}.} 42 | 43 | \item{\code{on_eval_before_archive}}{(\verb{function()})\cr 44 | Stage called before performance values are written to the archive. 45 | Called in \code{ObjectiveFSelectAsync$eval()}.} 46 | 47 | \item{\code{on_fselect_result_begin}}{(\verb{function()})\cr 48 | Stage called before the results are written. 49 | Called in \verb{FSelectInstance*$assign_result()}.} 50 | } 51 | \if{html}{\out{
}} 52 | } 53 | \section{Methods}{ 54 | \subsection{Public methods}{ 55 | \itemize{ 56 | \item \href{#method-CallbackAsyncFSelect-clone}{\code{CallbackAsyncFSelect$clone()}} 57 | } 58 | } 59 | \if{html}{\out{ 60 |
Inherited methods 61 | 68 |
69 | }} 70 | \if{html}{\out{
}} 71 | \if{html}{\out{}} 72 | \if{latex}{\out{\hypertarget{method-CallbackAsyncFSelect-clone}{}}} 73 | \subsection{Method \code{clone()}}{ 74 | The objects of this class are cloneable with this method. 75 | \subsection{Usage}{ 76 | \if{html}{\out{
}}\preformatted{CallbackAsyncFSelect$clone(deep = FALSE)}\if{html}{\out{
}} 77 | } 78 | 79 | \subsection{Arguments}{ 80 | \if{html}{\out{
}} 81 | \describe{ 82 | \item{\code{deep}}{Whether to make a deep clone.} 83 | } 84 | \if{html}{\out{
}} 85 | } 86 | } 87 | } 88 | -------------------------------------------------------------------------------- /R/FSelectInstanceAsyncSingleCrit.R: -------------------------------------------------------------------------------- 1 | #' @title Single Criterion Feature Selection with Rush 2 | #' 3 | #' @description 4 | #' The `FSelectInstanceAsyncSingleCrit` specifies a feature selection problem for a [FSelectorAsync]. 5 | #' The function [fsi_async()] creates a [FSelectInstanceAsyncSingleCrit] and the function [fselect()] creates an instance internally. 6 | #' 7 | #' @inheritSection FSelectInstanceBatchSingleCrit Default Measures 8 | #' @inheritSection ArchiveAsyncFSelect Analysis 9 | #' @inheritSection FSelectInstanceBatchSingleCrit Resources 10 | #' 11 | #' @template param_task 12 | #' @template param_learner 13 | #' @template param_resampling 14 | #' @template param_measure 15 | #' @template param_terminator 16 | #' @template param_store_benchmark_result 17 | #' @template param_store_models 18 | #' @template param_check_values 19 | #' @template param_callbacks 20 | #' @template param_rush 21 | #' @template param_ties_method 22 | #' 23 | #' @template param_xdt 24 | #' @template param_extra 25 | #' 26 | #' @export 27 | FSelectInstanceAsyncSingleCrit = R6Class("FSelectInstanceAsyncSingleCrit", 28 | inherit = OptimInstanceAsyncSingleCrit, 29 | public = list( 30 | 31 | #' @description 32 | #' Creates a new instance of this [R6][R6::R6Class] class. 33 | initialize = function( 34 | task, 35 | learner, 36 | resampling, 37 | measure = NULL, 38 | terminator, 39 | store_benchmark_result = TRUE, 40 | store_models = FALSE, 41 | check_values = FALSE, 42 | callbacks = NULL, 43 | ties_method = "least_features", 44 | rush = NULL 45 | ) { 46 | require_namespaces("rush") 47 | learner = assert_learner(as_learner(learner, clone = TRUE)) 48 | callbacks = assert_async_fselect_callbacks(as_callbacks(callbacks)) 49 | 50 | if (is.null(rush)) rush = rush::rsh() 51 | 52 | # create codomain from measure 53 | measures = assert_measures(as_measures(measure, task_type = task$task_type), task = task, learner = learner) 54 | codomain = measures_to_codomain(measures) 55 | 56 | # create search space from task 57 | search_space = task_to_domain(task) 58 | 59 | archive = ArchiveAsyncFSelect$new( 60 | search_space = search_space, 61 | codomain = codomain, 62 | rush = rush, 63 | ties_method = ties_method 64 | ) 65 | 66 | objective = ObjectiveFSelectAsync$new( 67 | task = task, 68 | learner = learner, 69 | resampling = resampling, 70 | measures = measures, 71 | store_benchmark_result = store_benchmark_result, 72 | store_models = store_models, 73 | check_values = check_values, 74 | callbacks = callbacks) 75 | 76 | super$initialize( 77 | objective, 78 | search_space, 79 | terminator, 80 | callbacks = callbacks, 81 | rush = rush, 82 | archive = archive) 83 | }, 84 | 85 | #' @description 86 | #' The [FSelectorAsync] object writes the best found point and estimated performance value here. 87 | #' For internal use. 88 | #' 89 | #' @param y (`numeric(1)`)\cr 90 | #' Optimal outcome. 91 | #' @param ... (`any`)\cr 92 | #' ignored. 93 | assign_result = function(xdt, y, extra = NULL, ...) { 94 | # add feature names to result for easy task subsetting 95 | feature_names = self$objective$task$feature_names 96 | features = list(feature_names[as.logical(xdt[, feature_names, with = FALSE])]) 97 | set(xdt, j = "features", value = list(features)) 98 | set(xdt, j = "n_features", value = length(features[[1L]])) 99 | 100 | # assign for callbacks 101 | private$.result_xdt = xdt 102 | private$.result_y = y 103 | private$.result_extra = extra 104 | 105 | call_back("on_fselect_result_begin", self$objective$callbacks, self$objective$context) 106 | 107 | super$assign_result(private$.result_xdt, private$.result_y) 108 | if (!is.null(private$.result$x_domain)) set(private$.result, j = "x_domain", value = NULL) 109 | } 110 | ), 111 | 112 | private = list( 113 | # initialize context for optimization 114 | .initialize_context = function(optimizer) { 115 | context = ContextAsyncFSelect$new(self, optimizer) 116 | self$objective$context = context 117 | } 118 | ) 119 | ) 120 | -------------------------------------------------------------------------------- /R/FSelectInstanceAsyncMultiCrit.R: -------------------------------------------------------------------------------- 1 | #' @title Multi-Criteria Feature Selection with Rush 2 | #' 3 | #' @include FSelectInstanceAsyncSingleCrit.R ArchiveAsyncFSelect.R 4 | #' 5 | #' @description 6 | #' The `FSelectInstanceAsyncMultiCrit` specifies a feature selection problem for a [FSelectorAsync]. 7 | #' The function [fsi_async()] creates a [FSelectInstanceAsyncMultiCrit] and the function [fselect()] creates an instance internally. 8 | #' 9 | #' @inheritSection FSelectInstanceBatchSingleCrit Default Measures 10 | #' @inheritSection ArchiveAsyncFSelect Analysis 11 | #' @inheritSection FSelectInstanceBatchSingleCrit Resources 12 | #' 13 | #' @template param_task 14 | #' @template param_learner 15 | #' @template param_resampling 16 | #' @template param_measures 17 | #' @template param_terminator 18 | #' @template param_store_benchmark_result 19 | #' @template param_store_models 20 | #' @template param_check_values 21 | #' @template param_callbacks 22 | #' @template param_rush 23 | #' 24 | #' @template param_xdt 25 | #' @template param_extra 26 | #' 27 | #' @export 28 | FSelectInstanceAsyncMultiCrit = R6Class("FSelectInstanceAsyncMultiCrit", 29 | inherit = OptimInstanceAsyncMultiCrit, 30 | public = list( 31 | 32 | #' @description 33 | #' Creates a new instance of this [R6][R6::R6Class] class. 34 | initialize = function( 35 | task, 36 | learner, 37 | resampling, 38 | measures, 39 | terminator, 40 | store_benchmark_result = TRUE, 41 | store_models = FALSE, 42 | check_values = FALSE, 43 | callbacks = NULL, 44 | rush = NULL 45 | ) { 46 | require_namespaces("rush") 47 | learner = assert_learner(as_learner(learner, clone = TRUE)) 48 | callbacks = assert_async_fselect_callbacks(as_callbacks(callbacks)) 49 | 50 | if (is.null(rush)) rush = rush::rsh() 51 | 52 | # create codomain from measures 53 | measures = assert_measures(as_measures(measures, task_type = task$task_type), task = task, learner = learner) 54 | codomain = measures_to_codomain(measures) 55 | 56 | # create search space from task 57 | search_space = task_to_domain(task) 58 | 59 | archive = ArchiveAsyncFSelect$new( 60 | search_space = search_space, 61 | codomain = codomain, 62 | rush = rush 63 | ) 64 | 65 | objective = ObjectiveFSelectAsync$new( 66 | task = task, 67 | learner = learner, 68 | resampling = resampling, 69 | measures = measures, 70 | store_benchmark_result = store_benchmark_result, 71 | store_models = store_models, 72 | check_values = check_values, 73 | callbacks = callbacks) 74 | 75 | super$initialize( 76 | objective = objective, 77 | search_space = search_space, 78 | terminator = terminator, 79 | callbacks = callbacks, 80 | archive = archive, 81 | rush = rush) 82 | }, 83 | 84 | #' @description 85 | #' The [FSelectorAsync] object writes the best found points and estimated performance values here (probably the Pareto set / front). 86 | #' For internal use. 87 | #' 88 | #' @param ydt (`numeric()`)\cr 89 | #' Optimal outcomes, e.g. the Pareto front. 90 | #' @param ... (`any`)\cr 91 | #' ignored. 92 | assign_result = function(xdt, ydt, extra = NULL, ...) { 93 | # add feature names to result for easy task subsetting 94 | feature_names = self$objective$task$feature_names 95 | features = map(seq_len(nrow(xdt)), function(i) { 96 | feature_names[as.logical(xdt[i, feature_names, with = FALSE])] 97 | }) 98 | set(xdt, j = "features", value = list(features)) 99 | set(xdt, j = "n_features", value = map_int(features, length)) 100 | 101 | # assign for callbacks 102 | private$.result_xdt = xdt 103 | private$.result_ydt = ydt 104 | private$.result_extra = extra 105 | 106 | call_back("on_fselect_result_begin", self$objective$callbacks, self$objective$context) 107 | 108 | super$assign_result(private$.result_xdt, private$.result_ydt) 109 | if (!is.null(private$.result$x_domain)) set(private$.result, j = "x_domain", value = NULL) 110 | } 111 | ), 112 | 113 | private = list( 114 | # initialize context for optimization 115 | .initialize_context = function(optimizer) { 116 | context = ContextAsyncFSelect$new(self, optimizer) 117 | self$objective$context = context 118 | } 119 | ) 120 | ) 121 | -------------------------------------------------------------------------------- /R/embedded_ensemble_fselect.R: -------------------------------------------------------------------------------- 1 | #' @title Embedded Ensemble Feature Selection 2 | #' 3 | #' @include CallbackBatchFSelect.R 4 | #' 5 | #' @description 6 | #' Ensemble feature selection using multiple learners. 7 | #' The ensemble feature selection method is designed to identify the most predictive features from a given dataset by leveraging multiple machine learning models and resampling techniques. 8 | #' Returns an [EnsembleFSResult]. 9 | #' 10 | #' @details 11 | #' The method begins by applying an initial resampling technique specified by the user, to create **multiple subsamples** from the original dataset (train/test splits). 12 | #' This resampling process helps in generating diverse subsets of data for robust feature selection. 13 | #' 14 | #' For each subsample (train set) generated in the previous step, the method applies learners 15 | #' that support **embedded feature selection**. 16 | #' These learners are then scored on their ability to predict on the resampled 17 | #' test sets, storing the selected features during training, for each 18 | #' combination of subsample and learner. 19 | #' 20 | #' Results are stored in an [EnsembleFSResult]. 21 | #' 22 | #' @param learners (list of [mlr3::Learner])\cr 23 | #' The learners to be used for feature selection. 24 | #' All learners must have the `selected_features` property, i.e. implement 25 | #' embedded feature selection (e.g. regularized models). 26 | #' @param init_resampling ([mlr3::Resampling])\cr 27 | #' The initial resampling strategy of the data, from which each train set 28 | #' will be passed on to the learners and each test set will be used for 29 | #' prediction. 30 | #' Can only be [mlr3::ResamplingSubsampling] or [mlr3::ResamplingBootstrap]. 31 | #' @param measure ([mlr3::Measure])\cr 32 | #' The measure used to score each learner on the test sets generated by 33 | #' `init_resampling`. 34 | #' If `NULL`, default measure is used. 35 | #' @param store_benchmark_result (`logical(1)`)\cr 36 | #' Whether to store the benchmark result in [EnsembleFSResult] or not. 37 | #' 38 | #' @template param_task 39 | #' 40 | #' @returns an [EnsembleFSResult] object. 41 | #' 42 | #' @source 43 | #' `r format_bib("meinshausen2010", "hedou2024")` 44 | #' @export 45 | #' @examples 46 | #' \donttest{ 47 | #' eefsr = embedded_ensemble_fselect( 48 | #' task = tsk("sonar"), 49 | #' learners = lrns(c("classif.rpart", "classif.featureless")), 50 | #' init_resampling = rsmp("subsampling", repeats = 5), 51 | #' measure = msr("classif.ce") 52 | #' ) 53 | #' eefsr 54 | #' } 55 | embedded_ensemble_fselect = function( 56 | task, 57 | learners, 58 | init_resampling, 59 | measure, 60 | store_benchmark_result = TRUE 61 | ) { 62 | assert_task(task) 63 | assert_learners(as_learners(learners), task = task, properties = "selected_features") 64 | assert_resampling(init_resampling) 65 | assert_choice(class(init_resampling)[1], choices = c("ResamplingBootstrap", "ResamplingSubsampling")) 66 | assert_measure(measure, task = task) 67 | assert_flag(store_benchmark_result) 68 | 69 | init_resampling$instantiate(task) 70 | 71 | design = benchmark_grid( 72 | tasks = task, 73 | learners = learners, 74 | resamplings = init_resampling 75 | ) 76 | 77 | bmr = benchmark(design, store_models = TRUE) 78 | 79 | trained_learners = bmr$score()$learner 80 | 81 | # extract selected features 82 | features = map(trained_learners, function(learner) { 83 | learner$selected_features() 84 | }) 85 | 86 | # extract n_features 87 | n_features = map_int(features, length) 88 | 89 | # extract scores on the test sets 90 | scores = bmr$score(measure) 91 | # remove `bmr_score` class 92 | class(scores) = c("data.table", "data.frame") 93 | 94 | set(scores, j = "features", value = features) 95 | set(scores, j = "n_features", value = n_features) 96 | setnames(scores, "iteration", "resampling_iteration") 97 | 98 | # remove R6 objects 99 | set(scores, j = "learner", value = NULL) 100 | set(scores, j = "task", value = NULL) 101 | set(scores, j = "resampling", value = NULL) 102 | set(scores, j = "prediction_test", value = NULL) 103 | set(scores, j = "task_id", value = NULL) 104 | set(scores, j = "nr", value = NULL) 105 | set(scores, j = "resampling_id", value = NULL) 106 | set(scores, j = "uhash", value = NULL) 107 | 108 | EnsembleFSResult$new( 109 | result = scores, 110 | features = task$feature_names, 111 | benchmark_result = if (store_benchmark_result) bmr, 112 | measure = measure 113 | ) 114 | } 115 | -------------------------------------------------------------------------------- /tests/testthat/test_embedded_ensemble_fselect.R: -------------------------------------------------------------------------------- 1 | test_that("embedded efs works", { 2 | task = tsk("sonar") 3 | with_seed(42, { 4 | efsr = embedded_ensemble_fselect( 5 | task = task, 6 | learners = lrns(c("classif.rpart", "classif.featureless")), 7 | init_resampling = rsmp("subsampling", repeats = 5), 8 | measure = msr("classif.ce") 9 | ) 10 | }) 11 | 12 | expect_character(efsr$man) 13 | expect_data_table(efsr$result, nrows = 10) 14 | expect_list(efsr$result$features, any.missing = FALSE, len = 10) 15 | expect_numeric(efsr$result$n_features, len = 10) 16 | expect_numeric(efsr$result$classif.ce, len = 10) 17 | expect_benchmark_result(efsr$benchmark_result) 18 | expect_measure(efsr$measure) 19 | expect_equal(efsr$measure$id, "classif.ce") 20 | expect_true(efsr$measure$minimize) # classification error 21 | expect_equal(efsr$n_learners, 2) 22 | expect_equal(efsr$n_resamples, 5) 23 | 24 | # stability 25 | expect_number(efsr$stability(stability_measure = "jaccard", stability_args = list(impute.na = 0))) 26 | expect_error(efsr$stability(stability_args = list(20)), "have names") 27 | stability = efsr$stability(stability_measure = "jaccard", stability_args = list(impute.na = 0), global = FALSE) 28 | expect_numeric(stability, len = 2) 29 | expect_equal(names(stability), c("classif.rpart", "classif.featureless")) 30 | 31 | # pareto_front 32 | pf = efsr$pareto_front() 33 | expect_data_table(pf, nrows = 7) 34 | expect_equal(names(pf), c("n_features", "classif.ce")) 35 | pf_pred = efsr$pareto_front(type = "estimated") 36 | expect_data_table(pf_pred, nrows = max(efsr$result$n_features)) 37 | expect_equal(names(pf_pred), c("n_features", "classif.ce")) 38 | 39 | # knee_points 40 | kps = efsr$knee_points() 41 | expect_data_table(kps, nrows = 1) 42 | expect_equal(names(kps), c("n_features", "classif.ce")) 43 | kpse = efsr$knee_points(type = "estimated") 44 | expect_data_table(kpse, nrows = 1) 45 | expect_true(kps$n_features != kpse$n_features) 46 | 47 | # data.table conversion 48 | tab = as.data.table(efsr) 49 | expect_equal(names(tab), c("learner_id", "resampling_iteration", "classif.ce", 50 | "features", "n_features", 51 | "task", "learner", "resampling")) 52 | 53 | # cannot change to use inner_measure 54 | expect_error(efsr$set_active_measure(which = "inner"), "No inner_measure was defined") 55 | # changing to "outer" leaves us with the same measure 56 | efsr$set_active_measure(which = "outer") 57 | expect_equal(efsr$measure$id, "classif.ce") # classification error 58 | 59 | # default feature ranking 60 | skip_if_not_installed("fastVoteR") 61 | feature_ranking = efsr$feature_ranking() 62 | expect_data_table(feature_ranking, nrows = length(task$feature_names)) 63 | expect_equal(names(feature_ranking), c("feature", "score", "norm_score", "borda_score")) 64 | }) 65 | 66 | test_that("combine embedded efs results", { 67 | task = tsk("sonar") 68 | with_seed(42, { 69 | efsr1 = embedded_ensemble_fselect( 70 | task = task, 71 | learners = lrns(c("classif.rpart", "classif.featureless")), 72 | init_resampling = rsmp("subsampling", repeats = 2), 73 | measure = msr("classif.ce") 74 | ) 75 | }) 76 | 77 | with_seed(43, { 78 | efsr2 = embedded_ensemble_fselect( 79 | task = task, 80 | learners = lrns(c("classif.rpart", "classif.featureless")), 81 | init_resampling = rsmp("subsampling", repeats = 3), 82 | measure = msr("classif.ce") 83 | ) 84 | }) 85 | 86 | comb1 = efsr1$clone(deep = TRUE)$combine(efsr2) 87 | comb2 = c(efsr1, efsr2) 88 | 89 | expect_class(comb1, "EnsembleFSResult") 90 | expect_class(comb2, "EnsembleFSResult") 91 | expect_data_table(comb1$result, nrows = 10L) 92 | expect_data_table(comb2$result, nrows = 10L) 93 | expect_equal(comb1$n_learners, 2L) 94 | expect_equal(comb2$n_learners, 2L) 95 | expect_equal(get_private(comb1)$.measure$id, "classif.ce") 96 | expect_equal(get_private(comb2)$.measure$id, "classif.ce") 97 | expect_null(get_private(comb1)$.inner_measure) 98 | expect_null(get_private(comb2)$.inner_measure) 99 | assert_benchmark_result(comb1$benchmark_result) 100 | assert_benchmark_result(comb2$benchmark_result) 101 | expect_equal(comb1$benchmark_result$n_resample_results, 4L) 102 | expect_equal(comb2$benchmark_result$n_resample_results, 4L) 103 | expect_equal(nrow(get_private(comb1$benchmark_result)$.data$data$fact), 10L) 104 | expect_equal(nrow(get_private(comb2$benchmark_result)$.data$data$fact), 10L) 105 | }) 106 | -------------------------------------------------------------------------------- /R/FSelectorBatchSequential.R: -------------------------------------------------------------------------------- 1 | #' @title Feature Selection with Sequential Search 2 | #' 3 | #' @include mlr_fselectors.R 4 | #' @name mlr_fselectors_sequential 5 | #' 6 | #' @description 7 | #' Feature selection using Sequential Search Algorithm. 8 | #' 9 | #' @details 10 | #' Sequential forward selection (`strategy = fsf`) extends the feature set in each iteration with the feature that increases the model's performance the most. 11 | #' Sequential backward selection (`strategy = fsb`) follows the same idea but starts with all features and removes features from the set. 12 | #' 13 | #' The feature selection terminates itself when `min_features` or `max_features` is reached. 14 | #' It is not necessary to set a termination criterion. 15 | #' 16 | #' @templateVar id sequential 17 | #' @template section_dictionary_fselectors 18 | #' 19 | #' @section Control Parameters: 20 | #' \describe{ 21 | #' \item{`min_features`}{`integer(1)`\cr 22 | #' Minimum number of features. By default, 1.} 23 | #' \item{`max_features`}{`integer(1)`\cr 24 | #' Maximum number of features. By default, number of features in [mlr3::Task].} 25 | #' \item{`strategy`}{`character(1)`\cr 26 | #' Search method `sfs` (forward search) or `sbs` (backward search).} 27 | #' } 28 | #' 29 | #' @family FSelector 30 | #' @export 31 | #' @template example 32 | FSelectorBatchSequential = R6Class("FSelectorBatchSequential", 33 | inherit = FSelectorBatch, 34 | public = list( 35 | 36 | #' @description 37 | #' Creates a new instance of this [R6][R6::R6Class] class.` 38 | initialize = function() { 39 | ps = ps( 40 | min_features = p_int(lower = 1, default = 1), 41 | max_features = p_int(lower = 1), 42 | strategy = p_fct(levels = c("sfs", "sbs"), default = "sfs") 43 | ) 44 | ps$values = list(strategy = "sfs", min_features = 1) 45 | 46 | super$initialize( 47 | id = "sequential", 48 | param_set = ps, 49 | properties = "single-crit", 50 | label = "Sequential Search", 51 | man = "mlr3fselect::mlr_fselectors_sequential" 52 | ) 53 | }, 54 | 55 | #' @description 56 | #' Returns the optimization path. 57 | #' 58 | #' @param inst ([FSelectInstanceBatchSingleCrit])\cr 59 | #' Instance optimized with [FSelectorBatchSequential]. 60 | #' @param include_uhash (`logical(1)`)\cr 61 | #' Include `uhash` column? 62 | #' 63 | #' @return [data.table::data.table()] 64 | optimization_path = function(inst, include_uhash = FALSE) { 65 | archive = inst$archive 66 | if (archive$n_batch == 0L) { 67 | stop("No results stored in archive") 68 | } 69 | uhash = if (include_uhash) "uhash" else NULL 70 | res = archive$data[, head(.SD, 1), by = get("batch_nr")] 71 | res[, c(archive$cols_x, archive$cols_y, "batch_nr", uhash), with = FALSE] 72 | } 73 | ), 74 | private = list( 75 | .optimize = function(inst) { 76 | 77 | pars = self$param_set$values 78 | archive = inst$archive 79 | feature_names = inst$archive$cols_x 80 | 81 | if (is.null(pars$max_features)) { 82 | pars$max_features = length(feature_names) 83 | } 84 | 85 | # Initialize states for first batch 86 | m = if (self$param_set$values$strategy == "sfs") pars$min_features else pars$max_features 87 | combinations = combn(length(feature_names), m) 88 | states = map_dtr(seq_len(ncol(combinations)), function(j) { 89 | state = rep(FALSE, length(feature_names)) 90 | state[combinations[, j]] = TRUE 91 | set_names(as.list(state), feature_names) 92 | }) 93 | 94 | inst$eval_batch(states) 95 | 96 | repeat({ 97 | if (archive$n_batch == pars$max_features - pars$min_features + 1) break 98 | 99 | res = archive$best(batch = archive$n_batch) 100 | best_state = as.logical(res[, feature_names, with = FALSE]) 101 | 102 | # Generate new states based on best feature set 103 | x = ifelse(pars$strategy == "sfs", FALSE, TRUE) 104 | y = ifelse(pars$strategy == "sfs", TRUE, FALSE) 105 | z = if (pars$strategy == "sfs") !best_state else best_state 106 | 107 | states = map_dtr(seq_along(best_state)[z], function(i) { 108 | if (best_state[i] == x) { 109 | new_state = best_state 110 | new_state[i] = y 111 | set_names(as.list(new_state), feature_names) 112 | } 113 | }) 114 | 115 | inst$eval_batch(states) 116 | }) 117 | } 118 | ) 119 | ) 120 | 121 | mlr_fselectors$add("sequential", FSelectorBatchSequential) 122 | -------------------------------------------------------------------------------- /man/mlr_fselectors_async_exhaustive_search.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/FSelectorAsyncExhaustiveSearch.R 3 | \name{mlr_fselectors_async_exhaustive_search} 4 | \alias{mlr_fselectors_async_exhaustive_search} 5 | \alias{FSelectorAsyncExhaustiveSearch} 6 | \title{Feature Selection with Asynchronous Exhaustive Search} 7 | \description{ 8 | Feature Selection using the Asynchronous Exhaustive Search Algorithm. 9 | Exhaustive Search generates all possible feature sets. 10 | The feature sets are evaluated asynchronously. 11 | } 12 | \details{ 13 | The feature selection terminates itself when all feature sets are evaluated. 14 | It is not necessary to set a termination criterion. 15 | } 16 | \section{Dictionary}{ 17 | 18 | This \link{FSelector} can be instantiated with the associated sugar function \code{\link[=fs]{fs()}}: 19 | 20 | \if{html}{\out{
}}\preformatted{fs("async_exhaustive_search") 21 | }\if{html}{\out{
}} 22 | } 23 | 24 | \section{Control Parameters}{ 25 | 26 | \describe{ 27 | \item{\code{max_features}}{\code{integer(1)}\cr 28 | Maximum number of features. 29 | By default, number of features in \link[mlr3:Task]{mlr3::Task}.} 30 | } 31 | } 32 | 33 | \seealso{ 34 | Other FSelectorAsync: 35 | \code{\link{mlr_fselectors_async_design_points}}, 36 | \code{\link{mlr_fselectors_async_random_search}} 37 | } 38 | \concept{FSelectorAsync} 39 | \section{Super classes}{ 40 | \code{\link[mlr3fselect:FSelector]{mlr3fselect::FSelector}} -> \code{\link[mlr3fselect:FSelectorAsync]{mlr3fselect::FSelectorAsync}} -> \code{FSelectorAsyncExhaustiveSearch} 41 | } 42 | \section{Methods}{ 43 | \subsection{Public methods}{ 44 | \itemize{ 45 | \item \href{#method-FSelectorAsyncExhaustiveSearch-new}{\code{FSelectorAsyncExhaustiveSearch$new()}} 46 | \item \href{#method-FSelectorAsyncExhaustiveSearch-optimize}{\code{FSelectorAsyncExhaustiveSearch$optimize()}} 47 | \item \href{#method-FSelectorAsyncExhaustiveSearch-clone}{\code{FSelectorAsyncExhaustiveSearch$clone()}} 48 | } 49 | } 50 | \if{html}{\out{ 51 |
Inherited methods 52 | 57 |
58 | }} 59 | \if{html}{\out{
}} 60 | \if{html}{\out{}} 61 | \if{latex}{\out{\hypertarget{method-FSelectorAsyncExhaustiveSearch-new}{}}} 62 | \subsection{Method \code{new()}}{ 63 | Creates a new instance of this \link[R6:R6Class]{R6} class. 64 | \subsection{Usage}{ 65 | \if{html}{\out{
}}\preformatted{FSelectorAsyncExhaustiveSearch$new()}\if{html}{\out{
}} 66 | } 67 | 68 | } 69 | \if{html}{\out{
}} 70 | \if{html}{\out{}} 71 | \if{latex}{\out{\hypertarget{method-FSelectorAsyncExhaustiveSearch-optimize}{}}} 72 | \subsection{Method \code{optimize()}}{ 73 | Starts the asynchronous optimization. 74 | \subsection{Usage}{ 75 | \if{html}{\out{
}}\preformatted{FSelectorAsyncExhaustiveSearch$optimize(inst)}\if{html}{\out{
}} 76 | } 77 | 78 | \subsection{Arguments}{ 79 | \if{html}{\out{
}} 80 | \describe{ 81 | \item{\code{inst}}{(\link{FSelectInstanceAsyncSingleCrit} | \link{FSelectInstanceAsyncMultiCrit}).} 82 | } 83 | \if{html}{\out{
}} 84 | } 85 | \subsection{Returns}{ 86 | \link[data.table:data.table]{data.table::data.table}. 87 | } 88 | } 89 | \if{html}{\out{
}} 90 | \if{html}{\out{}} 91 | \if{latex}{\out{\hypertarget{method-FSelectorAsyncExhaustiveSearch-clone}{}}} 92 | \subsection{Method \code{clone()}}{ 93 | The objects of this class are cloneable with this method. 94 | \subsection{Usage}{ 95 | \if{html}{\out{
}}\preformatted{FSelectorAsyncExhaustiveSearch$clone(deep = FALSE)}\if{html}{\out{
}} 96 | } 97 | 98 | \subsection{Arguments}{ 99 | \if{html}{\out{
}} 100 | \describe{ 101 | \item{\code{deep}}{Whether to make a deep clone.} 102 | } 103 | \if{html}{\out{
}} 104 | } 105 | } 106 | } 107 | -------------------------------------------------------------------------------- /man/callback_batch_fselect.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/CallbackBatchFSelect.R 3 | \name{callback_batch_fselect} 4 | \alias{callback_batch_fselect} 5 | \title{Create Feature Selection Callback} 6 | \usage{ 7 | callback_batch_fselect( 8 | id, 9 | label = NA_character_, 10 | man = NA_character_, 11 | on_optimization_begin = NULL, 12 | on_optimizer_before_eval = NULL, 13 | on_eval_after_design = NULL, 14 | on_eval_after_benchmark = NULL, 15 | on_eval_before_archive = NULL, 16 | on_optimizer_after_eval = NULL, 17 | on_result = NULL, 18 | on_optimization_end = NULL, 19 | on_auto_fselector_before_final_model = NULL, 20 | on_auto_fselector_after_final_model = NULL 21 | ) 22 | } 23 | \arguments{ 24 | \item{id}{(\code{character(1)})\cr 25 | Identifier for the new instance.} 26 | 27 | \item{label}{(\code{character(1)})\cr 28 | Label for the new instance.} 29 | 30 | \item{man}{(\code{character(1)})\cr 31 | String in the format \verb{[pkg]::[topic]} pointing to a manual page for this object. 32 | The referenced help package can be opened via method \verb{$help()}.} 33 | 34 | \item{on_optimization_begin}{(\verb{function()})\cr 35 | Stage called at the beginning of the optimization. 36 | Called in \code{Optimizer$optimize()}.} 37 | 38 | \item{on_optimizer_before_eval}{(\verb{function()})\cr 39 | Stage called after the optimizer proposes points. 40 | Called in \code{OptimInstance$eval_batch()}.} 41 | 42 | \item{on_eval_after_design}{(\verb{function()})\cr 43 | Stage called after design is created. 44 | Called in \code{ObjectiveFSelectBatch$eval_many()}.} 45 | 46 | \item{on_eval_after_benchmark}{(\verb{function()})\cr 47 | Stage called after feature sets are evaluated. 48 | Called in \code{ObjectiveFSelectBatch$eval_many()}.} 49 | 50 | \item{on_eval_before_archive}{(\verb{function()})\cr 51 | Stage called before performance values are written to the archive. 52 | Called in \code{ObjectiveFSelectBatch$eval_many()}.} 53 | 54 | \item{on_optimizer_after_eval}{(\verb{function()})\cr 55 | Stage called after points are evaluated. 56 | Called in \code{OptimInstance$eval_batch()}.} 57 | 58 | \item{on_result}{(\verb{function()})\cr 59 | Stage called after result are written. 60 | Called in \code{OptimInstance$assign_result()}.} 61 | 62 | \item{on_optimization_end}{(\verb{function()})\cr 63 | Stage called at the end of the optimization. 64 | Called in \code{Optimizer$optimize()}.} 65 | 66 | \item{on_auto_fselector_before_final_model}{(\verb{function()})\cr 67 | Stage called before the final model is trained. 68 | Called in \code{AutoFSelector$train()}.} 69 | 70 | \item{on_auto_fselector_after_final_model}{(\verb{function()})\cr 71 | Stage called after the final model is trained. 72 | Called in \code{AutoFSelector$train()}.} 73 | } 74 | \description{ 75 | Function to create a \link{CallbackBatchFSelect}. 76 | Predefined callbacks are stored in the \link[mlr3misc:Dictionary]{dictionary} \link{mlr_callbacks} and can be retrieved with \code{\link[=clbk]{clbk()}}. 77 | 78 | Feature selection callbacks can be called from different stages of feature selection. 79 | The stages are prefixed with \verb{on_*}. 80 | The \verb{on_auto_fselector_*} stages are only available when the callback is used in an \link{AutoFSelector}. 81 | 82 | \if{html}{\out{
}}\preformatted{Start Automatic Feature Selection 83 | Start Feature Selection 84 | - on_optimization_begin 85 | Start FSelect Batch 86 | - on_optimizer_before_eval 87 | Start Evaluation 88 | - on_eval_after_design 89 | - on_eval_after_benchmark 90 | - on_eval_before_archive 91 | End Evaluation 92 | - on_optimizer_after_eval 93 | End FSelect Batch 94 | - on_result 95 | - on_optimization_end 96 | End Feature Selection 97 | - on_auto_fselector_before_final_model 98 | - on_auto_fselector_after_final_model 99 | End Automatic Feature Selection 100 | }\if{html}{\out{
}} 101 | 102 | See also the section on parameters for more information on the stages. 103 | A feature selection callback works with \link[bbotk:ContextBatch]{bbotk::ContextBatch} and \link{ContextBatchFSelect}. 104 | } 105 | \details{ 106 | When implementing a callback, each function must have two arguments named \code{callback} and \code{context}. 107 | A callback can write data to the state (\verb{$state}), e.g. settings that affect the callback itself. 108 | Avoid writing large data the state. 109 | } 110 | \examples{ 111 | # Write archive to disk 112 | callback_batch_fselect("mlr3fselect.backup", 113 | on_optimization_end = function(callback, context) { 114 | saveRDS(context$instance$archive, "archive.rds") 115 | } 116 | ) 117 | } 118 | --------------------------------------------------------------------------------