These two objects can be used to compute importance scores based on correlation coefficient.
Format
An object of class filtro::class_score_cor (inherits from filtro::class_score, S7_object) of length 1.
An object of class filtro::class_score_cor (inherits from filtro::class_score, S7_object) of length 1.
Value
An S7 object. The primary property of interest is in results. This
is a data frame of results that is populated by the fit() method and has
columns:
name: The name of the score (e.g.,score_cor_pearsonorscore_cor_spearman).score: The estimates for each predictor.outcome: The name of the outcome column.predictor: The names of the predictor inputs.
These data are accessed using object@results (see examples below).
Details
These objects are used when:
The predictors are numeric and the outcome is numeric.
In this case, a correlation coefficient (via stats::cov.wt()) is computed with
the proper variable roles. Values closer to 1 or -1 (i.e., abs(cor_pearson)
closer to 1) are associated with more important predictors.
Estimating the scores
In filtro, the score_* objects define a scoring method (e.g., data
input requirements, package dependencies, etc). To compute the scores for
a specific data set, the fit() method is used. The main arguments for
these functions are:
objectA score class object (e.g.,
score_cor_pearson).formulaA standard R formula with a single outcome on the right-hand side and one or more predictors (or
.) on the left-hand side. The data are processed viastats::model.frame()dataA data frame containing the relevant columns defined by the formula.
...Further arguments passed to or from other methods.
case_weightsA quantitative vector of case weights that is the same length as the number of rows in
data. The default ofNULLindicates that there are no case weights.
Missing values are removed for each predictor/outcome combination being scored.
In cases where the underlying computations fail, the scoring proceeds silently, and a missing value is given for the score.
See also
Other class score metrics:
score_aov_pval,
score_imp_rf,
score_info_gain,
score_roc_auc,
score_xtab_pval_chisq
Examples
library(dplyr)
ames <- modeldata::ames
# Pearson correlation
ames_cor_pearson_res <-
score_cor_pearson |>
fit(Sale_Price ~ ., data = ames)
ames_cor_pearson_res@results
#> # A tibble: 73 × 4
#> name score outcome predictor
#> <chr> <dbl> <chr> <chr>
#> 1 cor_pearson NA Sale_Price MS_SubClass
#> 2 cor_pearson NA Sale_Price MS_Zoning
#> 3 cor_pearson 0.202 Sale_Price Lot_Frontage
#> 4 cor_pearson 0.267 Sale_Price Lot_Area
#> 5 cor_pearson NA Sale_Price Street
#> 6 cor_pearson NA Sale_Price Alley
#> 7 cor_pearson NA Sale_Price Lot_Shape
#> 8 cor_pearson NA Sale_Price Land_Contour
#> 9 cor_pearson NA Sale_Price Utilities
#> 10 cor_pearson NA Sale_Price Lot_Config
#> # ℹ 63 more rows
# Spearman correlation
ames_cor_spearman_res <-
score_cor_spearman |>
fit(Sale_Price ~ ., data = ames)
ames_cor_spearman_res@results
#> # A tibble: 73 × 4
#> name score outcome predictor
#> <chr> <dbl> <chr> <chr>
#> 1 cor_spearman NA Sale_Price MS_SubClass
#> 2 cor_spearman NA Sale_Price MS_Zoning
#> 3 cor_spearman 0.228 Sale_Price Lot_Frontage
#> 4 cor_spearman 0.429 Sale_Price Lot_Area
#> 5 cor_spearman NA Sale_Price Street
#> 6 cor_spearman NA Sale_Price Alley
#> 7 cor_spearman NA Sale_Price Lot_Shape
#> 8 cor_spearman NA Sale_Price Land_Contour
#> 9 cor_spearman NA Sale_Price Utilities
#> 10 cor_spearman NA Sale_Price Lot_Config
#> # ℹ 63 more rows
