Skip to content

Commit

Permalink
document sparse data usage in parsnip
Browse files Browse the repository at this point in the history
  • Loading branch information
EmilHvitfeldt committed Sep 4, 2024
1 parent f66a8f9 commit 44403fc
Show file tree
Hide file tree
Showing 25 changed files with 166 additions and 0 deletions.
18 changes: 18 additions & 0 deletions R/sparsevctrs.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,21 @@
#' Using sparse data with parsnip
#'
#' You can figure out whether a given model engine supports sparse data by
#' calling `get_encoding("name of model")` and looking at the `allow_sparse_x`
#' column.
#'
#' Using sparse data for model fitting and prediction shouldn't require any
#' additional configurations. Just pass in a sparse matrix such as dgCMatrix
#' from the `Matrix` package or a sparse tibble from the `sparsevctrs` package
#' to the data argument of the respective [fit()], [fit_xy()], and [predict()].
#'
#' Models that don't support sparse data will try to convert to non-sparse data
#' with warnings. An informative error will be thrown if conversion isn't
#' possible.
#'
#' @name sparse_data
NULL

to_sparse_data_frame <- function(x, object) {
if (methods::is(x, "sparseMatrix")) {
if (allow_sparse(object)) {
Expand Down
1 change: 1 addition & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ reference:
- set_engine
- set_mode
- show_engines
- sparse_data
- tidy.model_fit
- translate
- starts_with("update")
Expand Down
8 changes: 8 additions & 0 deletions man/details_boost_tree_xgboost.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions man/details_linear_reg_glmnet.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions man/details_logistic_reg_LiblineaR.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions man/details_logistic_reg_glmnet.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions man/details_multinom_reg_glmnet.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions man/details_rand_forest_ranger.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions man/details_svm_linear_LiblineaR.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions man/rmd/boost_tree_xgboost.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,11 @@ For classification, non-numeric outcomes (i.e., factors) are internally converte
```{r child = "template-uses-case-weights.Rmd"}
```

## Sparse Data

```{r child = "template-uses-sparse-data.Rmd"}
```

## Other details

### Interfacing with the `params` argument
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/boost_tree_xgboost.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,11 @@ This model can utilize case weights during model fitting. To use them, see the d

The `fit()` and `fit_xy()` arguments have arguments called `case_weights` that expect vectors of case weights.

## Sparse Data


This model can utilize sparse data during model fitting and prediction. Both sparse matrices such as dgCMatrix from the `Matrix` package and sparse tibbles from the `sparsevctrs` package are supported. See [sparse_data] for more information.

## Other details

### Interfacing with the `params` argument
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/linear_reg_glmnet.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ By default, [glmnet::glmnet()] uses the argument `standardize = TRUE` to center
```{r child = "template-uses-case-weights.Rmd"}
```

## Sparse Data

```{r child = "template-uses-sparse-data.Rmd"}
```

## Saving fitted model objects

```{r child = "template-butcher.Rmd"}
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/linear_reg_glmnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,11 @@ This model can utilize case weights during model fitting. To use them, see the d

The `fit()` and `fit_xy()` arguments have arguments called `case_weights` that expect vectors of case weights.

## Sparse Data


This model can utilize sparse data during model fitting and prediction. Both sparse matrices such as dgCMatrix from the `Matrix` package and sparse tibbles from the `sparsevctrs` package are supported. See [sparse_data] for more information.

## Saving fitted model objects


Expand Down
5 changes: 5 additions & 0 deletions man/rmd/logistic_reg_LiblineaR.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@ logistic_reg(penalty = double(1), mixture = double(1)) %>%
```{r child = "template-same-scale.Rmd"}
```

## Sparse Data

```{r child = "template-uses-sparse-data.Rmd"}
```

## Examples

The "Fitting and Predicting with parsnip" article contains [examples](https://parsnip.tidymodels.org/articles/articles/Examples.html#logistic-reg-LiblineaR) for `logistic_reg()` with the `"LiblineaR"` engine.
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/logistic_reg_LiblineaR.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,11 @@ Factor/categorical predictors need to be converted to numeric values (e.g., dumm
Predictors should have the same scale. One way to achieve this is to center and
scale each so that each predictor has mean zero and a variance of one.

## Sparse Data


This model can utilize sparse data during model fitting and prediction. Both sparse matrices such as dgCMatrix from the `Matrix` package and sparse tibbles from the `sparsevctrs` package are supported. See [sparse_data] for more information.

## Examples

The "Fitting and Predicting with parsnip" article contains [examples](https://parsnip.tidymodels.org/articles/articles/Examples.html#logistic-reg-LiblineaR) for `logistic_reg()` with the `"LiblineaR"` engine.
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/logistic_reg_glmnet.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,11 @@ By default, [glmnet::glmnet()] uses the argument `standardize = TRUE` to center
```{r child = "template-uses-case-weights.Rmd"}
```

## Sparse Data

```{r child = "template-uses-sparse-data.Rmd"}
```

## Saving fitted model objects

```{r child = "template-butcher.Rmd"}
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/logistic_reg_glmnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,11 @@ This model can utilize case weights during model fitting. To use them, see the d

The `fit()` and `fit_xy()` arguments have arguments called `case_weights` that expect vectors of case weights.

## Sparse Data


This model can utilize sparse data during model fitting and prediction. Both sparse matrices such as dgCMatrix from the `Matrix` package and sparse tibbles from the `sparsevctrs` package are supported. See [sparse_data] for more information.

## Saving fitted model objects


Expand Down
5 changes: 5 additions & 0 deletions man/rmd/multinom_reg_glmnet.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,11 @@ The "Fitting and Predicting with parsnip" article contains [examples](https://pa
```{r child = "template-uses-case-weights.Rmd"}
```

## Sparse Data

```{r child = "template-uses-sparse-data.Rmd"}
```

## Saving fitted model objects

```{r child = "template-butcher.Rmd"}
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/multinom_reg_glmnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,11 @@ This model can utilize case weights during model fitting. To use them, see the d

The `fit()` and `fit_xy()` arguments have arguments called `case_weights` that expect vectors of case weights.

## Sparse Data


This model can utilize sparse data during model fitting and prediction. Both sparse matrices such as dgCMatrix from the `Matrix` package and sparse tibbles from the `sparsevctrs` package are supported. See [sparse_data] for more information.

## Saving fitted model objects


Expand Down
5 changes: 5 additions & 0 deletions man/rmd/rand_forest_ranger.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,11 @@ For `ranger` confidence intervals, the intervals are constructed using the form
```{r child = "template-uses-case-weights.Rmd"}
```

## Sparse Data

```{r child = "template-uses-sparse-data.Rmd"}
```

## Saving fitted model objects

```{r child = "template-butcher.Rmd"}
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/rand_forest_ranger.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,11 @@ This model can utilize case weights during model fitting. To use them, see the d

The `fit()` and `fit_xy()` arguments have arguments called `case_weights` that expect vectors of case weights.

## Sparse Data


This model can utilize sparse data during model fitting and prediction. Both sparse matrices such as dgCMatrix from the `Matrix` package and sparse tibbles from the `sparsevctrs` package are supported. See [sparse_data] for more information.

## Saving fitted model objects


Expand Down
5 changes: 5 additions & 0 deletions man/rmd/svm_linear_LiblineaR.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,11 @@ Note that the `LiblineaR` engine does not produce class probabilities. When opti
```{r child = "template-no-case-weights.Rmd"}
```

## Sparse Data

```{r child = "template-uses-sparse-data.Rmd"}
```

## Examples

The "Fitting and Predicting with parsnip" article contains [examples](https://parsnip.tidymodels.org/articles/articles/Examples.html#svm-linear-LiblineaR) for `svm_linear()` with the `"LiblineaR"` engine.
Expand Down
5 changes: 5 additions & 0 deletions man/rmd/svm_linear_LiblineaR.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,11 @@ scale each so that each predictor has mean zero and a variance of one.

The underlying model implementation does not allow for case weights.

## Sparse Data


This model can utilize sparse data during model fitting and prediction. Both sparse matrices such as dgCMatrix from the `Matrix` package and sparse tibbles from the `sparsevctrs` package are supported. See [sparse_data] for more information.

## Examples

The "Fitting and Predicting with parsnip" article contains [examples](https://parsnip.tidymodels.org/articles/articles/Examples.html#svm-linear-LiblineaR) for `svm_linear()` with the `"LiblineaR"` engine.
Expand Down
1 change: 1 addition & 0 deletions man/rmd/template-uses-sparse-data.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This model can utilize sparse data during model fitting and prediction. Both sparse matrices such as dgCMatrix from the `Matrix` package and sparse tibbles from the `sparsevctrs` package are supported. See [sparse_data] for more information.
20 changes: 20 additions & 0 deletions man/sparse_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 44403fc

Please sign in to comment.