Skip to content

Revised model documentation #456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 85 commits into from
Jun 25, 2021
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
1e7f84c
re-organize linear_reg documentation
topepo Mar 20, 2021
509836c
search across parsnip adjacent packages for details files
topepo Mar 22, 2021
524cd49
added stan details
topepo Mar 22, 2021
7416ab3
Made clickable link for lm()
juliasilge Mar 30, 2021
0999884
Move some details down to highlight most important stuff
juliasilge Mar 30, 2021
4599543
Typo
juliasilge Mar 30, 2021
0a0326b
Spacing
juliasilge Mar 30, 2021
f1db675
Edits to wording for clarity
juliasilge Mar 30, 2021
2d424a2
doc refresh
topepo Apr 15, 2021
18201e9
extended glmnet documentation
topepo Apr 15, 2021
6dc73cc
Update man/rmd/glmnet-details.Rmd
topepo Apr 16, 2021
87e236a
Update R/linear_reg.R
topepo Apr 16, 2021
d98cbbd
Update R/aaa_models.R
topepo Apr 16, 2021
35b423a
better documentation based on review comments
topepo Apr 16, 2021
9a3aa27
Merge branch 'doc-test' of https://github.com/tidymodels/parsnip into…
topepo Apr 16, 2021
826b24b
move to underscore in file names
topepo Apr 16, 2021
9e6b912
use linked verions of function names
topepo Apr 16, 2021
bcbd1a5
more information on additional engines and tidymodels.org
topepo Apr 16, 2021
3db440c
expand the package exclusion list
topepo Apr 16, 2021
efe0e94
added boosted tree docs
topepo Apr 18, 2021
07d2bb8
reworked the parameter code to use tunable
topepo Apr 19, 2021
d8a3f69
early_stop re-added (required devel dials)
topepo Apr 21, 2021
f97d3db
Merge branch 'master' into doc-test
topepo Jun 1, 2021
d9f2543
minor linear_reg updates and use of templates
topepo Jun 1, 2021
f5868f8
Update man/rmd/glmnet-details.Rmd
topepo Jun 1, 2021
0b76e6c
Update man/rmd/glmnet-details.Rmd
topepo Jun 1, 2021
35707e7
update boosting pages
topepo Jun 1, 2021
2737fdb
better seeaslo and references
topepo Jun 1, 2021
777775c
decision_tree files
topepo Jun 1, 2021
ff89d99
fix failing test case
topepo Jun 1, 2021
3001f2e
logistic_reg files
topepo Jun 1, 2021
8c99d7d
mars files
topepo Jun 1, 2021
94d2e6f
mlp files
topepo Jun 1, 2021
749c270
multinomial files
topepo Jun 2, 2021
76df18c
fix some file names
topepo Jun 2, 2021
e625de4
un-needed files
topepo Jun 2, 2021
cf3730e
knn files
topepo Jun 2, 2021
958ecf8
rand_forest files
topepo Jun 2, 2021
8d79f7c
svm files
topepo Jun 2, 2021
2446b8f
cleaned up titles (no more "general interfaces")
topepo Jun 2, 2021
e336761
standardize on "specific engines only"
topepo Jun 2, 2021
09bb004
remove "Parameters can be represented by a placeholder" in examples
topepo Jun 2, 2021
13270a6
Update man/rmd/glmnet-details.Rmd
topepo Jun 2, 2021
cb36eaf
Update man/rmd/glmnet-details.Rmd
topepo Jun 2, 2021
e95a769
suggestions from Hannah
topepo Jun 2, 2021
62c5055
fixed a few bugs/typos
topepo Jun 2, 2021
66bff6b
bug fix
topepo Jun 3, 2021
a9efa89
dynamic @seealso
topepo Jun 3, 2021
a0a683b
added an overview of dynamic documentation bits.
topepo Jun 8, 2021
eea10fe
updated glmnet information
topepo Jun 9, 2021
83120fc
small doc updates for glmnet
topepo Jun 9, 2021
02e6561
Update NEWS.md
topepo Jun 10, 2021
de4a64d
prototype sections for worked examples
topepo Jun 10, 2021
600c23b
fix train.test indices
topepo Jun 11, 2021
b710818
more roxygenization
topepo Jun 15, 2021
0e35b38
added man-roxygen to build ignore
topepo Jun 15, 2021
bd45773
mode for null model
topepo Jun 15, 2021
d238ec2
examples for rand_forest() with engines ranger and randomForest
hfrick Jun 21, 2021
e5e0ee1
examples for svm_linear() with engines kernlab and LiblineaR
hfrick Jun 21, 2021
17c8722
set seed for reproducibility
hfrick Jun 21, 2021
d59fbc5
examples for `svm_poly()` and `svm_rbf()`
hfrick Jun 21, 2021
ce75f08
clean-up
hfrick Jun 21, 2021
4eb4130
add sentence about model spec
hfrick Jun 21, 2021
84e1926
add example for `multinom_reg()` with penguins
hfrick Jun 21, 2021
ed5938c
Need this for `devtools::document()` now
juliasilge Jun 22, 2021
105c74d
Edits to doc tools
juliasilge Jun 22, 2021
1d9e44c
Refine boosted tree docs
juliasilge Jun 22, 2021
698e914
Refine decision_tree() docs
juliasilge Jun 22, 2021
bf01bb4
Refine linear/logistic docs
juliasilge Jun 22, 2021
8ef2142
Refine mars, mlp, multinom (plus logistic again)
juliasilge Jun 22, 2021
88dc6b8
Finish refining model pages
juliasilge Jun 22, 2021
73fc19c
Refine details pages
juliasilge Jun 22, 2021
8c2c082
Finish up details pages, and document
juliasilge Jun 22, 2021
338ff70
tidy up examples of class prediction
hfrick Jun 23, 2021
1d3865f
Merge branch 'master' into doc-test
topepo Jun 24, 2021
a1b42ca
doc refresh after updating from master
topepo Jun 24, 2021
695a5d6
remove examples as they slow down `document()`
hfrick Jun 25, 2021
758b43f
remove another example
hfrick Jun 25, 2021
1e05cb5
update tree splitting template (and its name)
topepo Jun 25, 2021
112a5de
remove default engine text
topepo Jun 25, 2021
a5ae291
add default engine to list
topepo Jun 25, 2021
14c6749
maybe fix GHA issues when new dependencies are on CRAN
topepo Jun 25, 2021
db98176
remove multilevelmod reference
topepo Jun 25, 2021
aa93434
doc refresh
topepo Jun 25, 2021
8c730b3
missing comma
topepo Jun 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,5 @@ Suggests:
nlme,
modeldata,
LiblineaR,
Matrix
Matrix,
dials
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ export(control_parsnip)
export(convert_stan_interval)
export(decision_tree)
export(eval_args)
export(find_engine_files)
export(fit)
export(fit.model_spec)
export(fit_control)
Expand Down
61 changes: 61 additions & 0 deletions R/aaa_models.R
Original file line number Diff line number Diff line change
Expand Up @@ -911,3 +911,64 @@ get_encoding <- function(model) {
}
res
}

#' Tools for documenting packages
#' @param mod A character string for the model file
#' @param pkg The package that contains the model file
#' @return `find_engine_files()` returns a character string.
#' @name doc-tools
#' @keywords internal
#' @export
#' @examples
#' cat(find_engine_files("linear_reg"))
find_engine_files <- function(mod) {

# Get available topics
topic_names <- search_for_engine_docs(mod)
if (length(topic_names) == 0) {
return(character(0))
}

# Subset for our model function
eng <- strsplit(topic_names, "-")
eng <- purrr::map_chr(eng, ~ .x[length(.x)])
eng <- tibble::tibble(engine = eng, topic = topic_names)

# Combine them to keep the order in which they were registered
all_eng <- get_from_env(mod)
all_eng$.order <- 1:nrow(all_eng)
eng <- dplyr::left_join(eng, all_eng, by = "engine")
eng <- eng[order(eng$.order),]

res <-
glue::glue(" \\item \\code{\\link[=|eng$topic|]{|eng$engine|}} ",
.open = "|", .close = "|")

res <- paste0("\\itemize{\n", paste0(res, collapse = "\n"), "\n}")
res
}

search_for_engine_docs <- function(mod) {
all_deps <- get_from_env(paste0(mod, "_pkgs"))
all_deps <- unlist(all_deps$pkg)
all_deps <- unique(c("parsnip", all_deps))
excl <- c("stats", "magrittr")
all_deps <- all_deps[!(all_deps %in% excl)]
res <- purrr::map(all_deps, parsnip:::find_details_topics, mod = mod)
res <- unique(unlist(res))
res
}

find_details_topics <- function(pkg, mod) {
mod <- gsub("_", "-", mod)
meta_loc <- system.file("Meta/Rd.rds", package = pkg)
meta_loc <- meta_loc[meta_loc != ""]
if (length(meta_loc) > 0) {
topic_names <- readRDS(meta_loc)$Name
res <- grep(paste0("details-", mod), topic_names, value = TRUE)
} else {
res <- character(0)
}
res
}

9 changes: 9 additions & 0 deletions R/linear-reg-doc-glmnet.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#' Linear regression via glmnet
#'
#' `glmnet()` uses regularized least squares to fit models with numeric outcomes.
#'
#' @includeRmd man/rmd/linear-reg-glmnet.Rmd details
#'
#' @name details-linear-reg-glmnet
#' @keywords internal
NULL
9 changes: 9 additions & 0 deletions R/linear-reg-doc-keras.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#' Linear regression via keras/tensorflow
#'
#' This model uses regularized least squares to fit models with numeric outcomes.
#'
#' @includeRmd man/rmd/linear-reg-keras.Rmd details
#'
#' @name details-linear-reg-keras
#' @keywords internal
NULL
9 changes: 9 additions & 0 deletions R/linear-reg-doc-lm.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#' Linear regression via lm
#'
#' [stats::lm()] uses ordinary least squares to fit models with numeric outcomes.
#'
#' @includeRmd man/rmd/linear-reg-lm.Rmd details
#'
#' @name details-linear-reg-lm
#' @keywords internal
NULL
10 changes: 10 additions & 0 deletions R/linear-reg-doc-spark.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#' Linear regression via spark
#'
#' `sparklyr::ml_linear_regression()` uses regularized least squares to fit
#' models with numeric outcomes.
#'
#' @includeRmd man/rmd/linear-reg-spark.Rmd details
#'
#' @name details-linear-reg-spark
#' @keywords internal
NULL
9 changes: 9 additions & 0 deletions R/linear-reg-doc-stan.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#' Linear regression via Bayesian Methods
#'
#' The `stan` engine estimates regression parameters using Bayesian estimation.
#'
#' @includeRmd man/rmd/linear-reg-stan.Rmd details
#'
#' @name details-linear-reg-stan
#' @keywords internal
NULL
68 changes: 16 additions & 52 deletions R/linear_reg.R
Original file line number Diff line number Diff line change
@@ -1,72 +1,36 @@
#' General Interface for Linear Regression Models
#'
#' `linear_reg()` is a way to generate a _specification_ of a model
#' before fitting and allows the model to be created using
#' different packages in R, Stan, keras, or via Spark. The main
#' arguments for the model are:
#' \itemize{
#' \item \code{penalty}: The total amount of regularization
#' in the model. Note that this must be zero for some engines.
#' \item \code{mixture}: The mixture amounts of different types of
#' regularization (see below). Note that this will be ignored for some engines.
#' }
#' These arguments are converted to their specific names at the
#' time that the model is fit. Other options and arguments can be
#' set using `set_engine()`. If left to their defaults
#' here (`NULL`), the values are taken from the underlying model
#' functions. If parameters need to be modified, `update()` can be used
#' in lieu of recreating the object from scratch.
#' @description
#'
#' `linear_reg()` defines a model that can predict numeric values from
#' predictors using a linear function.
#'
#' There are different ways to fit this model. Information about the available
#' _engines_ that that can be used for fitting:
#'
#' \Sexpr[stage=render,results=rd]{parsnip:::find_engine_files("linear_reg")}
#'
#' @inheritParams boost_tree
#' @param mode A single character string for the type of model.
#' The only possible value for this model is "regression".
#' @param penalty A non-negative number representing the total
#' amount of regularization (`glmnet`, `keras`, and `spark` only).
#' For `keras` models, this corresponds to purely L2 regularization
#' (aka weight decay) while the other models can be a combination
#' of L1 and L2 (depending on the value of `mixture`; see below).
#' amount of regularization (specific engines only).
#' @param mixture A number between zero and one (inclusive) that is the
#' proportion of L1 regularization (i.e. lasso) in the model. When
#' `mixture = 1`, it is a pure lasso model while `mixture = 0` indicates that
#' ridge regression is being used. (`glmnet` and `spark` only).
#' ridge regression is being used (specific engines only).
#' @details
#' The data given to the function are not saved and are only used
#' to determine the _mode_ of the model. For `linear_reg()`, the
#' mode will always be "regression".
#'
#' The model can be created using the `fit()` function using the
#' following _engines_:
#' \itemize{
#' \item \pkg{R}: `"lm"` (the default) or `"glmnet"`
#' \item \pkg{Stan}: `"stan"`
#' \item \pkg{Spark}: `"spark"`
#' \item \pkg{keras}: `"keras"`
#' }
#'
#' For this model, other packages may add additional engines. Use
#' [show_engines()] to see the current set of engines.
#'
#' @includeRmd man/rmd/linear-reg.Rmd details
#' This function only defines what _type_ of model is being fit. Once an engine
#' is specified, the _method_ to fit the model is also defined.
#'
#' @note For models created using the spark engine, there are
#' several differences to consider. First, only the formula
#' interface to via `fit()` is available; using `fit_xy()` will
#' generate an error. Second, the predictions will always be in a
#' spark table format. The names will be the same as documented but
#' without the dots. Third, there is no equivalent to factor
#' columns in spark tables so class predictions are returned as
#' character columns. Fourth, to retain the model object for a new
#' R session (via `save()`), the `model$fit` element of the `parsnip`
#' object should be serialized via `ml_save(object$fit)` and
#' separately saved to disk. In a new session, the object can be
#' reloaded and reattached to the `parsnip` object.
#' The model is not trained or fit until the [fit.model_spec()] function is used
#' with the data.
#'
#' @seealso [fit()], [set_engine()]
#' @examples
#' show_engines("linear_reg")
#'
#' linear_reg()
#' # Parameters can be represented by a placeholder:
#' linear_reg(penalty = varying())
#' @export
#' @importFrom purrr map_lgl
linear_reg <-
Expand Down
86 changes: 86 additions & 0 deletions man/details-linear-reg-glmnet.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

67 changes: 67 additions & 0 deletions man/details-linear-reg-keras.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading