#' \item{weights}{The prior weights that were initially supplied.
#' Note that they are called \code{prior.weights} in the output of \code{\link[stats]{glm}}.}
#' \item{offset}{The used offset vector.}
#' \item{lambda}{The used penalty parameter: initially supplied by the user, or selected in-sample, out-of-sample or via k-fold cross-validation.}
#' \item{lambda}{The used penalty parameter: initially supplied by the user, or selected in-sample, out-of-sample or using cross-validation.}
#' \item{lambda1}{The used penalty parameter for the \eqn{L_1}-penalty in Sparse (Generalized) Fused Lasso or
#' Sparse Graph-Guided Fused Lasso is \eqn{\lambda \times \lambda_1}}
#' \item{lambda2}{The used penalty parameter for the \eqn{L_2}-penalty in Group (Generalized) Fused Lasso or
...
...
@@ -60,9 +60,9 @@
#' \item{obj.fun.reest}{Value of the objective function of the re-estimated model: minus the regularized scaled log-likelihood of the re-estimated model.}
#' \item{X.reest}{The model matrix used in the re-estimation, only returned when the argument \code{x.return} in \code{\link{glmsmurf}} or \code{\link{glmsmurf.fit}} is \code{TRUE}.}
#' When lambda is not given as input but selected in-sample, out-of-sample or using cross-validation,
#' i.e. the \code{lambda} argument in \code{\link{glmsmurf}} or \code{\link{glmsmurf.fit}} is a selection method,
#' i.e. the \code{lambda} argument in \code{\link{glmsmurf}} or \code{\link{glmsmurf.fit}} is a string describing the selection method,
#' the following components are also present:
#' \item{lambda.method}{Method (in-sample, out-of-sample or cross-validation) and measure (deviance, MSE, DSS, AIC, BIC or GCV score) used to select \code{lambda}.
#' \item{lambda.method}{Method (in-sample, out-of-sample or cross-validation (possibly with the one standard error rule)) and measure (AIC, BIC, GCV score, deviance, MSE or DSS) used to select \code{lambda}.
#' E.g. \code{"is.bic"} indicates in-sample selection of lambda with the BIC as measure.}
#' \item{lambda.vector}{Vector of \code{lambda} values that were considered in the selection process.}
#' \item{lambda.measures}{List with for each of the relevant measures a matrix containing for each considered value of \code{lambda} (rows)
#' @title Fit a Multi-Type Regularized GLM Using the SMuRF Algorithm
#'
#' @description SMuRF algorithm to fit a generalized linear model (GLM) via regularized maximum likelihood with a multi-type Lasso penalty.
#' @description SMuRF algorithm to fit a generalized linear model (GLM) with multiple types of predictors via regularized maximum likelihood.
#' \code{glmsmurf.fit} contains the fitting function for a given design matrix.
#'
#' @param formula A \code{\link[stats]{formula}} object describing the model to be fitted.
...
...
@@ -38,7 +38,7 @@
#' \item \code{"cv1se.dss"} (CV with one SE rule; DSS).
#' }
#' E.g. \code{"is.aic"} indicates in-sample selection of lambda with the AIC as measure.
#' When \code{lambda} is missing, it will be selected using cross-validation with the one standard error rule and the deviance as measure (\code{"cv1se.dev"}).
#' When \code{lambda} is missing or \code{NULL}, it will be selected using cross-validation with the one standard error rule and the deviance as measure (\code{"cv1se.dev"}).
#' @param lambda1 The penalty parameter for the \eqn{L_1}-penalty in Sparse (Generalized) Fused Lasso or Sparse Graph-Guided Fused Lasso is \eqn{\lambda \times \lambda_1}.
#' A positive numeric with default 0 (no extra \eqn{L_1}-penalty).
#' @param lambda2 The penalty parameter for the \eqn{L_2}-penalty in Group (Generalized) Fused Lasso or Group Graph-Guided Fused Lasso is \eqn{\lambda \times \lambda_2}. A positive numeric with default 0 (no extra \eqn{L_2}-penalty).
#' or a list with the penalty weight vector per predictor. This list should have length equal to the number of predictors and predictor names as element names.
#' or a list with the penalty weight vector per predictor. For \code{glmsmurf.fit}, only the second option (list) is possible. This list should have length equal to the number of predictors and predictor names as element names.
#' @param adj.matrix A named list containing the adjacency matrices (a.k.a. neighbor matrices) for each of the predictors with a Graph-Guided Fused Lasso penalty.
#' The list elements should have the names of the corresponding predictors. If only one predictor has a Graph-Guided Fused Lasso penalty,
#' it is also possible to only give the adjacency matrix itself (not in a list).
...
...
@@ -70,7 +70,9 @@
#' \item The estimated coefficients are rounded to 7 digits.
#' \item The cross-validation folds are not deterministic. The validation sample for selecting lambda out-of-sample is determined at random when no indices are provided
#' in 'validation.index' in the control object argument. In these cases, the selected value of lambda is hence not deterministic.
#' When selecting lambda in-sample, or out-of-sample when indices are provided in 'validation.index' in the control object argument, the resulting value of lambda is deterministic.
#' When selecting lambda in-sample, or out-of-sample when indices are provided in 'validation.index' in the control object argument, the selected value of lambda is deterministic.
#' \item The \code{glmsmurf} function can handle many use cases and is preferred for general use.
#' The \code{glmsmurf.fit} function requires a more thorough understanding of the package internals and should hence be used with care!
#' @param X Only for \code{glmsmurf.fit}: the design matrix including ones for the intercept. A \code{n} by \code{(p+1)} matrix which can
#' be of numeric matrix class (\code{\link[methods:StructureClasses]{matrix-class}}) or of class Matrix (\code{\link[Matrix]{Matrix-class}}) including sparse matrix class (\code{\link[Matrix]{dgCMatrix-class}}).
#' @param y Only for \code{glmsmurf.fit}: the response vector, a numeric vector of size \code{n}.
#' @param pen.cov Only for \code{glmsmurf.fit}: list with penalty type per predictor (covariate). A named list of strings with predictor names as element names.
#' Possible types: \code{"none"} (no penalty, e.g. for intercept), \code{"lasso"}, \code{"grouplasso"},
#' @param pen.cov Only for \code{glmsmurf.fit}: a list with the penalty type per predictor (covariate). A named list of strings with predictor names as element names.
#' Possible types: \code{"none"} (no penalty, e.g. for intercept), \code{"lasso"} (Lasso), \code{"grouplasso"} (Group Lasso),
#' \code{"2dflasso"} (2D Fused Lasso) or \code{"ggflasso"} (Graph-Guided Fused Lasso).
#' @param n.par.cov Only for \code{glmsmurf.fit}: list with number of parameters to estimate per predictor (covariate). A named list of strictly positive integers with predictor names as element names.
#' @param group.cov Only for \code{glmsmurf.fit}: list with group of each predictor (covariate) which is only used for the Group Lasso penalty.
#' @param n.par.cov Only for \code{glmsmurf.fit}: a list with the number of parameters to estimate per predictor (covariate). A named list of strictly positive integers with predictor names as element names.
#' @param group.cov Only for \code{glmsmurf.fit}: a list with the group of each predictor (covariate) which is only used for the Group Lasso penalty.
#' A named list of positive integers with predictor names as element names where 0 means no group.
#' @param refcat.cov Only for \code{glmsmurf.fit}: list with number of the reference category in the original order of the levels of each predictor (covariate).
#' @param refcat.cov Only for \code{glmsmurf.fit}: a list with the number of the reference category in the original order of the levels of each predictor (covariate).
#' When the predictor is not a factor or no reference category is present, it is equal to 0. This number will only be taken into account for a Fused Lasso, Generalized Fused Lasso or Graph-Guided Fused Lasso penalty
#' Note that there cannot be any unused levels in the interaction between \code{pred1} and \code{pred2}.
#'
#' When adding an interaction between \code{pred1} and \code{pred2} with a 2D Fused Lasso penalty, the 1D effects
#' should also be present in the model and the reference categories for the 1D predictors need to be the respective first levels. Alternatively, it is also allowed to add binned factors, of predictors
#' should also be present in the model and the reference categories for the 1D predictors need to be the respective first levels.
#' The reference level for the 2D predictor will then be the 2D level where it least one of the 1D components is equal to the 1D reference levels.
#' It is also allowed to add binned factors, of predictors
#' that are included in the model, in the interaction. They should have the original predictor name + '.binned' as predictor names.
#' For example: the original predictors 'age' and 'power' are included in the model and
#' the interaction of 'age.binned' and 'power.binned' can also be present in the model formula.
#' @param log.lambda Logical indicating if the logarithm of lambda is plotted on the x-axis, default is \code{TRUE}.
#' @param ... Additional arguments for the \code{\link[graphics]{plot}} function.
#'
#' @details This plot can only be made when lambda is selected in-sample, out-of-sample or using cross-validation,
#' @details This plot can only be made when lambda is selected in-sample, out-of-sample or using cross-validation (possibly with the one standard error rule),
#' see the \code{lambda} argument of \code{\link{glmsmurf}}.
#' @title Summary of a Multi-Type Regularized GLM Fitted Using the SMuRF Algorithm
#'
#' @description Function to print a summary of a \code{glmsmurf}object.
#' @description Function to print a summary of a \code{glmsmurf}-object.
#'
#' @param object An object of class '\code{\link[=glmsmurf-class]{glmsmurf}}', typically the result of a call to \code{\link{glmsmurf}} or \code{\link{glmsmurf.fit}}.
#' @param digits The number of significant digits used when printing, default is 3.
@@ -23,7 +23,7 @@ An object of class '\code{glmsmurf}' is a list with at least following component
\item{weights}{The prior weights that were initially supplied.
Note that they are called \code{prior.weights} in the output of \code{\link[stats]{glm}}.}
\item{offset}{The used offset vector.}
\item{lambda}{The used penalty parameter: initially supplied by the user, or selected in-sample, out-of-sample or via k-fold cross-validation.}
\item{lambda}{The used penalty parameter: initially supplied by the user, or selected in-sample, out-of-sample or using cross-validation.}
\item{lambda1}{The used penalty parameter for the \eqn{L_1}-penalty in Sparse (Generalized) Fused Lasso or
Sparse Graph-Guided Fused Lasso is \eqn{\lambda \times \lambda_1}}
\item{lambda2}{The used penalty parameter for the \eqn{L_2}-penalty in Group (Generalized) Fused Lasso or
...
...
@@ -62,9 +62,9 @@ the following components are also present:
\item{obj.fun.reest}{Value of the objective function of the re-estimated model: minus the regularized scaled log-likelihood of the re-estimated model.}
\item{X.reest}{The model matrix used in the re-estimation, only returned when the argument \code{x.return} in \code{\link{glmsmurf}} or \code{\link{glmsmurf.fit}} is \code{TRUE}.}
When lambda is not given as input but selected in-sample, out-of-sample or using cross-validation,
i.e. the \code{lambda} argument in \code{\link{glmsmurf}} or \code{\link{glmsmurf.fit}} is a selection method,
i.e. the \code{lambda} argument in \code{\link{glmsmurf}} or \code{\link{glmsmurf.fit}} is a string describing the selection method,
the following components are also present:
\item{lambda.method}{Method (in-sample, out-of-sample or cross-validation) and measure (deviance, MSE, DSS, AIC, BIC or GCV score) used to select \code{lambda}.
\item{lambda.method}{Method (in-sample, out-of-sample or cross-validation (possibly with the one standard error rule)) and measure (AIC, BIC, GCV score, deviance, MSE or DSS) used to select \code{lambda}.
E.g. \code{"is.bic"} indicates in-sample selection of lambda with the BIC as measure.}
\item{lambda.vector}{Vector of \code{lambda} values that were considered in the selection process.}
\item{lambda.measures}{List with for each of the relevant measures a matrix containing for each considered value of \code{lambda} (rows)
@@ -50,7 +50,7 @@ Offset(s) specified using the \code{formula} object will be ignored!}
\item \code{"cv1se.dss"} (CV with one SE rule; DSS).
}
E.g. \code{"is.aic"} indicates in-sample selection of lambda with the AIC as measure.
When \code{lambda} is missing, it will be selected using cross-validation with the one standard error rule and the deviance as measure (\code{"cv1se.dev"}).}
When \code{lambda} is missing or \code{NULL}, it will be selected using cross-validation with the one standard error rule and the deviance as measure (\code{"cv1se.dev"}).}
\item{lambda1}{The penalty parameter for the \eqn{L_1}-penalty in Sparse (Generalized) Fused Lasso or Sparse Graph-Guided Fused Lasso is \eqn{\lambda \times \lambda_1}.
A positive numeric with default 0 (no extra \eqn{L_1}-penalty).}
...
...
@@ -66,7 +66,7 @@ A positive numeric with default 0 (no extra \eqn{L_1}-penalty).}
or a list with the penalty weight vector per predictor. This list should have length equal to the number of predictors and predictor names as element names.}
or a list with the penalty weight vector per predictor. For \code{glmsmurf.fit}, only the second option (list) is possible. This list should have length equal to the number of predictors and predictor names as element names.}
\item{adj.matrix}{A named list containing the adjacency matrices (a.k.a. neighbor matrices) for each of the predictors with a Graph-Guided Fused Lasso penalty.
The list elements should have the names of the corresponding predictors. If only one predictor has a Graph-Guided Fused Lasso penalty,
...
...
@@ -88,17 +88,17 @@ be of numeric matrix class (\code{\link[methods:StructureClasses]{matrix-class}}
\item{y}{Only for \code{glmsmurf.fit}: the response vector, a numeric vector of size \code{n}.}
\item{pen.cov}{Only for \code{glmsmurf.fit}: list with penalty type per predictor (covariate). A named list of strings with predictor names as element names.
Possible types: \code{"none"} (no penalty, e.g. for intercept), \code{"lasso"}, \code{"grouplasso"},
\item{pen.cov}{Only for \code{glmsmurf.fit}: a list with the penalty type per predictor (covariate). A named list of strings with predictor names as element names.
Possible types: \code{"none"} (no penalty, e.g. for intercept), \code{"lasso"} (Lasso), \code{"grouplasso"} (Group Lasso),
\code{"2dflasso"} (2D Fused Lasso) or \code{"ggflasso"} (Graph-Guided Fused Lasso).}
\item{n.par.cov}{Only for \code{glmsmurf.fit}: list with number of parameters to estimate per predictor (covariate). A named list of strictly positive integers with predictor names as element names.}
\item{n.par.cov}{Only for \code{glmsmurf.fit}: a list with the number of parameters to estimate per predictor (covariate). A named list of strictly positive integers with predictor names as element names.}
\item{group.cov}{Only for \code{glmsmurf.fit}: list with group of each predictor (covariate) which is only used for the Group Lasso penalty.
\item{group.cov}{Only for \code{glmsmurf.fit}: a list with the group of each predictor (covariate) which is only used for the Group Lasso penalty.
A named list of positive integers with predictor names as element names where 0 means no group.}
\item{refcat.cov}{Only for \code{glmsmurf.fit}: list with number of the reference category in the original order of the levels of each predictor (covariate).
\item{refcat.cov}{Only for \code{glmsmurf.fit}: a list with the number of the reference category in the original order of the levels of each predictor (covariate).
When the predictor is not a factor or no reference category is present, it is equal to 0. This number will only be taken into account for a Fused Lasso, Generalized Fused Lasso or Graph-Guided Fused Lasso penalty
when a reference category is present.}
}
...
...
@@ -106,7 +106,7 @@ when a reference category is present.}
An object of class '\code{glmsmurf}' is returned. See \code{\link{glmsmurf-class}} for more details about this class and its generic functions.
}
\description{
SMuRF algorithm to fit a generalized linear model (GLM) via regularized maximum likelihood with a multi-type Lasso penalty.
SMuRF algorithm to fit a generalized linear model (GLM) with multiple types of predictors via regularized maximum likelihood.
\code{glmsmurf.fit} contains the fitting function for a given design matrix.
}
\details{
...
...
@@ -117,7 +117,9 @@ As a user, it is important to take the following into acocunt:
\item The estimated coefficients are rounded to 7 digits.
\item The cross-validation folds are not deterministic. The validation sample for selecting lambda out-of-sample is determined at random when no indices are provided
in 'validation.index' in the control object argument. In these cases, the selected value of lambda is hence not deterministic.
When selecting lambda in-sample, or out-of-sample when indices are provided in 'validation.index' in the control object argument, the resulting value of lambda is deterministic.
When selecting lambda in-sample, or out-of-sample when indices are provided in 'validation.index' in the control object argument, the selected value of lambda is deterministic.
\item The \code{glmsmurf} function can handle many use cases and is preferred for general use.
The \code{glmsmurf.fit} function requires a more thorough understanding of the package internals and should hence be used with care!
@@ -47,7 +47,9 @@ If \code{pred2} is different from \code{NULL}, \code{pen} should be set to \code
Note that there cannot be any unused levels in the interaction between \code{pred1} and \code{pred2}.
When adding an interaction between \code{pred1} and \code{pred2} with a 2D Fused Lasso penalty, the 1D effects
should also be present in the model and the reference categories for the 1D predictors need to be the respective first levels. Alternatively, it is also allowed to add binned factors, of predictors
should also be present in the model and the reference categories for the 1D predictors need to be the respective first levels.
The reference level for the 2D predictor will then be the 2D level where it least one of the 1D components is equal to the 1D reference levels.
It is also allowed to add binned factors, of predictors
that are included in the model, in the interaction. They should have the original predictor name + '.binned' as predictor names.
For example: the original predictors 'age' and 'power' are included in the model and
the interaction of 'age.binned' and 'power.binned' can also be present in the model formula.