Package 'DALSM'

Title:	Nonparametric Double Additive Location-Scale Model (DALSM)
Description:	Fit of a double additive location-scale model with a nonparametric error distribution from possibly right- or interval-censored data. The additive terms in the location and dispersion submodels, as well as the unknown error distribution in the location-scale model, are estimated using Laplace P-splines. For more details, see Lambert (2021) <doi:10.1016/j.csda.2021.107250>.
Authors:	Philippe Lambert [aut, cre]
Maintainer:	Philippe Lambert <[email protected]>
License:	GPL-3
Version:	0.9.4
Built:	2025-03-07 20:23:01 UTC
Source:	https://github.com/plambertuliege/dalsm

Help Index

Akaike Information Criterion (AIC) of a DALSM object.
Bayesian Information Criterion (BIC) of a DALSM object.
Generation of a recentered cubic B-spline basis matrix in additive models
Fit a double additive location-scale model (DALSM) with a flexible error distribution
Extraction of the estimated additive terms in a DALSM.object
Income data
Object resulting from the fit of a double additive location-scale model (DALSM).
Object creation for density estimation from right- or interval-censored data
Object created by Dens1d to prepare for density estimation from censored data using densityLPS
Constrained density estimation from censored data using Laplace P-splines
Object generated by function densityLPS
Log-determinant of a positive-definite matrix
Generic function for computing log-evidence
Log-evidence of a DALSM object.
Generation of the penalty matrix for an additive model based on P-splines
Plot visual information on a DALSM.object
Plot the density estimate in a densLPS.object
Plot a credible region for a curve together with its point estimates
Prediction based on a DALSM model
Print summary information on a DALSM.object
Print a summary of the information in a densLPS.object
Specification of knots in a cubic P-spline model
Histogram from weighted data

Akaike Information Criterion (AIC) of a DALSM object.

Description

Akaike Information Criterion (AIC) for the fitted DALSM model in a DALSM.object.

Usage

## S3 method for class 'DALSM'
AIC(object, ..., k=2)
## S3 method for class 'DALSM'
AIC(object, ..., k=2)

Arguments

`object`	A `DALSM.object`.
`...`	Other optional DALSM objects.
`k`	The penalty per parameter to be used. (Default: k=2 for the classical AIC).

Details

Akaike information criterion for the fitted model in a DALSM object, with a penalty calculated using the total effective degrees of freedom, -2log(L) + 2*ED.tot, smaller values being preferred during model selection.

Value

The AIC of the fitted DALSM model in x.

Author(s)

Philippe Lambert [email protected]

References

Time-varying exogenous covariates with frequently changing values in double additive cure survival model: an application to fertility. Journal of the Royal Statistical Society, Series A. <doi:10.1093/jrsssa/qnaf035>

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
AIC(fit)

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
AIC(fit)

Bayesian Information Criterion (BIC) of a DALSM object.

Description

Bayesian Information Criterion (BIC) for the fitted DALSM model in a DALSM.object.

Usage

## S3 method for class 'DALSM'
BIC(object, ...)
## S3 method for class 'DALSM'
BIC(object, ...)

Arguments

`object`	A `DALSM.object`.
`...`	Other optional DALSM objects.

Details

Bayesian (Schwarz) information for the fitted model in a DALSM object, with a penalty calculated using the total effective degrees of freedom and the total number of observed or interval-censored data, -2log(L) + log(n.uncensored)*ED.tot, smaller values being preferred during model selection.

Value

The BIC of the fitted DALSM model in x.

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
BIC(fit)

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
BIC(fit)

Generation of a recentered cubic B-spline basis matrix in additive models

Description

Generation of a cubic B-spline basis matrix with recentered columns to handle the identifiability constraint in additive models. See Wood (CRC Press 2017, pp. 175-176) for more details.

Usage

centeredBasis.gen(x, knots, cm = NULL, pen.order = 2)
centeredBasis.gen(x, knots, cm = NULL, pen.order = 2)

Arguments

`x`	vector of values where to compute the "recentered" B-spline basis.
`knots`	vector of knots (that should cover the values in <x>).
`cm`	(Optional) values subtracted from each column of the original B-spline matrix.
`pen.order`	penalty order for the B-spline parameters (Default: 2).

Value

List containing

B : centered cubic B-spline matrix (with columns recentered to have mean 0 over equi-spaced x values on the range of the knots).
Dd : difference matrix (of order <pen.order>) for the associated centered B-spline matrix.
Pd : penalty matrix (of order <pen.order>) for the associated centered B-spline matrix.
K : number of centered B-splines in the basis.
cm : values subtracted from each column of the original B-spline matrix. By default, this is a vector containing the mean of each column in the original B-spline matrix.

Author(s)

Philippe Lambert [email protected]

References

Lambert, P. (2021). Fast Bayesian inference using Laplace approximations in nonparametric double additive location-scale models with right- and interval-censored data. Computational Statistics and Data Analysis, 161: 107250. <doi:10.1016/j.csda.2021.107250>

Examples

x = seq(0,1,by=.01)
knots = seq(0,1,length=5)
obj = centeredBasis.gen(x,knots)
matplot(x,obj$B,type="l",ylab="Centered B-splines")
colMeans(obj$B)
x = seq(0,1,by=.01)
knots = seq(0,1,length=5)
obj = centeredBasis.gen(x,knots)
matplot(x,obj$B,type="l",ylab="Centered B-splines")
colMeans(obj$B)

Fit a double additive location-scale model (DALSM) with a flexible error distribution

Description

Fit a location-scale regression model with a flexible error distribution and additive terms in location (=mean) and dispersion (= log(sd)) using Laplace P-splines from potentially right- and interval-censored response data.

Usage

DALSM(y, formula1,formula2, w, data,
       K1=10, K2=10, pen.order1=2, pen.order2=2,
       a.pen=1, b.pen=1e-4, lambda1.min=1, lambda2.min=1,
       phi.0=NULL,psi1.0=NULL,psi2.0=NULL,
       psi.method=c("LM","NR","BFGS","NM"),
       lambda0.0=NULL,lambda1.0=NULL,lambda2.0=NULL,
       lambda.method=c("LPS","LPS2","none"), logscale=FALSE,
       density.method=c("LPS","Schall"),
       REML=FALSE, diag.only=TRUE, Normality=FALSE, sandwich=TRUE,
       joint.computation=FALSE,
       K.error=20, rmin=NULL,rmax=NULL,
       ci.level=.95,
       grad.tol=1e-2, RDM.tol=1e-4, fun.tol=1e-3,
       iterlim=50,verbose=FALSE)
DALSM(y, formula1,formula2, w, data,
       K1=10, K2=10, pen.order1=2, pen.order2=2,
       a.pen=1, b.pen=1e-4, lambda1.min=1, lambda2.min=1,
       phi.0=NULL,psi1.0=NULL,psi2.0=NULL,
       psi.method=c("LM","NR","BFGS","NM"),
       lambda0.0=NULL,lambda1.0=NULL,lambda2.0=NULL,
       lambda.method=c("LPS","LPS2","none"), logscale=FALSE,
       density.method=c("LPS","Schall"),
       REML=FALSE, diag.only=TRUE, Normality=FALSE, sandwich=TRUE,
       joint.computation=FALSE,
       K.error=20, rmin=NULL,rmax=NULL,
       ci.level=.95,
       grad.tol=1e-2, RDM.tol=1e-4, fun.tol=1e-3,
       iterlim=50,verbose=FALSE)

Arguments

`y`	n-vector of responses or (nx2)-matrix/data.frame when interval-censoring (with the 2 elements giving the interval bounds) or right-censoring (when the element in the 2nd column equals +Inf).
`formula1`	model formula for location (i.e. for the conditional mean).
`formula2`	model formula for dispersion (i.e. for the log of the conditional standard deviation).
`w`	vector of length `n` containing (non-negative) weights (default: rep(1,n)).
`data`	data frame containing the model covariates.
`K1`	(optional) number of B-splines to describe a given additive term in the location submodel (default: 10).
`K2`	(optional) number of B-splines to describe a given additive term in the dispersion submodel (default: 10).
`pen.order1`	(optional) penalty order for the additive terms in the location submodel (default: 2).
`pen.order2`	(optional) penalty order for the additive terms in the dispersion submodel (default: 2).
`a.pen`	(optional) prior on penalty parameter $\tau$ is Gamma(a.pen=1,b.pen) (for additive terms) (default: 1).
`b.pen`	(optional) prior on penalty parameter $\tau$ is Gamma(a.pen,b.pen=1e-4) (for additive terms) (default: 1e-4).
`lambda1.min`	(optional) minimal value for the penalty parameters in the additive model for location (default: 1.0).
`lambda2.min`	(optional) minimal value for penalty parameters in the additive model for log-dispersion (default: 1.0).
`phi.0`	(optional) initial values for the spline parameters in the log-hazard of the standardized error distribution.
`psi1.0`	(optional) initial values for the location submodel parameters.
`psi2.0`	(optional) initial values for the dispersion submodel parameters.
`psi.method`	(optional) algorithm used for the computation of the conditional posterior mode of the regression and splines parameters. Possible choices are Levenberg-Marquardt ("LM"), Newton-Raphson ("NR"), Broyden–Fletcher–Goldfarb–Shanno ("BFGS") and Nelder-Mead ("NM") (default: "LM").
`lambda0.0`	(optional) initial value for the penalty parameter used during the estimation of the error density.
`lambda1.0`	(optional) initial value for the J1 penalty parameters of the additive terms in the location submodel.
`lambda2.0`	(optional) initial value for the J2 penalty parameters of the additive terms in the dispersion submodel.
`lambda.method`	(optional) algorithm used for the computation of the marginal posterior mode of the penalty parameters tuning the smoothness of the estimated additive terms. Possible choices are Laplace P-splines ("LPS" or "LPS2") or "none" if fixed quantities given by initial values (by default set at 100) are desired.
`logscale`	this parameter is only relevant when "LPS2" is chosen for `lambda.method`. Setting `logscale=TRUE` selects the penalty parameter on the log scale, maximizing the marginal posterior of its log-transformed value (default: FALSE).
`density.method`	algorithm used for the selection of the penalty parameter during density estimation. Possible choices are Laplace- P-splines ("LPS") or Fellner-Schall ("Schall").
`REML`	(optional) logical indicating if a REML correction is desired to estimate the dispersion parameters (default: FALSE).
`diag.only`	(optional) logical indicating if only the diagonal of the Hessian needs to be corrected during REML (default: TRUE).
`Normality`	(optional) logical indicating if Normality is assumed for the error term (default: FALSE).
`sandwich`	(optional) logical indicating if sandwich variance estimators are needed for the regression parameters for a NP error distribution (when Normality=FALSE) ; it is forced to be TRUE when Normality=TRUE.
`joint.computation`	indicates whether the computation of the effective degrees of freedom for the additive terms should be made jointly for the location and dispersion additive terms (default: FALSE).
`K.error`	(optional) number of B-splines to approximate the log of the error hazard on (rmin,rmax) (default: 20).
`rmin`	(optional) minimum value for the support of the standardized error distribution.
`rmax`	(optional) maximum value for the support of the standardized error distribution.
`ci.level`	(optional) nominal level for the reported credible intervals (default: .95)
`grad.tol`	tolerance threshold for the L2-norm of the gradient when monitoring convergence. (default: 1e-2).
`RDM.tol`	tolerance thershold for the Relative Damping Measure (= RDM) when monitoring convergence (default: 1e-4).
`fun.tol`	tolerance threshold for variations in the maximized function during the final iterations of posterior mode computation and convergence monitoring (default: 1e-3).
`iterlim`	(optional) maximum number of iterations (after which the algorithm is interrupted with a non-convergence diagnostic) (default: 50).
`verbose`	(optional) logical indicating whether estimation step details should be displayed (default: FALSE).

Value

An object of class DASLM containing several components from the fit, see DALSM.object for details. A summary can be printed using print.DALSM or plotted using plot.DALSM.

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
print(fit)
plot(fit, select=0) ## Estimated error density
plot(fit, select=1:4, pages=1) ## Estimated additive terms

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
print(fit)
plot(fit, select=0) ## Estimated error density
plot(fit, select=1:4, pages=1) ## Estimated additive terms

Extraction of the estimated additive terms in a `DALSM.object`

Description

Extract the estimated additive terms with their standard errors from a DALSM.object resulting from the fit of a DALSM model. In addition to the estimated functions in the location and dispersion submodels, their values on a regular grid covering the observed covariate values are reported together with credible intervals. The mean effective coverage of these pointwise credible intervals for the additive terms with respect to given (optional) reference functions (such as the ones for the 'true' additive terms used to generate data in a simulation study) can also be computed.

Usage

DALSM_additive(obj.DALSM, ngrid=101,
       true.loc=NULL, true.disp=NULL, ci.level=NULL,
       verbose=FALSE)
DALSM_additive(obj.DALSM, ngrid=101,
       true.loc=NULL, true.disp=NULL, ci.level=NULL,
       verbose=FALSE)

Arguments

`obj.DALSM`	a `DALSM.object`
`ngrid`	(optional) grid size of covariate values where the additive terms are calculated (default: 101).
`true.loc`	(optional) list of functions containing the 'true' additive terms in the location sub-model.
`true.disp`	(optional) list of functions containing the 'true' additive terms in the dispersion sub-model.
`ci.level`	(optional) level of credible intervals.
`verbose`	logical indicating whether the computed corverages should be printed out (default: TRUE).

Value

It returns an invisible list containing:

J1 : number of additive terms in the location sub-model.
labels.loc : labels of the additive terms in the location sub-model.
f.loc.grid : list of length J1 with, for each additive term, a list of length 3 with elements 'x': a vector of ngrid values for the covariate ; 'y.mat': a matrix with 3 columns (est,low,up) giving the additive term and its pointwise credible region ; 'y.mat2': a matrix with 3 columns (est,low,up) giving the additive term and its simultaneous credible region ; se: the standard error of the additive term on the x-grid.
f.loc : a list of length J1 with, for each additive term <x>, a list with f.loc$x: a function computing the additive term f.loc(x) for a given covariate value 'x' ; attributes(f.loc$x): support, label, range.
se.loc : a list of length J1 with, for each additive term <x>, a list with se.loc$x: a function computing the s.e. of f(x) for a given covariate value 'x' ; attributes(se.loc$x): support, label, range.
coverage.loc : if true.loc is provided: a vector of length J1 giving the average effective coverage of pointwise credible intervals for each of the additive terms in the location sub-model.
J2 : number of additive terms in the dispersion sub-model.
labels.disp : labels of the additive terms in the dispersion sub-model.
f.disp.grid : list of length J2 with, for each additive term, a list of length 3 with elements 'x': a vector of ngrid values for the covariate ; 'y.mat': a matrix with 3 columns (est,low,up) giving the additive term and its pointwise credible region ; 'y.mat2': a matrix with 3 columns (est,low,up) giving the additive term and its simultaneous credible region ; se: the standard error of the additive term on the x-grid.
f.disp : a list of length J2 with, for each additive term <x>, a list with f.disp$x: a function computing the additive term f.disp(x) for a given covariate value 'x' ; attributes(f.disp$x): support, label, range.
se.disp : a list of length J2 with, for each additive term <x>, a list with se.disp$x: a function computing the s.e. of f(x) for a given covariate value 'x' ; attributes(se.disp$x): support, label, range.
coverage.disp : if <true.disp> is provided: a vector of length J2 giving the average effective coverage of pointwise credible intervals for each of the additive terms in the dispersion sub-model.

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
obj = DALSM_additive(fit)
str(obj)

## Visualize the estimated additive terms for Age
## ... in the location submodel
with(obj$f.loc.grid$age, matplot(x,y.mat,
                                 xlab="Age",ylab="f.loc(Age)",
                                 type="l",col=1,lty=c(1,2,2)))
## ... and in the dispersion submodel
with(obj$f.disp.grid$age, matplot(x,y.mat,
                                 xlab="Age",ylab="f.disp(Age)",
                                 type="l",col=1,lty=c(1,2,2)))
## Also report their values for selected age values
obj$f.loc$age(c(30,40,50)) ; obj$f.disp$age(c(30,40,50))
## ... together with their standard errors
obj$se.loc$age(c(30,40,50)) ; obj$se.disp$age(c(30,40,50))
require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
obj = DALSM_additive(fit)
str(obj)

## Visualize the estimated additive terms for Age
## ... in the location submodel
with(obj$f.loc.grid$age, matplot(x,y.mat,
                                 xlab="Age",ylab="f.loc(Age)",
                                 type="l",col=1,lty=c(1,2,2)))
## ... and in the dispersion submodel
with(obj$f.disp.grid$age, matplot(x,y.mat,
                                 xlab="Age",ylab="f.disp(Age)",
                                 type="l",col=1,lty=c(1,2,2)))
## Also report their values for selected age values
obj$f.loc$age(c(30,40,50)) ; obj$f.disp$age(c(30,40,50))
## ... together with their standard errors
obj$se.loc$age(c(30,40,50)) ; obj$se.disp$age(c(30,40,50))

Income data

Description

Income (in euros) available per person in 756 Belgian households (ESS 2016 ; Lambert 2021, Section 4) for respondents aged 25-55 when the main source of income comes from wages or salaries. The data were extracted from the European Social Survey (2016) and reworked to quantify the resources available at the individual level in the selected Belgian households.

Usage

data(DALSM_IncomeData)
data(DALSM_IncomeData)

Format

A data frame with 756 rows and 5 columns:

inc.low : lower bound for the individual income data.
inc.up : upper bound for the individual income data (Inf if right-censored).
twoincomes : 0: one-income household ; 1: two-income household.
age : age of the respondent.
eduyrs : number of years of education completed.

Author(s)

Philippe Lambert [email protected]

References

European Social Survey Round 8 Data (2016) Data file edition 2.1. NSD - Norwegian Centre for Research Data, Norway.

Object resulting from the fit of a double additive location-scale model (DALSM).

Description

An object returned by the DALSM function: this is a list with various components related to the fit of a double additive location-scale model using Laplace P-splines.

Value

A DALSM object has the following elements:

Essential part:

converged : logical convergence indicator.
derr : estimated standardized error distribution returned as a densLPS.object.
psi1 : estimated regression parameters for location (fixed effects, B-spline coefs for the J1 additive terms).
psi2 : estimated regression parameters for dispersion (fixed effects, B-spline coefs for the J2 additive terms).
fixed.loc : matrix with the estimated fixed effects (est,se,ci.low,ci.up) in the location sub-model.
fixed.disp : matrix with the estimated fixed effects (est,se,ci.low,ci.up) in the dispersion sub-model.
mu : n-vector with the fitted conditional mean.
sd : n-vector with the fitted conditional standard deviation.
dev: deviance for the fitted model.
llik: log-likelihood for the fitted model.
logEvid : log-evidence (or marginal likelihood) for the fitted model (conditionally on the selected penalty parameters).
AIC: Akaike Information Criterion for the fitted model.
BIC: Bayesian Information Criterion for the fitted model.
ED.tot: total effective degrees of freedom for the regression parts of the fitted model.

Additional elements:

data : the original data frame used when calling the DALSM function.
phi : estimated B-spline coefs for the log-hazard of the error distribution.
K.error : number of B-splines used to approximate the log of the error hazard.
rmin, rmax : minimum and maximum values for the support of the standardized error distribution.
knots.error : equidistant knots on (rmin,rmax) used to specify the B-spline basis for the log of the error hazard.
bread.psi1, Sand.psi1, Cov.psi1: estimated Variance-Covariance matrix for $\psi_1$ .
U.psi1 : gradient for $\psi_1$ .
bread.psi2, Sand.psi2, Cov.psi2: estimated Variance-Covariance matrix for $\psi_2$ .
U.psi2 : gradient for $\psi_2$ .
U.psi : gradient for $\psi=(\psi_1,\psi_2)$ .
Cov.psi : variance-covariance for $\psi=(\psi_1,\psi_2)$ .
regr1 : object generated by DesignFormula for the specified submodel for location.
regr2 : object generated by DesignFormula for the specified submodel for dispersion.
res : n-vector or nx2 matrix (if IC data) with the standardized residuals for the fitted model.
expctd.res : n-vector with observed standardized residual for a non RC unit, or their expected value if right-censored.
REML : logical indicating whether REML estimation was performed.
n : the sample size.
n.uncensored : number of non-censored response data.
event : n-vector of event indicators (1: non right-censored ; 0: right censoring).
is.IC : n-vector with interval censoring indicators.
n.IC : number of interval-censored response data.
n.RC : number of right-censored response data.
perc.obs : percentage of exactly observed response data.
perc.IC : percentage of interval-censored response data.
perc.RC : percentage of right-censored response data.
cred.int : nominal level for the reported credible intervals.
alpha : user-specified $\alpha$ with Bayesian $(1-\alpha)$ credible intervals reported.
sandwich : logical indicating if variance-covariance and standard errors computed using sandwich estimator in the NP case.
diag.only : logical indicating if the correction to the Hessian under REML only concerns diagonal elements.
iter : number of iterations.
elapsed.time : time required by the model fitting procedure.

If there are additive terms in the location submodel:

K1 : number of B-splines used to describe an additive term in the location submodel.
xi1 : matrix with the selected log penalty parameters for the J1 additive terms in the location submodel (point estimate, se, ci.low, ci.up.
U.xi1 : gradient for the log of the penalty parameters for the J1 additive terms in the location submodel.
U.lambda1 : gradient for the penalty parameters for the J1 additive terms in the location submodel.
Cov.xi1 : estimated Variance-Covariance matrix for the parameters involved in the J1 additive terms in the location submodel.
lambda1.min : minimal value for the penalty parameters in the additive submodel for location.
lambda1 : matrix with the selected penalty parameters for the J1 additive terms in the location submodel (point estimate, se, ci.low, ci.up).
ED1 : matrix with the effective dimensions for each of the J1 additive terms in the location submodel (point estimate,ci.low,ci.up).

If there are additive terms in the dispersion submodel:

K2 : number of B-splines used to describe an additive term in the dispersion submodel.
xi2 : matrix with the selected log penalty parameters for the J2 additive terms in the dispersion submodel (point estimate, se, ci.low, ci.up).
U.xi2 : gradient for the log of the penalty parameters for the J2 additive terms in the dispersion submodel.
U.lambda2 : gradient for the penalty parameters for the J2 additive terms in the dispersion submodel.
Cov.xi2 : estimated Variance-Covariance matrix for the parameters involved in the J2 additive terms in the dispersion submodel.
lambda2.min : minimal value for the penalty parameters in the additive submodel for dispersion.
lambda2 : matrix with the selected penalty parameters for the J2 additive terms in the dispersion submodel (point estimate, se, ci.low, ci.up).
ED2 : matrix with the effective dimensions for each of the J2 additive terms in the dispersion submodel (point estimate,ci.low,ci.up).

Author(s)

Philippe Lambert [email protected]

References

Object creation for density estimation from right- or interval-censored data

Description

Object creation for density estimation from right- or interval-censored data using function densityLPS.

Usage

Dens1d(y, event=NULL, w, ymin=NULL, ymax=NULL,
       K=25, equid.knots=TRUE, pen.order=2, nbins=501)
Dens1d(y, event=NULL, w, ymin=NULL, ymax=NULL,
       K=25, equid.knots=TRUE, pen.order=2, nbins=501)

Arguments

`y`	a n-vector (if no interval-censored data) or a nx2 matrix (left and right limits of the interval for IC data ; right limit set to Inf for right-censored data).
`event`	a n-vector of observation indicators (0: right-censored ; 1: exactly observed or interval-censored).
`w`	vector of length `n` containing (non-negative) weights (default: rep(1,n)).
`ymin`	left limit of the variable support.
`ymax`	right limit of the variable support.
`K`	number of B-splines in the basis to approximate the log-hazard.
`equid.knots`	logical indicating if equidistants knots are desired.
`pen.order`	penalty order when equidistant knots (otherwise: penalty matrix computed to penalize the second derivative).
`nbins`	number of small bins used for quadrature and approximations.

Value

A Dens1d.object, i.e. a list with summary measures and precomputed components required for density estimation using densityLPS.

Author(s)

Philippe Lambert [email protected]

References

Examples

library(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
head(resp,n=20)
temp = Dens1d(y=resp,ymin=0) ## Create Dens1d object from positive censored data
obj = densityLPS(temp) ## Density estimation from IC & RC data
plot(obj) ## Visualize the estimated density

library(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
head(resp,n=20)
temp = Dens1d(y=resp,ymin=0) ## Create Dens1d object from positive censored data
obj = densityLPS(temp) ## Density estimation from IC & RC data
plot(obj) ## Visualize the estimated density

Object created by `Dens1d` to prepare for density estimation from censored data using `densityLPS`

Description

An object returned by function Dens1d to prepare for density estimation with given mean and variance from censored data using function densityLPS.

Value

A Dens1d.object is a list with the following elements:

n : total (unweighted) sample size.
y : a n-vector if no interval-censored data, a nx2 matrix otherwise (for interval-censored data: left and right limits of the interval ; for right-censored data: the right limit is set to Inf).
event : a n-vector of observation indicators (0: right-censored ; 1: exactly observed or interval-censored).
w : a n-vector containing (non-negative) weights associated to each entry in y.
sw : total weighted sample size given by the sum of the weights associated to y.
ymin : left limit of the variable support.
ymax : right limit of the variable support.
is.uncensored : boolean n-vector indicating if the corresponding <y> value is not right- or interval-censored.
n.uncensored : total weighted number of uncensored data.
is.IC : boolean n-vector indicating if the corresponding y value is interval-censored.
n.IC : total weighted number of interval-censored data.
is.RC : n-vector of booleans indicating if the corresponding y value is right-censored.
n.RC : total weighted number of right-censored data.
ylow, yup : n-vector with the lower and upper limits of the interval data (when interval-censored). When y is exactly observed or right-censored, the two values are identical.
ymid : n-vector containing the mean of y.low and y.up.
K : number of B-splines in the basis used to model the log-hazard.
knots : vector of knots for the B-spline basis.
pen.order : penalty order.
Pd : penalty matrix in the P-spline model.
nbins : number of small bins used to partition the variable support.
bins : small bins of equal width used to partition the variable support (cf. binning).
ugrid : midpoints of the small bins.
dbins : width of a small bin.
Bbins : ((nbins +1) x K)-matrix containing the B-spline basis evaluated at the bin limits.
C : (n x nbins) matrix of event or censoring indicators $C_{ij}$ for unit 'i' and bin 'j'. For a unit with IC data, the bins with an non-empty intersection with the interval are indicated. When the unit is associated to a precise event time or censoring time in bin 'j', then $C_{ij}=1$ and 0 for other bins.
Bgrid : (nbins x K)-matrix containing the B-spline basis evaluated at the bin midpoints.
fgrid : nbins-vector with the estimated density values at the bin midpoints.
rgrid : nbins-vector with the number of units 'still at risk' of an "event" in a given bin.
BsB, ev : technical quantities used in the estimation procedure, see the code in Dens1d2.R for more details.

Author(s)

Philippe Lambert [email protected]

References

Constrained density estimation from censored data using Laplace P-splines

Description

P-spline estimation of the density (pdf), cumulative distribution (cdf), hazard and cumulative hazard functions from interval- or right-censored data under possible marginal mean and/or variance constraints. The penalty parameter $\tau$ tuning the smoothness of the log-hazard can be selected using the Laplace P-splines (LPS) method maximizing an approximation to the marginal posterior of $\tau$ (also named the 'evidence') or using Schall's method.

Usage

densityLPS(obj.data,
       is.density=TRUE, Mean0=NULL, Var0=NULL,
       fixed.penalty=FALSE, method=c("LPS","Schall"),
       fixed.phi=FALSE,phi.ref=NULL, phi0=NULL,tau0=exp(5),tau.min=.1,
       verbose=FALSE)
densityLPS(obj.data,
       is.density=TRUE, Mean0=NULL, Var0=NULL,
       fixed.penalty=FALSE, method=c("LPS","Schall"),
       fixed.phi=FALSE,phi.ref=NULL, phi0=NULL,tau0=exp(5),tau.min=.1,
       verbose=FALSE)

Arguments

`obj.data`	a list created from potentially right- or interval-censored data using `Dens1d`. It includes summary statistics, the assumed density support, the knots for the B-spline basis, etc.
`is.density`	(optional) logical indicating whether the estimated density should integrate to 1.0 over the range of the knots in obj.data$knots (default: TRUE).
`Mean0`	(optional) constrained value for the mean of the fitted density (defaut: NULL).
`Var0`	(optional) constrained value for the variance of the fitted density (defaut: NULL).
`fixed.penalty`	(optional) logical indicating whether the penalty parameter should be selected from the data (`fixed.penalty`=FALSE) or fixed (`fixed.penalty`=TRUE) at its initial value $\tau_0$ .
`method`	method used for the selection of the penalty parameter: "LPS" (by maximizing the marginal posterior for $\tau$ , cf. "Laplace P-splines") or "Schall" (Schall's method).
`fixed.phi`	(optional) logical indicating whether the spline parameters are fixed (`fixed.phi`=TRUE) or estimated from the data (default: `fixed.phi`=FALSE).
`phi.ref`	(optional) reference value for the spline parameters with respect to which deviations are penalized (default: zero vector).
`phi0`	starting value for the spline parameters (default: spline parameters corresponding to a Student density with 5 DF).
`tau0`	(optional) initial value for the penalty parameter $\tau$ (default: exp(5)).
`tau.min`	(optional) minimal value for the penalty parameter $\tau$ (default: .1).
`verbose`	(optional) logical indicating whether estimation step details should be displayed (default: FALSE).

Value

a densLPS.object containing the density estimation results.

Author(s)

Philippe Lambert [email protected]

References

Examples

library(DALSM)

## Example 1: estimation of the error density in a DALSM model
require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit$derr)  ## Plot the estimated error density
print(fit$derr) ## ... and provide summary statistics for it

## Example 2: density estimation from censored income data
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
head(resp,n=20)
temp = Dens1d(y=resp,ymin=0) ## Create Dens1d object from positive censored data
obj = densityLPS(temp) ## Density estimation from IC & RC data
print(obj) ## Summary information on the estimated density
plot(obj,hist=TRUE) ## Visualize the estimated density
legend("topright",col=c("black","grey"),lwd=c(2,20),
       legend=c("Fitted density","Pseudo-data"),bty="n")

## Example 3: density estimation from simulated RC and IC data
## Data generation
set.seed(123)
n = 500 ## Sample size
x = rgamma(n,10,2) ## Exact (unobserved) data
width = runif(n,1,3) ## Width of the IC data (mean width = 2)
w = runif(n) ## Positioning of the exact data within the interval
xmat = cbind(pmax(0,x-w*width),x+(1-w)*width) ## Generated IC data
t.cens = rexp(n,1/15) ## Right-censoring values
idx.RC = (1:n)[t.cens<x] ## Id's of the right-censored units
xmat[idx.RC,] = cbind(t.cens[idx.RC],Inf) ## Data for RC units: (t.cens,Inf)
head(xmat,15)
## Density estimation with mean and variance constraints
obj.data = Dens1d(xmat,ymin=0) ## Prepare the data for estimation
obj = densityLPS(obj.data,Mean0=10/2,Var0=10/4) ## Density estimation
print(obj)
plot(obj) ## Plot the estimated density
curve(dgamma(x,10,2), ## ... and compare it to the true density (in red)
      add=TRUE,col="red",lwd=2,lty=2)
legend("topright",col=c("black","red"),lwd=c(2,2),lty=c(1,2),
       legend=c("Estimated density","True density"),bty="n")
## Same story for the cdf
with(obj, curve(pdist(x),ymin,ymax,lwd=2,xlab="",ylab="F(x)"))
curve(pgamma(x,10,2),add=TRUE,col="red",lwd=2,lty=2)
legend("right",col=c("black","red"),lwd=c(2,2),lty=c(1,2),
       legend=c("Estimated cdf","True cdf"),bty="n")

## Example 4: density estimation from weighted grouped data
set.seed(123)
n = 500 ## Sample size
y = rgamma(n,10,2) ## Generate raw data
brks = c(0,2,4,6,10,12) ## Limits of the grouped data
y.cat = cut(y,breaks=brks) ## Categorization of the raw data
ymat = cbind(low=head(brks,-1), up=brks[-1]) ## Different grouped data
w = table(y.cat) ## Weight associated to each group data
print(cbind(ymat,w)) ## Observed grouped data
obj.data = Dens1d(y=ymat,w=w,ymin=0,pen.order=2) ## Data object
obj = densityLPS(obj.data,Mean0=10/2,Var0=10/4)  ## Density estimation
print(obj)
plot(obj,lwd=2)
weighted.hist(rowMeans(ymat),breaks=brks,weight=w,freq=FALSE,
              main="Estimated density",border=TRUE,col="#D9D9D980",add=TRUE)
curve(dgamma(x,10,2),add=TRUE,col="red",lwd=2,lty=2)
legend("topright",col=c("black","red","grey"),lwd=c(2,2,10),lty=c(1,2,1),
       legend=c("Estimated density","True density","Observed grouped data"),bty="n")

library(DALSM)

## Example 1: estimation of the error density in a DALSM model
require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit$derr)  ## Plot the estimated error density
print(fit$derr) ## ... and provide summary statistics for it

## Example 2: density estimation from censored income data
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
head(resp,n=20)
temp = Dens1d(y=resp,ymin=0) ## Create Dens1d object from positive censored data
obj = densityLPS(temp) ## Density estimation from IC & RC data
print(obj) ## Summary information on the estimated density
plot(obj,hist=TRUE) ## Visualize the estimated density
legend("topright",col=c("black","grey"),lwd=c(2,20),
       legend=c("Fitted density","Pseudo-data"),bty="n")

## Example 3: density estimation from simulated RC and IC data
## Data generation
set.seed(123)
n = 500 ## Sample size
x = rgamma(n,10,2) ## Exact (unobserved) data
width = runif(n,1,3) ## Width of the IC data (mean width = 2)
w = runif(n) ## Positioning of the exact data within the interval
xmat = cbind(pmax(0,x-w*width),x+(1-w)*width) ## Generated IC data
t.cens = rexp(n,1/15) ## Right-censoring values
idx.RC = (1:n)[t.cens<x] ## Id's of the right-censored units
xmat[idx.RC,] = cbind(t.cens[idx.RC],Inf) ## Data for RC units: (t.cens,Inf)
head(xmat,15)
## Density estimation with mean and variance constraints
obj.data = Dens1d(xmat,ymin=0) ## Prepare the data for estimation
obj = densityLPS(obj.data,Mean0=10/2,Var0=10/4) ## Density estimation
print(obj)
plot(obj) ## Plot the estimated density
curve(dgamma(x,10,2), ## ... and compare it to the true density (in red)
      add=TRUE,col="red",lwd=2,lty=2)
legend("topright",col=c("black","red"),lwd=c(2,2),lty=c(1,2),
       legend=c("Estimated density","True density"),bty="n")
## Same story for the cdf
with(obj, curve(pdist(x),ymin,ymax,lwd=2,xlab="",ylab="F(x)"))
curve(pgamma(x,10,2),add=TRUE,col="red",lwd=2,lty=2)
legend("right",col=c("black","red"),lwd=c(2,2),lty=c(1,2),
       legend=c("Estimated cdf","True cdf"),bty="n")

## Example 4: density estimation from weighted grouped data
set.seed(123)
n = 500 ## Sample size
y = rgamma(n,10,2) ## Generate raw data
brks = c(0,2,4,6,10,12) ## Limits of the grouped data
y.cat = cut(y,breaks=brks) ## Categorization of the raw data
ymat = cbind(low=head(brks,-1), up=brks[-1]) ## Different grouped data
w = table(y.cat) ## Weight associated to each group data
print(cbind(ymat,w)) ## Observed grouped data
obj.data = Dens1d(y=ymat,w=w,ymin=0,pen.order=2) ## Data object
obj = densityLPS(obj.data,Mean0=10/2,Var0=10/4)  ## Density estimation
print(obj)
plot(obj,lwd=2)
weighted.hist(rowMeans(ymat),breaks=brks,weight=w,freq=FALSE,
              main="Estimated density",border=TRUE,col="#D9D9D980",add=TRUE)
curve(dgamma(x,10,2),add=TRUE,col="red",lwd=2,lty=2)
legend("topright",col=c("black","red","grey"),lwd=c(2,2,10),lty=c(1,2,1),
       legend=c("Estimated density","True density","Observed grouped data"),bty="n")

Object generated by function `densityLPS`

Description

An object returned by function densityLPS: this is a list with various components related to the estimation of a density with given mean and variance from potentially right- or interval-censored data using Laplace P-splines.

Value

An object returned by densityLPS has the following elements: Essential part:

converged : logical convergence indicator.
ddist : fitted density function.
Hdist : fitted cumulative hazard function.
hdist : fitted hazard function.
pdist : fitted cumulative distribution function.
ymin, ymax : assumed values for the support of the distribution.
phi : estimated B-spline coefficients for the log-hazard of the error distribution.
U.phi : score of the Lagrangian G( $\phi|\omega$ ).
tau, ltau : selected penalty parameter and its logarithm.
est : vector containing the estimated/selected ( $\phi,\log\tau$ ) parameters.
fixed.phi : logical indicating whether the spline parameters were given fixed values or estimated from the data.
phi.ref : reference values for the spline parameters with respect to which $\phi$ is compared during penalization.
BWB : Hessian for $\phi$ without the penalty contribution.
Prec : Hessian or posterior precision matrix for $\phi$ .
Fisher : Fisher information for $\phi$ .
bins, ugrid, du : bins (of width 'du') and with midpoints 'ugrid' partitioning the support of the density.
h.grid, H.grid, dens.grid : hazard, cumulative hazard and density values at the grid midpoints 'ugrid'.
h.bins, H.bins, dens.bins : hazard, cumulative hazard and density values at the bin limits 'bins'.
expected : expected number of observations within each bin.
Finfty : integrated density value over the considered support.
Mean0, Var0 : when specified, constrained mean and variance values during estimation.
mean.dist, var.dist : mean and variance of the fitted density.
method : method used for penaly selection: "evidence" (by maximizing the marginal posterior for $\tau$ ) or "Schall" (Schall's method).
ed : effective number of (spline) parameters.
iterations : total number of iterations necessary for convergence.
elapsed.time : time required for convergence.

Additional elements: the content of the Dens1d.object used when densityLPS was called.

Author(s)

Philippe Lambert [email protected]

References

Log-determinant of a positive-definite matrix

Description

Log-determinant of a positive-definite matrix

Usage

logDet.fun(x)
logDet.fun(x)

Arguments

`x`	positive definite matrix.

Value

log of det(x).

Author(s)

Philippe Lambert [email protected]

References

Examples

A = matrix(1:4,ncol=2)
logDet.fun(A)

A = matrix(1:4,ncol=2)
logDet.fun(A)

Generic function for computing log-evidence

Description

This is a generic function returning the log-evidence for the class of the input object.

Usage

logEvid(object, ...)
logEvid(object, ...)

Arguments

`object`	An object for which log-evidence is to be computed.
`...`	Additional arguments (not used for now).

Log-evidence of a DALSM object.

Description

The log-evidence of the fitted DALSM model in a DALSM.object.

Usage

## S3 method for class 'DALSM'
logEvid(object, ...)
## S3 method for class 'DALSM'
logEvid(object, ...)

Arguments

`object`	A `DALSM.object`.
`...`	Optionally more DALSM objects.

Details

Provides the log-evidence (or log-marginal likelihood) of the fitted DALSM model in a given DALSM.object, where the evidence is the marginal posterior of the penalty parameters at their selected values.

Value

The log-evidence of the DALSM model in object.

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
logEvid(fit)

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
logEvid(fit)

Generation of the penalty matrix for an additive model based on P-splines

Description

Compute the penalty matrix associated to a vector containing fixed (non-penalized) parameters and equal-size sub-vectors of penalized B-spline parameters.

Usage

Pcal.fun(nfixed, lambda, Pd.x)
Pcal.fun(nfixed, lambda, Pd.x)

Arguments

`nfixed`	the number of fixed (i.e. non-penalized) parameters.
`lambda`	a vector of `p` penalty parameters where each component is associated to a sub-vector of spline parameters of length `J`.
`Pd.x`	a penalty matrix of size `J` associated to a given sub-vector of spline parameters.

Value

A block diagonal penalty matrix of size (nfixed+pJ) given by Blockdiag(diag(0,nfixed), diag(lambda).kron.Pd.x).

Author(s)

Philippe Lambert [email protected]

References

Examples

D = diff(diag(5),diff=2) ## Difference penalty matrix of order 2 for 5 P-spline parameters
P = t(D) %*% D ## Penalty matrix of order 2 for 5 B-spline parameters
lambda = c(100,10) ## Penalty parameters for 2 additive terms with 5 B-spline parameters each
Pcal.fun(3,lambda,P) ## Global penalty matrix for 3 additional non-penalized parameters
D = diff(diag(5),diff=2) ## Difference penalty matrix of order 2 for 5 P-spline parameters
P = t(D) %*% D ## Penalty matrix of order 2 for 5 B-spline parameters
lambda = c(100,10) ## Penalty parameters for 2 additive terms with 5 B-spline parameters each
Pcal.fun(3,lambda,P) ## Global penalty matrix for 3 additional non-penalized parameters

Plot visual information on a `DALSM.object`

Description

Visualize the estimated additive terms and error density corresponding to a Double Additive Location-Scale Model (DALSM) object.

Usage

## S3 method for class 'DALSM'
plot(x, ngrid=300, ci.level=.95, pages=0, select=NULL,
       fill=TRUE, pointwise=TRUE, mar=c(4,5,1,1),
       xlim0=NULL, ylim0=NULL, xlim1=NULL, ylim1=NULL, xlim2=NULL, ylim2=NULL,
       equal.ylims=TRUE, ...)
## S3 method for class 'DALSM'
plot(x, ngrid=300, ci.level=.95, pages=0, select=NULL,
       fill=TRUE, pointwise=TRUE, mar=c(4,5,1,1),
       xlim0=NULL, ylim0=NULL, xlim1=NULL, ylim1=NULL, xlim2=NULL, ylim2=NULL,
       equal.ylims=TRUE, ...)

Arguments

`x`	a `DALSM.object`.
`ngrid`	(optional) number of points to make the plots for the fitted additive terms (default: 300).
`ci.level`	(optional) nominal level for the plotted pointwise credible intervals (default: .95).
`pages`	The number of pages over which to spread the output. For example, if pages=1 then all terms will be plotted on one page with the layout performed automatically. Set to 0 to have the routine leave all graphics settings as they are (default: 0).
`select`	Allows the plot for a single model term to be selected for printing. e.g. if you just want the plot for the second smooth term set select=2. The plot of the standardized error density $f_0(z)$ is provided when select=0 (default: NULL).
`fill`	Logical indicating whether credible regions should be greyed out. (Default: TRUE).
`pointwise`	Logical indicating whether only pointwise credible intervals should be plotted for additive terms. When FALSE, simultaneous credible regions are also provided. (Default: TRUE).
`mar`	A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot. (Default: c(4,5,1,1)).
`xlim0`	Vector of length 2 specifying x-axis limits when plotting the estimated error density function $f_0(z)$ .
`ylim0`	Vector of length 2 specifying y-axis limits when plotting the estimated error density function $f_0(z)$ .
`xlim1`	Vector of length 2 specifying (common) x-axis limits when plotting the fitted additive term(s) entering the conditional mean submodel. (Default: NULL).
`ylim1`	Vector of length 2 specifying (common) y-axis limits when plotting the fitted additive term(s) entering the conditional mean submodel. (Default: NULL).
`xlim2`	Vector of length 2 specifying (common) x-axis limits when plotting the fitted additive term(s) in the log of the conditional standard deviation submodel. (Default: NULL).
`ylim2`	Vector of length 2 specifying (common) y-axis limits when plotting the fitted additive term(s) entering the conditional standard deviation submodel. (Default: NULL).
`equal.ylims`	logical indicating if the same y-limits must be used when plotting the fitted additive terms from the same submodel. It can be overriden by non-NULL values for `ylim1` or `ylim2`. (Default: TRUE).
`...`	additional generic plotting arguments.

Details

Plot the fitted additive terms and the estimated standardized error density contained in the DALSM.object x.

Value

In addition to the plots, an invisible list containing the following is returned:

J1 : number of additive terms in the location sub-model.
x.loc : a ngrid by J1 matrix containing a regular grid of ngrid covariate values where the corresponding additive term in location is evaluated.
f.loc : a ngrid by J1 matrix containing the J1 fitted location additive terms evaluated at x.loc.
se.loc : a ngrid by J1 matrix containing the the pointwise standard errors of the fitted location additive terms evaluated at x.loc.
J2 : number of additive terms in the dispersion sub-model.
x.disp : a ngrid by J2 matrix containing a regular grid of ngrid covariate values where the corresponding additive term in dispersion is evaluated.
f.disp : a ngrid by J2 matrix containing the J2 fitted dispersion additive terms evaluated at x.disp.
se.disp : a ngrid by J2 matrix containing the pointwise standard errors of the fitted dispersion additive terms evaluated at x.disp.

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit, select=0) ## Estimated error density
plot(fit, select=1:4, pages=1) ## Estimated additive terms

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit, select=0) ## Estimated error density
plot(fit, select=1:4, pages=1) ## Estimated additive terms

Plot the density estimate in a `densLPS.object`

Description

Plot the density estimate obtained by densityLPS from censored data with given mean and variance.

Usage

## S3 method for class 'densLPS'
plot(x,
       xlim=range(fit$bins),breaks=NULL,hist=FALSE,histRC=FALSE,
       xlab="",ylab="Density",main="",...)
## S3 method for class 'densLPS'
plot(x,
       xlim=range(fit$bins),breaks=NULL,hist=FALSE,histRC=FALSE,
       xlab="",ylab="Density",main="",...)

Arguments

`x`	a `densLPS.object`.
`xlim`	interval of values where the density should be plotted.
`breaks`	(Optional) breaks for the histogram of the observed residuals.
`hist`	Logical (Default: FALSE) indicating whether the histogram of the (pseudo-) data should be plotted with the estimated density.
`histRC`	Logical (Default: FALSE) indicating whether the histogram of the right-censored residuals should be highlighted.
`xlab`	Optional label for the x-axis (Defaut: empty).
`ylab`	Optional label for the y-axis (Default: "Density").
`main`	Plot main title (Default: "").
`...`	Optional additional plot parameters.

Value

No returned value (just plots).

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)

## Example 1: density estimation from simulated IC data
n = 500 ## Sample size
x = 3 + rgamma(n,10,2) ## Exact generated data
width = runif(n,1,3) ## Width of the IC data (mean width = 2)
w = runif(n) ## Positioning of the exact data within the interval
xmat = cbind(x-w*width,x+(1-w)*width) ## Generated IC data
head(xmat)
obj.data = Dens1d(xmat,ymin=0) ## Prepare the data for estimation
## Density estimation with fixed mean and variance
obj = densityLPS(obj.data,Mean0=3+10/2,Var0=10/4)
plot(obj, hist=TRUE) ## Histogram of the pseudo-data with the density estimate
curve(dgamma(x-3,10,2), ## ... compared to the true density (in red)
      add=TRUE,col="red",lwd=2,lty=2)
legend("topright",col=c("black","red","grey"),lwd=c(2,2,10),lty=c(1,2,1),
       legend=c("Fitted density","True density","Pseudo-data"),bty="n")
print(obj) ## ... with summary statistics

## Example 2: estimation of the error density in a DALSM model
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit$derr, hist=TRUE)  ## Plot the estimated error density
legend("topright",col=c("black","grey"),lwd=c(2,10),lty=c(1,1),
       legend=c("Estimated error density","Pseudo-residuals"),bty="n")
print(fit$derr) ## ... and provide summary statistics for it
require(DALSM)

## Example 1: density estimation from simulated IC data
n = 500 ## Sample size
x = 3 + rgamma(n,10,2) ## Exact generated data
width = runif(n,1,3) ## Width of the IC data (mean width = 2)
w = runif(n) ## Positioning of the exact data within the interval
xmat = cbind(x-w*width,x+(1-w)*width) ## Generated IC data
head(xmat)
obj.data = Dens1d(xmat,ymin=0) ## Prepare the data for estimation
## Density estimation with fixed mean and variance
obj = densityLPS(obj.data,Mean0=3+10/2,Var0=10/4)
plot(obj, hist=TRUE) ## Histogram of the pseudo-data with the density estimate
curve(dgamma(x-3,10,2), ## ... compared to the true density (in red)
      add=TRUE,col="red",lwd=2,lty=2)
legend("topright",col=c("black","red","grey"),lwd=c(2,2,10),lty=c(1,2,1),
       legend=c("Fitted density","True density","Pseudo-data"),bty="n")
print(obj) ## ... with summary statistics

## Example 2: estimation of the error density in a DALSM model
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit$derr, hist=TRUE)  ## Plot the estimated error density
legend("topright",col=c("black","grey"),lwd=c(2,10),lty=c(1,1),
       legend=c("Estimated error density","Pseudo-residuals"),bty="n")
print(fit$derr) ## ... and provide summary statistics for it

Plot a credible region for a curve together with its point estimates

Description

Plot a credible region (in grey) for a curve together with its point estimates.

Usage

plotRegion(x,mat,
       add=FALSE, xlim=range(x), ylim=range(mat),
       colfill="#D9D9D980", lwd=2, xlab="", ylab="", main="", ...)
plotRegion(x,mat,
       add=FALSE, xlim=range(x), ylim=range(mat),
       colfill="#D9D9D980", lwd=2, xlab="", ylab="", main="", ...)

Arguments

`x`	vector of values where the curve is evaluated.
`mat`	cbind(f.hat,f.low,f.up) is a matrix containing the point estimates <f.hat> and the liming values <f.low> and <f.up> for the credible region.
`add`	Logical indicating if the plot must be added to the active plot (Default: FALSE).
`xlim`	range of <x> values for which the plot should be provided.
`ylim`	range of curve values that should be considered for the plot.
`colfill`	color used for filling the shaded region. (Default: "#D9D9D980").
`lwd`	line width for the plot (Default: 2).
`xlab`	x-label.
`ylab`	y-label.
`main`	main title.
`...`	optional plotting parameters.

Value

No returned value (just a plot)

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
obj = DALSM_additive(fit)
## par(mfrow=c(1,2),mar=c(4,5,1,1))
with(obj$f.loc.grid$age, plotRegion(x, y.mat,
     xlab="age", ylab=expression('f'[1]^{~mu}*(age))))
with(obj$f.disp.grid$age, plotRegion(x, y.mat,
     xlab="age", ylab=expression('f'[1]^{~sigma}*(age))))
require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
obj = DALSM_additive(fit)
## par(mfrow=c(1,2),mar=c(4,5,1,1))
with(obj$f.loc.grid$age, plotRegion(x, y.mat,
     xlab="age", ylab=expression('f'[1]^{~mu}*(age))))
with(obj$f.disp.grid$age, plotRegion(x, y.mat,
     xlab="age", ylab=expression('f'[1]^{~sigma}*(age))))

Prediction based on a DALSM model

Description

Estimated conditional mean and standard deviation of the response based on a DALSM object for given covariate values in a data frame 'newdata'. Conditional quantiles can also be computed.

Usage

## S3 method for class 'DALSM'
predict(object, newdata, probs, ...)
## S3 method for class 'DALSM'
predict(object, newdata, probs, ...)

Arguments

`object`	a `DALSM.object`.
`newdata`	an optional data frame in which to look for variables with which to predict. If omitted, the covariate values in the original data frame used to fit the DALSM model are considered.
`probs`	probability levels of the requested conditional quantiles.
`...`	further arguments passed to or from other methods.

Value

Returns a list containing:

mu : estimated conditional mean.
sd : estimated conditional standard deviation.
quant : estimated quantiles (at probability level probs) of the fitted conditional response in the DALSM model.
qerr : quantiles (at probability level probs) of the fitted error distribution in the DALSM model.
probs : a reminder of the requested probability levels for the fitted quantiles.

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
newdata = data.frame(twoincomes=c(1,0),age=c(40,50),eduyrs=c(18,12))
pred = predict(fit, data = DALSM_IncomeData, newdata=newdata, probs=c(.2,.5,.8))
with(pred, cbind(newdata, mu, sd, quant)) ## Predicted mu,sd and quantiles
require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
newdata = data.frame(twoincomes=c(1,0),age=c(40,50),eduyrs=c(18,12))
pred = predict(fit, data = DALSM_IncomeData, newdata=newdata, probs=c(.2,.5,.8))
with(pred, cbind(newdata, mu, sd, quant)) ## Predicted mu,sd and quantiles

Print summary information on a `DALSM.object`

Description

Print summary information on a DALSM.object.

Usage

## S3 method for class 'DALSM'
print(x, digits.est=3,digits.edf=2,digits.tst=2,digits.Pvalue=3,...)
## S3 method for class 'DALSM'
print(x, digits.est=3,digits.edf=2,digits.tst=2,digits.Pvalue=3,...)

Arguments

`x`	an object of class `DALSM.object`.
`digits.est`	number of digits when reporting parameter estimates (default: 3).
`digits.edf`	number of digits when reporting effective degrees of freedom (default: 2).
`digits.tst`	number of digits when reporting test statistics (default: 2).
`digits.Pvalue`	number of digits when reporting P-values (default: 3).
`...`	additional generic printing arguments.

Details

Provides summary measures on the estimation of the regression parameters and additive terms in the location and dispersion submodels corresponding to a DALSM.object generated by DALSM.

Value

No returned value (just printed summary).

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
print(fit)

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
print(fit)

Print a summary of the information in a `densLPS.object`

Description

Print summary information on the density estimate obtained by densityLPS from censored data with given mean and variance.

Usage

## S3 method for class 'densLPS'
print(x,...)
## S3 method for class 'densLPS'
print(x,...)

Arguments

`x`	a `densLPS.object`.
`...`	Optional additional print parameters.

Value

No returned value (just printed summary).

Author(s)

Philippe Lambert [email protected]

References

Examples

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit$derr)  ## Plot the estimated error density
print(fit$derr) ## ... and provide some descriptive elements on it

require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit$derr)  ## Plot the estimated error density
print(fit$derr) ## ... and provide some descriptive elements on it

Specification of knots in a cubic P-spline model

Description

Specification of knots for a cubic B-spline basis with K elements in a P-spline model. The knots should support the data contained in vector x and are by default assumed equidistant. Alternatively, they can be based on the data quantiles. The penalty matrix of the selected penalty order (3 by default) is also returned.

Usage

qknots(x, xmin = NULL, xmax = NULL, equid.knots = TRUE, pen.order = 3, K = 25)
qknots(x, xmin = NULL, xmax = NULL, equid.knots = TRUE, pen.order = 3, K = 25)

Arguments

`x`	data that the knots should upport.
`xmin`	desired minimum value for the knots.
`xmax`	desired maximum value for the knots.
`equid.knots`	logical indicating if equidistant knots are desired (Default: TRUE). If FALSE, the quantile of `x` are used to select the knots.
`pen.order`	penalty order if `equid.knots` is `TRUE`.
`K`	number of B-splines in the basis.

Value

A list containing:

xmin : specified minimum value for the knots, except if $\min(x) < xmin$ , in which case the default value $\min(x)-sd(x)$ is returned.
xmin : specified maximum value for the knots, except if $xmax < \max(x)$ , in which case the default value $\max(x)+sd(x)$ is returned.
K : number of B-splines.
knots : equidistant knots on (xmin,xmax) if equidistant.knots is TRUE, based on quantiles of x otherwise.
Pd : $K\times K$ penalty matrix of order pen.order.
pen.order : a reminder of the requested penalty order (Default: 3).

Examples

x = runif(500)
obj = qknots(x,xmin=0,xmax=1,K=13)
print(obj)
x = runif(500)
obj = qknots(x,xmin=0,xmax=1,K=13)
print(obj)

Histogram from weighted data

Description

Computes a histogram from weighted data. If plot=TRUE, the resulting object of class histogram is plotted before it is returned.

Usage

weighted.hist(x, weight = NULL, breaks = "Sturges", freq = NULL,
probability = !freq, include.lowest = TRUE, right = TRUE, fuzz = 1e-07,
density = NULL, angle = 45, col = "lightgray", border = NULL,
main = paste("Histogram of", xname), xlim = range(breaks), ylim = NULL,
xlab = xname, ylab, axes = TRUE, plot = TRUE, labels = FALSE, nclass = NULL,
warn.unused = TRUE, ...)
weighted.hist(x, weight = NULL, breaks = "Sturges", freq = NULL,
probability = !freq, include.lowest = TRUE, right = TRUE, fuzz = 1e-07,
density = NULL, angle = 45, col = "lightgray", border = NULL,
main = paste("Histogram of", xname), xlim = range(breaks), ylim = NULL,
xlab = xname, ylab, axes = TRUE, plot = TRUE, labels = FALSE, nclass = NULL,
warn.unused = TRUE, ...)

Arguments

`x`	a vector of values for which the histogram is desired.
`weight`	optional vector of the same length as `x` defining the weights associated each each element in `x`. By default, it is `rep(1,length(x))`.
`breaks`	one of: a vector giving the breakpoints between histogram cells, a single number giving the number of cells for the histogram, a character string naming an algorithm to compute the number of cells (see ‘Details’), a function to compute the number of cells.
`freq`	logical; if `TRUE`, the histogram graphic is a representation of frequencies, the `counts` component of the result; if `FALSE`, probability densities, component density, are plotted (so that the histogram has a total area of one). Defaults to `TRUE` if and only if `breaks` are equidistant (and `probability` is not specified).
`probability`	an alias for `!freq`, for S compatibility.
`include.lowest`	logical; if `TRUE`, an `x[i]` equal to the `breaks` value will be included in the first (or last, for `right = FALSE`) bar. This will be ignored (with a warning) unless `breaks` is a vector.
`right`	logical; if `TRUE`, the histogram cells are right-closed (left open) intervals.
`fuzz`	non-negative number, for the case when the data is pretty and some observations `x[.]` are close but not exactly on a `break`. For counting fuzzy breaks proportional to `fuzz` are used. The default is occasionally suboptimal.
`density`	the density of shading lines, in lines per inch. The default value of `NULL` means that no shading lines are drawn. Non-positive values of `density` also inhibit the drawing of shading lines.
`angle`	the slope of shading lines, given as an angle in degrees (counter-clockwise).
`col`	a colour to be used to fill the bars.
`border`	the color of the border around the bars. The default is to use the standard foreground color.
`main`	main title.
`xlim`	range of `x` for plotting.
`ylim`	range of `y` values.
`xlab`	label for the horizontal axis.
`ylab`	label for the vertical axis.
`axes`	logical. If `TRUE` (default), axes are draw if the plot is drawn.
`plot`	logical. If `TRUE` (default), a histogram is plotted, otherwise a list of breaks and counts is returned. In the latter case, a warning is used if (typically graphical) arguments are specified that only apply to the `plot=TRUE` case.
`labels`	logical or character. Additionally draw labels on top of bars, if not `FALSE`; see `plot.histogram` in the `graphics` package.
`nclass`	numeric (integer). For S(-PLUS) compatibility only, `nclass` is equivalent to `breaks` for a scalar or character argument.
`warn.unused`	logical. If `plot=FALSE` and `warn.unused=TRUE`, a warning will be issued when graphical parameters are passed to `hist.default()`.
`...`	further arguments and graphical parameters passed to `plot.histogram` and thence to title and axis (if `plot=TRUE`).

Value

a list comparable to what hist.default produces and a plot the weighted histogram if plot=TRUE.

Author(s)

Philippe Lambert [email protected]. The code for this function is fully based on hist.default from the graphics package in R, with minor modifications to account for weights.

Examples

## Example 1
set.seed(123)
mround <- function(x,base) base*round(x/base) ## Rounding function
y = mround(rgamma(500,10,2), base=.5) ## Rounded data
tab = table(y) ; print(tab) ## Empirical distribution of the rounded data
## Histogram from the empirical distribution
res = weighted.hist(as.numeric(names(tab)), weight=c(tab), xlab="",main="")

## Example 2
## Generate categorized data
set.seed(123)
brks = c(0,3,5,7,9,15) ; ycat = cut(rgamma(500,10,2),breaks=brks)
tab = table(ycat) ## Empirical distribution of the categorized data
print(tab)
ymid = .5 * (brks[-6]+brks[-1]) ; w = c(tab) ## Midpoints and their weights
## Histogram from weighted data
res = weighted.hist(ymid,weight=w,breaks=brks,xlab="",main="")
## Example 1
set.seed(123)
mround <- function(x,base) base*round(x/base) ## Rounding function
y = mround(rgamma(500,10,2), base=.5) ## Rounded data
tab = table(y) ; print(tab) ## Empirical distribution of the rounded data
## Histogram from the empirical distribution
res = weighted.hist(as.numeric(names(tab)), weight=c(tab), xlab="",main="")

## Example 2
## Generate categorized data
set.seed(123)
brks = c(0,3,5,7,9,15) ; ycat = cut(rgamma(500,10,2),breaks=brks)
tab = table(ycat) ## Empirical distribution of the categorized data
print(tab)
ymid = .5 * (brks[-6]+brks[-1]) ; w = c(tab) ## Midpoints and their weights
## Histogram from weighted data
res = weighted.hist(ymid,weight=w,breaks=brks,xlab="",main="")

Package 'DALSM'

Help Index

Akaike Information Criterion (AIC) of a DALSM object.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Bayesian Information Criterion (BIC) of a DALSM object.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Generation of a recentered cubic B-spline basis matrix in additive models

Description

Usage

Arguments

Value

Author(s)

References

Examples

Fit a double additive location-scale model (DALSM) with a flexible error distribution

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Extraction of the estimated additive terms in a DALSM.object

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Income data

Description

Usage

Format

Author(s)

References

Object resulting from the fit of a double additive location-scale model (DALSM).

Description

Value

Author(s)

References

See Also

Object creation for density estimation from right- or interval-censored data

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Object created by Dens1d to prepare for density estimation from censored data using densityLPS

Description

Value

Author(s)

References

See Also

Constrained density estimation from censored data using Laplace P-splines

Description

Usage

Arguments

Value

Extraction of the estimated additive terms in a `DALSM.object`

Object created by `Dens1d` to prepare for density estimation from censored data using `densityLPS`

Object generated by function `densityLPS`

Plot visual information on a `DALSM.object`

Plot the density estimate in a `densLPS.object`