Title: | Density Estimation from GROuped Summary Statistics |
---|---|
Description: | Estimation of a density from grouped (tabulated) summary statistics evaluated in each of the big bins (or classes) partitioning the support of the variable. These statistics include class frequencies and central moments of order one up to four. The log-density is modelled using a linear combination of penalised B-splines. The multinomial log-likelihood involving the frequencies adds up to a roughness penalty based on the differences in the coefficients of neighbouring B-splines and the log of a root-n approximation of the sampling density of the observed vector of central moments in each class. The so-obtained penalized log-likelihood is maximized using the EM algorithm to get an estimate of the spline parameters and, consequently, of the variable density and related quantities such as quantiles, see Lambert, P. (2021) <arXiv:2107.03883> for details. |
Authors: | Philippe Lambert [aut, cre] (Université de Liège / Université catholique de Louvain (Belgium)) |
Maintainer: | Philippe Lambert <[email protected]> |
License: | GPL-3 |
Version: | 0.9.0 |
Built: | 2025-02-14 05:41:45 UTC |
Source: | https://github.com/plambertuliege/degross |
Density function based on an object resulting from the estimation procedure in degross.
ddegross(x, degross.fit, phi)
ddegross(x, degross.fit, phi)
x |
Scalar or vector where the fitted density must be evaluated. |
degross.fit |
A degross.object generated using degross and containing the density estimation results. |
phi |
(Optional) vector of spline parameters for the log density (default: |
A scalar or vector of the same length as x
containing the value of the fitted density at x
.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degross.object
, pdegross
, qdegross
, degross
.
## Generate grouped data sim = simDegrossData(n=1500, plotting=TRUE, choice=2) ## Create a degrossData object obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) ## Estimate the density obj.fit = degross(obj.data) ## Superpose the fitted density using the <ddegross> function curve(ddegross(x,obj.fit),add=TRUE,lty="dashed") legend("topright",lty="dashed",lwd=2,legend="Estimated",box.lty=0, inset=.04)
## Generate grouped data sim = simDegrossData(n=1500, plotting=TRUE, choice=2) ## Create a degrossData object obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) ## Estimate the density obj.fit = degross(obj.data) ## Superpose the fitted density using the <ddegross> function curve(ddegross(x,obj.fit),add=TRUE,lty="dashed") legend("topright",lty="dashed",lwd=2,legend="Estimated",box.lty=0, inset=.04)
Estimation of a density from tabulated summary statistics evaluated within each of the big bins (or classes) partitioning the variable support. These statistics include class frequencies and central moments of orders one up to four. The log-density is modelled using a linear combination of penalized B-splines. The multinomial log-likelihood involving the frequencies adds up to a roughness penalty based on differences of neighboring B-spline coefficients and to the log of a root-n approximation of the sampling density of the observed vector of central moments within each class. The so-obtained penalized log-likelihood is maximized using the EM algorithm to get an estimation of the spline parameters and, hence, of the variable density and related quantities such as quantiles, see Lambert (2021) for details.
degross(degross.data, phi0 = NULL, tau0 = 1000, use.moments = rep(TRUE,4), freq.min = 20, diag.only=FALSE, penalize = TRUE, aa = 2, bb = 1e-06, pen.order = 3, fixed.tau = FALSE, plotting = FALSE, verbose = FALSE, iterlim=20)
degross(degross.data, phi0 = NULL, tau0 = 1000, use.moments = rep(TRUE,4), freq.min = 20, diag.only=FALSE, penalize = TRUE, aa = 2, bb = 1e-06, pen.order = 3, fixed.tau = FALSE, plotting = FALSE, verbose = FALSE, iterlim=20)
degross.data |
A degrossData.object generated by degrossData. |
phi0 |
Starting value for the |
tau0 |
Starting value for the roughness penalty parameter. Default: 1000. |
use.moments |
Vector with 4 logicals indicating which tabulated sample moments to use as soft constraints. Defaults: |
freq.min |
Minimal big bin frequency required to use the corresponding observed moments as soft constraints. Default: |
diag.only |
Logical indicating whether to ignore the off-diagonal elements of the variance-covariance matrix of the sample central moments. Default: FALSE. |
penalize |
Logical indicating whether a roughness penalty of order |
aa |
Positive real giving the first parameter in the Gamma prior for |
bb |
Positive real giving the second parameter in the Gamma prior for |
pen.order |
Integer giving the order of the roughness penalty. Default: |
fixed.tau |
Logical indicating whether the roughness penalty parameter |
plotting |
Logical indicating whether an histogram of the data with the estimated density should be plotted. Default: FALSE. |
verbose |
Logical indicating whether details on the estimation progress should be displayed. Default: FALSE. |
iterlim |
Maximum number of iterations during the M-step. Default: 20. |
An object of class degross
containing several components from the density estimation procedure. Details can be found in degross.object
. A summary of its content can be printed using print.degross
or plotted using plot.degross
.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degross.object
, ddegross
, pdegross
, qdegross
.
## Simulate grouped data sim = simDegrossData(n=3500, plotting=TRUE,choice=2,J=3) print(sim$true.density) ## Display density of the data generating mechanism ## Create a degrossData object obj.data = with(sim, degrossData(Big.bins=Big.bins, freq.j=freq.j, m.j=m.j)) print(obj.data) ## Estimate the density underlying the grouped data obj.fit = degross(obj.data) ## Plot the estimated density... plot(obj.fit) ## ... and compare it with the ('target') density used to simulate the data curve(sim$true.density(x),add=TRUE,col="red",lwd=2) legend("topleft", legend=c("Observed freq.","Target density","Estimated density"), col=c("grey85","red","black"), lwd=c(10,2,2), lty=c("solid","solid","dashed"), box.lty=0, inset=.02)
## Simulate grouped data sim = simDegrossData(n=3500, plotting=TRUE,choice=2,J=3) print(sim$true.density) ## Display density of the data generating mechanism ## Create a degrossData object obj.data = with(sim, degrossData(Big.bins=Big.bins, freq.j=freq.j, m.j=m.j)) print(obj.data) ## Estimate the density underlying the grouped data obj.fit = degross(obj.data) ## Plot the estimated density... plot(obj.fit) ## ... and compare it with the ('target') density used to simulate the data curve(sim$true.density(x),add=TRUE,col="red",lwd=2) legend("topleft", legend=c("Observed freq.","Target density","Estimated density"), col=c("grey85","red","black"), lwd=c(10,2,2), lty=c("solid","solid","dashed"), box.lty=0, inset=.02)
Log-posterior (with gradient and Fisher information) for given spline parameters, small bin frequencies, tabulated sample moments and roughness penalty parameter. This function is maximized during the M-step of the EM algorithm to estimate the B-spline parameters entering the density specification.
degross_lpost(phi, tau, n.i, degross.data, use.moments = rep(TRUE,4), freq.min = 20, diag.only=FALSE, penalize = TRUE, aa = 2, bb = 1e-6, pen.order = 3)
degross_lpost(phi, tau, n.i, degross.data, use.moments = rep(TRUE,4), freq.min = 20, diag.only=FALSE, penalize = TRUE, aa = 2, bb = 1e-6, pen.order = 3)
phi |
Vector of K B-spline parameters |
tau |
Roughness penalty parameter. |
n.i |
Small bin frequencies. |
degross.data |
A degrossData.object created using the degrossData function. |
use.moments |
Vector with 4 logicals indicating which tabulated sample moments to use as soft constraints. Defaults: |
freq.min |
Minimal big bin frequency required to use the corresponding observed moments as soft constraints. Default: |
diag.only |
Logical indicating whether to ignore the off-diagonal elements of the variance-covariance matrix of the sample central moments. Default: FALSE. |
penalize |
Logical indicating whether a roughness penalty of order |
aa |
Positive real giving the first parameter in the Gamma prior for |
bb |
Positive real giving the second parameter in the Gamma prior for |
pen.order |
Integer giving the order of the roughness penalty. Default: |
A list containing :
lpost
, lpost.ni
:
value of the log-posterior based on the given small bin frequencies n.i
and the tabulated sample moments.
lpost.mj
:
value of the log-posterior based on the big bin frequencies degross.data$freq.j
and the tabulated sample moments.
llik.ni
:
multinomial log-likelihood based on the given small bin frequencies n.i
.
llik.mj
:
multinomial log-likelihood based on the big bin frequencies degross.data$freq.j
.
moments.penalty
:
log of the joint (asymptotic) density for the observed sample moments.
penalty
:
.
Score
, Score.ni
:
score (w.r.t. ) of
lpost.ni
.
Score.mj
:
score (w.r.t. ) of
lpost.mj
.
Fisher
& Fisher.ni
:
information matrix (w.r.t. ) of
lpost.ni
.
Fisher.mj
:
information matrix (w.r.t. ) of
lpost.mj
.
M.j
:
theoretical moments of the density (resulting from ) within a big bin.
pi.i
:
small bin probabilities.
ui
:
small bin midpoints.
delta
:
width of the small bins.
gamma.j
:
Big bin probabilities.
tau
:
reminder of the value of the roughness penalty parameter .
phi
:
reminder of the vector of spline parameters (defining the density).
n.i
:
reminder of the small bin frequencies given as input.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degross_lpostBasic
, degross
, degross.object
.
sim = simDegrossData(n=3500, plotting=TRUE,choice=2) ## Generate grouped data obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) obj.fit = degross(obj.data) ## Estimate the underlying density ## Evaluate the log-posterior at convergence res = with(obj.fit, degross_lpost(phi, tau, n.i, obj.data, diag.only=diag.only)) print(res$Score) ## Score of the log posterior at convergence
sim = simDegrossData(n=3500, plotting=TRUE,choice=2) ## Generate grouped data obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) obj.fit = degross(obj.data) ## Estimate the underlying density ## Evaluate the log-posterior at convergence res = with(obj.fit, degross_lpost(phi, tau, n.i, obj.data, diag.only=diag.only)) print(res$Score) ## Score of the log posterior at convergence
Log-posterior for given spline parameters, big bin (and optional: small bin) frequencies, tabulated sample moments and roughness penalty parameter. Compared to degross_lpost, no Fisher information matrix is computed and the gradient evaluation is optional, with a resulting computational gain.
degross_lpostBasic(phi, tau, n.i, degross.data, use.moments = rep(TRUE,4), freq.min = 20, diag.only=FALSE, gradient=FALSE, penalize = TRUE, aa = 2, bb = 1e-6, pen.order = 3)
degross_lpostBasic(phi, tau, n.i, degross.data, use.moments = rep(TRUE,4), freq.min = 20, diag.only=FALSE, gradient=FALSE, penalize = TRUE, aa = 2, bb = 1e-6, pen.order = 3)
phi |
Vector of K B-spline parameters |
tau |
Roughness penalty parameter. |
n.i |
Small bin frequencies. |
degross.data |
A degrossData.object created using the degrossData function. |
use.moments |
Vector with 4 logicals indicating which tabulated sample moments to use as soft constraints. Defaults: |
freq.min |
Minimal big bin frequency required to use the corresponding observed moments as soft constraints. Default: |
diag.only |
Logical indicating whether to ignore the off-diagonal elements of the variance-covariance matrix of the sample central moments. Default: FALSE. |
gradient |
Logical indicating if the gradient (Score) of the |
penalize |
Logical indicating whether a roughness penalty of order |
aa |
Real giving the first parameter in the Gamma prior for |
bb |
Real giving the second parameter in the Gamma prior for |
pen.order |
Integer giving the order of the roughness penalty. Default: |
A list containing :
lpost.ni
:
value of the log-posterior based on the given small bin frequencies n.i
and the tabulated sample moments.
lpost.mj
:
value of the log-posterior based on the big bin frequencies degross.data$freq.j
and the tabulated sample moments.
llik.ni
:
multinomial log-likelihood based on the given small bin frequencies n.i
.
llik.mj
:
multinomial log-likelihood based on the big bin frequencies degross.data$freq.j
resulting from n.i
.
moments.penalty
:
log of the joint (asymptotic) density for the observed sample moments.
penalty
:
.
M.j
:
theoretical moments of the density (resulting from ) within a big bin.
pi.i
:
small bin probabilities.
ui
:
small bin midpoints.
delta
:
width of the small bins.
gamma.j
:
big bin probabilities.
tau
:
reminder of the value of the roughness penalty parameter .
phi
:
reminder of the vector of spline parameters (defining the density).
n.i
:
reminder of the small bin frequencies given as input.
freq.j
:
reminder of the big bin frequencies in degross.data$freq.j
.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degross_lpost
, degross
, degross.object
.
sim = simDegrossData(n=3500, plotting=TRUE,choice=2) ## Generate grouped data obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) obj.fit = degross(obj.data) ## Estimate the underlying density phi.hat = obj.fit$phi ; tau.hat = obj.fit$tau ## Evaluate the log-posterior at convergence res = degross_lpostBasic(phi=phi.hat, tau=tau.hat, degross.data=obj.data, gradient=TRUE) print(res)
sim = simDegrossData(n=3500, plotting=TRUE,choice=2) ## Generate grouped data obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) obj.fit = degross(obj.data) ## Estimate the underlying density phi.hat = obj.fit$phi ; tau.hat = obj.fit$tau ## Evaluate the log-posterior at convergence res = degross_lpostBasic(phi=phi.hat, tau=tau.hat, degross.data=obj.data, gradient=TRUE) print(res)
An object returned by the degross
function is a list containing several components resulting from the density estimation procedure.
A degross
object is a list containing, after convergence of the EM algorithm :
lpost
& lpost.ni
:
value of the log-posterior for the complete data based on the expected small bin frequencies n.i
at convergence of the EM algorithm.
lpost.mj
:
value of the log-posterior for the observed data based on the big bin frequencies freq.j
.
llik.ni
:
log-likelihood for the complete data based on the estimated small bin frequencies n.i
.
llik.mj
:
log-likelihood for the observed data based on the big bin frequencies freq.j
.
moments.penalty
:
log of the joint (asymptotic) density for the observed sample moments.
penalty
:
.
Score
& Score.mj
:
score (w.r.t. ) of the log of the observed joint posterior function.
Score.ni
:
score (w.r.t. ) of the log-posterior for the complete data based on the expected small bin frequencies
n.i
at convergence of the EM algorithm.
Fisher
& Fisher.ni
:
information matrix (w.r.t. ) based on the log-posterior for the complete data based on the expected small bin frequencies
n.i
at convergence of the EM algorithm.
Fisher.mj
:
information matrix (w.r.t. ) based on the log of the observed joint posterior function.
M.j
:
theoretical moments of the fitted density within a big bin.
pi.i
:
small bin probabilities (at convergence).
ui
:
small bin midpoints.
delta
:
width of the small bins.
gamma.j
:
big bin probabilities (at convergence).
tau
:
value of the roughness penalty parameter (
tau0
if fixed.tau
=TRUE, estimated otherwise).
phi
:
vector with the spline parameters (at convergence).
n.i
:
small bin frequencies under the estimated density (at convergence).
edf
:
the effective degrees of freedom (or effective number of spline parameters) (at convergence).
aic
:
-2*(llik.mj
+ moments.penalty
) + 2edf
.
bic
:
-2(llik.mj
+ moments.penalty
) + *
edf
.
log.evidence
:
approximation to the log of
.
degross.data
:
the degrossData object from which density estimation proceeded.
use.moments
:
vector of 4 logicals indicating which tabulated sample moments were used as soft constraints during estimation.
diag.only
:
logical indicating whether the off-diagonal elements of the variance-covariance matrix of the sample central moments were ignored. Default: FALSE.
logNormCst
:
log of the normalizing constant when evaluating the density.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degross
, print.degross
, plot.degross
.
Creates a degrossData.object from the observed tabulated frequencies and central moments.
degrossData(Big.bins, freq.j, m.j, I=300, K=25)
degrossData(Big.bins, freq.j, m.j, I=300, K=25)
Big.bins |
Vector of length |
freq.j |
The number of data observed within each big bin. |
m.j |
A matrix of dim |
I |
The number of small bins used for quadrature during the normalization of the density during its estimation. Default: |
K |
The desired number of B-splines in the basis used for density estimation. Default= |
A degrossData.object, i.e. a list containing:
small.bins
:
a vector of length I+1
with the small bin limits.
ui
:
the I
midpoints of the small bins.
delta
:
width of the small bins.
I
:
the number of small bins.
B.i
:
a matrix of dim I
by K
with the B-spline basis evaluated at the small bin midpoints.
K
:
number of B-splines in the basis.
knots
:
equidistant knots supporting the B-splines basis.
Big.bins
:
vector of length J+1
with the limits of the J
big bins containing the data used to produce the tabulated statistics.
freq.j
:
the number of data observed within each big bin.
m.j
:
a matrix of dim J
by 4 giving the first 4 sample central moments within each big bin.
J
:
the number of big bins.
small.to.big
:
a vector of length I
indicating to what big bin each element of ui
belongs.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
sim = simDegrossData(n=3500, plotting=TRUE) obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data)
sim = simDegrossData(n=3500, plotting=TRUE) obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data)
degross
.An object returned by the degrossData
function from tabulated frequencies and central moments of order 1 up to 4. It is used in a second step by degross
to estimate the underlying density.
A list containing :
small.bins
:
a vector of length I+1
with the small bin limits.
ui
:
the I
midpoints of the small bins.
delta
:
width of the small bins.
I
:
the number of small bins.
B.i
:
a matrix of dim I
by K
with the B-spline basis evaluated at the small bin midpoints.
K
:
number of B-splines in the basis.
knots
:
equidistant knots supporting the B-splines basis.
Big.bins
:
vector of length J+1
with the limits of the J
big bins containing the data used to produce the tabulated statistics.
freq.j
:
the number of data observed within each big bin.
m.j
:
a matrix of dim J
by 4 giving the first 4 sample central moments within each big bin.
J
:
the number of big bins.
small.to.big
:
a vector of length I
indicating to what big bin each element of ui
belongs.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degrossData
, print.degrossData
Cumulative distribution function (cdf) based on an object resulting from the estimation procedure in degross.
pdegross(x, degross.fit, phi)
pdegross(x, degross.fit, phi)
x |
Scalar or vector where the fitted cdf must be evaluated. |
degross.fit |
A |
phi |
(Optional) vector of spline parameters for the log density (default: |
a scalar or vector of the same length as x
containing the value of the fitted cdf at x
.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degross.object
, ddegross
, qdegross
, degross
.
## Generate grouped data sim = simDegrossData(n=3500, plotting=TRUE, choice=2) ## Create a degrossData object obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) ## Estimate the density obj.fit = degross(obj.data) ## Superpose the fitted cdf using the <pdegross> function with(sim, curve(true.cdf(x),min(Big.bins),max(Big.bins), col="red",lwd=2, ylab="F(x)")) curve(pdegross(x,obj.fit),add=TRUE,lty="dashed") legend("topleft", legend=c("Target cdf","Estimated cdf"), lwd=2, lty=c("solid","dashed"), col=c("red","black"), box.lty=0, inset=.04)
## Generate grouped data sim = simDegrossData(n=3500, plotting=TRUE, choice=2) ## Create a degrossData object obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) ## Estimate the density obj.fit = degross(obj.data) ## Superpose the fitted cdf using the <pdegross> function with(sim, curve(true.cdf(x),min(Big.bins),max(Big.bins), col="red",lwd=2, ylab="F(x)")) curve(pdegross(x,obj.fit),add=TRUE,lty="dashed") legend("topleft", legend=c("Target cdf","Estimated cdf"), lwd=2, lty=c("solid","dashed"), col=c("red","black"), box.lty=0, inset=.04)
Plot the density estimate corresponding to a degross
object and superpose it to the observed histogram.
## S3 method for class 'degross' plot(x, col="black", lwd=2, lty="dashed", xlab="", ylab="Density", main="",...)
## S3 method for class 'degross' plot(x, col="black", lwd=2, lty="dashed", xlab="", ylab="Density", main="",...)
x |
A degross.object generated by degross. |
col |
Color used for plotting the fitted density. |
lwd |
Line width for the fitted density curve. |
lty |
Line type for the the fitted density curve. |
xlab |
Label on the x-axis. |
ylab |
Label on the y-axis. |
main |
Title for the generated graph. |
... |
Further arguments to be passed to |
A histogram based on the observed big bin frequencies with the fitted density superposed.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degross
, degross.object
, print.degross
.
sim = simDegrossData(n=3500, plotting=TRUE,choice=2) ## Generate grouped data obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) obj.fit = degross(obj.data) ## Estimate the underlying density plot(obj.fit) ## Plot the fitted density with the data histogram
sim = simDegrossData(n=3500, plotting=TRUE,choice=2) ## Generate grouped data obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) obj.fit = degross(obj.data) ## Estimate the underlying density plot(obj.fit) ## Plot the fitted density with the data histogram
Print a summary of the information contained in a degross.object generated by degross
for density estimation from tabulated frequency and central moment data.
## S3 method for class 'degross' print(x, ...)
## S3 method for class 'degross' print(x, ...)
x |
A degross.object generated by degross. |
... |
Possible additional printing options. |
Print information on the fitted density corresponding to the degross.object x
: the estimated central moments within each class (or big bin) are printed with global fit statistics. A summary of the observed data is also provided: it includes the total sample size, the numbers of small and big bins with their limits in addition to the number of B-splines used for density estimation with degross
.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
sim = simDegrossData(n=3500, plotting=TRUE) obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) ## Estimate the density underlying the grouped data obj.fit = degross(obj.data) print(obj.fit)
sim = simDegrossData(n=3500, plotting=TRUE) obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) ## Estimate the density underlying the grouped data obj.fit = degross(obj.data) print(obj.fit)
Print a summary of the information contained in a degrossData.object
used by degross
for density estimation from tabulated frequency and moment data.
## S3 method for class 'degrossData' print(x, ...)
## S3 method for class 'degrossData' print(x, ...)
x |
A degrossData.object generated by degrossData. |
... |
Possible additional printing options for a matrix object. |
Print the tabulated summary statistics contained in the degrossData.object x
, with additional information on the total sample size, numbers of small and big bins with their limits, the number of B-splines planned for density estimation using degross
.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
sim = simDegrossData(n=3500, plotting=TRUE) obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data)
sim = simDegrossData(n=3500, plotting=TRUE) obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data)
Quantile function based on an object resulting from the estimation procedure in degross.
qdegross(p, degross.fit, phi, get.se=FALSE, cred.level=.95, eps=1e-4)
qdegross(p, degross.fit, phi, get.se=FALSE, cred.level=.95, eps=1e-4)
p |
Scalar or vector of probabilities in (0,1) indicating the requested fitted quantiles Q(p) based on the density estimation results in |
degross.fit |
A |
phi |
(Optional) vector of spline parameters for the log density (default: |
get.se |
Logical indicating if standard errors for Q(p) are requested (default: FALSE). |
cred.level |
Level of credible intervals for Q(p). |
eps |
Precision with which each quantile should be computed (default: 1e-4). |
A scalar or vector x
of the same length as p
containing the values Q(p) at which the cdf pdegross(x,degross.fit)
is equal to p
.
When get.se
is TRUE, a vector or a matrix containing the quantile estimate(s), standard errors and credible interval limits for Q(p) is provided.
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
degross.object
, ddegross
, pdegross
, degross
.
## Generate grouped data sim = simDegrossData(n=3500, plotting=TRUE, choice=2) ## Create a degrossData object obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) ## Estimate the density obj.fit = degross(obj.data) ## Corresponding fitted quantiles p = c(.01,.05,seq(.1,.9,by=.1),.95,.99) ## Desired probabilities Q.p = qdegross(p,obj.fit) ## Compute the desired quantiles print(Q.p) ## Estimated quantiles ## Compute the standard error and a 90% credible interval for the 60% quantile Q.60 = qdegross(.60,obj.fit,get.se=TRUE,cred.level=.90) ## Compute the desired quantile print(Q.60) ## Estimated quantile, standard error and credible interval
## Generate grouped data sim = simDegrossData(n=3500, plotting=TRUE, choice=2) ## Create a degrossData object obj.data = degrossData(Big.bins=sim$Big.bins, freq.j=sim$freq.j, m.j=sim$m.j) print(obj.data) ## Estimate the density obj.fit = degross(obj.data) ## Corresponding fitted quantiles p = c(.01,.05,seq(.1,.9,by=.1),.95,.99) ## Desired probabilities Q.p = qdegross(p,obj.fit) ## Compute the desired quantiles print(Q.p) ## Estimated quantiles ## Compute the standard error and a 90% credible interval for the 60% quantile Q.60 = qdegross(.60,obj.fit,get.se=TRUE,cred.level=.90) ## Compute the desired quantile print(Q.60) ## Estimated quantile, standard error and credible interval
Variance-covariance of sample central moments (root-n approximation) given the vector mu with the theoretical moments of order 1 to 8. CAREFUL: the result must be divided by n (= sample size)!
Sigma_fun(mu)
Sigma_fun(mu)
mu |
Vector of length 8 with the first 8 theoretical central moments. |
Variance-covariance matrix of the first four sample central moments (CAREFUL: a division by the sample size is further required !)
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
mu = numeric(8) dfun = function(x) dgamma(x,10,5) mu[1] = integrate(function(x) x*dfun(x),0,Inf)$val for (j in 2:8) mu[j] = integrate(function(x) (x-mu[1])^j*dfun(x),0,Inf)$val Sigma_fun(mu)
mu = numeric(8) dfun = function(x) dgamma(x,10,5) mu[1] = integrate(function(x) x*dfun(x),0,Inf)$val for (j in 2:8) mu[j] = integrate(function(x) (x-mu[1])^j*dfun(x),0,Inf)$val Sigma_fun(mu)
Simulation of grouped data and their sample moments to illustrate the degross density estimation procedure
simDegrossData(n, plotting=TRUE, choice=2, J=3)
simDegrossData(n, plotting=TRUE, choice=2, J=3)
n |
Desired sample size |
plotting |
Logical indicating whether the histogram of the simulated data should be plotted. Default: FALSE |
choice |
Integer in 1:3 indicating from which mixture of distributions to generate the data |
J |
Number of big bins |
A list containing tabulated frequencies and central moments of degrees 1 to 4 for data generated using a mixture density. This list contains :
n
:
total sample size.
J
:
number of big bins.
Big.bins
:
vector of length J+1
with the big bin limits.
freq.j
:
vector of length J
with the observed big bin frequencies.
m.j
:
J
by 4
matrix with on each row the observed first four sample central moments within a given big bin.
true.density
:
density of the raw data generating mechanism (to be estimated from the observed grouped data).
true.cdf
:
cdf of the raw data generating mechanism (to be estimated from the observed grouped data).
Philippe Lambert [email protected]
Lambert, P. (2021) Moment-based density and risk estimation from grouped summary statistics. arXiv:2107.03883.
## Generate data sim = simDegrossData(n=3500, plotting=TRUE, choice=2, J=3) print(sim$true.density) ## Display density of the data generating mechanism # Create a degrossData object obj.data = with(sim, degrossData(Big.bins=Big.bins, freq.j=freq.j, m.j=m.j)) print(obj.data)
## Generate data sim = simDegrossData(n=3500, plotting=TRUE, choice=2, J=3) print(sim$true.density) ## Display density of the data generating mechanism # Create a degrossData object obj.data = with(sim, degrossData(Big.bins=Big.bins, freq.j=freq.j, m.j=m.j)) print(obj.data)