...
 
Commits (3)
......@@ -14,6 +14,47 @@ Issues:
* t2 in bam(...,discrete=TRUE) - not treated as tensor products at
present, and reparameterization needs checking (also for bam).
* OpenBLAS 0.3.2/3 appears not to be thread safe at present - can
cause problems when opemmp multithreading.
1.8-25
** bam(...,discrete=TRUE) methods improved. Cross products now usually faster
(can be much faster) and code can now make better use of optimised BLAS.
* fix to 'fs' smooth.construct method and smooth2random method, to allow
constructor to be called without a "gamm" atribute set on the smooth spec
but still get a sensible result from smooth2random (albeit never using
sarse matrices). Useful for other packages using constructors and
smooth2random, for 'fs' smooths.
* The mrf smooth constructor contained an obsolete hack in which the term
dimension was set to 2 to avoid plotting when used as a te marginal. This
messed up side constraints for terms where a mrf smooth was a main effect
and te marginal. Fixed.
* extract.lme.cov/2 documentation modified to cover NA handling properly, and
routines modified to not require data to be supplied.
* fix of efsudr bug whereby extended families with no extra parameters to
estimate could give incorrect results when using optimer="efs" in 'gam'.
* negbin() corrected - it was declaring the log link to be canonical, leading
to poor convergence and slight misfit.
* predict.bam(...,discrete=TRUE) now handles na.action properly, rather than
always dropping NAs.
* Fix of very obscure bug in which very poor model of small dataset could
end up with fewer `good' data than coefs, breaking an assumption of C code
and segfaulting.
* fix of null deviance computation bug introduced with extended families in
bam - null deviance was wrong for non-default methods.
* liu2 modified to deal with random effects estimated to be exactly 0, so
that summary.gam does not fail in this case.
1.8-24
* Extended Fellner Schall optimizer now avaialable for all families with 'gam'
......@@ -37,7 +78,7 @@ present, and reparameterization needs checking (also for bam).
to avoid masking problems in obscure circumstances.
* 'mrf' smooth documentation modified to make it clearer how to specify 'nb',
and code modified so that it is now possible to speficy the neighbour
and code modified so that it is now possible to specify the neighbour
structure using names rather than indices.
* 'bfgs' fix to handle extended families.
......
Package: mgcv
Version: 1.8-24
Version: 1.8-25
Author: Simon Wood <simon.wood@r-project.org>
Maintainer: Simon Wood <simon.wood@r-project.org>
Title: Mixed GAM Computation Vehicle with Automatic Smoothness
......@@ -18,6 +18,6 @@ LazyLoad: yes
ByteCompile: yes
License: GPL (>= 2)
NeedsCompilation: yes
Packaged: 2018-06-18 13:08:20 UTC; sw283
Packaged: 2018-10-26 08:31:24 UTC; sw283
Repository: CRAN
Date/Publication: 2018-06-18 19:07:59 UTC
Date/Publication: 2018-10-26 14:50:06 UTC
3d3df9183a9b061dc8c4fce64fc30fef *ChangeLog
57abeb2a1390e7e8d3727fa3d69b589f *DESCRIPTION
7471749e7e75a48aba7d56c7d0451464 *ChangeLog
28c24bc01e4b54e0368fc857108a6efb *DESCRIPTION
eb723b61539feef013de476e68b5c50a *GPL-2
723a559e94082d3a7c70651cbcb08dd8 *NAMESPACE
6e5f0f0d0b69f7082e9ecd4aad1f1374 *R/bam.r
baf2b1ee09bd0ba3b653ab4247b15ce4 *NAMESPACE
459045bffc77934a4f287258f31f8aa6 *R/bam.r
de69a45fe00e28a0309193df283c5829 *R/coxph.r
6c79693fe31558339cd694883d0beeb1 *R/efam.r
a0cc498ed5dca0d7909a504465592aa1 *R/fast-REML.r
253a2ac16043c512a1e7a13b097648e4 *R/gam.fit3.r
747d780473bb28492d5aafda9d4f72ba *R/gam.fit4.r
8fb9318d7a272467924c1ef4cd4eb97e *R/fast-REML.r
a00617b7da9f9de5fff27f13b5086c76 *R/gam.fit3.r
5ae9e2ab9b0b30811a14e6be88f4588c *R/gam.fit4.r
1b620b840ca67c9759639bd6e53824e3 *R/gam.sim.r
f2368e22ccf048cf3d1e210e970013dd *R/gamlss.r
3dd7e563b5859efdee464aa1d762bad6 *R/gamm.r
ce69039c15d4db9cfda5d1309af8b242 *R/gamm.r
10facb791e4cfd123d183f05660119c6 *R/jagam.r
954b93360cb6d862f10a740c2e09424e *R/mgcv.r
2feca0dc9d354de7bc707c67a5891364 *R/misc.r
70308e442854adfe95980e3a6c05357f *R/mgcv.r
2cea3312b32a86d53e0304b1279da048 *R/misc.r
03772998ab05887f2eeae10dd6efe983 *R/mvam.r
9702ee7e24809b4a3f0edfd41c97cd2e *R/plots.r
af5ee4fe12423c780babb13ceb94e595 *R/smooth.r
eed19ceca4c4114023f12cda3f622659 *R/plots.r
530e432792d072e2587cbd0896b620cf *R/smooth.r
7398607a9ba7e85276bcf09df0d9344b *R/soap.r
bde1774ce7903cabc57b3196f8872ea8 *R/sparse.r
e468195a83fab90da8e760c2c3884bd3 *data/columb.polys.rda
......@@ -42,7 +42,7 @@ fd0cfd64be579f9fbecdbb7f2b8ec1eb *man/Sl.initial.repara.Rd
60670020425f8749b81a8d8c3f168880 *man/Sl.setup.Rd
69ae63833438a3af2963e39482f1d72f *man/Tweedie.Rd
8087ab00d10b44c99c37f49bf90e19cd *man/anova.gam.Rd
cf0848f120b2d770ae6d9705b53dbbd6 *man/bam.Rd
d235547d7f960f11c6ca2cb0ed571ef6 *man/bam.Rd
ab5e37c3bf8803de63b63c3bdc5909cd *man/bam.update.Rd
cf5f1ee0aab639c7c4b9b357434f15b2 *man/bandchol.Rd
745cbf31eb14fc1c5916fc634c74d998 *man/bug.reports.mgcv.Rd
......@@ -51,10 +51,10 @@ cf5f1ee0aab639c7c4b9b357434f15b2 *man/bandchol.Rd
c03748964ef606621418e428ae49b103 *man/columb.Rd
9906a83ce29a3b17044bc2a3c9940cee *man/concurvity.Rd
04d4285d6b0cea7f8b3f4cb7ec9a466c *man/coxph.Rd
fa4f9235186983c04d7d45f51adce210 *man/coxpht.Rd
9a1cabe4d6abc94326a04e90c218bc93 *man/coxpht.Rd
b78faf4ab9477183e7a3fbbd8801afeb *man/dDeta.Rd
0a6d4b3858cbb69a5d375ecd09282256 *man/exclude.too.far.Rd
4248593e3b2efd51e9501a0c845cb68d *man/extract.lme.cov.Rd
3add7e72948f304246a412b3a91b9fad *man/extract.lme.cov.Rd
6d377ab3f866a3ba15b63ba2a8ae47ff *man/family.mgcv.Rd
42534ae5dffc0a7f6806270c901cbdd4 *man/fix.family.link.Rd
b7830b485a29b13b520fd184e6717b0d *man/fixDependence.Rd
......@@ -62,7 +62,7 @@ e75719779e18c723ee1fd17e44e7901b *man/formXtViX.Rd
88888e966394c9f9792d9537341d053c *man/formula.gam.Rd
4da4d585b329769eb44f0c7a6e7dd554 *man/fs.test.Rd
6f405acde2d7b6f464cf45f5395113ba *man/full.score.Rd
370b3d285e928da33d4897a9a3345d24 *man/gam.Rd
22ddcbad33f08332cd15a6cb88b0a510 *man/gam.Rd
42d669d0f18eba79234b2e849d724863 *man/gam.check.Rd
6d1a075aab054f1e14e6f0f1f432626a *man/gam.control.Rd
afd2fdc49ac57b4b94ee233fa5da1d64 *man/gam.convergence.Rd
......@@ -81,7 +81,7 @@ ed77ce6e1b941625485706d7e307b816 *man/gamObject.Rd
a2593990d6a87f7b783d0435062dec02 *man/gamSim.Rd
e5d2541f32dab56972f58b0773eba50c *man/gamlss.etamu.Rd
c7f140d128d1d1d76909499900faf49e *man/gamlss.gH.Rd
84d8e9331f7febe62c3b6055896357e6 *man/gamm.Rd
623851672072e852e0c0ff03a96de0d6 *man/gamm.Rd
222535dd19201cfd929efd3305b13f43 *man/gaulss.Rd
398a5c12285401c1d37a8edb58780bc3 *man/get.var.Rd
a62dd487f34f476f7f830ed9d1bc58dc *man/gevlss.Rd
......@@ -129,7 +129,7 @@ c27a6b886929b1dc83bf4b90cae848f9 *man/polys.plot.Rd
41b1b0883f2419c8e2d2dd913fc7d677 *man/print.gam.Rd
6d0ce4e574fabceffdbedd46c91364cb *man/qq.gam.Rd
22b7dcbc8ff4096365fa98ce56b957c9 *man/rTweedie.Rd
420e3d95f4c38c73303536ee33bfc98b *man/random.effects.Rd
0d49bec9b5b3609fb6112d18ec650e85 *man/random.effects.Rd
c523210ae95cb9aaa0aaa1c37da1a4c5 *man/residuals.gam.Rd
3c747a8066bcc28ae706ccf74f903d3e *man/rig.Rd
4faef2a628f3c2c43c8a88552ff5a7df *man/rmvn.Rd
......@@ -143,7 +143,7 @@ d54f4042e212fca7704cf8428bdaea38 *man/single.index.Rd
fc7fba34b89fdf29f66744d1fdadda79 *man/smooth.construct.bs.smooth.spec.Rd
2f0463d1aca0b8798da6e681bd4c6e54 *man/smooth.construct.cr.smooth.spec.Rd
f5e6d0f5122f61c336827b3615482157 *man/smooth.construct.ds.smooth.spec.Rd
db75c958cbfb561914a3291ab58b9786 *man/smooth.construct.fs.smooth.spec.Rd
bd58515156b5e07c006e69f29aa830d1 *man/smooth.construct.fs.smooth.spec.Rd
92591aadf25c362bed2b07da4adbd8be *man/smooth.construct.gp.smooth.spec.Rd
a7a3cb2c62724e7fea5edc94027fc097 *man/smooth.construct.mrf.smooth.spec.Rd
2523b6cefa306210c00f3477853b7f07 *man/smooth.construct.ps.smooth.spec.Rd
......@@ -154,7 +154,7 @@ b45d8e71bda4ceca8203dffea577e441 *man/smooth.construct.re.smooth.spec.Rd
8672633a1fad8df3cb1f53d7fa883620 *man/smooth.construct.tensor.smooth.spec.Rd
c522c270c217e5b83cf8f3e95220a03f *man/smooth.construct.tp.smooth.spec.Rd
ae5e27524e37d57505754639455f18a5 *man/smooth.terms.Rd
4c49358e5e6a70d1eca69a2ccaa77609 *man/smooth2random.Rd
f642b1caecf3d2bcdbbc0a884e1d3fa5 *man/smooth2random.Rd
844f9653d74441293d05a24dd3e2876a *man/smoothCon.Rd
b55a396da77559dac553613146633f97 *man/sp.vcov.Rd
83bd8e097711bf5bd0fff09822743d43 *man/spasm.construct.Rd
......@@ -185,26 +185,26 @@ b85b3ef1a39c4e1346b93173be8914aa *po/mgcv.pot
fdbb7d250e7baa7d73f26837246480f3 *po/pl.po
03972284b3400cf82cacd5d2dc4b8cb3 *src/Makevars
031c78f98f40e2615bf4794cb8fbae91 *src/coxph.c
0d723ffa162b4cb0c2c0fa958ccb4edd *src/discrete.c
f6c4ba80ce1b71be7f4a44fac1b08c28 *src/gdi.c
4801447cdbc0638d0aaf8ff723ba4b6b *src/discrete.c
0aa17445bd208f941d79e5dd70181297 *src/gdi.c
2436f9b328e80370ce2203dbf1dd813c *src/general.h
2a049d4a776ae2d109fe6f61eaa44d96 *src/init.c
4bd936321fe26c1c77cb810f7f627258 *src/init.c
a9151b5236852eef9e6947590bfcf88a *src/magic.c
654ff83187dc0f7ef4e085f3348f70d2 *src/mat.c
81a300809799304235bd032949bc047a *src/mat.c
e4cef7f1753153fbab242d1c4d4f7e7f *src/matrix.c
de37b0972199b796654405efc007f25b *src/matrix.h
8df4b96961491d76989b50856237ee2d *src/mgcv.c
09ec4af09981cab11312cc3d6f6dfc5f *src/mgcv.h
97e3717e95a70b1470b4c3071e144d17 *src/misc.c
3b7f9c7eff3a15b27c4cc5bd475afe6d *src/mgcv.h
8e3efe7a0d6b9619671cf2f7471f2112 *src/misc.c
465b8790ca2dfb6e8c5635cacabf5460 *src/mvn.c
563938b7bb6504ab10df5376c4360220 *src/qp.c
073a4b5b0bc6e869c5b35478c47facf1 *src/qp.h
d5673b88f6f3d85c62a1337f49abba24 *src/soap.c
7ad1c265233960a282682a6c90e27057 *src/sparse-smooth.c
44c7ac70ff41f615d449fafd3280819c *src/sparse-smooth.c
fe0444bece322bc229e46b3d1c150779 *src/tprs.c
5bd85bf0319a7b7c755cf49c91a7cd94 *src/tprs.h
38e593a85a6fd0bb4fbed836f3361406 *tests/bam.R
71d55841bbcd1067f58471cf6bb4b27a *tests/coxpht.R
3677e01cc0f90ae5ef9cc44583b045c8 *tests/coxpht.R
fefd6fe58a089c4692652bc5c0bcc65c *tests/gam.R
fa2508c443bdc759a734df0d00ed735e *tests/mgcv-parallel.R
9c1a01e6ea9ce8855f5489bc67762ecb *tests/missing.data.R
......
......@@ -83,7 +83,7 @@ importFrom(stats,.checkMFClasses,.getXlevels,anova,approx,as.formula,
binomial,coef,contrasts,"contrasts<-",cooks.distance,cor,cov,
delete.response,dbeta,dgamma,dnorm,dpois,fitted,formula,gaussian,glm,
influence,logLik,lm,mad,
make.link,median,model.frame,model.offset,model.matrix,nlm,
make.link,median,model.frame,model.offset,model.matrix,na.action,nlm,
na.pass,napredict,na.omit,naresid,optim,pchisq,pnorm,pt,pf,
power,predict,printCoefmat,quantile,
qbeta,qbinom,qcauchy,qchisq,qnbinom,qgamma,qnorm,qpois,qqline,qqnorm,qqplot,
......
This diff is collapsed.
......@@ -845,17 +845,19 @@ Sl.iftChol <- function(Sl,XX,R,d,beta,piv,nt=1) {
Sl.fitChol <- function(Sl,XX,f,rho,yy=0,L=NULL,rho0=0,log.phi=0,phi.fixed=TRUE,
nobs=0,Mp=0,nt=1,tol=0,gamma=1) {
nobs=0,Mp=0,nt=c(1,1),tol=0,gamma=1) {
## given X'WX in XX and f=X'Wy solves the penalized least squares problem
## with penalty defined by Sl and rho, and evaluates a REML Newton step, the REML
## gradiant and the the estimated coefs bhat. If phi.fixed=FALSE then we need
## yy = y'Wy in order to get derivsatives w.r.t. phi.
## yy = y'Wy in order to get derivsatives w.r.t. phi.
## NOTE: with an optimized BLAS nt==1 is likely to be much faster than
## nt > 1
tol <- as.numeric(tol)
rho <- if (is.null(L)) rho + rho0 else L%*%rho + rho0
if (length(rho)<length(rho0)) rho <- rho0 ## ncol(L)==0 or length(rho)==0
## get log|S|_+ without stability transform...
fixed <- rep(FALSE,length(rho))
ldS <- ldetS(Sl,rho,fixed,np=ncol(XX),root=FALSE,repara=FALSE,nt=nt)
ldS <- ldetS(Sl,rho,fixed,np=ncol(XX),root=FALSE,repara=FALSE,nt=nt[1])
## now the Cholesky factor of the penalized Hessian...
#XXp <- XX+crossprod(ldS$E) ## penalized Hessian
......@@ -864,7 +866,7 @@ Sl.fitChol <- function(Sl,XX,f,rho,yy=0,L=NULL,rho0=0,log.phi=0,phi.fixed=TRUE,
d <- diag(XXp);ind <- d<=0
d[ind] <- 1;d[!ind] <- sqrt(d[!ind])
#XXp <- t(XXp/d)/d ## diagonally precondition
R <- if (nt>1) pchol(t(XXp/d)/d,nt) else suppressWarnings(chol(t(XXp/d)/d,pivot=TRUE))
R <- if (nt[2]>1) pchol(t(XXp/d)/d,nt[2]) else suppressWarnings(chol(t(XXp/d)/d,pivot=TRUE))
r <- min(attr(R,"rank"),Rrank(R))
p <- ncol(XXp)
piv <- attr(R,"pivot") #;rp[rp] <- 1:p
......@@ -876,15 +878,17 @@ Sl.fitChol <- function(Sl,XX,f,rho,yy=0,L=NULL,rho0=0,log.phi=0,phi.fixed=TRUE,
beta[piv] <- backsolve(R,(forwardsolve(t(R),f[piv]/d[piv])))/d[piv]
## get component derivatives based on IFT (noting that ldS$Sl has s.p.s updated to current)
dift <- Sl.iftChol(ldS$Sl,XX,R,d,beta,piv,nt=nt)
dift <- Sl.iftChol(ldS$Sl,XX,R,d,beta,piv,nt=nt[1])
## now the derivatives of log|X'X+S|
P <- pbsi(R,nt=nt,copy=TRUE) ## invert R
PP <- matrix(0,p,p)
PP[piv,piv] <- if (nt==1) tcrossprod(P) else pRRt(P,nt) ## PP'
if (nt[2]>1) {
P <- pbsi(R,nt=nt[2],copy=TRUE) ## invert R
PP[piv,piv] <- pRRt(P,nt[2]) ## PP'
} else PP[piv,piv] <- chol2inv(R)
PP <- t(PP/d)/d
ldetXXS <- 2*sum(log(diag(R))+log(d[piv])) ## log|X'X+S|
dXXS <- d.detXXS(ldS$Sl,PP,nt=nt) ## derivs of log|X'X+S|
dXXS <- d.detXXS(ldS$Sl,PP,nt=nt[1]) ## derivs of log|X'X+S|
phi <- exp(log.phi)
......
......@@ -296,7 +296,7 @@ gam.fit3 <- function (x, y, sp, Eb,UrS=list(),
eta <- .9 * eta + .1 * etaold
mu <- linkinv(eta)
}
zg <- rep(0,max(dim(x)))
for (iter in 1:control$maxit) { ## start of main fitting iteration
good <- weights > 0
var.val <- variance(mu)
......@@ -334,8 +334,10 @@ gam.fit3 <- function (x, y, sp, Eb,UrS=list(),
if (sum(good)<ncol(x)) stop("Not enough informative observations.")
if (control$trace) t1 <- proc.time()
oo <- .C(C_pls_fit1,y=as.double(z),X=as.double(x[good,]),w=as.double(w),wy=as.double(w*z),
E=as.double(Sr),Es=as.double(Eb),n=as.integer(sum(good)),
ng <- sum(good);zg[1:ng] <- z ## ensure y dim large enough for beta in all cases
oo <- .C(C_pls_fit1,y=as.double(zg),X=as.double(x[good,]),w=as.double(w),wy=as.double(w*z),
E=as.double(Sr),Es=as.double(Eb),n=as.integer(ng),
q=as.integer(ncol(x)),rE=as.integer(rows.E),eta=as.double(z),
penalty=as.double(1),rank.tol=as.double(rank.tol),nt=as.integer(control$nthreads),
use.wy=as.integer(0))
......@@ -344,9 +346,10 @@ gam.fit3 <- function (x, y, sp, Eb,UrS=list(),
if (!fisher&&oo$n<0) { ## likelihood indefinite - switch to Fisher for this step
z <- (eta - offset)[good] + (yg - mug)/mevg
w <- (weg * mevg^2)/var.mug
ng <- sum(good);zg[1:ng] <- z ## ensure y dim large enough for beta in all cases
if (control$trace) t1 <- proc.time()
oo <- .C(C_pls_fit1,y=as.double(z),X=as.double(x[good,]),w=as.double(w),wy=as.double(w*z),
E=as.double(Sr),Es=as.double(Eb),n=as.integer(sum(good)),
oo <- .C(C_pls_fit1,y=as.double(zg),X=as.double(x[good,]),w=as.double(w),wy=as.double(w*z),
E=as.double(Sr),Es=as.double(Eb),n=as.integer(ng),
q=as.integer(ncol(x)),rE=as.integer(rows.E),eta=as.double(z),
penalty=as.double(1),rank.tol=as.double(rank.tol),nt=as.integer(control$nthreads),
use.wy=as.integer(0))
......@@ -2633,7 +2636,7 @@ negbin <- function (theta = stop("'theta' must be specified"), link = "log") {
structure(list(family = famname, link = linktemp, linkfun = stats$linkfun,
linkinv = stats$linkinv, variance = variance,dvar=dvar,d2var=d2var,d3var=d3var, dev.resids = dev.resids,
aic = aic, mu.eta = stats$mu.eta, initialize = initialize,ls=ls,
validmu = validmu, valideta = stats$valideta,getTheta = getTheta,qf=qf,rd=rd,canonical="log"), class = "family")
validmu = validmu, valideta = stats$valideta,getTheta = getTheta,qf=qf,rd=rd,canonical=""), class = "family")
} ## negbin
......
......@@ -27,40 +27,53 @@ dDeta <- function(y,mu,wt,theta,fam,deriv=0) {
d$Deta4 <- r$Dmu4; d$Dth2 <- r$Dth2; d$Detath2 <- r$Dmuth2
d$Deta2th2 <- r$Dmu2th2; d$Deta3th <- r$Dmu3th
}
return(d)
}
} else {
ig1 <- fam$mu.eta(fam$linkfun(mu))
ig12 <- ig1^2
g2g <- fam$g2g(mu)
ig1 <- fam$mu.eta(fam$linkfun(mu))
ig12 <- ig1^2
g2g <- fam$g2g(mu)
## ig12 <- ig1^2;ig13 <- ig12 * ig1
d$Deta <- r$Dmu * ig1
d$Deta2 <- r$Dmu2*ig12 - r$Dmu*g2g*ig1
d$EDeta2 <- r$EDmu2*ig12
d$Deta.Deta2 <- r$Dmu/(r$Dmu2*ig1 - r$Dmu*g2g)
d$Deta.EDeta2 <- r$Dmu/(r$EDmu2*ig1)
if (deriv>0) {
ig13 <- ig12 * ig1
d$Dth <- r$Dth
d$Detath <- r$Dmuth * ig1
g3g <- fam$g3g(mu)
d$Deta3 <- r$Dmu3*ig13 - 3*r$Dmu2 * g2g * ig12 + r$Dmu * (3*g2g^2 - g3g)*ig1
if (!is.null(r$EDmu3)) d$EDeta3 <- r$EDmu3*ig13 - 3*r$EDmu2 * g2g * ig12 ## EDmu=0
d$Deta2th <- r$Dmu2th*ig12 - r$Dmuth*g2g*ig1
if (!is.null(r$EDmu2th)) d$EDeta2th <- r$EDmu2th*ig12 ##- r$EDmuth*g2g*ig1
}
if (deriv>1) {
g4g <- fam$g4g(mu)
d$Deta4 <- ig12^2*r$Dmu4 - 6*r$Dmu3*ig13*g2g + r$Dmu2*(15*g2g^2-4*g3g)*ig12 -
d$Deta <- r$Dmu * ig1
d$Deta2 <- r$Dmu2*ig12 - r$Dmu*g2g*ig1
d$EDeta2 <- r$EDmu2*ig12
d$Deta.Deta2 <- r$Dmu/(r$Dmu2*ig1 - r$Dmu*g2g)
d$Deta.EDeta2 <- r$Dmu/(r$EDmu2*ig1)
if (deriv>0) {
ig13 <- ig12 * ig1
d$Dth <- r$Dth
d$Detath <- r$Dmuth * ig1
g3g <- fam$g3g(mu)
d$Deta3 <- r$Dmu3*ig13 - 3*r$Dmu2 * g2g * ig12 + r$Dmu * (3*g2g^2 - g3g)*ig1
if (!is.null(r$EDmu3)) d$EDeta3 <- r$EDmu3*ig13 - 3*r$EDmu2 * g2g * ig12 ## EDmu=0
d$Deta2th <- r$Dmu2th*ig12 - r$Dmuth*g2g*ig1
if (!is.null(r$EDmu2th)) d$EDeta2th <- r$EDmu2th*ig12 ##- r$EDmuth*g2g*ig1
}
if (deriv>1) {
g4g <- fam$g4g(mu)
d$Deta4 <- ig12^2*r$Dmu4 - 6*r$Dmu3*ig13*g2g + r$Dmu2*(15*g2g^2-4*g3g)*ig12 -
r$Dmu*(15*g2g^3-10*g2g*g3g +g4g)*ig1
d$Dth2 <- r$Dth2
d$Detath2 <- r$Dmuth2 * ig1
d$Deta2th2 <- ig12*r$Dmu2th2 - r$Dmuth2*g2g*ig1
d$Deta3th <- ig13*r$Dmu3th - 3 *r$Dmu2th*g2g*ig12 + r$Dmuth*(3*g2g^2-g3g)*ig1
}
d$Dth2 <- r$Dth2
d$Detath2 <- r$Dmuth2 * ig1
d$Deta2th2 <- ig12*r$Dmu2th2 - r$Dmuth2*g2g*ig1
d$Deta3th <- ig13*r$Dmu3th - 3 *r$Dmu2th*g2g*ig12 + r$Dmuth*(3*g2g^2-g3g)*ig1
}
} ## end of non identity
good <- is.finite(d$Deta)&is.finite(d$Deta2)
if (deriv>0) {
if (length(theta)>1) good <- good&is.finite(rowSums(d$Dth))&is.finite(rowSums(d$Detath))&
is.finite(rowSums(d$Deta2th))&is.finite(d$Deta3) else
good <- good&is.finite(d$Dth)&is.finite(d$Detath)&is.finite(d$Deta2th)&is.finite(d$Deta3)
if (deriv>1) {
if (length(theta)==1) good <- good&is.finite(d$Dth2)&is.finite(d$Detath2)&is.finite(d$Deta2th2)&
is.finite(d$Deta3th)&is.finite(d$Deta4) else
good <- good&is.finite(rowSums(d$Dth2))&is.finite(rowSums(d$Detath2))&is.finite(rowSums(d$Deta2th2))&
is.finite(rowSums(d$Deta3th))&is.finite(d$Deta4)
}
}
d$good <- good
d
} ## dDeta
......@@ -337,7 +350,8 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
pdev <- sum(dev.resids(y, linkinv(x%*%start+as.numeric(offset)), weights,theta)) + t(start)%*%St%*%start
if (pdev>old.pdev) start <- mukeep <- etastart <- NULL
}
coefold <- null.coef ## set to default, may be replaced below
if (!is.null(mukeep)) mustart <- mukeep
## and now finalize initialization of mu and eta...
......@@ -356,35 +370,31 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
else family$linkfun(mustart)
mu <- linkinv(eta);etaold <- eta
coefold <- null.coef
conv <- boundary <- FALSE
dd <- dDeta(y,mu,weights,theta,family,0) ## derivatives of deviance w.r.t. eta
w <- dd$Deta2 * .5;
wz <- w*(eta-offset) - .5*dd$Deta
z <- (eta-offset) - dd$Deta.Deta2
good <- is.finite(z)&is.finite(w)
zg <- rep(0,max(dim(x)))
for (iter in 1:control$maxit) { ## start of main fitting iteration
if (control$trace) cat(iter," ")
# dd <- dDeta(y,mu,weights,theta,family,0) ## derivatives of deviance w.r.t. eta
# w <- dd$Deta2 * .5;
# wz <- w*(eta-offset) - .5*dd$Deta
# z <- (eta-offset) - dd$Deta.Deta2
# good <- is.finite(z)&is.finite(w)
if (control$trace&sum(!good)>0) cat("\n",sum(!good)," not good\n")
if (sum(!good)) {
use.wy <- TRUE
good <- is.finite(w)&is.finite(wz)
z[!is.finite(z)] <- 0 ## avoid NaN in .C call - unused anyway
} else use.wy <- family$use.wz
if (sum(good)==0) stop("no good data in iteration")
ng <- sum(good)
zg[1:ng] <- z[good] ## ensure that y dimension large enough for coefs
oo <- .C(C_pls_fit1,
y=as.double(z[good]),X=as.double(x[good,]),w=as.double(w[good]),wy = as.double(wz[good]),
E=as.double(Sr),Es=as.double(Eb),n=as.integer(sum(good)),
y=as.double(zg),X=as.double(x[good,]),w=as.double(w[good]),wy = as.double(wz[good]),
E=as.double(Sr),Es=as.double(Eb),n=as.integer(ng),
q=as.integer(ncol(x)),rE=as.integer(rows.E),eta=as.double(z),
penalty=as.double(1),rank.tol=as.double(rank.tol),
nt=as.integer(control$nthreads),use.wy=as.integer(use.wy))
nt=as.integer(control$nthreads),use.wy=as.integer(use.wy))
posdef <- oo$n >= 0
if (!posdef) { ## then problem is indefinite - switch to +ve weights for this step
if (control$trace) cat("**using positive weights\n")
......@@ -402,10 +412,11 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
good <- is.finite(w)&is.finite(wz)
z[!is.finite(z)] <- 0 ## avoid NaN in .C call - unused anyway
} else use.wy <- family$use.wz
ng <- sum(good)
zg[1:ng] <- z[good] ## ensure that y dimension large enough for coefs
oo <- .C(C_pls_fit1, ##C_pls_fit1,
y=as.double(z[good]),X=as.double(x[good,]),w=as.double(w[good]),wy = as.double(wz[good]),
E=as.double(Sr),Es=as.double(Eb),n=as.integer(sum(good)),
y=as.double(zg),X=as.double(x[good,]),w=as.double(w[good]),wy = as.double(wz[good]),
E=as.double(Sr),Es=as.double(Eb),n=as.integer(ng),
q=as.integer(ncol(x)),rE=as.integer(rows.E),eta=as.double(z),
penalty=as.double(1),rank.tol=as.double(rank.tol),
nt=as.integer(control$nthreads),use.wy=as.integer(use.wy))
......@@ -435,8 +446,7 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
coefold <- null.coef
etaold <- null.eta
}
#warning("Step size truncated due to divergence",
# call. = FALSE)
ii <- 1
while (!is.finite(dev)) {
if (ii > control$maxit)
......@@ -456,8 +466,7 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
## now step halve if mu or eta are out of bounds...
if (!(valideta(eta) && validmu(mu))) {
#warning("Step size truncated: out of bounds",
# call. = FALSE)
ii <- 1
while (!(valideta(eta) && validmu(mu))) {
if (ii > control$maxit)
......@@ -522,7 +531,7 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
## Need to check coefs converged adequately, to ensure implicit differentiation
## ok. Testing coefs unchanged is problematic under rank deficiency (not guaranteed to
## drop same parameter every iteration!)
grad <- 2 * t(x[good,])%*%((w[good]*(x%*%start)[good]-wz[good]))+ 2*St%*%start
grad <- 2 * t(x[good,,drop=FALSE])%*%((w[good]*(x%*%start)[good]-wz[good]))+ 2*St%*%start
if (max(abs(grad)) > control$epsilon*max(abs(start+coefold))/2) {
old.pdev <- pdev ## not converged quite enough
coef <- coefold <- start
......@@ -562,8 +571,8 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
wf <- pmax(0,dd$EDeta2 * .5) ## Fisher type weights
wz <- w*(eta-offset) - 0.5*dd$Deta ## Wz finite when w==0
gdi.type <- if (any(abs(w)<.Machine$double.xmin*1e20)||any(!is.finite(z))) 1 else 0
good <- is.finite(wz)&is.finite(w)
good <- is.finite(wz)&is.finite(w)&dd$good
if (sum(good)==0) stop("not enough finite derivatives")
residuals <- z - (eta - offset)
residuals[!is.finite(residuals)] <- NA
......@@ -599,8 +608,11 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
}
}
if (scoreType=="EFS") scoreType <- "REML"
## gdi.type should probably be computed after dropping via good
gdi.type <- if (any(abs(w)<.Machine$double.xmin*1e20)||any(!is.finite(z))) 1 else 0
if (scoreType=="EFS") scoreType <- "REML"
oo <- .C(C_gdi2,
X=as.double(x[good,]),E=as.double(Sr),Es=as.double(Eb),rS=as.double(unlist(rS)),
U1 = as.double(U1),sp=as.double(exp(sp)),theta=as.double(theta),
......@@ -621,7 +633,6 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
rSncol=as.integer(rSncol),deriv=as.integer(deriv),
fixed.penalty = as.integer(rp$fixed.penalty),nt=as.integer(control$nthreads),
type=as.integer(gdi.type),dVkk=as.double(rep(0,nSp^2)))
rV <- matrix(oo$rV,ncol(x),ncol(x)) ## rV%*%t(rV)*scale gives covariance matrix
rV <- T %*% rV
## derivatives of coefs w.r.t. sps etc...
......@@ -726,7 +737,7 @@ efsudr <- function(x,y,lsp,Eb,UrS,weights,family,offset=0,start=NULL,etastart=NU
nsp <- length(UrS)
if (inherits(family,"extended.family")) {
spind <- family$n.theta + 1:nsp
thind <- 1:family$n.theta
thind <- if (family$n.theta>0) 1:family$n.theta else rep(0,0)
} else {
thind <- rep(0,0)
spind <- 1:nsp ## index of smoothing params in lsp
......
......@@ -348,12 +348,18 @@ smooth2random.fs.interaction <- function(object,vnames,type=1) {
## penalties are not interwoven, but blocked (i.e. this ordering is
## as for gamm case).
if (object$fixed) return(list(fixed=TRUE,Xf=object$X))
##if (type == 2) require(Matrix)
## If smooth constructor was not called with "gamm" attribute set,
## then we need to reset model matrix to base model matrix.
if (!is.null(object$Xb)) {
object$X <- object$Xb
object$S <- object$base$S
if (!is.null(object$S.scale)&&length(object$S)>0) for (i in 1:length(object$S)) object$S[[i]] <- object$S[[i]]/object$S.scale[i]
}
colx <- ncol(object$X)
diagU <- rep(1,colx)
ind <- 1:colx
flev <- levels(object$fac)
n.lev <- length(flev)
## flev <- levels(object$fac)
n.lev <- length(object$flev)
if (type==2) {
## index which params in fit parameterization are penalized by each penalty.
## e.g. pen.ind==1 is TRUE for each param penalized by first penalty and
......@@ -383,19 +389,15 @@ smooth2random.fs.interaction <- function(object,vnames,type=1) {
attr(random[[i]],"group") <- object$fac ## factor supplied as part of term
attr(random[[i]],"Xr.name") <- term.name
attr(random[[i]],"Xr") <- X
# rind <- c(rind,k:(k+ncol(X)-1))
# rinc <- c(rinc,rep(ncol(X),ncol(X)))
# k <- k + n.lev * ncol(X)
} else { ## gamm4 form --- whole sparse matrices
Xr <- as(matrix(0,nrow(X),0),"dgCMatrix")
ii <- 0
for (j in 1:n.lev) { ## assemble full sparse model matrix
Xr <- cbind2(Xr,as(X*as.numeric(object$fac==flev[j]),"dgCMatrix"))
Xr <- cbind2(Xr,as(X*as.numeric(object$fac==object$flev[j]),"dgCMatrix"))
pen.ind[indi+ii] <- i;ii <- ii + colx
}
# rind <- c(rind,k:(k+ncol(Xr)-1))
# rinc <- c(rinc,rep(ncol(Xr),ncol(Xr)))
random[[i]] <- Xr
random[[i]] <- if (is.null(object$Xb)) Xr else as(Xr,"matrix")
names(random)[i] <- term.name
attr(random[[i]],"s.label") <- object$label
}
......@@ -784,7 +786,7 @@ varWeights.dfo <- function(b,data)
w
}
extract.lme.cov2<-function(b,data,start.level=1)
extract.lme.cov2<-function(b,data=NULL,start.level=1)
# function to extract the response data covariance matrix from an lme fitted
# model object b, fitted to the data in data. "inner" == "finest" grouping
# start.level is the r.e. grouping level at which to start the construction,
......@@ -799,6 +801,10 @@ extract.lme.cov2<-function(b,data,start.level=1)
# V is either returned as an array, if it's diagonal, a matrix if it is
# a full matrix or a list of matrices if it is block diagonal.
{ if (!inherits(b,"lme")) stop("object does not appear to be of class lme")
if (is.null(data)) {
na.act <- na.action(b)
data <- if (is.null(na.act)) b$data else b$data[-na.act,]
}
grps <- nlme::getGroups(b) # labels of the innermost groupings - in data frame order
n <- length(grps) # number of data
n.levels <- length(b$groups) # number of levels of grouping (random effects only)
......@@ -989,13 +995,17 @@ extract.lme.cov2<-function(b,data,start.level=1)
list(V=V,ind=Cind)
} ## extract.lme.cov2
extract.lme.cov<-function(b,data,start.level=1)
extract.lme.cov<-function(b,data=NULL,start.level=1)
# function to extract the response data covariance matrix from an lme fitted
# model object b, fitted to the data in data. "inner" == "finest" grouping
# start.level is the r.e. grouping level at which to start the construction,
# levels outer to this will not be included in the calculation - this is useful
# for gamm calculations
{ if (!inherits(b,"lme")) stop("object does not appear to be of class lme")
if (is.null(data)) {
na.act <- na.action(b)
data <- if (is.null(na.act)) b$data else b$data[-na.act,]
}
grps<-nlme::getGroups(b) # labels of the innermost groupings - in data frame order
n<-length(grps) # number of data
if (is.null(b$modelStruct$varStruct)) w<-rep(b$sigma,n) ###
......
......@@ -3208,6 +3208,8 @@ liu2 <- function(x, lambda, h = rep(1,length(lambda)),lower.tail=FALSE) {
lh <- lh*lambda
c3 <- sum(lh)
if (x<=0 || c2 <= 0) return(1)
s1 <- c3/c2^1.5
s2 <- sum(lh*lambda)/c2^2
......@@ -3222,6 +3224,7 @@ liu2 <- function(x, lambda, h = rep(1,length(lambda)),lower.tail=FALSE) {
} else {
a <- 1/s1
delta <- 0
if (c3==0) return(1)
l <- c2^3/c3^2
}
......
......@@ -133,8 +133,12 @@ mvn.ll <- function(y,X,beta,dbeta=NULL) {
XWXd <- function(X,w,k,ks,ts,dt,v,qc,nthreads=1,drop=NULL,ar.stop=-1,ar.row=-1,ar.w=-1) {
## Form X'WX given weights in w and X in compressed form in list X.
## each element of X is a (marginal) model submatrix. Full version
## is given by X[[i]][k[,i],]. list X relates to length(ts) separate
## is given by X[[i]][k[,i],] (see below for summation convention).
## list X relates to length(ts) separate
## terms. ith term starts at matrix ts[i] and has dt[i] marginal matrices.
## For summation convention, k[,ks[j,1]:ks[j,2]] gives index columns
## for matrix j, thereby allowing summation over matrix covariates....
## i.e. for q in ks[j,1]:ks[j,2] sum up X[[j]][k[,q],]
## Terms with several marginals are tensor products and may have
## constraints (if qc[i]>1), stored as a householder vector in v[[i]].
## check ts and k index start (assumed 1 here)
......@@ -142,13 +146,24 @@ XWXd <- function(X,w,k,ks,ts,dt,v,qc,nthreads=1,drop=NULL,ar.stop=-1,ar.row=-1,a
m <- unlist(lapply(X,nrow));p <- unlist(lapply(X,ncol))
nx <- length(X);nt <- length(ts)
n <- length(w);pt <- 0;
for (i in 1:nt) pt <- pt + prod(p[ts[i]:(ts[i]+dt[i]-1)]) - as.numeric(qc[i]>0)
oo <- .C(C_XWXd,XWX =as.double(rep(0,pt^2)),X= as.double(unlist(X)),w=as.double(w),
for (i in 1:nt) pt <- pt + prod(p[ts[i]:(ts[i]+dt[i]-1)]) - as.numeric(qc[i]>0)
## block oriented code...
t0 <- system.time(oo <- .C(C_XWXd0,XWX =as.double(rep(0,(pt+nt)^2)),X= as.double(unlist(X)),w=as.double(w),
k=as.integer(k-1),ks=as.integer(ks-1),m=as.integer(m),p=as.integer(p), n=as.integer(n),
ns=as.integer(nx), ts=as.integer(ts-1), as.integer(dt), nt=as.integer(nt),
v = as.double(unlist(v)),qc=as.integer(qc),nthreads=as.integer(nthreads),
ar.stop=as.integer(ar.stop-1),ar.row=as.integer(ar.row-1),ar.weights=as.double(ar.w))
if (is.null(drop)) matrix(oo$XWX,pt,pt) else matrix(oo$XWX,pt,pt)[-drop,-drop]
ar.stop=as.integer(ar.stop-1),ar.row=as.integer(ar.row-1),ar.weights=as.double(ar.w)))
## old strictly level 2 code for comparison...
# t1 <- system.time(ooo <- .C(C_XWXd,XWX =as.double(rep(0,pt^2)),X= as.double(unlist(X)),w=as.double(w),
# k=as.integer(k-1),ks=as.integer(ks-1),m=as.integer(m),p=as.integer(p), n=as.integer(n),
# ns=as.integer(nx), ts=as.integer(ts-1), as.integer(dt), nt=as.integer(nt),
# v = as.double(unlist(v)),qc=as.integer(qc),nthreads=as.integer(nthreads),
# ar.stop=as.integer(ar.stop-1),ar.row=as.integer(ar.row-1),ar.weights=as.double(ar.w)))
# XWX <- matrix(oo$XWX[1:pt^2],pt,pt)
# XWX0 <- matrix(ooo$XWX[1:pt^2],pt,pt)
# plot(XWX0,XWX,pch=".",main=range(XWX-XWX0));abline(0,1,col=2)
if (is.null(drop)) matrix(oo$XWX[1:pt^2],pt,pt) else matrix(oo$XWX[1:pt^2],pt,pt)[-drop,-drop]
} ## XWXd
XWyd <- function(X,w,y,k,ks,ts,dt,v,qc,drop=NULL,ar.stop=-1,ar.row=-1,ar.w=-1) {
......@@ -266,7 +281,7 @@ pqr2 <- function(x,nt=1,nb=30) {
pbsi <- function(R,nt=1,copy=TRUE) {
## parallel back substitution inversion of upper triangular R
## library(mgcv); n <- 500;p<-400;x <- matrix(runif(n*p),n,p)
## qrx <- mgcv:::pqr2(x,2);R <- qr.R(qrx)
## qrx <- qr(x);R <- qr.R(qrx)
## system.time(Ri <- mgcv:::pbsi(R,2))
## system.time(Ri2 <- backsolve(R,diag(p)));range(Ri-Ri2)
if (copy) R <- R * 1 ## ensure that R modified only within pbsi
......@@ -274,11 +289,12 @@ pbsi <- function(R,nt=1,copy=TRUE) {
R
} ## pbsi
pchol <- function(A,nt=1,nb=30) {
pchol <- function(A,nt=1,nb=40) {
## parallel Choleski factorization.
## library(mgcv);
## set.seed(2);n <- 200;r <- 190;A <- tcrossprod(matrix(runif(n*r),n,r))
## system.time(R <- chol(A,pivot=TRUE));system.time(L <- mgcv:::pchol(A));range(R[1:r,]-L[1:r,])
## k <- 30;range(R[1:k,1:k]-L[1:k,1:k])
## system.time(L <- mgcv:::pchol(A,nt=2,nb=30))
## piv <- attr(L,"pivot");attr(L,"rank");range(crossprod(L)-A[piv,piv])
## should nb be obtained from 'ILAENV' as page 23 of Lucas 2004??
......@@ -306,7 +322,8 @@ pcrossprod <- function(A,trans=FALSE,nt=1,nb=30) {
pRRt <- function(R,nt=1) {
## parallel RR' for upper triangular R
## following creates index of lower triangular elements...
## n <- 4000;a <- rep(1:n,n);b <- rep(1:n,each=n)-1;which(a-b>0) -> ii;a[ii]+b[ii]*n->ii
## n <- 4000;a <- rep(1:n,n);b <- rep(1:n,each=n);which(a>=b) -> ii;a[ii]+(b[ii]-1)*n->ii ## lower
## n <- 4000;a <- rep(1:n,n);b <- rep(1:n,each=n);which(a<=b) -> ii;a[ii]+(b[ii]-1)*n->ii ## upper
## library(mgcv);R <- matrix(0,n,n);R[ii] <- runif(n*(n+1)/2)
## Note: A[a-b<=0] <- 0 zeroes upper triangle
## system.time(A <- mgcv:::pRRt(R,2))
......
......@@ -435,7 +435,7 @@ lolaxy <- function(lo,la,theta,phi) {
list(x=x[ind],y=y[ind])
} ## end of lolaxy
plot.sos.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=1,se2.mult=2,
plot.sos.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=2,se2.mult=1,
partial.resids=FALSE,rug=TRUE,se=TRUE,scale=-1,n=100,n2=40,n3=3,
pers=FALSE,theta=30,phi=30,jit=FALSE,xlab=NULL,ylab=NULL,main=NULL,
ylim=NULL,xlim=NULL,too.far=0.1,shade=FALSE,shade.col="gray80",
......@@ -638,7 +638,7 @@ polys.plot <- function(pc,z=NULL,scheme="heat",lab="",...) {
par(oldpar)
} ## polys.plot
plot.mrf.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=1,se2.mult=2,
plot.mrf.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=2,se2.mult=1,
partial.resids=FALSE,rug=TRUE,se=TRUE,scale=-1,n=100,n2=40,n3=3,
pers=FALSE,theta=30,phi=30,jit=FALSE,xlab=NULL,ylab=NULL,main=NULL,
ylim=NULL,xlim=NULL,too.far=0.1,shade=FALSE,shade.col="gray80",
......@@ -663,7 +663,7 @@ plot.mrf.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=1,se2.mult=2,
} ## end plot.mrf.smooth
plot.fs.interaction <- function(x,P=NULL,data=NULL,label="",se1.mult=1,se2.mult=2,
plot.fs.interaction <- function(x,P=NULL,data=NULL,label="",se1.mult=2,se2.mult=1,
partial.resids=FALSE,rug=TRUE,se=TRUE,scale=-1,n=100,n2=40,n3=3,
pers=FALSE,theta=30,phi=30,jit=FALSE,xlab=NULL,ylab=NULL,main=NULL,
ylim=NULL,xlim=NULL,too.far=0.1,shade=FALSE,shade.col="gray80",
......@@ -697,7 +697,7 @@ plot.fs.interaction <- function(x,P=NULL,data=NULL,label="",se1.mult=1,se2.mult=
}
} ## end plot.fs.interaction
plot.mgcv.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=1,se2.mult=2,
plot.mgcv.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=2,se2.mult=1,
partial.resids=FALSE,rug=TRUE,se=TRUE,scale=-1,n=100,n2=40,n3=3,
pers=FALSE,theta=30,phi=30,jit=FALSE,xlab=NULL,ylab=NULL,main=NULL,
ylim=NULL,xlim=NULL,too.far=0.1,shade=FALSE,shade.col="gray80",
......@@ -707,14 +707,18 @@ plot.mgcv.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=1,se2.mult=2,
## `x' is a smooth object, usually part of a `gam' fit. It has an attribute
## 'coefficients' containing the coefs for the smooth, but usually these
## are not needed.
## Usually this function is called twice. First to set up P, then to compute the
## actual plot information including standard error bands, and then to actually
## plot...
## `P' is a list of plot data.
## If `P' is NULL then the routine should compute some of this plot data
## If `P' is NULL (first call) then the routine should compute some of this plot data
## and return without plotting...
## * X the matrix mapping the smooth's coefficients to the values at
## which the smooth must be computed for plotting.
## * The values against which to plot.
## * `exclude' indicates rows of X%*%p to set to NA for plotting -- NULL for none.
## * se TRUE if plotting of the term can use standard error information.
## * se.mult - the multiplier of the standard error used to plot confidence bands
## * scale TRUE if the term should be considered by plot.gam if a common
## y scale is required.
## * any raw data information.
......@@ -723,9 +727,9 @@ plot.mgcv.smooth <- function(x,P=NULL,data=NULL,label="",se1.mult=1,se2.mult=2,
## very little processing is done outside the routine, except for partial residual
## computations.
## Alternatively return P as NULL if x should not be plotted.
## If P is not NULL it will contain
## If P is not NULL (second call) it will contain the following...
## * fit - the values for plotting
## * se.fit - standard errors of fit (can be NULL)
## * se - standard errors of fit multiplied by se.mult
## * the values against which to plot
## * any raw data information
## * any partial.residuals
......@@ -1255,7 +1259,7 @@ plot.gam <- function(x,residuals=FALSE,rug=NULL,se=TRUE,pages=0,select=NULL,scal
se.fit <- sqrt(pmax(0,rowSums((X1%*%x$Vp)*X1)))
} else se.fit <- ## se in centred (or anyway unconstained) space only
sqrt(pmax(0,rowSums((P$X%*%x$Vp[first:last,first:last,drop=FALSE])*P$X)))
if (!is.null(P$exclude)) P$se.fit[P$exclude] <- NA
if (!is.null(P$exclude)) se.fit[P$exclude] <- NA
} ## standard errors for fit completed
if (partial.resids) { P$p.resid <- fv.terms[,length(order)+i] + w.resid }
if (se && P$se) P$se <- se.fit*P$se.mult # Note multiplier
......
......@@ -2032,29 +2032,27 @@ smooth.construct.fs.smooth.spec <- function(object,data,knots) {
object$S[[i+1]] <- object$S[[1]]*0
object$S[[i+1]][object$rank+i,object$rank+i] <- 1
}
object$P <- rp$P ## X' = X%*%P, where X is original version
object$fterm <- fterm ## the factor name...
if (!is.factor(fac)) warning("no factor supplied to fs smooth")
object$flev <- levels(fac)
object$fac <- fac ## gamm should use this for grouping
## now full penalties ...
#fullS <- list()
#fullS[[1]] <- diag(rep(c(rp$D,rep(0,null.d)),nf)) ## range space penalties
#for (i in 1:null.d) { ## null space penalties
# um <- rep(0,ncol(rp$X));um[object$rank+i] <- 1
# fullS[[i+1]] <- diag(rep(um,nf))
#}
## Now the model matrix
if (gamm) { ## no duplication, gamm will handle this by nesting
if (object$fixed==TRUE) stop("\"fs\" terms can not be fixed here")
object$X <- rp$X
object$fac <- fac ## gamm should use this for grouping
#object$fac <- fac ## gamm should use this for grouping
object$te.ok <- 0 ## would break special handling
## rank??
} else { ## duplicate model matrix columns, and penalties...
nf <- length(object$flev)
## Store the base model matrix/S in case user wants to convert to r.e. but
## has not created with a "gamm" attribute on object
object$Xb <- rp$X
object$base$S <- object$S
## creating the model matrix...
object$X <- rp$X * as.numeric(fac==object$flev[1])
if (nf>1) for (i in 2:nf) {
......@@ -2535,7 +2533,6 @@ smooth.construct.mrf.smooth.spec <- function(object, data, knots) {
if (length(ind)>0) for (i in length(ind):1) object$xt$polys[[ind[i]]] <- NULL
}
object$plot.me <- TRUE
object$dim <- 2 ## signal that it's really 2D here to avoid attempt to plot in te term
## polygon list in correct format
} else {
object$plot.me <- FALSE ## can't plot without polygon information
......
mgcv (1.8-25-1) unstable; urgency=medium
* New upstream release
* debian/control: Set Build-Depends: to current R version
* debian/control: Set Standards-Version: to current version
-- Dirk Eddelbuettel <edd@debian.org> Fri, 02 Nov 2018 17:24:23 -0500
mgcv (1.8-24-1) unstable; urgency=medium
* New upstream release
......
......@@ -2,8 +2,8 @@ Source: mgcv
Section: gnu-r
Priority: optional
Maintainer: Dirk Eddelbuettel <edd@debian.org>
Build-Depends: debhelper (>= 7.0.0), r-base-dev (>= 3.5.0), r-cran-nlme, r-cran-matrix, dh-r
Standards-Version: 4.1.4
Build-Depends: debhelper (>= 7.0.0), r-base-dev (>= 3.5.1), r-cran-nlme, r-cran-matrix, dh-r
Standards-Version: 4.2.1
Vcs-Browser: https://salsa.debian.org/edd/r-cran-mgcv
Vcs-Git: https://salsa.debian.org/edd/r-cran-mgcv.git
Homepage: https://cran.r-project.org/package=mgcv
......
......@@ -13,7 +13,7 @@ are designed for datasets containing upwards of several tens of thousands of dat
of \code{bam} is much lower memory footprint than \code{\link{gam}}, but it can also be much faster,
for large datasets. \code{bam} can also compute on a cluster set up by the \link[parallel]{parallel} package.
An alternative fitting approach (Wood et al. 2016) is provided by the \code{discrete==TRUE} method. In this case a method based on discretization of covariate values and C code level parallelization (controlled by the \code{nthreads} argument instead of the \code{cluster} argument) is used. This extends both the data set and model size that are practical.
An alternative fitting approach (Wood et al. 2017) is provided by the \code{discrete==TRUE} method. In this case a method based on discretization of covariate values and C code level parallelization (controlled by the \code{nthreads} argument instead of the \code{cluster} argument) is used. This extends both the data set and model size that are practical.
}
\usage{
......@@ -132,7 +132,7 @@ single machine). See details and example code.
}
\item{nthreads}{Number of threads to use for non-cluster computation (e.g. combining results from cluster nodes).
If \code{NA} set to \code{max(1,length(cluster))}.}
If \code{NA} set to \code{max(1,length(cluster))}. See details.}
\item{gc.level}{to keep the memory footprint down, it helps to call the garbage collector often, but this takes
a substatial amount of time. Setting this to zero means that garbage collection only happens when R decides it should. Setting to 2 gives frequent garbage collection. 1 is in between.}
......@@ -187,7 +187,9 @@ can be used in the 1D case, and tensor product smooths (\code{te}) are typically
If \code{cluster} is provided as a cluster set up using \code{\link[parallel]{makeCluster}} (or \code{\link[parallel]{makeForkCluster}}) from the \code{parallel} package, then the rate limiting QR decomposition of the model matrix is performed in parallel using this cluster. Note that the speed ups are often not that great. On a multi-core machine it is usually best to set the cluster size to the number of physical cores, which is often less than what is reported by \code{\link[parallel]{detectCores}}. Using more than the number of physical cores can result in no speed up at all (or even a slow down). Note that a highly parallel BLAS may negate all advantage from using a cluster of cores. Computing in parallel of course requires more memory than computing in series. See examples.
When \code{discrete=TRUE} the covariate data are first discretized. Discretization takes place on a smooth by smooth basis, or in the case of tensor product smooths (or any smooth that can be represented as such, such as random effects), separately for each marginal smooth. The required spline bases are then evaluated at the discrete values, and stored, along with index vectors indicating which original observation they relate to. Fitting is by a version of performance oriented iteration/PQL using REML smoothing parameter selection on each iterative working model (as for the default method). The iteration is based on the derivatives of the REML score, without computing the score itself, allowing the expensive computations to be reduced to one parallel block Cholesky decomposition per iteration (plus two basic operations of equal cost, but easily parallelized). Unlike standard POI/PQL, only one step of the smoothing parameter update for the working model is taken at each step (rather than iterating to the optimal set of smoothing parameters for each working model). At each step a weighted model matrix crossproduct of the model matrix is required - this is efficiently computed from the pre-computed basis functions evaluated at the discretized covariate values. Efficient computation with tensor product terms means that some terms within a tensor product may be re-ordered for maximum efficiency. Parallel computation is controlled using the \code{nthreads} argument. For this method no cluster computation is used, and the \code{parallel} package is not required.
When \code{discrete=TRUE} the covariate data are first discretized. Discretization takes place on a smooth by smooth basis, or in the case of tensor product smooths (or any smooth that can be represented as such, such as random effects), separately for each marginal smooth. The required spline bases are then evaluated at the discrete values, and stored, along with index vectors indicating which original observation they relate to. Fitting is by a version of performance oriented iteration/PQL using REML smoothing parameter selection on each iterative working model (as for the default method). The iteration is based on the derivatives of the REML score, without computing the score itself, allowing the expensive computations to be reduced to one parallel block Cholesky decomposition per iteration (plus two basic operations of equal cost, but easily parallelized). Unlike standard POI/PQL, only one step of the smoothing parameter update for the working model is taken at each step (rather than iterating to the optimal set of smoothing parameters for each working model). At each step a weighted model matrix crossproduct of the model matrix is required - this is efficiently computed from the pre-computed basis functions evaluated at the discretized covariate values. Efficient computation with tensor product terms means that some terms within a tensor product may be re-ordered for maximum efficiency.
When \code{discrete=TRUE} parallel computation is controlled using the \code{nthreads} argument. For this method no cluster computation is used, and the \code{parallel} package is not required. Note that actual speed up from parallelization depends on the BLAS installed and your hardware. With the (R default) reference BLAS using several threads can make a substantial difference, but with a single threaded tuned BLAS, such as openblas, the effect is less marked (since cache use is typically optimized for one thread, and is then sub optimal for several). However the tuned BLAS is usually much faster than using the reference BLAS, however many threads you use. If you have a multi-threaded BLAS installed then you should leave \code{nthreads} at 1, since calling a multi-threaded BLAS from multiple threads usually slows things down: the only exception to this is that you might choose to form discrete matrix cross products (the main cost in the fitting routine) in a multi-threaded way, but use single threaded code for other computations: this can be achieved by e.g. \code{nthreads=c(2,1)}, which would use 2 threads for discrete inner products, and 1 for most code calling BLAS. Not that the basic reason that multi-threaded performance is often disappointing is that most computers are heavily memory bandwidth limited, not flop rate limited. It is hard to get data to one core fast enough, let alone trying to get data simultaneously to several cores.
\code{discrete=TRUE} will often produce identical results to the methods without discretization, since covariates often only take a modest number of discrete values anyway, so no approximation at all is involved in the discretization process. Even when some approximation is involved, the differences are often very small as the algorithms discretize marginally whenever possible. For example each margin of a tensor product smooth is discretized separately, rather than discretizing onto a grid of covariate values (for an equivalent isotropic smooth we would have to discretize onto a grid). The marginal approach allows quite fine scale discretization and hence very low approximation error. Note that when using the smooth \code{id} mechanism to link smoothing parameters, the discrete method cannot force the linked bases to be identical, so some differences to the none discrete methods will be noticable.
......
......@@ -39,7 +39,8 @@ tdpois <- function(dat,event="z",et="futime",t="day",status="status1",
require(utils) ## for progress bar
te <- sort(unique(dat[[et]][dat[[status]]==1])) ## event times
sid <- unique(dat[[id]])
prg <- txtProgressBar(min = 0, max = length(sid), initial = 0,
inter <- interactive()
if (inter) prg <- txtProgressBar(min = 0, max = length(sid), initial = 0,
char = "=",width = NA, title="Progress", style = 3)
## create dataframe for poisson model data
dat[[event]] <- 0; start <- 1
......@@ -54,7 +55,7 @@ tdpois <- function(dat,event="z",et="futime",t="day",status="status1",
um[[et]] <- tr ## reset time to relevant event times
dap[start:(start-1+nrow(um)),] <- um ## copy to dap
start <- start + nrow(um)
setTxtProgressBar(prg, i)
if (inter) setTxtProgressBar(prg, i)
}
close(prg)
dap[1:(start-1),]
......
......@@ -10,8 +10,8 @@ calculation. \code{extract.lme.cov} forms the full matrix explicitly:
\code{extract.lme.cov2} tries to be more economical than this.
}
\usage{
extract.lme.cov(b,data,start.level=1)
extract.lme.cov2(b,data,start.level=1)
extract.lme.cov(b,data=NULL,start.level=1)
extract.lme.cov2(b,data=NULL,start.level=1)
}
%- maybe also `usage' for other objects documented here.
......@@ -19,7 +19,8 @@ extract.lme.cov2(b,data,start.level=1)
\item{b}{ A fitted model object returned by a call to \code{\link[nlme]{lme}}}.
\item{data}{ The data frame/ model frame that was supplied to
\code{\link[nlme]{lme}}.}
\code{\link[nlme]{lme}}, but with any rows removed by the na action dropped. Uses
the data stored in the model object if not supplied.}
\item{start.level}{The level of nesting at which to start including random
effects in the calculation. This is used to allow smooth terms to be estimated
......@@ -92,7 +93,8 @@ require(mgcv)
library(nlme)
data(Rail)
b <- lme(travel~1,Rail,~1|Rail)
extract.lme.cov(b,Rail)
extract.lme.cov(b)
extract.lme.cov2(b)
}
\keyword{models} \keyword{smooth} \keyword{regression}%-- one or more ..
......
......@@ -56,7 +56,7 @@ from \code{environment(formula)}: typically the environment from
which \code{gam} is called.}
\item{weights}{ prior weights on the contribution of the data to the log likelihood. Note that a weight of 2, for example,
is equivalent to having made exactly the same observation twice. If you want to reweight the contributions
is equivalent to having made exactly the same observation twice. If you want to re-weight the contributions
of each datum without changing the overall magnitude of the log likelihood, then you should normalize the weights
(e.g. \code{weights <- weights/mean(weights)}). }
......@@ -177,11 +177,11 @@ where the (independent) response variables \eqn{y_i \sim {\rm Poi }}{y_i~Poi}, a
requires constraints on the smooth functions. By default these are imposed automatically and require that the function sums to zero over the observed covariate values (the presence of a metric \code{by} variable is the only case which usually suppresses this).
If absolutely any smooth functions were allowed in model fitting then maximum likelihood
estimation of such models would invariably result in complex overfitting estimates of
estimation of such models would invariably result in complex over-fitting estimates of
\eqn{f_1}{f_1} and \eqn{f_2}{f_2}. For this reason the models are usually fit by
penalized likelihood
maximization, in which the model (negative log) likelihood is modified by the addition of
a penalty for each smooth function, penalizing its `wiggliness'. To control the tradeoff
a penalty for each smooth function, penalizing its `wiggliness'. To control the trade-off
between penalizing wiggliness and penalizing badness of fit each penalty is multiplied by
an associated smoothing parameter: how to estimate these parameters, and
how to practically represent the smooth functions are the main statistical questions
......@@ -214,10 +214,10 @@ coefficient matrices for each smooth term in the model formula, obtaining a mode
the strictly parametric part of the model formula, and combining these to obtain a
complete model matrix (/design matrix) and a set of penalty matrices for the smooth terms.
The linear identifiability constraints are also obtained at this point. The model is
fit using \code{\link{gam.fit}}, \code{\link{gam.fit3}} or varaints, which are modifications
fit using \code{\link{gam.fit}}, \code{\link{gam.fit3}} or variants, which are modifications
of \code{\link{glm.fit}}. The GAM
penalized likelihood maximization problem is solved by Penalized Iteratively
Reweighted Least Squares (P-IRLS) (see e.g. Wood 2000).
Re-weighted Least Squares (P-IRLS) (see e.g. Wood 2000).
Smoothing parameter selection is possible in one of two ways. (i)
`Performance iteration' uses the fact that at each P-IRLS step a working penalized linear model
is estimated, and the smoothing parameter estimation can be performed for each such working model.
......
......@@ -14,7 +14,7 @@ see \code{gamm4} from package \code{gamm4}.
Smooths are specified as in a call to \code{\link{gam}} as part of the fixed
effects model formula, but the wiggly components of the smooth are treated as
random effects. The random effects structures and correlation structures
availabale for \code{lme} are used to specify other random effects and
available for \code{lme} are used to specify other random effects and
correlations.
It is assumed that the random effects and correlation structures are employed
......@@ -58,7 +58,7 @@ is a GEE approach to correlation in the generalized case. See examples below.}
\item{family}{A \code{family} as used in a call to \code{\link{glm}} or \code{\link{gam}}.
The default \code{gaussian} with identity link causes \code{gamm} to fit by a direct call to
\code{\link[nlme]{lme}} procided there is no offset term, otherwise
\code{\link[nlme]{lme}} provided there is no offset term, otherwise
\code{gammPQL} is used.}
\item{data}{ A data frame or list containing the model response variable and
......@@ -195,7 +195,7 @@ Wang, Y. (1998) Mixed effects smoothing spline analysis of variance. J.R. Statis
\section{WARNINGS }{
\code{gamm} was a somewhat different argument list to \code{\link{gam}},
\code{gamm} has a somewhat different argument list to \code{\link{gam}},
\code{gam} arguments such as \code{gamma} supplied to \code{gamm} will
just be ignored.
......
......@@ -11,15 +11,15 @@ regression terms. This method can be used with \code{\link{gam}} by making use o
\code{\link{smooth.construct.re.smooth.spec}}, for full details. The basic idea is that, e.g., \code{s(x,z,g,bs="re")} generates an i.i.d. Gaussian
random effect with model matrix given by \code{model.matrix(~x:z:g-1)} --- in principle such terms can take any number of arguments. This simple
approach is sufficient for implementing a wide range of commonly used random effect structures. For example if \code{g} is a factor then
\code{s(g,bs="re")} produces a random coefficient for each level of \code{g}, with the random coefficients all modelled as i.i.d. normal. If \code{g} is a factor and \code{x} is numeric, then \code{s(x,g,bs="re")} produces an i.i.d. normal random slope relating the response to \code{x} for each level of \code{g}. If \code{h} is another factor then \code{s(h,g,bs="re")} produces the usual i.i.d. normal \code{g} - \code{h} interaction. Note that a rather useful approximate test for zero random effect is also implemented for tsuch terms based on Wood (2013). If the precision
\code{s(g,bs="re")} produces a random coefficient for each level of \code{g}, with the random coefficients all modelled as i.i.d. normal. If \code{g} is a factor and \code{x} is numeric, then \code{s(x,g,bs="re")} produces an i.i.d. normal random slope relating the response to \code{x} for each level of \code{g}. If \code{h} is another factor then \code{s(h,g,bs="re")} produces the usual i.i.d. normal \code{g} - \code{h} interaction. Note that a rather useful approximate test for zero random effect is also implemented for such terms based on Wood (2013). If the precision
matrix is known to within a multiplicative constant, then this can be supplied via the \code{xt} argument of \code{s}. See \link{smooth.construct.re.smooth.spec} for details and example.
Alternatively, but less straightforwardly, the \code{paraPen} argument to \code{\link{gam}} can be used:
see \code{\link{gam.models}}. If smoothing parameter estimation is by ML or REML (e.g. \code{gam(...,method="REML")}) then this approach is
a completely conventional likelihood based treatment of random effects.
\code{gam} can be slow for fitting models with large numbers of random effects, because it does not exploit the sparcity that is often a feature
of parametric random effects. It can not be used for models with more coefficients than data. However \code{gam} is often faster and more relaiable
\code{gam} can be slow for fitting models with large numbers of random effects, because it does not exploit the sparsity that is often a feature
of parametric random effects. It can not be used for models with more coefficients than data. However \code{gam} is often faster and more reliable
than \code{gamm} or \code{gamm4}, when the number of random effects is modest.
To facilitate the use of random effects with \code{gam}, \code{\link{gam.vcomp}} is a utility routine for converting
......
......@@ -20,7 +20,7 @@ for factor smooth interactions (including interactions of tensor product smooths
\arguments{
\item{object}{For the \code{smooth.construct} method a smooth specification object,
usually generated by a term \code{s(x,...,bs="fs",)}. For the \code{predict.Matrix} method
usually generated by a term \code{s(x,...,bs="fs",)}. May have a \code{gamm} attribute: see details. For the \code{predict.Matrix} method
an object of class \code{"fs.interaction"} produced by the \code{smooth.construct} method.}
\item{data}{a list containing just the data (including any \code{by} variable) required by this term,
......@@ -30,7 +30,7 @@ an object of class \code{"fs.interaction"} produced by the \code{smooth.construc
}
\value{ An object of class \code{"fs.interaction"} or a matrix mapping the coefficients of the factor smooth interaction to the smooths themselves.
\value{ An object of class \code{"fs.interaction"} or a matrix mapping the coefficients of the factor smooth interaction to the smooths themselves. The contents of an \code{"fs.interaction"} object will depend on whether or not \code{smooth.construct} was called with an object with attribute \code{gamm}: see below.
}
\details{This class produces a smooth for each level of a single factor variable. Within a \code{\link{gam}}
......@@ -42,10 +42,13 @@ The class is particularly useful for use with \code{\link{gamm}}, where estimati
the nesting of the smooth within the factor. Note however that: i) \code{gamm} only allows one conditioning
factor for smooths, so \code{s(x)+s(z,fac,bs="fs")+s(v,fac,bs="fs")} is OK, but \code{s(x)+s(z,fac1,bs="fs")+s(v,fac2,bs="fs")}
is not; ii) all aditional random effects and correlation structures will be treated as nested within the factor
of the smooth factor interaction.
of the smooth factor interaction. To facilitate this the constructor is called from \code{\link{gamm}} with an attribute
\code{"gamm"} attached to the smooth specification object. The result differs from that resulting from the case where this is
not done.
Note that \code{gamm4} from the {\code{gamm4}} package suffers from none of the restrictions that apply to \code{gamm}, and
\code{"fs"} terms can be used without side-effects.
\code{"fs"} terms can be used without side-effects. Construcor is still called with a smooth specification object having a
\code{"gamm"} attribute.
Any singly penalized basis can be used to smooth at each factor level. The default is \code{"tp"}, but alternatives can
be supplied in the \code{xt} argument of \code{s} (e.g. \code{s(x,fac,bs="fs",xt="cr")} or
......
......@@ -20,13 +20,16 @@ smooth2random(object,vnames,type=1)
}
\details{There is a duality between smooths and random effects which means that smooths can be estimated using mixed modelling software. This function converts standard \code{mgcv} smooth objects to forms suitable for estimation by \code{lme}, for example. A service routine for \code{\link{gamm}} exported for use by package developers.
\details{There is a duality between smooths and random effects which means that smooths can be estimated using mixed modelling software. This function converts standard \code{mgcv} smooth objects to forms suitable for estimation by \code{lme}, for example. A service routine for \code{\link{gamm}} exported for use by package developers. See examples for creating prediction matrices for new data, corresponding to the random and fixed effect matrices returned when \code{type=2}.
}
\value{A list.
\item{rand}{ a list of random effects, including grouping factors, and
a fixed effects matrix. Grouping factors, model matrix and model matrix name attached as attributes, to each element.}
a fixed effects matrix. Grouping factors, model matrix and model
matrix name attached as attributes, to each element. Alternatively, for \code{type=2}
a list of random effect model matrices, each corresponding to an i.i.d. Gaussian
random effect with a single variance component.}
\item{trans.D}{A vector, trans.D, that transforms coefs, in order [rand1, rand2,... fix] back to original parameterization. If null, then taken as vector of ones. \code{b.original = trans.U \%*\% (trans.D*b.fit)}.}
......@@ -56,11 +59,57 @@ and Hall/CRC Press.
\seealso{ \code{\link{gamm}} }
\examples{
## Simple type 1 'lme' style...
library(mgcv)
x <- runif(30)
sm <- smoothCon(s(x),data.frame(x=x))[[1]]
smooth2random(sm,"")
## Now type 2 'lme4' style...
z <- runif(30)
dat <- data.frame(x=x,z=z)
sm <- smoothCon(t2(x,z),dat)[[1]]
re <- smooth2random(sm,"",2)
str(re)
## For prediction after fitting we might transform parameters back to
## original parameterization using 'rind', 'trans.D' and 'trans.U',
## and call PredictMat(sm,newdata) to get the prediction matrix to
## multiply these transformed parameters by.
## Alternatively we could obtain fixed and random effect Prediction
## matrices corresponding to the results from smooth2random, which
## can be used with the fit parameters without transforming them.
## The following shows how...
s2rPred <- function(sm,re,data) {
## Function to aid prediction from smooths represented as type==2
## random effects. re must be the result of smooth2random(sm,...,type=2).
X <- PredictMat(sm,data) ## get prediction matrix for new data
## transform to r.e. parameterization
if (!is.null(re$trans.U)) X <- X\%*\%re$trans.U
X <- t(t(X)*re$trans.D)
## re-order columns according to random effect re-ordering...
X[,re$rind] <- X[,re$pen.ind!=0]
## re-order penalization index in same way
pen.ind <- re$pen.ind; pen.ind[re$rind] <- pen.ind[pen.ind>0]
## start return object...
r <- list(rand=list(),Xf=X[,which(re$pen.ind==0),drop=FALSE])
for (i in 1:length(re$rand)) { ## loop over random effect matrices
r$rand[[i]] <- X[,which(pen.ind==i),drop=FALSE]
attr(r$rand[[i]],"s.label") <- attr(re$rand[[i]],"s.label")
}
names(r$rand) <- names(re$rand)
r
} ## s2rPred
## use function to obtain prediction random and fixed effect matrices
## for first 10 elements of 'dat'. Then confirm that these match the
## first 10 rows of the original model matrices, as they should...
r <- s2rPred(sm,re,dat[1:10,])
range(r$Xf-re$Xf[1:10,])
range(r$rand[[1]]-re$rand[[1]][1:10,])
}
\keyword{models} \keyword{smooth} \keyword{regression}%-- one or more ...
......
This diff is collapsed.
This diff is collapsed.
......@@ -29,6 +29,7 @@ R_CMethodDef CEntries[] = {
{"tri_chol",(DL_FUNC) tri_chol,4},
{"diagXVXt", (DL_FUNC) &diagXVXt,16},
{"XWXd", (DL_FUNC) &XWXd,18},
{"XWXd0", (DL_FUNC) &XWXd0,18},
{"XWyd", (DL_FUNC) &XWyd,18},
{"Xbd", (DL_FUNC) &Xbd,15},
{"vcorr", (DL_FUNC) &vcorr, 5},
......
This diff is collapsed.
......@@ -20,12 +20,20 @@
/* sed -i 's/old-text/new-text/g' *.c
is quite useful!!
*/
// For safe memory handling from R...
/* For safe memory handling from R... */
#define CALLOC R_chk_calloc
#define FREE R_chk_free
// Can reset to check for memory errors...
/* BUT, this can mess up valgrinding for memory error checking - problems are
sometimes missed because standard allocation is being circumvented. Then errors can
corrupt R memory management without detection and trigger nothing until R
messes up internally becuase of corruption, which then makes it look as if
R is generating the problem. Hence better to reset for checking. Also sizing
errors in .C often generate no obvious valgrind error.*/
//#define CALLOC calloc
//#define FREE free
void *R_chk_calloc1(size_t nmemb,size_t size);
void magic(double *y,double *X,double *sp0,double *def_sp,double *S,double *H,double *L,
double *lsp0,double *gamma,double *scale, int *control,int *cS,double *rank_tol,
double *tol,double *b,double *rV,double *norm_const,int *n_score,int *nt);
......@@ -81,6 +89,9 @@ void mvn_ll(double *y,double *X,double *XX,double *beta,int *n,int *lpi,
void XWXd(double *XWX,double *X,double *w,int *k,int *ks, int *m,int *p, int *n, int *nx,
int *ts, int *dt, int *nt,double *v,int *qc,int *nthreads,int *ar_stop,
int *ar_row,double *ar_weights);
void XWXd0(double *XWX,double *X,double *w,int *k,int *ks, int *m,int *p, int *n, int *nx,
int *ts, int *dt, int *nt,double *v,int *qc,int *nthreads,int *ar_stop,
int *ar_row,double *ar_weights);
void XWyd(double *XWy,double *y,double *X,double *w,int *k, int *ks, int *m,int *p, int *n,
int *nx, int *ts, int *dt, int *nt,double *v,int *qc,
int *ar_stop,int *ar_row,double *ar_weights);
......@@ -109,7 +120,9 @@ void MinimumSeparation(double *x,int *n, int *d,double *t,int *m,double *dist);
void rksos(double *x,int *n,double *eps);
void pivoter(double *x,int *r,int *c,int *pivot, int *col, int *reverse);
/* Routines for linear algebra with direct access to linpack and lapack */
/* Routines for linear algebra with direct access to linpack and lapack */
void row_squash(double *X,int rnew,int rold,int col);
void up2lo(double * A, int n);
void band_chol(double *B,int *n,int *k,int *info);
void tri_chol(double *ld,double *sd,int *n,int *info);
void mgcv_omp(int *a);
......
......@@ -21,8 +21,19 @@ USA. */
#include <math.h>
#include <R.h>
#include <Rmath.h>
#include <Rinternals.h>
#include <Rconfig.h>
#include "mgcv.h"
void *R_chk_calloc1(size_t nmemb,size_t size) {
/* checks for zero or negative memory allocation calls...*/
if (nmemb<=0) {
Rprintf("adjusting %d memory allocation\n",nmemb);
nmemb++;
}
return(R_chk_calloc(nmemb,size));
}
/* Compute reproducing kernel for spline on the sphere */
void rksos(double *x,int *n,double *eps) {
......@@ -676,7 +687,7 @@ void rwMatrix(int *stop,int *row,double *w,double *X,int *n,int *p,int *trans,do
w[stop[i-1]+1...stop[i]]. stop[-1]=-1 by convention.
stop is an n vector.
If (trans==0) the the operation on a column x is x'[i] += w[row[j]] * X[row[j]] over the
If (trans==0) the operation on a column x is x'[i] += w[row[j]] * X[row[j]] over the
j from stop[i-1]+1 to stop[i]. Otherwise the tranposed operation
x'[row[j]] += w[row[j]] * x[i] is used with the same j range. x' zero at outset.
......
......@@ -765,7 +765,7 @@ SEXP Rkradius(SEXP kdr,SEXP Xr,SEXP xr,SEXP rr,SEXP offr) {
row of xr. off is an m+1 vector. Returns a vector ni such that ni[off[i]:(off[i+1]-1)]
contains the indices (rows) in Xr of the neighbours of the ith row of xr.
*/
double *X,*x,*dis,*r,*xx,*ddat;
double *X,*x,*r,*xx,*ddat;
kdtree_type *kd;
int *dim,m,d,*off,*nei,*list,nn,i,j,n_buff=0,nlist,*ni,nprot=1,*idat;
SEXP DIM,ptr,neir,IDAT;
......
......@@ -20,8 +20,10 @@ tdpois <- function(dat,event="z",et="futime",t="day",status="status1",
require(utils) ## for progress bar
te <- sort(unique(dat[[et]][dat[[status]]==1])) ## event times
sid <- unique(dat[[id]])
prg <- txtProgressBar(min = 0, max = length(sid), initial = 0,
char = "=",width = NA, title="Progress", style = 3)
inter <- interactive()
if (inter) prg <-
txtProgressBar(min = 0, max = length(sid), initial = 0,
char = "=",width = NA, title="Progress", style = 3)
## create dataframe for poisson model data
dat[[event]] <- 0; start <- 1
dap <- dat[rep(1:length(sid),length(te)),]
......@@ -35,9 +37,9 @@ tdpois <- function(dat,event="z",et="futime",t="day",status="status1",
um[[et]] <- tr ## reset time to relevant event times
dap[start:(start-1+nrow(um)),] <- um ## copy to dap
start <- start + nrow(um)
setTxtProgressBar(prg, i)
if (inter) setTxtProgressBar(prg, i)
}
close(prg)
if (inter) close(prg)
dap[1:(start-1),]
} ## tdpois
pbcseq$status1 <- as.numeric(pbcseq$status==2) ## death indicator
......