Commit a1ae9b9e authored by Dirk Eddelbuettel's avatar Dirk Eddelbuettel

Import Upstream version 1.8-22

parent 1b84d0b8
......@@ -4,6 +4,28 @@
Currently deprecated and liable to be removed:
- gam performance iteration (1.8-19, Sept 2017)
1.8-22
* Fix of bug whereby testing for OpenMP and nthreads>1 in bam, would fail if
OpenMP was missing.
1.8-21
* When functions were added to families within mgcv some very large
environments could end up attached to those functions, for no good reason.
The problem originated from the dispatch of the generic 'fix.family.link'
and then propagated via fix.family.var and fix.family.ls. This is now avoided,
resulting in smaller gam objects on disk and lower R memory usage.
Thanks to Niels Richard Hansen for uncovering this.
* Another bug fix for prediction from discrete fit bam models with an offset,
this time when there were more than 50000 data. Also fix to bam fitting when
the number of data was an integer multiple of the chunk size + 1.
* check.term was missing a 'stop' so that some unhandled nesting structures
in bam(...,discrete=TRUE) failed with an unhelpful error, instead of a
helpful one. Fixed.
1.8-20
* bam(,discrete=TRUE) could produce garbage with ti(x,z,k=c(6,5),mc=c(T,F))
......
Package: mgcv
Version: 1.8-20
Version: 1.8-22
Author: Simon Wood <simon.wood@r-project.org>
Maintainer: Simon Wood <simon.wood@r-project.org>
Title: Mixed GAM Computation Vehicle with Automatic Smoothness
......@@ -18,6 +18,6 @@ LazyLoad: yes
ByteCompile: yes
License: GPL (>= 2)
NeedsCompilation: yes
Packaged: 2017-09-09 07:02:31 UTC; sw283
Packaged: 2017-09-18 10:38:41 UTC; sw283
Repository: CRAN
Date/Publication: 2017-09-09 16:26:09 UTC
Date/Publication: 2017-09-19 00:27:56 UTC
c26bec397cf6ee2a88d4c295945b8cab *ChangeLog
ac713195702fe496609fec839dbdcc9f *DESCRIPTION
bc8bc8be9cec065f0c7ec0cf670516f2 *ChangeLog
ad2a852c227dd80d86b24784d12dba42 *DESCRIPTION
eb723b61539feef013de476e68b5c50a *GPL-2
3d1bd78b1a6f876c1160b7e64ea79cc0 *NAMESPACE
043efd04d042fba89ddd6781c0220ab9 *R/bam.r
9fa89dc9361930dca536d4e011e4e605 *NAMESPACE
b051474f30e8e779f2440e8ecd5bbd51 *R/bam.r
7b419683b0948cf6da009f614078fe90 *R/coxph.r
777a0d67a1f7fa14bf87bc312064061b *R/efam.r
f610b2695c525c2d9d8f8174e6760548 *R/fast-REML.r
4dc75c18c9feceb73d207b8ac01b2f5f *R/gam.fit3.r
60aa872ec8901e4dc6273e653f77c83f *R/gam.fit4.r
dfdb821247da3e780de0d4599b88735d *R/fast-REML.r
0a9cc97524994d1563b93769284266c4 *R/gam.fit3.r
f0fd93b43b36abf05e6666b4a71c20fa *R/gam.fit4.r
1b620b840ca67c9759639bd6e53824e3 *R/gam.sim.r
b99d9de0c46a230e081b9bbbbbb3c617 *R/gamlss.r
c934ba653cb1a904132b3f15a01e41c5 *R/gamm.r
10facb791e4cfd123d183f05660119c6 *R/jagam.r
48c3f7080c31084058aaf3de06015e06 *R/mgcv.r
b9084037d058ef5d787e2818fbc29bae *R/mgcv.r
2feca0dc9d354de7bc707c67a5891364 *R/misc.r
03772998ab05887f2eeae10dd6efe983 *R/mvam.r
ca13f63d1112d31258eb3c68464fa653 *R/plots.r
......@@ -42,7 +42,7 @@ fd0cfd64be579f9fbecdbb7f2b8ec1eb *man/Sl.initial.repara.Rd
60670020425f8749b81a8d8c3f168880 *man/Sl.setup.Rd
69ae63833438a3af2963e39482f1d72f *man/Tweedie.Rd
8087ab00d10b44c99c37f49bf90e19cd *man/anova.gam.Rd
a7e0ce83164f1e34d0a9d99d0f9853f3 *man/bam.Rd
6180a4e9ea206e1a350b993dade0a869 *man/bam.Rd
ab5e37c3bf8803de63b63c3bdc5909cd *man/bam.update.Rd
cf5f1ee0aab639c7c4b9b357434f15b2 *man/bandchol.Rd
745cbf31eb14fc1c5916fc634c74d998 *man/bug.reports.mgcv.Rd
......@@ -151,7 +151,7 @@ b45d8e71bda4ceca8203dffea577e441 *man/smooth.construct.re.smooth.spec.Rd
0bfe981f2c3e6ea5b8d5372076ccde53 *man/smooth.construct.sos.smooth.spec.Rd
3cb4e59f915c8d64b90754eaeeb5a86f *man/smooth.construct.t2.smooth.spec.Rd
8672633a1fad8df3cb1f53d7fa883620 *man/smooth.construct.tensor.smooth.spec.Rd
a088e69cf148d07a78ce95a69759c95c *man/smooth.construct.tp.smooth.spec.Rd
c522c270c217e5b83cf8f3e95220a03f *man/smooth.construct.tp.smooth.spec.Rd
ae5e27524e37d57505754639455f18a5 *man/smooth.terms.Rd
4c49358e5e6a70d1eca69a2ccaa77609 *man/smooth2random.Rd
844f9653d74441293d05a24dd3e2876a *man/smoothCon.Rd
......@@ -185,19 +185,19 @@ ed7cb61912e4990cb0076d4cdcf11da8 *po/pl.po
03972284b3400cf82cacd5d2dc4b8cb3 *src/Makevars
342aa30c8f6f1e99ffa2576a6f29d7ce *src/coxph.c
0d723ffa162b4cb0c2c0fa958ccb4edd *src/discrete.c
dba99f7d7cc412dd9255f6307c8c7fa7 *src/gdi.c
f6c4ba80ce1b71be7f4a44fac1b08c28 *src/gdi.c
2436f9b328e80370ce2203dbf1dd813c *src/general.h
6ea9eff0b4b804a4e3ebf830c0ecacb9 *src/init.c
a287f92033406194afe0301ee076fda8 *src/init.c
a9151b5236852eef9e6947590bfcf88a *src/magic.c
654ff83187dc0f7ef4e085f3348f70d2 *src/mat.c
0545dabf3524a110d616ea5e6373295d *src/matrix.c
6b781cbd5b9cfee68ad30bb7ce31ef3a *src/matrix.h
48cde0e19d5dd54b131ba66c777c0ec2 *src/mgcv.c
f2eacd39b434ddef7a3bf9d53fb6fe77 *src/mgcv.h
e4cef7f1753153fbab242d1c4d4f7e7f *src/matrix.c
de37b0972199b796654405efc007f25b *src/matrix.h
8df4b96961491d76989b50856237ee2d *src/mgcv.c
b6278c9a4a32bdc16487a96e6a0045b6 *src/mgcv.h
97e3717e95a70b1470b4c3071e144d17 *src/misc.c
465b8790ca2dfb6e8c5635cacabf5460 *src/mvn.c
8f480dc455f9ff011c3e8f059efec2c5 *src/qp.c
cd563899be5b09897d1bf36a7889caa0 *src/qp.h
563938b7bb6504ab10df5376c4360220 *src/qp.c
073a4b5b0bc6e869c5b35478c47facf1 *src/qp.h
d5673b88f6f3d85c62a1337f49abba24 *src/soap.c
dcac8c02b5f9be28d13efc834fc88d55 *src/sparse-smooth.c
fe0444bece322bc229e46b3d1c150779 *src/tprs.c
......
......@@ -85,7 +85,7 @@ influence,logLik,lm,mad,
make.link,median,model.frame,model.offset,model.matrix,nlm,
na.pass,napredict,na.omit,naresid,optim,pchisq,pnorm,pt,pf,
power,predict,printCoefmat,quantile,
qbeta,qbinom,qchisq,qnbinom,qgamma,qnorm,qpois,qqline,qqnorm,qqplot,
qbeta,qbinom,qcauchy,qchisq,qnbinom,qgamma,qnorm,qpois,qqline,qqnorm,qqplot,
reformulate,residuals,
rbeta,rbinom,rgamma,rnbinom,rnorm,rpois,runif,sd,
termplot,terms.formula,terms,uniroot,var,vcov,weights)
......
......@@ -109,8 +109,8 @@ qr.up <- function(arg) {
wt <- c(wt,w)
w <- sqrt(w)
## note assumption that nt=1 in following qr.update - i.e. each cluster node is strictly serial
if (b == 1) qrx <- qr.update(w*X[good,],w*z,use.chol=arg$use.chol)
else qrx <- qr.update(w*X[good,],w*z,qrx$R,qrx$f,qrx$y.norm2,use.chol=arg$use.chol)
if (b == 1) qrx <- qr.update(w*X[good,,drop=FALSE],w*z,use.chol=arg$use.chol)
else qrx <- qr.update(w*X[good,,drop=FALSE],w*z,qrx$R,qrx$f,qrx$y.norm2,use.chol=arg$use.chol)
rm(X);if(arg$gc.level>1) gc() ## X can be large: remove and reclaim
}
qrx$dev <- dev;qrx$wt <- wt;qrx$eta <- eta
......@@ -191,8 +191,8 @@ check.term <- function(term,rec) {
ii <- which(rec$vnames%in%term)
if (length(ii)) { ## at least one variable already discretized
if (length(term)==rec$d[min(ii)]) { ## dimensions match previous discretization
if (sum(!(term%in%rec$vnames[ii]))) ("bam can not discretize with this nesting structure")
else return(rec$ki[min(ii)]) ## all names match previous - retun index of previous
if (sum(!(term%in%rec$vnames[ii]))) stop("bam can not discretize with this nesting structure")
else return(rec$ki[min(ii)]) ## all names match previous - return index of previous
} else stop("bam can not discretize with this nesting structure")
} else return(0) ## no match
} ## check.term
......@@ -953,8 +953,8 @@ bgam.fit <- function (G, mf, chunk.size, gp ,scale ,gamma,method, coef=NULL,etas
wt[ind] <- w ## wt <- c(wt,w)
w <- sqrt(w)
## note that QR may be parallel using npt>1, even under serial accumulation...
if (b == 1) qrx <- qr.update(w*X[good,],w*z,use.chol=use.chol,nt=npt)
else qrx <- qr.update(w*X[good,],w*z,qrx$R,qrx$f,qrx$y.norm2,use.chol=use.chol,nt=npt)
if (b == 1) qrx <- qr.update(w*X[good,,drop=FALSE],w*z,use.chol=use.chol,nt=npt)
else qrx <- qr.update(w*X[good,,drop=FALSE],w*z,qrx$R,qrx$f,qrx$y.norm2,use.chol=use.chol,nt=npt)
rm(X);if(gc.level>1) gc() ## X can be large: remove and reclaim
}
if (use.chol) { ## post proc to get R and f...
......@@ -1855,7 +1855,7 @@ AR.resid <- function(rsd,rho=0,AR.start=NULL) {
bam <- function(formula,family=gaussian(),data=list(),weights=NULL,subset=NULL,na.action=na.omit,
offset=NULL,method="fREML",control=list(),select=FALSE,scale=0,gamma=1,knots=NULL,sp=NULL,
min.sp=NULL,paraPen=NULL,chunk.size=10000,rho=0,AR.start=NULL,discrete=FALSE,
cluster=NULL,nthreads=NA,gc.level=1,use.chol=FALSE,samfrac=1,coef=NULL,
cluster=NULL,nthreads=1,gc.level=1,use.chol=FALSE,samfrac=1,coef=NULL,
drop.unused.levels=TRUE,G=NULL,fit=TRUE,drop.intercept=NULL,...)
## Routine to fit an additive model to a large dataset. The model is stated in the formula,
......@@ -1894,7 +1894,7 @@ bam <- function(formula,family=gaussian(),data=list(),weights=NULL,subset=NULL,n
warning("discretization only available with fREML")
} else {
if (!is.null(cluster)) warning("discrete method does not use parallel cluster - use nthreads instead")
if (nthreads>1 && !mgcv.omp()) warning("openMP not available: single threaded computation only")
if (is.finite(nthreads) && nthreads>1 && !mgcv.omp()) warning("openMP not available: single threaded computation only")
}
}
if (inherits(family,"extended.family")) {
......
......@@ -474,7 +474,7 @@ Sl.addS <- function(Sl,A,rho) {
ind <- (Sl[[b]]$start:Sl[[b]]$stop)[Sl[[b]]$ind]
if (length(Sl[[b]]$S)==1) { ## singleton
B <- exp(rho[k]);diag <- -1
er <- .Call(C_mgcv_madi,A,B,ind,diag)
dummy <- .Call(C_mgcv_madi,A,B,ind,diag)
## diag(A)[ind] <- diag(A)[ind] + exp(rho[k]) ## penalty is identity times sp
k <- k + 1
} else {
......
......@@ -2195,79 +2195,71 @@ fix.family.link.extended.family <- function(fam) {
if (link=="identity") {
fam$g2g <- fam$g3g <- fam$g4g <-
function(mu) rep.int(0,length(mu))
return(fam)
}
if (link == "log") {
} else if (link == "log") {
fam$g2g <- function(mu) rep(-1,length(mu))
fam$g3g <- function(mu) rep(2,length(mu))
fam$g4g <- function(mu) rep(-6,length(mu))
return(fam)
}
if (link == "inverse") {
} else if (link == "inverse") {
## g'(mu) = -1/mu^2
fam$g2g <- function(mu) 2*mu ## g'' = 2/mu^3
fam$g3g <- function(mu) 6*mu^2 ## g''' = -6/mu^4
fam$g4g <- function(mu) 24*mu^3 ## g'''' = 24/mu^5
return(fam)
}
if (link == "logit") {
} else if (link == "logit") {
## g = log(mu/(1-mu)) g' = 1/(1-mu) + 1/mu = 1/(mu*(1-mu))
fam$g2g <- function(mu) mu^2 - (1-mu)^2 ## g'' = 1/(1 - mu)^2 - 1/mu^2
fam$g3g <- function(mu) 2*mu^3 + 2*(1-mu)^3 ## g''' = 2/(1 - mu)^3 + 2/mu^3
fam$g4g <- function(mu) 6*mu^4 - 6*(1-mu)^4 ## g'''' = 6/(1-mu)^4 - 6/mu^4
return(fam)
}
if (link == "sqrt") {
} else if (link == "sqrt") {
## g = sqrt(mu); g' = .5*mu^-.5
fam$g2g <- function(mu) - mu^-.5 ## g'' = -.25 * mu^-1.5
fam$g3g <- function(mu) 3 * mu^-1 ## g''' = .375 * mu^-2.5
fam$g4g <- function(mu) -15 * mu^-1.5 ## -0.9375 * mu^-3.5
return(fam)
}
if (link == "probit") {
} else if (link == "probit") {
## g(mu) = qnorm(mu); 1/g' = dmu/deta = 1/dnorm(eta)
fam$g2g <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qnorm(mu)
## g'' = eta/fam$mu.eta(eta)^2
eta
}
fam$g3g <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qnorm(mu)
## g''' = (1 + 2*eta^2)/fam$mu.eta(eta)^3
(1 + 2*eta^2)
}
fam$g4g <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qnorm(mu)
## g'''' = (7*eta + 6*eta^3)/fam$mu.eta(eta)^4
(7*eta + 6*eta^3)
}
return(fam)
} ## probit
if (link == "cauchit") {
} else if (link == "cauchit") {
## uses general result that if link is a quantile function then
## d mu / d eta = f(eta) where f is the density. Link derivative
## is one over this... repeated differentiation w.r.t. mu using chain
## rule gives results...
fam$g2g <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qcauchy(mu)
## g'' = 2*pi*pi*eta*(1+eta*eta)
eta/(1+eta*eta)
}
fam$g3g <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qcauchy(mu)
eta2 <- eta*eta
## g''' = 2*pi*pi*pi*(1+3*eta2)*(1+eta2)
(1+3*eta2)/(1+eta2)^2
}
fam$g4g <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qcauchy(mu)
eta2 <- eta*eta
## g'''' = 2*pi^4*(8*eta+12*eta2*eta)*(1+eta2)
((8+ 12*eta2)/(1+eta2)^2)*(eta/(1+eta2))
}
return(fam)
} ## cauchit
if (link == "cloglog") {
} else if (link == "cloglog") {
## g = log(-log(1-mu)), g' = -1/(log(1-mu)*(1-mu))
fam$g2g <- function(mu) { l1m <- log1p(-mu)
-l1m - 1
......@@ -2279,9 +2271,10 @@ fix.family.link.extended.family <- function(fam) {
l1m <- log1p(-mu)
-l1m*(l1m*(6*l1m+11)+12)-6
}
return(fam)
}
stop("link not implemented for extended families")
} else stop("link not implemented for extended families")
## avoid storing the calling environment of fix.family.link...
environment(fam$g2g) <- environment(fam$g3g) <- environment(fam$g4g) <- environment(fam$linkfun)
return(fam)
} ## fix.family.link.extended.family
fix.family.link.family <- function(fam)
......@@ -2300,58 +2293,52 @@ fix.family.link.family <- function(fam)
}
if (!is.null(fam$d2link)&&!is.null(fam$d3link)&&!is.null(fam$d4link)) return(fam)
link <- fam$link
if (length(link)>1) if (fam$family=="quasi") # then it's a power link
{ lambda <- log(fam$linkfun(exp(1))) ## the power, if > 0
if (lambda<=0) { fam$d2link <- function(mu) -1/mu^2
fam$d3link <- function(mu) 2/mu^3
fam$d4link <- function(mu) -6/mu^4
}
else { fam$d2link <- function(mu) lambda*(lambda-1)*mu^(lambda-2)
fam$d3link <- function(mu) (lambda-2)*(lambda-1)*lambda*mu^(lambda-3)
fam$d4link <- function(mu) (lambda-3)*(lambda-2)*(lambda-1)*lambda*mu^(lambda-4)
}
return(fam)
} else stop("unrecognized (vector?) link")
if (link=="identity") {
if (length(link)>1) {
if (fam$family=="quasi") # then it's a power link
{ lambda <- log(fam$linkfun(exp(1))) ## the power, if > 0
if (lambda<=0) { fam$d2link <- function(mu) -1/mu^2
fam$d3link <- function(mu) 2/mu^3
fam$d4link <- function(mu) -6/mu^4
} else { fam$d2link <- function(mu) lambda*(lambda-1)*mu^(lambda-2)
fam$d3link <- function(mu) (lambda-2)*(lambda-1)*lambda*mu^(lambda-3)
fam$d4link <- function(mu) (lambda-3)*(lambda-2)*(lambda-1)*lambda*mu^(lambda-4)
}
} else stop("unrecognized (vector?) link")
} else if (link=="identity") {
fam$d4link <- fam$d3link <- fam$d2link <-
function(mu) rep.int(0,length(mu))
return(fam)
}
if (link == "log") {
} else if (link == "log") {
fam$d2link <- function(mu) -1/mu^2
fam$d3link <- function(mu) 2/mu^3
fam$d4link <- function(mu) -6/mu^4
return(fam)
}
if (link == "inverse") {
} else if (link == "inverse") {
fam$d2link <- function(mu) 2/mu^3
fam$d3link <- function(mu) { mu <- mu*mu;-6/(mu*mu)}
fam$d4link <- function(mu) { mu2 <- mu*mu;24/(mu2*mu2*mu)}
return(fam)
}
if (link == "logit") {
} else if (link == "logit") {
fam$d2link <- function(mu) 1/(1 - mu)^2 - 1/mu^2
fam$d3link <- function(mu) 2/(1 - mu)^3 + 2/mu^3
fam$d4link <- function(mu) 6/(1-mu)^4 - 6/mu^4
return(fam)
}
if (link == "probit") {
} else if (link == "probit") {
fam$d2link <- function(mu) {
eta <- fam$linkfun(mu)
eta/fam$mu.eta(eta)^2
#eta <- fam$linkfun(mu)
eta <- qnorm(mu)
#eta/fam$mu.eta(eta)^2
eta/pmax(dnorm(eta), .Machine$double.eps)^2
}
fam$d3link <- function(mu) {
eta <- fam$linkfun(mu)
(1 + 2*eta^2)/fam$mu.eta(eta)^3
#eta <- fam$linkfun(mu)
eta <- qnorm(mu)
#(1 + 2*eta^2)/fam$mu.eta(eta)^3
(1 + 2*eta^2)/pmax(dnorm(eta), .Machine$double.eps)^3
}
fam$d4link <- function(mu) {
eta <- fam$linkfun(mu)
(7*eta + 6*eta^3)/fam$mu.eta(eta)^4
#eta <- fam$linkfun(mu)
eta <- qnorm(mu)
#(7*eta + 6*eta^3)/fam$mu.eta(eta)^4
(7*eta + 6*eta^3)/pmax(dnorm(eta), .Machine$double.eps)^4
}
return(fam)
}
if (link == "cloglog") {
} else if (link == "cloglog") {
## g = log(-log(1-mu)), g' = -1/(log(1-mu)*(1-mu))
fam$d2link <- function(mu) { l1m <- log1p(-mu)
-1/((1 - mu)^2*l1m) *(1+ 1/l1m)
......@@ -2365,52 +2352,59 @@ fix.family.link.family <- function(fam)
mu4 <- (1-mu)^4
( - 12 - 11 * l1m - 6 * l1m^2 - 6/l1m )/mu4 /l1m^3
}
return(fam)
}
if (link == "sqrt") {
} else if (link == "sqrt") {
fam$d2link <- function(mu) -.25 * mu^-1.5
fam$d3link <- function(mu) .375 * mu^-2.5
fam$d4link <- function(mu) -0.9375 * mu^-3.5
return(fam)
}
if (link == "cauchit") {
} else if (link == "cauchit") {
## uses general result that if link is a quantile function then
## d mu / d eta = f(eta) where f is the density. Link derivative
## is one over this... repeated differentiation w.r.t. mu using chain
## rule gives results...
fam$d2link <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qcauchy(mu)
2*pi*pi*eta*(1+eta*eta)
}
fam$d3link <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qcauchy(mu)
eta2 <- eta*eta
2*pi*pi*pi*(1+3*eta2)*(1+eta2)
}
fam$d4link <- function(mu) {
eta <- fam$linkfun(mu)
#eta <- fam$linkfun(mu)
eta <- qcauchy(mu)
eta2 <- eta*eta
2*pi^4*(8*eta+12*eta2*eta)*(1+eta2)
}
return(fam)
}
if (link == "1/mu^2") {
} else if (link == "1/mu^2") {
fam$d2link <- function(mu) 6 * mu^-4
fam$d3link <- function(mu) -24 * mu^-5
fam$d4link <- function(mu) 120 * mu^-6
return(fam)
}
if (substr(link,1,3)=="mu^") { ## it's a power link
} else if (substr(link,1,3)=="mu^") { ## it's a power link
## note that lambda <=0 gives log link so don't end up here
lambda <- get("lambda",environment(fam$linkfun))
fam$d2link <- function(mu) (lambda*(lambda-1)) * mu^{lambda-2}
fam$d3link <- function(mu) (lambda*(lambda-1)*(lambda-2)) * mu^{lambda-3}
fam$d4link <- function(mu) (lambda*(lambda-1)*(lambda-2)*(lambda-3)) * mu^{lambda-4}
return(fam)
}
stop("link not recognised")
} else stop("link not recognised")
## avoid giant environments being stored....
environment(fam$d2link) <- environment(fam$d3link) <- environment(fam$d4link) <- environment(fam$linkfun)
return(fam)
} ## fix.family.link.family
## NOTE: something horrible can happen here. The way method dispatch works, the
## environment attached to functions created in fix.family.link is the environment
## from which fix.family.link was called - and this whole environment is stored
## with the created function - in the gam context that means the model matrix is
## stored invisibly away for no useful purpose at all. pryr:::object_size will
## show the true stored size of an object with hidden environments. But environments
## of functions created in method functions should be set explicitly to something
## harmless (see ?environment for some possibilities, empty is rarely a good idea)
## 9/2017
fix.family.link <- function(fam) UseMethod("fix.family.link")
fix.family.var <- function(fam)
......@@ -2424,28 +2418,20 @@ fix.family.var <- function(fam)
fam$scale <- -1
if (family=="gaussian") {
fam$d3var <- fam$d2var <- fam$dvar <- function(mu) rep.int(0,length(mu))
return(fam)
}
if (family=="poisson"||family=="quasipoisson") {
} else if (family=="poisson"||family=="quasipoisson") {
fam$dvar <- function(mu) rep.int(1,length(mu))
fam$d3var <- fam$d2var <- function(mu) rep.int(0,length(mu))
if (family=="poisson") fam$scale <- 1
return(fam)
}
if (family=="binomial"||family=="quasibinomial") {
} else if (family=="binomial"||family=="quasibinomial") {
fam$dvar <- function(mu) 1-2*mu
fam$d2var <- function(mu) rep.int(-2,length(mu))
fam$d3var <- function(mu) rep.int(0,length(mu))
if (family=="binomial") fam$scale <- 1
return(fam)
}
if (family=="Gamma") {
} else if (family=="Gamma") {
fam$dvar <- function(mu) 2*mu
fam$d2var <- function(mu) rep.int(2,length(mu))
fam$d3var <- function(mu) rep.int(0,length(mu))
return(fam)
}
if (family=="quasi") {
} else if (family=="quasi") {
fam$dvar <- switch(fam$varfun,
constant = function(mu) rep.int(0,length(mu)),
"mu(1-mu)" = function(mu) 1-2*mu,
......@@ -2468,15 +2454,13 @@ fix.family.var <- function(fam)
"mu^2" = function(mu) rep.int(0,length(mu)),
"mu^3" = function(mu) rep.int(6,length(mu))
)
return(fam)
}
if (family=="inverse.gaussian") {
} else if (family=="inverse.gaussian") {
fam$dvar <- function(mu) 3*mu^2
fam$d2var <- function(mu) 6*mu
fam$d3var <- function(mu) rep.int(6,length(mu))
return(fam)
}
stop("family not recognised")
} else stop("family not recognised")
environment(fam$dvar) <- environment(fam$d2var) <- environment(fam$d3var) <- environment(fam$linkfun)
return(fam)
} ## fix.family.var
......@@ -2491,23 +2475,17 @@ fix.family.ls<-function(fam)
nobs <- sum(w>0)
c(-nobs*log(2*pi*scale)/2 + sum(log(w[w>0]))/2,-nobs/(2*scale),nobs/(2*scale*scale))
}
return(fam)
}
if (family=="poisson") {
} else if (family=="poisson") {
fam$ls <- function(y,w,n,scale) {
res <- rep(0,3)
res[1] <- sum(dpois(y,y,log=TRUE)*w)
res
}
return(fam)
}
if (family=="binomial") {
} else if (family=="binomial") {
fam$ls <- function(y,w,n,scale) {
c(-binomial()$aic(y,n,y,w,0)/2,0,0)
}
return(fam)
}
if (family=="Gamma") {
} else if (family=="Gamma") {
fam$ls <- function(y,w,n,scale) {
res <- rep(0,3)
y <- y[w>0];w <- w[w>0]
......@@ -2520,26 +2498,22 @@ fix.family.ls<-function(fam)
res[3] <- sum(k/w^2)
res
}
return(fam)
}
if (family=="quasi"||family=="quasipoisson"||family=="quasibinomial") {
} else if (family=="quasi"||family=="quasipoisson"||family=="quasibinomial") {
## fam$ls <- function(y,w,n,scale) rep(0,3)
## Uses extended quasi-likelihood form...
fam$ls <- function(y,w,n,scale) {
nobs <- sum(w>0)
c(-nobs*log(scale)/2 + sum(log(w[w>0]))/2,-nobs/(2*scale),nobs/(2*scale*scale))
}
return(fam)
}
if (family=="inverse.gaussian") {
} else if (family=="inverse.gaussian") {
fam$ls <- function(y,w,n,scale) {
nobs <- sum(w>0)
c(-sum(log(2*pi*scale*y^3))/2 + sum(log(w[w>0]))/2,-nobs/(2*scale),nobs/(2*scale*scale))
## c(-sum(w*log(2*pi*scale*y^3))/2,-sum(w)/(2*scale),sum(w)/(2*scale*scale))
}
return(fam)
}
stop("family not recognised")
} else stop("family not recognised")
environment(fam$ls) <- environment(fam$linkfun)
return(fam)
} ## fix.family.ls
fix.family <- function(fam) {
......
......@@ -570,27 +570,7 @@ gam.fit4 <- function(x, y, sp, Eb,UrS=list(),
rSncol=as.integer(rSncol),deriv=as.integer(deriv),
fixed.penalty = as.integer(rp$fixed.penalty),nt=as.integer(control$nthreads),
type=as.integer(gdi.type),dVkk=as.double(rep(0,nSp^2)))
## test code used to ensure type 0 and type 1 produce identical results, when both should work.
# oot <- .C(C_gdi2,
# X=as.double(x[good,]),E=as.double(Sr),Es=as.double(Eb),rS=as.double(unlist(rS)),
# U1 = as.double(U1),sp=as.double(exp(sp)),theta=as.double(theta),
# z=as.double(z),w=as.double(w),wz=as.double(wz),wf=as.double(wf),Dth=as.double(dd$Dth),
# Det=as.double(dd$Deta),
# Det2=as.double(dd$Deta2),Dth2=as.double(dd$Dth2),Det.th=as.double(dd$Detath),
# Det2.th=as.double(dd$Deta2th),Det3=as.double(dd$Deta3),Det.th2 = as.double(dd$Detath2),
# Det4 = as.double(dd$Deta4),Det3.th=as.double(dd$Deta3th), Deta2.th2=as.double(dd$Deta2th2),
# beta=as.double(coef),b1=as.double(rep(0,ntot*ncol(x))),w1=rep(0,ntot*length(z)),
# D1=as.double(rep(0,ntot)),D2=as.double(rep(0,ntot^2)),
# P=as.double(0),P1=as.double(rep(0,ntot)),P2 = as.double(rep(0,ntot^2)),
# ldet=as.double(1-2*(scoreType=="ML")),ldet1 = as.double(rep(0,ntot)),
# ldet2 = as.double(rep(0,ntot^2)),
# rV=as.double(rep(0,ncol(x)^2)),
# rank.tol=as.double(.Machine$double.eps^.75),rank.est=as.integer(0),
# n=as.integer(sum(good)),q=as.integer(ncol(x)),M=as.integer(nSp),
# n.theta=as.integer(length(theta)), Mp=as.integer(Mp),Enrow=as.integer(rows.E),
# rSncol=as.integer(rSncol),deriv=as.integer(deriv),
# fixed.penalty = as.integer(rp$fixed.penalty),nt=as.integer(control$nthreads),
# type=as.integer(1))
rV <- matrix(oo$rV,ncol(x),ncol(x)) ## rV%*%t(rV)*scale gives covariance matrix
rV <- T %*% rV
## derivatives of coefs w.r.t. sps etc...
......
......@@ -97,8 +97,7 @@ pcls <- function(M)
# M$sp - array of theta_i's
# Ain, bin and p are not in the object needed to call mgcv....
#
{ nar<-c(length(M$y),length(M$p),dim(M$Ain)[1],dim(M$C)[1],0)
H<-0
{ nar<-c(length(M$y),length(M$p),dim(M$Ain)[1],dim(M$C)[1])
## sanity checking ...
if (nrow(M$X)!=nar[1]) stop("nrow(M$X) != length(M$y)")
if (ncol(M$X)!=nar[2]) stop("ncol(M$X) != length(M$p)")
......@@ -140,13 +139,12 @@ pcls <- function(M)
M$C <- matrix(0,0,0)
M$p <- rep(0,ncol(M$X))
nar[2] <- length(M$p)
nar[4] <- 0
qra.exist <- TRUE
if (ncol(M$X)>nrow(M$X)) stop("Model matrix not full column rank")
}
}
o<-.C(C_RPCLS,as.double(M$X),as.double(M$p),as.double(M$y),as.double(M$w),as.double(M$Ain),as.double(M$bin)
,as.double(M$C),as.double(H),as.double(Sa),as.integer(M$off),as.integer(df),as.double(M$sp),
,as.double(M$C),as.double(Sa),as.integer(M$off),as.integer(df),as.double(M$sp),
as.integer(length(M$off)),as.integer(nar))
p <- array(o[[2]],length(M$p))
if (qra.exist) p <- qr.qy(qra,c(rep(0,j),p))
......@@ -2679,6 +2677,8 @@ predict.gam <- function(object,newdata,type="link",se.fit=FALSE,terms=NULL,exclu
nb <- length(object$coefficients)
}
if (type=="lpmatrix") block.size <- NULL ## nothing gained by blocking in this case - and offset handling easier this way
## split prediction into blocks, to avoid running out of memory
if (is.null(block.size)) {
## use one block as predicting using model frame
......
......@@ -21,7 +21,7 @@ bam(formula,family=gaussian(),data=list(),weights=NULL,subset=NULL,
na.action=na.omit, offset=NULL,method="fREML",control=list(),
select=FALSE,scale=0,gamma=1,knots=NULL,sp=NULL,min.sp=NULL,
paraPen=NULL,chunk.size=10000,rho=0,AR.start=NULL,discrete=FALSE,
cluster=NULL,nthreads=NA,gc.level=1,use.chol=FALSE,samfrac=1,
cluster=NULL,nthreads=1,gc.level=1,use.chol=FALSE,samfrac=1,
coef=NULL,drop.unused.levels=TRUE,G=NULL,fit=TRUE,drop.intercept=NULL,...)
}
%- maybe also `usage' for other objects documented here.
......@@ -135,7 +135,7 @@ single machine). See details and example code.
}
\item{nthreads}{Number of threads to use for non-cluster computation (e.g. combining results from cluster nodes).
if \code{NA} set to \code{max(1,length(cluster))}.}
If \code{NA} set to \code{max(1,length(cluster))}.}
\item{gc.level}{to keep the memory footprint down, it helps to call the garbage collector often, but this takes
a substatial amount of time. Setting this to zero means that garbage collection only happens when R decides it should. Setting to 2 gives frequent garbage collection. 1 is in between.}
......
......@@ -7,9 +7,9 @@
\description{\code{\link{gam}} can use isotropic smooths of any number of variables, specified via terms like
\code{s(x,z,bs="tp",m=3)} (or just \code{s(x,z)} as this is the default basis). These terms are based on thin plate
regression splines. \code{m} specifies the order of the derivatives in the thin plate spline penalty. If \code{m} is
a vector of length 2 and the second element is zero, then the penalty null space of the smooth is not included in
the smooth: this is useful if you need to test whether a smooth could be replaced by a linear term, for example.
regression splines. \code{m} specifies the order of the derivatives in the thin plate spline penalty.
If \code{m} is a vector of length 2 and the second element is zero, then the penalty null space of the smooth is not included in the smooth: this is useful if you need to test whether a smooth could be replaced by a linear term, or construct models with odd nesting structures.
Thin plate regression splines are constructed by starting with the
basis and penalty for a full thin plate spline and then truncating this basis in
......@@ -117,11 +117,23 @@ Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B 65(1):95-114
\examples{
require(mgcv); n <- 100; set.seed(2)
x <- runif(n); y <- x + x^2*.2 + rnorm(n) *.1
## is smooth significantly different from straight line?
summary(gam(y~s(x,m=c(2,0))+x,method="REML")) ## not quite
## is smooth significatly different from zero?
summary(gam(y~s(x),method="REML")) ## yes!
## Fool bam(...,discrete=TRUE) into (strange) nested
## model fit...
set.seed(2) ## simulate some data...
dat <- gamSim(1,n=400,dist="normal",scale=2)