The R in this document reproduces the results in the lecture on multilevel logistic regression. The data are from Skrondal & Rabe-Hesketh (2004), which were also analyzed by Zeger & Karim (1991), Diggle et al. (2002), but originally from Sommer et al. (1983).

Setup

The package that we’ll use

library(lme4)
library(lmerTest)
library(gee)    # for generalized estimating equations
library(MASS)   # For confidence limits for parameters

And read in the data

resp <- read.table(file="D:/Dropbox/edps587/lectures/10 logreg/respitory.txt", header=TRUE)

head(resp)
##       id resp cons age xero cosine sine female height stunted time age1 season
## 1 121013    0    1  31    0     -1    0      0     -3       0    1   31      2
## 2 121013    0    1  34    0      0   -1      0     -3       0    2   31      3
## 3 121013    0    1  37    0      1    0      0     -2       0    3   31      4
## 4 121013    0    1  40    0      0    1      0     -2       0    4   31      1
## 5 121013    1    1  43    0     -1    0      0     -2       0    5   31      2
## 6 121013    0    1  46    0      0   -1      0     -3       0    6   31      3
##   time2
## 1     1
## 2     4
## 3     9
## 4    16
## 5    25
## 6    36

Logistic regression

This is the example shown on page 26, except that in notes I give Wald

# glm = Generalized Linear model
model1 <- glm(resp ~ 1 + age, data=resp, family=binomial)
summary(model1)
## 
## Call:
## glm(formula = resp ~ 1 + age, family = binomial, data = resp)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.6204  -0.4988  -0.3940  -0.3135   2.6238  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -2.343573   0.105297 -22.257  < 2e-16 ***
## age         -0.024795   0.005558  -4.461 8.15e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 721.45  on 1199  degrees of freedom
## Residual deviance: 699.69  on 1198  degrees of freedom
## AIC: 703.69
## 
## Number of Fisher Scoring iterations: 5

Note that z^2 above equals Wald statistics in the lecture notes. These are compared to N(0,1) to get the p-value.

The following puts 95% confidence levels on parameters (i.e., the coefficients or beta’s)

round(confint(model1, level = 0.95),digits=2)
## Waiting for profiling to be done...
##             2.5 % 97.5 %
## (Intercept) -2.56  -2.14
## age         -0.04  -0.01

For interpetation of the results

odds <- exp(coefficients(model1))
round(odds,digits=2)
## (Intercept)         age 
##        0.10        0.98
odds2 <- 1/odds
round(odds2,digits=2) 
## (Intercept)         age 
##       10.42        1.03

Out of curiosity and as a check on whether variance function of mean (i.e., binomial distribution OK).

model1b <- glm(resp ~ 1 + age, data=resp, family=quasibinomial)
summary(model1b)
## 
## Call:
## glm(formula = resp ~ 1 + age, family = quasibinomial, data = resp)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.6204  -0.4988  -0.3940  -0.3135   2.6238  
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.343573   0.103890 -22.558  < 2e-16 ***
## age         -0.024795   0.005484  -4.522 6.75e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasibinomial family taken to be 0.9734542)
## 
##     Null deviance: 721.45  on 1199  degrees of freedom
## Residual deviance: 699.69  on 1198  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 5

Since the dispersion paramters is is close to 1, binomial is probably just fine for these data.

Generalized Estimating Equations

This yields a population averge model

model.gee <- gee(resp ~1 + age, id, data=resp,family=binomial,corstr="exchangeable")
## Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
## running glm to get initial regression estimate
## (Intercept)         age 
## -2.34357294 -0.02479512
summary(model.gee)
## 
##  GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
##  gee S-function, version 4.13 modified 98/01/27 (1998) 
## 
## Model:
##  Link:                      Logit 
##  Variance to Mean Relation: Binomial 
##  Correlation Structure:     Exchangeable 
## 
## Call:
## gee(formula = resp ~ 1 + age, id = id, data = resp, family = binomial, 
##     corstr = "exchangeable")
## 
## Summary of Residuals:
##         Min          1Q      Median          3Q         Max 
## -0.17380518 -0.11711642 -0.07547880 -0.04895995  0.96704630 
## 
## 
## Coefficients:
##                Estimate  Naive S.E.    Naive z Robust S.E.   Robust z
## (Intercept) -2.33553443 0.112748751 -20.714504 0.113371780 -20.600668
## age         -0.02426996 0.005886584  -4.122928 0.005143234  -4.718814
## 
## Estimated Scale Parameter:  0.9661839
## Number of Iterations:  2
## 
## Working Correlation
##            [,1]       [,2]       [,3]       [,4]       [,5]       [,6]
## [1,] 1.00000000 0.04999101 0.04999101 0.04999101 0.04999101 0.04999101
## [2,] 0.04999101 1.00000000 0.04999101 0.04999101 0.04999101 0.04999101
## [3,] 0.04999101 0.04999101 1.00000000 0.04999101 0.04999101 0.04999101
## [4,] 0.04999101 0.04999101 0.04999101 1.00000000 0.04999101 0.04999101
## [5,] 0.04999101 0.04999101 0.04999101 0.04999101 1.00000000 0.04999101
## [6,] 0.04999101 0.04999101 0.04999101 0.04999101 0.04999101 1.00000000

Random effects logistic regression via glmer

The command glmer is an option in the lme4 package.

LaPlace

Now for the multilevel model or random effects model using the default estimation method (i.e., LaPlace)

mod1.laplace <- glmer(resp ~ 1 + age  + (1 | id), data=resp, family=binomial)
summary(mod1.laplace)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: resp ~ 1 + age + (1 | id)
##    Data: resp
## 
##      AIC      BIC   logLik deviance df.resid 
##    696.7    711.9   -345.3    690.7     1197 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.8165 -0.2981 -0.2309 -0.1770  4.8868 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  id     (Intercept) 0.8949   0.946   
## Number of obs: 1200, groups:  id, 275
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -2.676724   0.184338 -14.521  < 2e-16 ***
## age         -0.026678   0.006735  -3.961 7.47e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##     (Intr)
## age 0.163

MLE: quass quadrature

If you want MLE, glmer uses Guass quadrature — MLE gold standard. glmer will use adaptive guass quadrature and the number of quadrature points is indicated by nAGQ. These results are reported ~ page 50,

mod1.quad <- glmer(resp ~ 1 + age +  (1 | id),data=resp, family=binomial, nAGQ=10 )
summary(mod1.quad)
## Generalized linear mixed model fit by maximum likelihood (Adaptive
##   Gauss-Hermite Quadrature, nAGQ = 10) [glmerMod]
##  Family: binomial  ( logit )
## Formula: resp ~ 1 + age + (1 | id)
##    Data: resp
## 
##      AIC      BIC   logLik deviance df.resid 
##    698.6    713.8   -346.3    692.6     1197 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.7688 -0.3095 -0.2392 -0.1847  5.0157 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  id     (Intercept) 0.7273   0.8528  
## Number of obs: 1200, groups:  id, 275
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -2.612985   0.172299 -15.165  < 2e-16 ***
## age         -0.026760   0.006592  -4.059 4.92e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##     (Intr)
## age 0.195

Computing the confidence intervals take a bit of time. The following gives 95% profile confidence intervals for all parameters

round(confint(mod1.quad),digits=4)
## Computing profile confidence intervals ...
##               2.5 %  97.5 %
## .sig01       0.3993  1.2727
## (Intercept) -2.9887 -2.3059
## age         -0.0403 -0.0142

For interpretations, we use the odds

odds <- exp(fixef(mod1.quad))
round(odds,digits=2)
## (Intercept)         age 
##        0.07        0.97

Extract R script from this document

This is not run here, but you can enter it into your console with correct path and file name.

setwd("D:/Dropbox/edps587/lectures/10 logreg")
knitr::purl("Rmarkdown_respitory_example.Rmd","R_script_respitory_example.txt")