There are multiple way/commands/functions for computing residuals in R when fitting glms. The one that I think that give ajusted standard residuals is rstandard(model, type=). To make sure, I wrote a function to compute these residuals so that I can compare results.

Pearson residuals are asymptotically \(\sim N(0,1)\); whereas, the ajusted standardized (Haberman) residuals are \(\sim N(0,1)\) and thus preferable.

See bottom of page 81 of Agresti (2019). An Introduction to Categorical Data Analysis, 3rd edition. Since I couldn’t find an R function to compute these, I wrote a small and easy to use function. The function is below and example usage.

We’ll the gss data from lecture 2 (on inference for two-way tables).

The Function for Haberman & Pearson Residuals

Input to the function is the model object and the dim equals the number of variables of the table (i.e., table dimension). The output will be a data frame with your original data, fitted values, h, pearson residuals and haberman residuals.

This function that I wrote should work on 2-way and higher-way tables. I also used this to check existing function to make sure that it give adjusted Standardized (Haberman) residualts.

stdresiduals <- function(object, dim){
  y <- object$data[, dim+1]
  h = hatvalues(object) 
  haberman <- (y - object$fitted.values) / sqrt(object$fitted.values*(1-h))
  pearson <-  (y - object$fitted.values)/ sqrt(object$fitted.values)
  results <- as.data.frame(cbind(object$data, object$fitted.values, h, pearson, haberman))
  return(results)
}

Example Usage

A useful function that we’ll use for viewing residuals (i.e., xtabs)

library(vcd)   # so that I can use xtabs
## Loading required package: grid

Fit A model

The Gss data from lecture notes:

( gss <- read.table("C:\\Users\\cja\\Dropbox\\edps 589\\2 Chi-square\\gss_data.txt", header=TRUE) )
##    fechld mapaid count
## 1       1      1    97
## 2       1      2    96
## 3       1      3    22
## 4       1      4    17
## 5       1      5     2
## 6       2      1   102
## 7       2      2   199
## 8       2      3    48
## 9       2      4    38
## 10      2      5     5
## 11      3      1    42
## 12      3      2   102
## 13      3      3    25
## 14      3      4    36
## 15      3      5     7
## 16      4      1     9
## 17      4      2    18
## 18      4      3     7
## 19      4      4    10
## 20      4      5     2

To correspond to lecture notes, fit a poisson regression model of independence (additive effects):

# -- Need factors
fechld.f <- as.factor(gss$fechld)
mapaid.f <- as.factor(gss$mapaid)
# -- Fit model of independence
glm0 <- glm(count ~ fechld.f + mapaid.f, data=gss, family=poisson)

Use the function:

myresiduals <- stdresiduals(glm0, 2)

round(xtabs(pearson ~ fechld + mapaid, data=myresiduals), digits=3)
##       mapaid
## fechld      1      2      3      4      5
##      1  3.789 -1.322 -0.962 -1.883 -1.086
##      2 -0.841  1.104  0.412 -1.014 -0.787
##      3 -2.319  0.248  0.109  2.393  1.615
##      4 -1.112 -0.774  0.735  2.069  1.279
round(xtabs(haberman ~ fechld + mapaid, data=myresiduals), digits=3)
##       mapaid
## fechld      1      2      3      4      5
##      1  5.218 -2.116 -1.193 -2.333 -1.278
##      2 -1.332  2.031  0.587 -1.444 -1.064
##      3 -3.140  0.391  0.133  2.917  1.869
##      4 -1.348 -1.091  0.802  2.258  1.326

Check rstandard(model, type=“pearson”)

gss$rstd <- rstandard(glm0, type="pearson")  
# -- in table form sometimes help to spot patterns and useful for interpretation
round(xtabs(rstd ~ fechld + mapaid, data=gss), digits=3)
##       mapaid
## fechld      1      2      3      4      5
##      1  5.218 -2.116 -1.193 -2.333 -1.278
##      2 -1.332  2.031  0.587 -1.444 -1.064
##      3 -3.140  0.391  0.133  2.917  1.869
##      4 -1.348 -1.091  0.802  2.258  1.326