There are multiple way/commands/functions for computing residuals in R when fitting glms. The one that I think that give ajusted standard residuals is rstandard(model, type=). To make sure, I wrote a function to compute these residuals so that I can compare results.
Pearson residuals are asymptotically \(\sim N(0,1)\); whereas, the ajusted standardized (Haberman) residuals are \(\sim N(0,1)\) and thus preferable.
See bottom of page 81 of Agresti (2019). An Introduction to Categorical Data Analysis, 3rd edition. Since I couldn’t find an R function to compute these, I wrote a small and easy to use function. The function is below and example usage.
We’ll the gss data from lecture 2 (on inference for two-way tables).
Input to the function is the model object and the dim equals the number of variables of the table (i.e., table dimension). The output will be a data frame with your original data, fitted values, h, pearson residuals and haberman residuals.
This function that I wrote should work on 2-way and higher-way tables. I also used this to check existing function to make sure that it give adjusted Standardized (Haberman) residualts.
stdresiduals <- function(object, dim){
y <- object$data[, dim+1]
h = hatvalues(object)
haberman <- (y - object$fitted.values) / sqrt(object$fitted.values*(1-h))
pearson <- (y - object$fitted.values)/ sqrt(object$fitted.values)
results <- as.data.frame(cbind(object$data, object$fitted.values, h, pearson, haberman))
return(results)
}
A useful function that we’ll use for viewing residuals (i.e., xtabs)
library(vcd) # so that I can use xtabs
## Loading required package: grid
The Gss data from lecture notes:
( gss <- read.table("C:\\Users\\cja\\Dropbox\\edps 589\\2 Chi-square\\gss_data.txt", header=TRUE) )
## fechld mapaid count
## 1 1 1 97
## 2 1 2 96
## 3 1 3 22
## 4 1 4 17
## 5 1 5 2
## 6 2 1 102
## 7 2 2 199
## 8 2 3 48
## 9 2 4 38
## 10 2 5 5
## 11 3 1 42
## 12 3 2 102
## 13 3 3 25
## 14 3 4 36
## 15 3 5 7
## 16 4 1 9
## 17 4 2 18
## 18 4 3 7
## 19 4 4 10
## 20 4 5 2
To correspond to lecture notes, fit a poisson regression model of independence (additive effects):
# -- Need factors
fechld.f <- as.factor(gss$fechld)
mapaid.f <- as.factor(gss$mapaid)
# -- Fit model of independence
glm0 <- glm(count ~ fechld.f + mapaid.f, data=gss, family=poisson)
myresiduals <- stdresiduals(glm0, 2)
round(xtabs(pearson ~ fechld + mapaid, data=myresiduals), digits=3)
## mapaid
## fechld 1 2 3 4 5
## 1 3.789 -1.322 -0.962 -1.883 -1.086
## 2 -0.841 1.104 0.412 -1.014 -0.787
## 3 -2.319 0.248 0.109 2.393 1.615
## 4 -1.112 -0.774 0.735 2.069 1.279
round(xtabs(haberman ~ fechld + mapaid, data=myresiduals), digits=3)
## mapaid
## fechld 1 2 3 4 5
## 1 5.218 -2.116 -1.193 -2.333 -1.278
## 2 -1.332 2.031 0.587 -1.444 -1.064
## 3 -3.140 0.391 0.133 2.917 1.869
## 4 -1.348 -1.091 0.802 2.258 1.326
gss$rstd <- rstandard(glm0, type="pearson")
# -- in table form sometimes help to spot patterns and useful for interpretation
round(xtabs(rstd ~ fechld + mapaid, data=gss), digits=3)
## mapaid
## fechld 1 2 3 4 5
## 1 5.218 -2.116 -1.193 -2.333 -1.278
## 2 -1.332 2.031 0.587 -1.444 -1.064
## 3 -3.140 0.391 0.133 2.917 1.869
## 4 -1.348 -1.091 0.802 2.258 1.326