Stata Press books

Generalized Linear Models and Extensions, Third Edition

James W. Hardin and Joseph M. Hilbe
Copyright 2012
ISBN-13: 978-1-59718-105-1
Pages 455; paperback
Price $58.00
See the back cover
Table of contents
Preface (pdf)
Author index (pdf)
Subject index (pdf)
Other supplementary materials provided by the authors
Download the datasets used in the book
Download the brochure (pdf)

Comment from the Stata Technical group

Generalized linear models (GLMs) extend linear regression to models with a non-Gaussian, or even discrete, response. GLM theory is predicated on the exponential family of distributions—a class so rich that it includes the commonly used logit, probit, and Poisson models. Although one can fit these models in Stata by using specialized commands (for example, logit for logit models), fitting them as GLMs with Stata’s glm command offers some advantages. For example, model diagnostics may be calculated and interpreted similarly regardless of the assumed distribution.

This text thoroughly covers GLMs, both theoretically and computationally, with an emphasis on Stata. The theory consists of showing how the various GLMs are special cases of the exponential family, showing general properties of this family of distributions, and showing the derivation of maximum likelihood (ML) estimators and standard errors. Hardin and Hilbe show how iteratively reweighted least squares, another method of parameter estimation, are a consequence of ML estimation using Fisher scoring. The authors also discuss different methods of estimating standard errors, including robust methods, robust methods with clustering, Newey–West, outer product of the gradient, bootstrap, and jackknife. The thorough coverage of model diagnostics includes measures of influence such as Cook’s distance, several forms of residuals, the Akaike and Bayesian information criteria, and various R2-type measures of explained variability.

After presenting general theory, Hardin and Hilbe then break down each distribution. Each distribution has its own chapter that explains the computational details of applying the general theory to that particular distribution. Pseudocode plays a valuable role here, because it lets the authors describe computational algorithms relatively simply. Devoting an entire chapter to each distribution (or family, in GLM terms) also allows for the inclusion of real-data examples showing how Stata fits such models, as well as presenting certain diagnostics and analytical strategies that are unique to that family. The chapters on binary data and on count (Poisson) data are excellent in this regard. Hardin and Hilbe give ample attention to the problems of overdispersion and zero inflation in count-data models.

The final part of the text concerns extensions of GLMs, which come in three forms. First, the authors cover multinomial responses, both ordered and unordered. Although multinomial responses are not strictly a part of GLM, the theory is similar in that one can think of a multinomial response as an extension of a binary response. The examples presented in these chapters often use the authors’ own Stata programs, augmenting official Stata’s capabilities. Second, GLMs may be extended to clustered data through generalized estimating equations (GEEs), and one chapter covers GEE theory and examples. Finally, GLMs may be extended by programming one’s own family and link functions for use with Stata’s official glm command, and the authors detail this process.

In addition to other enhancements—for example, a new section on marginal effects—the third edition contains several new extended GLMs, giving Stata users new ways to capture the complexity of count data. New count models include a three-parameter negative binomial known as NB-P, Poisson inverse Gaussian (PIG), zero-inflated generalized Poisson (ZIGP), a rewritten generalized Poisson, two- and three-component finite mixture models, and a generalized censored Poisson and negative binomial. This edition has a new chapter on simulation and data synthesis, but also shows how to construct a wide variety of synthetic and Monte Carlo models throughout the book.

List of tables
List of figures
1 Introduction
1.1 Origins and motivation
1.2 Notational conventions
1.3 Applied or theoretical?
1.4 Road map
1.5 Installing the support materials
I Foundations of Generalized Linear Models
2 GLMs
2.1 Components
2.2 Assumptions
2.3 Exponential family
2.4 Example: Using an offset in a GLM
2.5 Summary
3 GLM estimation algorithms
3.1 Newton–Raphson (using the observed Hessian)
3.2 Starting values for Newton–Raphson
3.3 IRLS (using the expected Hessian)
3.4 Starting values for IRLS
3.5 Goodness of fit
3.6 Estimated variance matrices
3.6.1 Hessian
3.6.2 Outer product of the gradient
3.6.3 Sandwich
3.6.4 Modified sandwich
3.6.5 Unbiased sandwich
3.6.6 Modified unbiased sandwich
3.6.7 Weighted sandwich: Newey–West
3.6.8 Jackknife Usual jackknife One-step jackknife Weighted jackknife Variable jackknife
3.6.9 Bootstrap Usual bootstrap Grouped bootstrap
3.7 Estimation algorithms
3.8 Summary
4 Analysis of fit
4.1 Deviance
4.2 Diagnostics
4.2.1 Cook’s distance
4.2.2 Overdispersion
4.3 Assessing the link function
4.4 Residual analysis
4.4.1 Response residuals
4.4.2 Working residuals
4.4.3 Pearson residuals
4.4.4 Partial residuals
4.4.5 Anscombe residuals
4.4.6 Deviance residuals
4.4.7 Adjusted deviance residuals
4.4.8 Likelihood residuals
4.4.9 Score residuals
4.5 Checks for systematic departure from the model
4.6 Model statistics
4.6.1 Criterion measures AIC BIC
4.6.2 The interpretation of R2 in linear regression Percentage variance explained The ratio of variances A transformation of the likelihood ratio A transformation of the F test Squared correlation
4.6.3 Generalizations of linear regression R2 interpretations Efron’s pseudo-R2 McFadden’s likelihood-ratio index Ben-Akiva and Lerman adjusted likelihood-ratio index McKelvey and Zavoina ratio of variances Transformation of likelihood ratio Cragg and Uhler normed measure
4.6.4 More R2 measures The count R2 The adjusted count R2 Veall and Zimmermann R2 Cameron–Windmeijer R2
4.7 Marginal effects
4.7.1 Marginal effects for GLMs
4.7.2 Discrete change for GLMs
5 Data synthesis
5.1 Generating correlated data
5.2 Generating data from a specified population
5.2.1 Generating data for linear regression
5.2.2 Generating data for logistic regression
5.2.3 Generating data for probit regression
5.2.4 Generating data for cloglog regression
5.2.5 Generating data for Gaussian variance and log link
5.2.6 Generating underdispersed count data
5.3 Simulation
5.3.1 Heteroskedasticity in linear regression
5.3.2 Power analysis
5.3.3 Comparing fit of Poisson and negative binomial
5.3.4 Effect of omitted covariate on R2Efron in Poisson regression
II Continuous Response Models
6 The Gaussian family
6.1 Derivation of the GLM Gaussian family
6.2 Derivation in terms of the mean
6.3 IRLS GLM algorithm (nonbinomial)
6.4 ML estimation
6.5 GLM log-normal models
6.6 Expected versus observed information matrix
6.7 Other Gaussian links
6.8 Example: Relation to OLS
6.9 Example: Beta-carotene
7 The gamma family
7.1 Derivation of the gamma model
7.2 Example: Reciprocal link
7.3 ML estimation
7.4 Log-gamma models
7.5 Identity-gamma models
7.6 Using the gamma model for survival analysis
8 The inverse Gaussian family
8.1 Derivation of the inverse Gaussian model
8.2 The inverse Gaussian algorithm
8.3 Maximum likelihood algorithm
8.4 Example: The canonical inverse Gaussian
8.5 Noncanonical links
9 The power family and link
9.1 Power links
9.2 Example: Power link
9.3 The power family
III Binomial Response Models
10 The binomial–logit family
10.1 Derivation of the binomial model
10.2 Derivation of the Bernoulli model
10.3 The binomial regression algorithm
10.4 Example: Logistic regression
10.4.1 Model producing logistic coefficients: The heart data
10.4.2 Model producing logistic odds ratios
10.5 GOF statistics
10.6 Proportional data
10.7 Interpretation of parameter estimates
11 The general binomial family
11.1 Noncanonical binomial models
11.2 Noncanonical binomial links (binary form)
11.3 The probit model
11.4 The clog-log and log-log models
11.5 Other links
11.6 Interpretation of coefficients
11.6.1 Identity link
11.6.2 Logit link
11.6.3 Log link
11.6.4 Log complement link
11.6.5 Summary
11.7 Generalized binomial regression
12 The problem of overdispersion
12.1 Overdispersion
12.2 Scaling of standard errors
12.3 Williams’ procedure
12.4 Robust standard errors
IV Count Response Models
13 The Poisson family
13.1 Count response regression models
13.2 Derivation of the Poisson algorithm
13.3 Poisson regression: Examples
13.4 Example: Testing overdispersion in the Poisson model
13.5 Using the Poisson model for survival analysis
13.6 Using offsets to compare models
13.7 Interpretation of coefficients
14 The negative binomial family
14.1 Constant overdispersion
14.2 Variable overdispersion
14.2.1 Derivation in terms of a Poisson–gamma mixture
14.2.2 Derivation in terms of the negative binomial probability function
14.2.3 The canonical link negative binomial parameterization
14.3 The log-negative binomial parameterization
14.4 Negative binomial examples
14.5 The geometric family
14.6 Interpretation of coefficients
15 Other count data models
15.1 Count response regression models
15.2 Zero-truncated models
15.3 Zero-inflated models
15.4 Hurdle models
15.5 Negative binomial(P) models
15.6 Heterogeneous negative binomial models
15.7 Generalized Poisson regression models
15.8 Poisson inverse Gaussian models
15.9 Censored count response models
15.10 Finite mixture models
V Multinomial Response Models
16 The ordered-response family
16.1 Interpretation of coefficients: Single binary predictor
16.2 Ordered outcomes for general link
16.3 Ordered outcomes for specific links
16.3.1 Ordered logit
16.3.2 Ordered probit
16.3.3 Ordered clog-log
16.3.4 Ordered log-log
16.3.5 Ordered cauchit
16.4 Generalized ordered outcome models
16.5 Example: Synthetic data
16.6 Example: Automobile data
16.7 Partial proportional-odds models
16.8 Continuation-ratio models
17 Unordered-response family
17.1 The multinomial logit model
17.1.1 Interpretation of coefficients: Single binary predictor
17.1.2 Example: Relation to logistic regression
17.1.3 Example: Relation to conditional logistic regression
17.1.4 Example: Extensions with conditional logistic regression
17.1.5 The independence of irrelevant alternatives
17.1.6 Example: Assessing the IIA
17.1.7 Interpreting coefficients
17.1.8 Example: Medical admissions—introduction
17.1.9 Example: Medical admissions—summary
17.2 The multinomial probit model
17.2.1 Example: A comparison of the models
17.2.2 Example: Comparing probit and multinomial probit
17.2.3 Example: Concluding remarks
VI Extensions to the GLM
18 Extending the likelihood
18.1 The quasilikelihood
18.2 Example: Wedderburn’s leaf blotch data
18.3 Generalized additive models
19 Clustered data
19.1 Generalization from individual to clustered data
19.2 Pooled estimators
19.3 Fixed effects
19.3.1 Unconditional fixed-effects estimators
19.3.2 Conditional fixed-effects estimators
19.4 Random effects
19.4.1 Maximum likelihood estimation
19.4.2 Gibbs sampling
19.5 GEEs
19.6 Other models
VII Stata Software
20 Programs for Stata
20.1 The glm command
20.1.1 Syntax
20.1.2 Description
20.1.3 Options
20.2 The predict command after glm
20.2.1 Syntax
20.2.2 Options
20.3 User-written programs
20.3.1 Global macros available for user-written programs
20.3.2 User-written variance functions
20.3.3 User-written programs for link functions
20.3.4 User-written programs for Newey–West weights
20.4 Remarks
20.4.1 Equivalent commands
20.4.2 Special comments on family(Gaussian) models
20.4.3 Special comments on family(binomial) models
20.4.4 Special comments on family(nbinomial) models
20.4.5 Special comment on family(gamma) link(log) models
A Tables
Author index
Subject index