Most large data sets that can be used for
rehabilitation-related research contain data that are inherently
'nested' or 'clustered.' Persons who see the same provider, are admitted
to the same hospital, or live in the same community share common
characteristics, experiences, and environmental influences. As a result,
individuals within a group or setting (commonly referred to as
'context') tend to be more similar to each other than those chosen at
random from all groups in terms of both health determinants and health
outcomes. This correlation (dependency) of observations violates the
assumption of independence for regression analysis leading to biased
standard errors of parameter estimates. Thus, regardless of whether
your specific research question includes factors from more than one
level, it is necessary to account for the hierarchical nature of
potentially correlated observations within these data sets.
There are two basic approaches to working with nested data:
- Adjust standard errors of individual-level predictors
to account for the potential bias introduced by ignoring the nested
structure of the data, or
- Model the structure of the data and partition the variance attributable to the different levels.
A generalized estimating equation (GEE) can be used when
the objective is simply adjusting the standard errors. Conversely, a
multilevel model (hierarchical linear model [HLM] or hierarchical
generalized linear model [HGLM] for numerical and categorical outcome
variables, respectively) can accommodate either objective: controlling
for or modeling the correlated observations, including repeated
measures on the same subjects over time.
- P Burton et al. (1998) Extending the simple linear
regression model to account for correlated responses: an introduction
to generalized estimating equations and multi-level mixed modeling.
Statist Med. 17: 1261-1291.
- H Goldstein et al. (2002) Tutorial in Biostatistics Multilevel modeling of medical data. Statist Med. 21: 3291-3315.
- JA Hanley et al. (2003) Statistical analysis of
correlated data using generalized estimating equations: an
orientation. Am J Epidemiol. 157(4): 364-375.
- JB Bingenheimer & SW Raudenbush (2004) Statistical and
substantive inferences in public health: Issues in the application
of multilevel models. Annu Rev Public Health. 25: 53-77.
- J Merlo et al. (2005) A brief conceptual tutorial on
multilevel analysis in social epidemiology: Investigating contextual
phenomena in different groups of people. J Epidemiol Community Health.