Lemma 2 methodological research

Lemma logoMethodological research is being carried out in the following areas

Handling missing data

Previous work (Carpenter and Goldstein 2004: Multilevel Modelling Newsletter 16 (2) (PDF, 298kB)) implements procedures that are based on methodological extensions that allow multivariate mixtures of normal, ordered or unordered categorical responses that can be defined at any level of a data hierarchy. The 2-level model is considered in detail and a major application is to multiple imputation for missing data. They use latent variable ideas to create an underlying set of latent multivariate normal responses: one normal response for each binary or ordered response variable and a set of normal responses for each multicategory response variable. This reduces the analysis to a multivariate normal model that allows us to apply standard algorithmic steps in the estimation.

In multiple imputation there are two models. One is the scientific model of interest (MOI) and the other is the imputation model (IM). The basic idea is that all the variables that are present in the MOI form a set of response variables in the IM which is then fitted, within a multilevel structure, with intercepts in the fixed part of the model. For a set of multivariate Normal responses this is straightforward and in addition, if any responses are missing, they will be randomly imputed within an MCMC analysis. For original non-normal variables, these imputed values are then transformed back to the original (ordered or unordered) scales so that the imputed ‘complete’ datasets will have all variables on their original scales.

Goldstein et al. (2009) have completed theoretical work required to implement the methodology on using multiple imputation techniques for handling missing data for general multilevel structures. A restricted version of this methodology has been implemented and linked to MLwiN via the Realcom-Impute package.



(Back to top)

Correlated random classifications

The classic random effects multilevel model assumes independence between units of the same classification (e.g. effects of different schools) and independence between units of different classifications (e.g. school and neighbourhood effects). Examples where these assumptions are likely to be questionable are when modelling school competition (correlated school effects) and when exploring parental selection mechanisms into schools and neighbourhoods (correlated school and neighbourhood effects). We propose to develop methodology to handle both these cases.

We can distinguish two types of potential relationships between different classifications: non-additive and non-independent. To assess whether schools and neighbourhoods are interacting non-additively, in terms of their effects on pupil performance, we form an interaction classification (the non-empty cells in the tabulation of pupils by school and neighbourhood). We then fit a model with random classifications for the main effects of school and neighbourhood and a random interaction classification. The size of the variance component for the interaction classification provides a measure of the extent to which school and neighbourhood contributions to pupil learning are non-additive.  Parental selection mechanisms into schools and neighbourhoods may result in non-independence of these classifications.

Browne and Goldstein (2010) have developed an MCMC estimation method for multilevel models with correlated random effects.  Rasbash et al. (2011) allow for correlated actor and partner effects in a social relations model. Interaction classifications will be described in a forthcoming online training module on non-hierarchical structures.

See also Realistic models for school effectiveness


Note: some of the documents on this page are in PDF format. In order to view a PDF you will need Adobe Acrobat Reader

(Back to top)

Edit this page