Papers for downloading

From here you can download a selection of papers since 1994 and a few earlier ones, arranged in date order within the following topics Note, however, that this list is as of January 2019, no longer being maintained  here. Please go to my new web site harveygoldstein.co.uk for an up tp date version.

  1. Methodological papers on multilevel modelling
  2. Papers on assessment, school effectiveness and performance indicators

    1. Methodological papers on multilevel modelling
      • Multilevel Modelling of Medical data. An introduction to multilevel modelling with an emphasis on medical applications. Also discusses cross classifications and multiple membership models. Goldstein, H., Browne, W. and Rasbash, J. (2004). Multilevel Modelling of Medical data. Tutorials in Biostatistics. R. D'Agostino. Chichester, Wiley. 2: 69-94. (Not available at the moment for downloading).
      • Multilevel modelling of educational data (PDF, 154kB)This is an introductory account of multilevel models with a discussion of the range of types of data that can now be handled with the latest developments in methodology. Goldstein, H. (2003). Multilevel modelling of educational data. In: Methodology and epistomology of multilevel analysis. D. Courgeau. London, Kluwer.
      • Modelling complex variation (PDF, 240kB) This paper is an entry in the forthcoming (2005) Encyclopedia of Behavioural Statistics. It describes how to set up models for the variance as a function of explanatory variables in generalised linear models.
      • Multilevel event history modelling (PDF, 338kB). This paper sets out in detail the procedures for setting up and estimating a very flexible event history model using existing multilevel software. Goldstein, H., Pan, H. and Bynner, J. (2004). A flexible procedure for analyzing longitudinal event histories using a multilevel model. Understanding statistics 3: 85-89.
      • Multilevel multistate event history models (PDF, 547kB) This paper proposes a general discrete-time model for multilevel event history data. The model is developed for the analysis of longitudinal repeated episodes within individuals where there are multiple origin states and multiple transitions from a state (competing risks). Transitions from each origin state are modelled jointly to allow for correlation across states in the unobserved individual characteristics that influence transitions. This paper extends Goldstein et al., (above). Steele, F., Goldstein, H. and Browne, W. (2004). A general multilevel multistate competing risks model for event history data, with an application to a study of contraceptive use dynamics. Statistical Modelling 4: 145-159.
      • Multilevel ordinal category response models (PDF, 770kB) This paper,  (Statistical Modelling, 2003,  3, 127-153), develops and explores the use of multilevel models for ordered category responses and applies the models to an analysis of examination grades.
      • Binary and continuous response factor models (PDF, 537kB)  This shows how traditional models for binary response factor analysis can be extended to fit multiple factors and hierarchical (multilevel) data structures. The general multilevel factor model is developed using MCMC estimation and it is shown how the Normal response model can be extended using a probit link function that has useful interpretations. The models are applied to the study of country differences in a large scale study of Mathematics achievement in schools (PISA). To be published in Psychometrics: a festschrift for  Roderick P. McDonald. (2003, Ed. A. Olivares 
      • Modelling dependencies between fixed and random effects (PDF, 418kB). For small group sizes, the multilevel iterative generalised least squares (IGLS) estimator is biased and inconsistent where the random effects are correlated with the fixed predictors.  In this unpublished (2002) paper we review various approaches to ensure consistency.
      • Multilevel modelling of medical data (PDF, 490kB). (2002. Statistics in Medicine 21: 3291-3315.). This paper presents an overview of multilevel modelling with medical examples. It includes extensions to complex data structures and discusses both likelihood and Bayesian estimation.
      • Partitioning variation in generalised linear multilevel models (PDF, 185kB). (Understanding Statistics, 2002, 1,  223-232). This paper describes various methods for obtaining estimates of the relative proportions of variance at different levels of a data hierarchy with examples and some MLwiN macros.
      • Multilevel factor analysis models (PDF, 458kB). This paper sets out a general model for multilevel factor analysis using MCMC estimation. It appeared in Spring 2002 as a chapter in 'Latent variable and latent structure models'. Edited by G. Marcoulides and I. Moustaki, Lawrence Erlbaum Associates, New Jersey.
      • Meta analysis using multilevel models. The general meta analysis model can be formulated as a multilevel model where the different studies being 'pooled' become higher level units. Two papers are published. The first deals with and the second with continuous outcome meta analysis models (PDF, 538kB) where study data reported at different levels (individual or study) can be combined in a single analysis.
      • Multiple membership multiple classification models (PDF, 301kB) (Statistical Modelling, 2001, 1, 103-124). This paper describes extensions to the purely hierarchical multilevel model to handle cross classifications and multiple membership (fuzzy set) structures. It introduces an efficient notation and shows how likelihood and Bayesian (MCMC) techniques can be used for estimation.
      • Multivariate multilevel modelling of A level examination results (PDF, 920kB) (J. Royal Statist. Soc, 2002, A, 165, 137-154). This explores modelling strategies for multivariate (examination) responses with non-randomly missing responses within a multilevel data structure.
      • Multilevel models for repeated binary outcomes (PDF, 79kB). (J. Royal Statist. Soc., A, (2000), 163, 49-62). This looks at ways of modelling repeated measures data with a binary outcome where the occasions are equally spaced, using data from the British Election Study.
      • Discrete response multilevel models (PDF, 40kB). (Quality and Quantity, 34 (3): (2000) pp.323-330.). This looks at ways of modelling repeated measures data with a binary outcome where the occasions are not equally spaced, using data from the British Election Study. It uses a time series approach.
      • Multilevel models for dynamic household structures (PDF, 74kB) (European Journal of Population, (2000), 16, 373-387). A modelling procedure is proposed for fully describing dynamic household structures where households change their composition over time. Multiple membership models are proposed for such data and their application is discussed with an example.
      • A non parametric bootstrap for multilevel models (PDF, 166kB) This paper describes the implementation of a residuals bootstrap for generalised linear multilevel models that has improved properties than the fully parametric bootstrap when data are non-Normal. (Multilevel modelling Newsletter, 1999, 11, 2-5.
      • Bootstrap procedures for multilevel data (PDF, 54kB). This unpublished paper looks at parametric and non-parametric bootstrapping procedures for multilevel data. (For more details about using the iterative bootstrap see paper on bias correction)
      • Bias correction procedures for non-linear multilevel models (PDF, 31kB). This unpublished paper looks in detail at the iterative bootstrap procedure, implemented in MLwiN, with some simulations.
      • Causal inferences from repeated measures data (PDF, 19kB). This unpublished paper looks at ways of making causal type inferences from repeated measures models which are symmetric with respect to time.
      • .Improved approximations for binary response multilevel models (PDF, 64kB). (Goldstein and Rasbash, JRSS, A, 159, 505-513). This paper discusses the use of improved approximations for the estimation of generalised linear multilevel models where the response is a proportion. Simulation studies by Rodriguez and Goldman have shown that in extreme situations large biases can occur, most notably in the case when the response is binary, the number of level 1 units per level 2 unit is small and the underlying random parameter values are large. An improved approximation is introduced which largely eliminates the biases in the situation described by Rodriguez and Goldman.
      • Interpreting aggregate level models (PDF, 26kB). Aggregate level analyses which fit models using average values of predictor and response variables do not in general provide appropriate inferences about lower level relationships since they are estimating different components of such underlying models. This unpublished paper shows why this occurs and the circumstances where misleading inferences will be drawn.
      • An introduction to multilevel models in biostatistics (PDF, 57kB). This article is an entry in the 'Encyclopedia of Biostatistics' and provides an introduction to multilevel modelling for biostatisticians.
      • An introduction to repeated measures models (PDF, 40kB). This article is an entry in the 'Encyclopedia of Biostatistics' and describes the application of multilevel models to repeated measures data analysis.
      • Multilevel models for the analysis of social data (PDF, 38kB). This is an entry in the ‘Encyclopaedia of Social research Methods’ and provides an introduction to multilevel models, with examples, for social scientists.
      • Multilevel models with missing data (PDF, 77kB). This paper proposes a procedure for producing consistent and asymptotically efficient moment-based estimators for a multilevel model with randomly missing explanatory or response variables. The procedure is essentially an extension of existing procedures based upon multiple imputation, is computationally efficient and avoids the need to generate multiple data sets to which multilevel models are fitted. It is also shown that the procedure copes effectively with informatively missing data values when the missingness mechanism is incorporated into the estimation procedure. It has close parallels with moment-based estimators for errors in variables models.
      • Multilevel spatial models (PDF, 205kB). (J. Royal statistical Society, Series C, 1999, 253-268). This paper addresses some of the theoretical and methodological problems of modelling the distribution of diseases, such as cancer, in discrete geographical areas. Theoretically, it is necessary to examine in detail the various processes, both artefactual and causative, which may affect the number of cases occurring within a certain area, and the distribution of relative risks between areas. A methodological framework based on multilevel modelling is developed, with spatial and nonspatial relationships being considered as random effects occurring at different levels within a population data structure. Examples of exploratory and inferential analyses are given, and discussion focuses on the issues raised by complex spatial modelling of geographically distributed health data.
      • Multiple membership and missing identification models (PDF, 59kB). This paper presents a method for handling educational data in which students belong to more than one unit at a given level, but there is missing information on the identification of the units to which students belong. For example, a student might be classified as belonging sequentially to a particular combination of primary and secondary school, but for some students, the identity of either the primary or secondary school may be unknown. Similar situations arise in longitudinal studies in which students change school or class from one year to the next. The method involves setting up a cross-classified model, but replacing (0,1) values for unit membership with weights reflecting probabilities of unit membership in cases where membership information is randomly missing. The method is illustrated with reference to longitudinal data on students’ progress in English.
      • Non-parametric maximum likelihood for 2 level models (PDF, 9kB). This unpublished paper shows how to implement non-parametric maximum likelihood estimation for 2-level models in MLwiN.
      • Weighting in multilevel models (PDF, 17kB).  This unpublished paper shows how to carry out weighted multilevel analyses. The procedures are incorporated into MLwiN version 1.10.
      • Multilevel models for longitudinal growth norms (PDF, 611kB). This paper describes a two stage procedure for estimating conditional and unconditional growth norms. It describes how the resulting norms can be used in clinical and research practice.
      • Multilevel time series models (PDF, 1,350kB). This paper shows how a general multilevel time series model (in continuous time) can be fitted at level 1, with random effects also at higher levels.
      • Notes on the statistical relationship between smoking in pregnancy and perinatal mortality (PDF, 123kB) Considerable debate took place in the 1970s about this relationship. A reanalysis of results from several studies reveals that such a relationship exists and is mediated through birthweight.

(Back to top)

  1. Papers on assessment, school effectiveness and performance indicators
    • Effect sizes in linear models (PDF, 34kB) This symposium contribution looks at different ways of defining and measuring the effect size associated with a predictor variable in a linear model and how different effect sizes may be compared. Goldstein, H. (2004). Some observations on the definition and estimation of effect sizes. But what does it mean? The use of effect sizes in educational research. I. Schagen and K. Elliot. Slough, NFER.
    • International comparisons (PDF, 56kB) This is a review of a recent book on international studies of achievement. It reflects on the present state of understanding among educationalists and the insularity of much of the commentary emerging from the United States. Goldstein, H. (2004). International comparative assessment: how far have we really come. (Review essay). Assessment in Education 11: 227-234.
    • Examination standards (PDF, 38kB) This paper looks at the recurring controversies about the standards of pupil performance in public examinations in the UK. Goldstein, H. (2004). Measuring Educational Standards. Significance(September 2004): 103-105.
    • PISA 2000 survey: a critical commentary (PDF, 139kB). This paper raises some methodological concerns about the conduct, analysis and interpretation of results from the Programme for International Student Assessment (PISA) study. It comments on the restricted nature of the data modelling and analysis, and the resulting interpretations.  Implicit in the paper are suggestions for ways in which such studies can be improved.
    • Modelling social segregation in schooling (PDF, 1,106kB) This paper presents an application of a multilevel model for  studying changes in the social composition of schools. It presents a critique of existing 'index' methods and shows how a multilevel approach is efficient and avoids the arbitrariness associated with choosing any particular index.
    • Education for all: the globalisation of learning targets (PDF, 171kB). UNESCO's ambitious project to improve basic education in the developing world revolves around the achievement of learning targets, especially in terms of literacy. This paper argues that such targets will be counterproductive in just the same kinds of ways that 'high stakes' national target setting has been in countries such as the UK and the USA. An early version was published in "Research Intelligence"  in February 2003. The present revised version is to appear in "Comparative Education", (2004).
    • Designing social research (PDF, 190kB).  This is a professorial lecture given at the University of Bristol in 2002. It looks at the basic principles underlying quantitative social research and how these should affect the design and analysis of social data. 
    • Class size effects on achievement in the reception year (PDF, 616kB). (British Educational Research Journal, 2002, 28, 169-185).  This shows positive effects of smaller classes on progress during the reception year in English Primary schools. The effect is present for classes up to size 27 or so and is more marked for less able  children
    • League tables and schooling  (PDF, 0.1 mb). This is a paper presented to Members of Parliament in January 2001. It sets out the issues about the publication of league tables for schools in a non-technical fashion. It explains 'value added' and sets out a framework for the constructive use of properly adjusted performance indicators. Many of the same issues apply to health service indicators and are relevant to current (June 2001) proposals to introduce league tables for police forces.
    • Predicting the future (PDF, 1,655kB) (Gray, J., Goldstein, H. and Thomas, S. (2001). Predicting the future: the role of past performance in determining trends in institutional effectiveness at A level. British Educational Research Journal 27: 391-406.) Using an extensive set of A level examination results over 4 years this paper demonstrates that apparent stability in institutional exam results from year to year are deceptive. When 'value added' analyses are done there are substantial variations over time. There are no obvious trends in 'effectiveness' over time that can be pinned down for individual institutions and predictions of results in the fourth year using data from the previous 3 years are poor. This casts doubt on received wisdom about school improvement and especially calls into question current government policies.
    • Using pupil performance data for judging schools and teachers (PDF, 94kB). (Brit. Educational res. J.,  2001, 27, 433-442). This paper reviews ways in which performance data (league tables and targets) are currently used. It presents a critique using research evidence and suggests a more rational approach to the use of such data.
    • School effectiveness research and educational policy (PDF, 49kB) (1). In the latter half of the 1990s many academics and others became highly critical of 'school effectiveness' research and applications, especially its use by government for political purposes. This paper looks at the validity of these critiques and reflects on the future direction for this area of educational research. ( Oxford Review of Education, (2000), 26, 353-363)
    • School effectiveness research and educational policy (2). A further commentary on critiques of school effectiveness research inspired by a recent paper from M. Thrupp.
    • International comparisons of adult literacy (PDF, 314kB). The International Adult Literacy Survey raises a number of important issues which are inherent in all attempts to make comparisons of cognitive and behavioural attributes across countries. This paper discusses both the statistical and interpretational problems. A  detailed analysis of the survey instruments is carried out to demonstrate the cultural specificity involved and the data modelling techniques used in IALS are critiqued and alternative analyses performed. The paper argues for extreme caution in interpreting results in the light of the weaknesses of the survey. (Blum, A., Goldstein, H. and Guerin-Pace, F. (2001). International adult literacy survey (IALS): an analysis of international comparisons of adult literacy. Assessment in Education 8: 225-246.)
    • The use of value added information in judging school performance. This report is based upon a study funded by OFSTED in 1999 and is one result of a continuing collaboration between researchers at the Institute of Education and Hampshire Education Authority. Funded by OFSTED, it demonstrates that there are no simple substitute measures, such as free school meal entitlement, to carrying out a full value added analysis when comparing school performances.
    • GCSE to A-level exam results value added analysis (PDF, 139kB). This paper is a multilevel analysis of 500,000 student A level results, matched to GCSE results for all A level institutions 1993-1995.
    • Assessing the performance of schools. (PDF, 21kB) This paper presents a non-technical discussion of school league tables and the idea of value-added analysis.
    • Ethical issues for performance indicators (PDF, 25kB). This paper looks at some of the ethical problems in the publication of school rankings and makes suggestions for ethical guidelines.
    • Critique of 1997 government education white paper (PDF, 36kB). A   critique of some of the assumptions underlying the 1997 New Labour Government white paper and its use of research evidence.
    • Failing schools in a failing system (PDF, 67kB). This book chapter discusses the notion of school failure and examines how it has been used within recent policy initiatives.
    • Limitations of league tables (PDF, 112kB). A methodological discussion of the use of league table rankings in education and health.
    • The influence of primary and secondary schools on 16-year-old examination results (PDF, 512kB). This paper shows that the Primary school attended as well as the Secondary school influence later exmination performance. This has important implications for 'value added' analyses that should be adjusting for prior  attainment at more than one time point. 
    • School effectiveness methodology (PDF, 123kB). This paper discusses the methodological requirements for valid inferences from school effectiveness research studies. The requirements include long term longitudinal data and proper statistical modelling of hierarchical data structures. The paper outlines the appropriate multilevel statistical models and shows how these can model the complexities of school, class and student level data.
    • Targets and performance indicators (PDF, 16kB). A non technical commentary on target setting for schools in the context of performance indicators.
    • Using value added data for school improvement purposes (PDF, 249kB).  This paper (Oxford Review of Education (1999), 25, 469-483.) describes a value-added analysis of baseline - KS1 and KS1 - KS2 data in Hampshire primary schools and discusses how results can be fed back to schools for school improvement purposes without the problems associated with public 'league tables'
    • Methodological review of class size studies (PDF, 145kB). This report, commissioned by UNESCO, examines the methodology of class size studies and carries out a reanalysis of the Tennessee STAR research using multilevel modelling.
    • Models for reality (PDF, 78kB). A professorial lecture given at the Institute of Education July 1, 1998. It looks at the role of statistical modelling for educational data.
    • League tables and their limitations: statistical issues (1996) (PDF, 2,921kB). This paper looks in detail at the statistical issues of  adjustment and presentation of uncertainty intervals for performance indicators using educational and health data. It was read to the Royal Statistical Society and contains an extensive discussion section.
    •   International comparisons of student achievement (1995) (PDF, 2,506kB).  This report, commissioned by UNESCO reviews existing comparative studies of student achievement and proposes a critique of their methodologies, including psychometric methods, translation issues and sampling designs. 
    • Recontextualising mental measurement (1994) (PDF, 610kB). This paper (Educational Measurement Issues and Practice, 13, 16-43) examines how item response (so called IRT) statistical models have come to dominate a large part of educational testing. It argues that these models are over-simple, statistically implausible and typically associated with the trivialisation of educational assessment.
    • A multilevel analysis of examination results (1993) (PDF, 387kB). Data on examination results from inner London schools are analysed in relation to intake achievement, pupil gender and school type. The examination achievement, averaged over subjects, is studied as is achievement in the separate subjects of Mathematics and English. Multilevel models are fitted, so that the variation between schools can be studied. It is shown that confidence intervals for school ’residuals’ or ’effects’ are wide, so that few schools can be separated reliably. In particular, no fine rank ordering of schools legitimately can be produced.

(Back to top)

Note: some of the documents on this page are in PDF format. In order to view a PDF you will need Adobe Acrobat Reader

Edit this page