R version 3.6.3 (2020-02-29) -- "Holding the Windsock" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > ############################################################################ > # MLwiN User Manual > # > # 18 Modelling Cross-classified Data . . . . . . . . . . . . . . . . . .271 > # > # Rasbash, J., Steele, F., Browne, W. J. and Goldstein, H. (2012). > # A User's Guide to MLwiN, v2.26. Centre for Multilevel Modelling, > # University of Bristol. > ############################################################################ > # R script to replicate all analyses using R2MLwiN > # > # Zhang, Z., Charlton, C., Parker, R, Leckie, G., and Browne, W.J. > # Centre for Multilevel Modelling, 2012 > # http://www.bristol.ac.uk/cmm/software/R2MLwiN/ > ############################################################################ > > library(R2MLwiN) R2MLwiN: A package to run models implemented in MLwiN from R Copyright 2013-2017 Zhengzheng Zhang, Christopher M. J. Charlton, Richard M. A. Parker, William J. Browne and George Leckie Support provided by the Economic and Social Research Council (ESRC) (Grants RES-149-25-1084, RES-576-25-0032 and ES/K007246/1) To cite R2MLwiN in publications use: Zhengzheng Zhang, Richard M. A. Parker, Christopher M. J. Charlton, George Leckie, William J. Browne (2016). R2MLwiN: A Package to Run MLwiN from within R. Journal of Statistical Software, 72(10), 1-43. doi:10.18637/jss.v072.i10 A BibTeX entry for LaTeX users is @Article{, title = {{R2MLwiN}: A Package to Run {MLwiN} from within {R}}, author = {Zhengzheng Zhang and Richard M. A. Parker and Christopher M. J. Charlton and George Leckie and William J. Browne}, journal = {Journal of Statistical Software}, year = {2016}, volume = {72}, number = {10}, pages = {1--43}, doi = {10.18637/jss.v072.i10}, } The MLwiN_path option is currently set to C:/Program Files/MLwiN v3.05/ To change this use: options(MLwiN_path="") > # MLwiN folder > mlwin <- getOption("MLwiN_path") > while (!file.access(mlwin, mode = 1) == 0) { + cat("Please specify the root MLwiN folder or the full path to the MLwiN executable:\n") + mlwin <- scan(what = character(0), sep = "\n") + mlwin <- gsub("\\", "/", mlwin, fixed = TRUE) + } > options(MLwiN_path = mlwin) > > > # 18.1 An introduction to cross-classification . . . . . . . . . . . . . 271 > > # 18.2 How cross-classified models are implemented in MLwiN . . . . . . .273 > > # 18.3 Some computational considerations . . . . . . . . . . . . . . . . 273 > > > > # 18.4 Modelling a two-way classification: An example . . . . . . . . . .275 > > > data(xc, package = "R2MLwiN") > summary(xc) vrq attain pid sex sc Min. : 70.0 Min. : 1.000 61 : 72 Male :1739 Min. : 0.000 1st Qu.: 89.0 1st Qu.: 3.000 122 : 68 Female:1696 1st Qu.: 0.000 Median : 98.0 Median : 5.000 32 : 58 Median : 0.000 Mean : 97.8 Mean : 5.679 24 : 57 Mean : 6.845 3rd Qu.:107.0 3rd Qu.: 9.000 6 : 55 3rd Qu.:20.000 Max. :140.0 Max. :10.000 1 : 54 Max. :31.000 (Other):3071 sid fed choice med cons 14 : 290 Min. :0.0000 Min. :1.000 Min. :0.0000 Min. :1 18 : 257 1st Qu.:0.0000 1st Qu.:1.000 1st Qu.:0.0000 1st Qu.:1 12 : 253 Median :0.0000 Median :1.000 Median :0.0000 Median :1 6 : 250 Mean :0.2754 Mean :1.195 Mean :0.3421 Mean :1 11 : 234 3rd Qu.:1.0000 3rd Qu.:1.000 3rd Qu.:1.0000 3rd Qu.:1 17 : 233 Max. :1.0000 Max. :4.000 Max. :1.0000 Max. :1 (Other):1918 pupil 1 : 148 2 : 141 3 : 136 4 : 132 5 : 128 6 : 122 (Other):2628 > > sid_dummy <- model.matrix(~factor(xc$sid) - 1) > sid_dummy_names <- paste0("s", 1:19) > colnames(sid_dummy) <- sid_dummy_names > xc <- cbind(xc, sid_dummy) > > covmatrix <- NULL > for (i in 1:19) { + for (j in 1:i) { + if (i != j) { + covmatrix <- cbind(covmatrix, c(3, sid_dummy_names[j], sid_dummy_names[i])) + } + } + } > > random.ui <- matrix(0, 21, 18) > random.ui[1, ] <- 1 > random.ui[2:19, ] <- diag(18) * -1 > random.ci <- rep(0, 18) > > (mymode1 <- runMLwiN(attain ~ 1 + (s1 + s2 + s3 + s4 + s5 + + s6 + s7 + s8 + s9 + s10 + s11 + s12 + s13 + s14 + + s15 + s16 + s17 + s18 + s19 | cons) + (1 | pid) + (1 | pupil), estoptions = list(clre = covmatrix, constraints = list(random.ui = random.ui, + random.ci = random.ci)), data = xc)) /nogui option ignored ECHO 0 Echoing is ON BATC 1 Batch mode is ON MAXI 2 STAR iteration 0 iteration 1 Convergence not achieved TOLE 2 MAXI 20 NEXT iteration 2 iteration 3 Convergence achieved ECHO 0 Execution completed -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN (version: 3.05) multilevel model (Normal) N min mean max N_complete min_complete mean_complete cons 1 3435 3435.00000 3435 1 3435 3435.00000 pid 148 1 23.20946 72 148 1 23.20946 max_complete cons 3435 pid 72 Estimation algorithm: IGLS Elapsed time : 0.3s Number of obs: 3435 (from total 3435) The model converged after 4 iterations. Log likelihood: -8574.6 Deviance statistic: 17149.1 --------------------------------------------------------------------------------------------------- The model formula: attain ~ 1 + (s1 + s2 + s3 + s4 + s5 + s6 + s7 + s8 + s9 + s10 + s11 + s12 + s13 + s14 + s15 + s16 + s17 + s18 + s19 | cons) + (1 | pid) + (1 | pupil) Level 3: cons Level 2: pid Level 1: pupil --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] Intercept 5.50405 0.17488 31.47 1.974e-217 *** 5.16130 5.84680 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the cons level: Coef. Std. Err. var_s1 0.34789 0.16227 var_s2 0.34789 0.16227 var_s3 0.34789 0.16227 var_s4 0.34789 0.16227 var_s5 0.34789 0.16227 var_s6 0.34789 0.16227 var_s7 0.34789 0.16227 var_s8 0.34789 0.16227 var_s9 0.34789 0.16227 var_s10 0.34789 0.16227 var_s11 0.34789 0.16227 var_s12 0.34789 0.16227 var_s13 0.34789 0.16227 var_s14 0.34789 0.16227 var_s15 0.34789 0.16227 var_s16 0.34789 0.16227 var_s17 0.34789 0.16227 var_s18 0.34789 0.16227 var_s19 0.34789 0.16227 --------------------------------------------------------------------------------------------------- The random part estimates at the pid level: Coef. Std. Err. var_Intercept 1.12374 0.19794 --------------------------------------------------------------------------------------------------- The random part estimates at the pupil level: Coef. Std. Err. var_Intercept 8.11163 0.19997 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > > > # 18.5 Other aspects of the SETX command . . . . . . . . . . . . . . . . 277 > > sid_inter <- sid_dummy * xc$vrq > sid_inter_names <- paste0("s",1:19, "Xvrq") > colnames(sid_inter) <- sid_inter_names > xc <- cbind(xc, sid_inter) > > sid_names <- c(sid_dummy_names, sid_inter_names) > > covmatrix <- NULL > for (i in 1:38) { + for (j in 1:i) { + if (i != j) { + covmatrix <- cbind(covmatrix, c(3, sid_names[j], sid_names[i])) + } + } + } > > random.ui <- matrix(0, 40, 36) > random.ui[1, 1:18] <- 1 > random.ui[2:19, 1:18] <- diag(18) * -1 > random.ui[20, 19:36] <- 1 > random.ui[21:38, 19:36] <- diag(18) * -1 > random.ci <- rep(0, 36) > > (mymodel2 <- runMLwiN(attain ~ 1 + (s1 + s2 + s3 + s4 + s5 + + s6 + s7 + s8 + s9 + s10 + s11 + s12 + s13 + s14 + + s15 + s16 + s17 + s18 + s19 + s1Xvrq + s2Xvrq + s3Xvrq + s4Xvrq + s5Xvrq + s6Xvrq + s7Xvrq + s8Xvrq + s9Xvrq + + s10Xvrq + s11Xvrq + s12Xvrq + s13Xvrq + s14Xvrq + s15Xvrq + s16Xvrq + s17Xvrq + s18Xvrq + s19Xvrq | cons) + (1 | pid) + + (1 | pupil), estoptions = list(clre = covmatrix, constraints = list(random.ui = random.ui, random.ci = random.ci)), + data = xc)) /nogui option ignored ECHO 0 Echoing is ON BATC 1 Batch mode is ON MAXI 2 STAR iteration 0 iteration 1 Convergence not achieved TOLE 2 MAXI 20 NEXT iteration 2 iteration 3 iteration 4 iteration 5 iteration 6 Convergence achieved ECHO 0 Execution completed -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN (version: 3.05) multilevel model (Normal) N min mean max N_complete min_complete mean_complete cons 1 3435 3435.00000 3435 1 3435 3435.00000 pid 148 1 23.20946 72 148 1 23.20946 max_complete cons 3435 pid 72 Estimation algorithm: IGLS Elapsed time : 10.34s Number of obs: 3435 (from total 3435) The model converged after 7 iterations. Log likelihood: -7498.4 Deviance statistic: 14996.8 --------------------------------------------------------------------------------------------------- The model formula: attain ~ 1 + (s1 + s2 + s3 + s4 + s5 + s6 + s7 + s8 + s9 + s10 + s11 + s12 + s13 + s14 + s15 + s16 + s17 + s18 + s19 + s1Xvrq + s2Xvrq + s3Xvrq + s4Xvrq + s5Xvrq + s6Xvrq + s7Xvrq + s8Xvrq + s9Xvrq + s10Xvrq + s11Xvrq + s12Xvrq + s13Xvrq + s14Xvrq + s15Xvrq + s16Xvrq + s17Xvrq + s18Xvrq + s19Xvrq | cons) + (1 | pid) + (1 | pupil) Level 3: cons Level 2: pid Level 1: pupil --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] Intercept -9.80600 0.31434 -31.20 1.221e-213 *** -10.42210 -9.18991 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the cons level: Coef. Std. Err. var_s1 0.37611 0.58525 var_s2 0.37611 0.58525 var_s3 0.37611 0.58525 var_s4 0.37611 0.58525 var_s5 0.37611 0.58525 var_s6 0.37611 0.58525 var_s7 0.37611 0.58525 var_s8 0.37611 0.58525 var_s9 0.37611 0.58525 var_s10 0.37611 0.58525 var_s11 0.37611 0.58525 var_s12 0.37611 0.58525 var_s13 0.37611 0.58525 var_s14 0.37611 0.58525 var_s15 0.37611 0.58525 var_s16 0.37611 0.58525 var_s17 0.37611 0.58525 var_s18 0.37611 0.58525 var_s19 0.37611 0.58525 var_s1Xvrq 0.02483 0.00807 var_s2Xvrq 0.02483 0.00807 var_s3Xvrq 0.02483 0.00807 var_s4Xvrq 0.02483 0.00807 var_s5Xvrq 0.02483 0.00807 var_s6Xvrq 0.02483 0.00807 var_s7Xvrq 0.02483 0.00807 var_s8Xvrq 0.02483 0.00807 var_s9Xvrq 0.02483 0.00807 var_s10Xvrq 0.02483 0.00807 var_s11Xvrq 0.02483 0.00807 var_s12Xvrq 0.02483 0.00807 var_s13Xvrq 0.02483 0.00807 var_s14Xvrq 0.02483 0.00807 var_s15Xvrq 0.02483 0.00807 var_s16Xvrq 0.02483 0.00807 var_s17Xvrq 0.02483 0.00807 var_s18Xvrq 0.02483 0.00807 var_s19Xvrq 0.02483 0.00807 --------------------------------------------------------------------------------------------------- The random part estimates at the pid level: Coef. Std. Err. var_Intercept 0.29415 0.06503 --------------------------------------------------------------------------------------------------- The random part estimates at the pupil level: Coef. Std. Err. var_Intercept 4.23656 0.10460 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > > # Note: The final models in this section of the manual are for demonstration only. > # The models presented in the manual do not converge with the current data. > # We have therefore not given the R2MLwiN commands for these models. > > # 18.6 Reducing storage overhead by grouping . . . . . . . . . . . . . . 279 > > findClust <- function(data, var1, var2) { + grplist <- by(xc, xc$sid, function(x) x) + moreclust <- TRUE + while (moreclust) { + found <- FALSE + for (i in (length(grplist) - 1):1) { + if (length(grplist) == 1) { + moreclust <- FALSE + break + } + for (j in length(grplist):(i + 1)) { + if (length(intersect(grplist[[i]][[var2]], grplist[[j]][[var2]])) != 0) { + grplist[[i]] <- rbind(grplist[[i]], grplist[[j]]) + grplist[[j]] <- NULL + found <- TRUE + } + } + } + if (found) { + break + } else { + moreclust <- FALSE + } + } + + cat(paste0("Number of clusters: ", length(grplist), "\n")) + + ids <- NULL + for (i in 1:length(grplist)) { + ids <- rbind(ids, cbind(unique(grplist[[i]][[var1]]), i)) + } + colnames(ids) <- c(var1, "id") + merge(data[, c(var1, var2)], ids, sort = FALSE)$id + } > > if (!require(doBy)) { + warning("package doBy required to run this example") + } else { + data(xc, package = "R2MLwiN") + + xc$region <- findClust(xc, "sid", "pid") + xc$region <- NULL + + numchildren <- summaryBy(cons ~ sid + pid, FUN = length, data = xc) + colnames(numchildren) <- c("sid", "pid", "numchildren") + xc <- merge(xc, numchildren, sort = FALSE) + + xc <- xc[order(xc$sid, xc$pid), ] + + xc <- xc[xc$numchildren > 2, ] + + xc$region <- findClust(xc, "sid", "pid") + + xc <- xc[order(xc$region, xc$sid), ] + + xc$rsid <- rep(1, nrow(xc)) + + rgrp <- xc$region[1] + sgrp <- xc$sid[1] + id <- 1 + for (i in 2:nrow(xc)) { + if (xc$region[i] != rgrp) { + rgrp <- xc$region[i] + sgrp <- xc$sid[i] + id <- 1 + } + if (xc$sid[i] != sgrp) { + sgrp <- xc$sid[i] + id <- id + 1 + } + xc$rsid[i] <- id + } + + rs_dummy <- model.matrix(~factor(xc$rsid) - 1) + rs_dummy_names <- paste0("rs", 1:8) + colnames(rs_dummy) <- rs_dummy_names + xc <- cbind(xc, rs_dummy) + + covmatrix <- NULL + for (i in 1:8) { + for (j in 1:i) { + if (i != j) { + covmatrix <- cbind(covmatrix, c(3, rs_dummy_names[j], rs_dummy_names[i])) + } + } + } + + random.ui <- matrix(0, 10, 7) + random.ui[1, ] <- 1 + random.ui[2:8, ] <- diag(7) * -1 + random.ci <- rep(0, 7) + + xc <- xc[order(xc$region, xc$pid, xc$pupil), ] + + (mymode1 <- runMLwiN(attain ~ 1 + (rs1 + rs2 + rs3 + rs4 + rs5 + rs6 + rs7 + rs8 | region) + (1 | pid) + (1 | pupil), + estoptions = list(clre = covmatrix, constraints = list(random.ui = random.ui, random.ci = random.ci)), data = xc)) + + } Loading required package: doBy Number of clusters: 1 Number of clusters: 6 /nogui option ignored ECHO 0 Echoing is ON BATC 1 Batch mode is ON MAXI 2 STAR iteration 0 iteration 1 Convergence not achieved TOLE 2 MAXI 20 NEXT iteration 2 iteration 3 iteration 4 Convergence achieved ECHO 0 Execution completed -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- MLwiN (version: 3.05) multilevel model (Normal) N min mean max N_complete min_complete mean_complete max_complete region 6 78 544.5 1330 6 78 544.5 1330 pid 135 3 24.2 72 135 3 24.2 72 Estimation algorithm: IGLS Elapsed time : 0.13s Number of obs: 3267 (from total 3267) The model converged after 5 iterations. Log likelihood: -8153.7 Deviance statistic: 16307.3 --------------------------------------------------------------------------------------------------- The model formula: attain ~ 1 + (rs1 + rs2 + rs3 + rs4 + rs5 + rs6 + rs7 + rs8 | region) + (1 | pid) + (1 | pupil) Level 3: region Level 2: pid Level 1: pupil --------------------------------------------------------------------------------------------------- The fixed part estimates: Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval] Intercept 5.58195 0.18121 30.80 2.35e-208 *** 5.22678 5.93712 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --------------------------------------------------------------------------------------------------- The random part estimates at the region level: Coef. Std. Err. var_rs1 0.38383 0.18873 var_rs2 0.38383 0.18873 var_rs3 0.38383 0.18873 var_rs4 0.38383 0.18873 var_rs5 0.38383 0.18873 var_rs6 0.38383 0.18873 var_rs7 0.38383 0.18873 var_rs8 0.38383 0.18873 --------------------------------------------------------------------------------------------------- The random part estimates at the pid level: Coef. Std. Err. var_Intercept 1.10259 0.20150 --------------------------------------------------------------------------------------------------- The random part estimates at the pupil level: Coef. Std. Err. var_Intercept 8.10443 0.20474 -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- Warning message: In runMLwiN(attain ~ 1 + (rs1 + rs2 + rs3 + rs4 + rs5 + rs6 + rs7 + : pid has unused factor levels defined. These were dropped from this model run, but we recommend removing them prior to calling runMLwiN. > > # 18.7 Modelling a multi-way cross-classification . . . . . . . . . . . .280 > > # 18.8 MLwiN commands for cross-classifications . . . . . . . . . . . . .281 > > # Chapter learning outcomes . . . . . . . . . . . . . . . . . . . . . . .282 > > ############################################################################ > > proc.time() user system elapsed 4.51 0.70 17.43