IT1 Logo

WEEK 7+ ADDITIONAL EXCEL EXERCISE III:  
THE RELATIONSHIP BETWEEN TWO VARIABLES 
 
 
 
 
 
 
 

[p015ac]



Introduction

The objective of this final additional set of EXCEL exercises is to acquire some further techniques for exploring the relationship between two variables, in this case using a dataset on party allegiance and voting behaviour in Britain during the 1840s. Cross-tabulations of voting and party allegiance are constructed in order to compute a summary statistic - the X2, pronounced ‘ky square’ and written as ‘chi-square’ - which measures the degree of independence (or dependence) of one variable (in this case, voting) on another (in this case, party allegiance). The X2 test allows conclusions concerning the significance of relationships between variables to be drawn with greater confidence than is possible with the usual casual empiricism practised by historians (as in Z was important).

This session differs from those hitherto in that there is a large amount of explanatory material on the X2 test which you will need to work through before embarking upon the exercises. Do not be put off by the use here of more ‘complicated’ statistical techniques, but follow through the material carefully, supplementing it with reference to one of the statistics texts mentioned below. When you have mastered the X2 test you will have good grounds for feeling satisfied with your progress, and anxious no doubt to apply this technique to more recent data on a wide range of topics in contemporary history.

Resources

All of the data for these exercises are in EH156, which you should copy to your master floppy disk, renaming and saving as per the normal rules. This dataset was developed by Aydelotte (1963) and, in its original form, consisted of punchcards listing the voting behaviour in 114 divisions of all 815 men who sat in the House of Commons between the general elections of 1841 and 1847, a total of 92,910 cases. Aydelotte's paper is of importance both because it derives from one of the first machine-readable historical datasets but also because it marked an important stage in the development of computerised, historical psephology (see also Aydelotte 1972 and, for more recent work, Schonhardt-Bailey 1994).

This dataset allows you to explore whether party allegiance influenced voting behaviour during a critical phase (Robert Peel’s ministry) in the establishment of an essentially laissez-faire stance by government towards the economic and social role of the state (see Kitson Clark 1962 for a generally reliable political history of this period; and Taylor 1972 for an exploration of the role of economic ideas in the contemporary debate over collectivism versus laissez-faire). Voting by political party (Liberal and Conservative) is summarised in panel A of the dataset for the key debates (such as Corn Law repeal, Poor Law reform, the reintroduction of the income tax and Factories Acts); while in panels B and C the data is ordered in such a way - the Guttman scale - that it records the relationship between votes on several related divisions, thereby highlighting the salience of certain issues and their durability. Thus, for example, those who voted against the Chartist petition in 1842 (85 per cent of MPs) were much more likely to vote against the repeal of the Corn Laws in 1843 (a probability of 0.775). Panel D contains more detailed voting data on the Ten Hours Bill, 1844.

The test statistic we shall derive - the X2 - is described in any basic statistics text. There is a particularly good account of its use by historians in Floud (1979, pp. 132-8); see also Haskins and Jeffrey (1990, pp. 222-6); Jarausch and Hardy (1991, pp. 107-16) and Darcy and Rohrs (1995, pp. 113-20). It is strongly recommended that you read at least one of these before embarking upon the following exercises.

Chi-square test (X2)

A necessary starting point for understanding the definition and derivation of the X2 test is the contingency table which portrays cross-tabulations between two nominally scaled variables. Table 1 details votes for and against clause 8 of the Ten Hours Bill, 1844 (which would actually have specified a twelve hour maximum working day), a motion defeated by 194 to 191 votes. This is an example of a 4 by 2 (4x2) contingency table where there are four possible cases for the independent variable (the political faction: Liberal Party vs. three variants of MPs holding the Conservative whip) and two possible outcomes for the dependent variable (voting: For vs. Against; hereafter + and -).

Table 1  Observed voting behaviour by political party and faction: clause 8, Ten Hours Bill, 22 March 1844
 

 

+

-

Total

Liberals

56

94

150

Peelites

52

25

77

Protectionists

57

62

119

Other Conservatives

26

13

39

Total

191

194

385

Note: Conservative Party MPs = Peelites, Protectionists and Other Conservatives (N=235). The Peelites are defined as those who voted for the repeal of the Corn Laws in 1846 and the Protectionists those who voted against.

Cursory visual inspection suggests a clear divide between the political parties, and between the factions within the Conservative Party, on this issue, but is the pattern of voting statistically significant? Might there be a random element to voting behaviour? In order to determine whether the disparities observed are greater than what might be expected by random error, the X2 is computed: an inferential statistic which compares the joint frequency distribution observed in the table to an expected joint frequency distribution that would be found if the two variables were not related, i.e. were independent. Thus follows the null hypothesis that the two variables are statistically unrelated, which in this case would be that voting behaviour is invariant to party affiliation.

A first step is to compute the expected voting figures, an exercise conducted in Table 2. This shows the full workings so as to make clearer how the expected figures are derived from the null hypothesis, i.e. that we would expect the same proportion of party (or faction) members to vote for or against the motion in exactly the same proportion as the total votes cast for or against the motion (e.g. that the probability of a Liberal MP voting for the motion is the same as that for a Peelite MP). Thus, from the observed voting figures we can calculate that 49.61 per cent of MPs (100*191/385) voted for the motion, and the expected number to vote for the motion for either of the parties/factions will be equivalent to the number of their MPs adjusted by this percentage (150*191/385 in the case of Liberal MPs). The values are entered into this contingency table to two decimal places.

Table 2  Observed and expected voting behaviour by political party and faction: clause 8, Ten Hours Bill, 22 March 1844

A. Observed

+

-

Total

Liberals

56.00

94.00

150.00

Peelites

52.00

25.00

77.00

Protectionists

57.00

62.00

119.00

Other Cons.

26.00

13.00

39.00

Total

191.00

194.00

385.00

B. Expected

 

 

 

Liberals

(191/385)*150=74.42

(194/385)*150=75.58

150.00

Peelites

(191/385)*77=38.20

(194/385)*77=38.80

77.00

Protectionists

(191/385)*119=59.04

(194/385)*119=59.96

119.00

Other Cons.

(191/385)*39=19.34

(194/385)*39=19.65

39.00

Total

(191/385)*385=191.00

(194/385)*385=194.00

385.00

This exercise shows clearly that more Liberal MPs than might be expected opposed the clause (94 - votes observed and 75.58 expected), whilst Peelites were much more likely to support the motion (52 + votes observed and 38.20 expected). We are now in a position to explore whether this pattern of voting might be random. Table 3 presents a simplified form of the calculation.

Table 3  Derivation of X2 estimates: Ten Hours Bill, 1844
 

 

 

+

-

Liberal

Observed

56.00

94.00

 

Expected

74.42

75.58

 

Cell X2

4.56

4.49

Peelites

Observed

52.00

25.00

 

Expected

38.20

38.80

 

Cell X2

4.99

4.91

Protectionists

Observed

57.00

62.00

 

Expected

59.04

59.96

 

Cell X2

0.07

0.07

Other Cons.

Observed

26.00

13.00

 

Expected

19.34

19.65

 

Cell X2

2.29

2.25

In the above the cell X2 is defined as the square of the observed less the expected value divided by the expected value. Thus the cell X2 for the + vote by Liberal MPs is:

 (56.00-74.42)2
______________
       74.42

 and that for the - vote by Liberal MPs is:

 (94.00-75.58)2
_______________
      75.58

Once the cell X2 have been calculated the table X2, the chi-square test, is merely the sum of the individual cell X2, in this case 23.63. Having derived the estimate we then need recourse to a standard statistics textbook which contains a table of the X2 distribution. The value of the X2 that would result in the null hypothesis being rejected at a given level of significance depends on the dimensions of the table. The greater the number of rows and columns in a contingency table, the higher is the minimum level of X2 necessary to achieve statistical significance (it is hoped that, from the above example, it is obvious that the X2 increases as the distance between the observed and expected values in each cell increases). The actual critical value is determined from the level of significance required and the degrees of freedom for the table, the latter defined as:

 (No. of cols. - 1)*(No. of rows - 1)

 which in this case would be 3 degrees of freedom.

From the X2 tables, here taken from Koutsoyiannis (1977, table 3), we see that the X2 would have to be at least 11.34 in order to reject the null hypothesis of no relationship at the .01 level of significance. A significance level of .01 means that due to random error there is a 1 per cent chance of rejecting the null hypothesis even when it is actually true. With a critical value of 11.34 and a computed X2 of 23.63 we might have some confidence that we can reject the null hypothesis and thus for this division in the House of Commons there was a relationship between voting and party/faction allegiance. However, the X2 test tells us only that it is probable - within the parameters set - that two variables are related; it does not provide a measure of the strength of any relationship. This requires that we compute a measure of association (in this case, Cramér’s V2 introducedlater in this exercise).

More formally, the X2  is defined as:
           R        C
X2       (Oij-Eij)2/Eij 
         i=1    j=1
where R is the number of rows, C is the number of columns, i is the row subscript, j the column subscript and Oij and Eij are respectively the observed and the expected values for each cell.

Before proceeding to the exercises we must note that special rules apply for 2 by 2 contingency tables (i.e. those with two rows and two columns) which, of course, provide only 1 degree of freedom. Special rules also apply when N<40, but as a general rule one should not attempt this sort of quantitative research with such a small sample.

For 2 by 2 tables the general procedure outlined above will inflate the result for X2, and a slightly different procedure is applied. We will use Table 4 which details votes for and against a motion opposing the reimposition of the income tax in 1842, a motion defeated by 285 to 190 votes (i.e. the income tax was reimposed, initially at 7d in the &pound; - equivalent to 2.9 per cent). This is an example of a 2 by 2 (2x2) table because there are two possible outcomes for the independent variable (the political party: Liberal vs. Conservative) and two for the dependent variable (voting: For vs. Against; hereafter + and -).

Table 4  Observed voting behaviour by political party: opposition to Income Tax Bill, 1st Reading, 18 April 1842
 

 

+

-

Total

Liberals

186

6

192

Conservatives

4

279

283

Total

190

285

475

Cursory visual inspection again suggests a clear divide between the political parties on this issue, and as before the first step is to derive the estimated votes on the null hypothesis that voting is invariant to party allegiance. These are reported in Table 5 together with the cell X2  produced by the general method outlined above (which produce a table X2  of 434.3).

Table 5  Derivation of X2 estimates: opposition to Income Tax Bill, 1st Reading, 18 April 1842
 

 

 

+

-

Liberal

Observed

186.00

6.00

 

Expected

76.80

115.20

 

Cell X2

155.3

103.5

Conservatives

Observed

4.00

279.00

 

Expected

113.20

169.80

 

Cell X2

105.3

70.2

This exercise shows clearly the excess of the Liberal vote (deficit of the Conservative vote): 76.8 votes expected as against 186 observed (113.2 as against 4 for the Conservatives). The table X2 of 434.3 is an overestimate, and instead we use (notation from Floud 1979, p. 137);

                  N(|AD-BC|-N/2)2
X2 = _______________________________
             (A+B)(C+D)(B+D)(A+C)

where the |AD-BC| indicates the absolute value of AD-BC; that is the sign is ignored and the term is treated as positive even if BCAD. The notation, using the above example of votes for and against the Income Tax Bill, is as follows:

Table 6  Labelling the cells of a 2 by 2 contingency table
 

 

+

-

Total

Liberal

A

B

A+B

Conservatives

C

D

C+D

Total

A+C

B+D

N

Using the above formula we derive:

            475(((186*279)-(6*4))-475/2)2         475(51,870-237.5)2
X2 = _______________________________ = _____________________ = 430.38
         (186+6)(4+279)(6+279)(186+4)              2,942,294,400

which is again so large in relation to the critical value that we have no hesitation in rejecting the null hypothesis (which has critical values of 6.63 for the .01 significance level and 7.88 for the .005 significance level).
 

Exercise 1: who supported the Chartist petition?

The first first exercise is to construct a 2 by 2 contingency table of voting for and against the Chartist Petition, 1842. Given the Conservative vote this produces a neat, clear-cut result and whilst you do not need to calculate the expected values to derive the X2 please do so in this case so as to gain some practice which this element. It also reveals very forcefully the difference in voting patterns.

Using panel A of EH156 as your source construct a table as follows, filling in the gaps marked * in Table 7. Enter the values into this contingency table to two decimal places.

Table 7  Observed and expected voting behaviour by political party: Chartist Petition,3 May 1842
 

 

 

+

-

Total

Liberals

Observed

51

68

*

 

Expected

*

*

*

Conservatives

Observed

0

221

*

 

Expected

*

*

*

 

Total

*

*

*

Now calculate the table X2. With 1 degree of freedom the critical value was 6.63 for the .01 significance level and 7.88 for the .005 significance level. Do you accept or reject the null hypothesis?

We can now progress to a larger contingency table: a 3 by 2 for the Ten Hours Bill, 1847. Calculate the X2. Having demonstrated that we can reject the null hypothesis we now explain more of the properties of the X2 and then go on to define and use Cramér’s V2 measure of association.

First, the magnitude of the X2 depends on:

Cramér’s V2 measure of associatio divides X2 by the size of the table (L-1) and the number of cases (N) to isolate the strength of the relationship, where L is defined as the lesser of the number of rows or columns. Put formally, Cramér’s V2 measure of association is defined as:

  X2
_____
(L-1)N

and its values fall within the range 0 and 1, the higher the value the stronger the relationship.

Calculate this measure of association for the data in Table 7. To assist you in interpreting your results, by convention weak relationships are taken to be values from 0 to 0.30, moderate relationships between 0.30 and 0.70 and strong relationships those of 0.70 and above.

Exercise 2: class and political party allegiance

It is a staple of British political science that class is the principal determinant of party allegiance. Use EH158.xls to test the relationship between class (here represented not by the Registrar General’s categories but by ones which have been developed to incorporate economic interests) and voting in the general elections of 1964 (won by Labour from the Conservatives with a small majority) and 1983 (in which the Conservatives increased their majority). Derive all of the statistical tests covered in this session and consider whether class voting declined between 1964-83. You might also like to have a look at the source for EH158 (Heath et al. 1985; see also Crewe et al. 1991). Finally, construct Chart 1 and any other graphs which you consider display the relationship between class and voting behaviour. Print out both graph and calculations.

References

Aydelotte, W.O. (1963) 'Voting patterns in the British House ofCommons in the 1840s', Comparative Studies in Society and History, 5 (2), pp. 134-63.
Aydelotte, W.O. (1972) 'The disintegration of the Conservative Party in the 1840s: a study of political attitudes', in W.O. Aydelotte, A.G.Bogue and R.W. Fogel (eds) (1972) The dimensions of quantitative research in history. London: Oxford University Press, pp. 319-46.
Crewe, I., Day, N. and Fox, A. (1991) The British electorate, 1963-1987: a compendium of data from the British election studies. Cambridge: Cambridge University Press.
Darcy, R. and Rohrs, R.C. (1995) A guide to quantitative history. Westport, CT: Praeger.
Floud, R.C. (1979) An introduction to quantitative methods for historians, 2nd edn. London: Methuen.
Haskins, L. and Jeffrey, K. (1990) Understanding quantitative history. Cambridge, MA: MIT Press.
Heath, A., Jowell, R., and Curtice, J. (1985) How Britain votes. Oxford: Pergamon Press.
Jarausch, K.H. and Hardy, K.A. (1991) Quantitative methods for historians: a guide to research, data and statistics. Chapel Hill, NC: University of North Carolina Press.
Kitson Clark, G.S.R. (1962) The making of Victorian England. London: Methuen.
Koutsoyiannis, A. (1977) Theory of econometrics: an introductory exposition of econometric methods, 2nd edn. London: Macmillan.
Schonhardt-Bailey, C. (1994) 'Linking constituency interests to legislative voting behaviour: the role of district economic and electoral composition in the repeal of the Corn Laws', Parliamentary History, 13 (1), pp. 86-118.
Taylor, A.J. (1972) Laissez-faire and state intervention in nineteenth-century Britain. London: Macmillan.
 
 
To IT-MA home page
To Department of Historical Studies home page.


These pages are maintained and owned by Dr Roger Middleton

(c)R. Middleton 1997. Last modified 30 June 1998.