Sunday, 23 February 2014

A quick look at the other tools


After this weekend there will be two more before election day. Last weekend we covered the maps of electoral trends, which leaves us with the seat run-downs and pendulum to discuss. It might seem like a good idea to dedicate this weekend to one, next weekend to the other and the final weekend to formalising predictions. However, next weekend I will be providing summaries of each Legislative Council party. While this is not useful in regards to predicting electoral outcomes I personally believe it is one of the most important subjects I cover on this blog since it helps people make informed above- or below-the-line votes, provides links to each parties website and just generally makes the whole process easier. I know I had a few of my non-regular readers visit my blog exclusively for those summaries.

The Polls and the Pendulum


This week, then, we will cover both the seat rundowns and the pendulum. For the latter we will be using Antony Green'sdata as opposed to the numbers produced by the South Australian Electoral Districts Boundaries Commission. You can read the important distinctions at the previous link, but basically Antony Green's data is for psephological purposes while the EDBC's is for the redistribution of boundaries and factors in all kinds of demographic shifts. As Mr Green himself explains, “when calculating swing, you should be comparing with the 2010 election, not a mythical estimate of what the 2010 result might have been in 2014.”


Notice that the above pendulum is sorted alphabetically instead of by margin or party? You won't see a pendulum like that from anyone who actually knows what they are doing. Here, however, it is useful because we can directly cross reference the pendulum seat-by-seat with our other tools. However, for the purists or those more visually inclined, here is the same information of a Liberal-Labor scale.

Note that the Independents in Fisher, Frome and Mount Gambier are listed as Liberals

The Independents have been included with the Liberals on this scale for several reasons. Firstly, their margins are all measured against the Libs, so the Labor party is well down in a 3-horse race. Secondly, the Independents are conservative, suggesting that a lot of their votes would flow to the Libs if they did not run. This means that the Labor vote in these seats is not necessarily low because would-be Labor voters were seduced by the lure of an Independent (although it is a fair be that a lot of the Labor vote flowed to the Independents rather than the Libs and probably cost the Coalition the seat. Thirdly, in a hung parliament, it is expected the Independents would side with the Liberal Party to form government.

Although the Labor Party has more seats (and hence is currently in power), the Liberals fewer marginal seats: 3 (and an Independent) compared with Labor's 11. The Liberals only have one fairly safe seat (and an Independent) compared to Labor's 4, and while Labor has 11 safe seats the Liberals have 10, plus 1 Independent and 4 in the very safe category which Labor failed to reach. If the Liberals won all of Labor's marginals, they would hold 29 seats with 3 Independents by their side; if Labor won all of the Liberal marginals they would hold 29 seats with 3 Independents against them.

Assuming a uniform swing and support from the Independents, the Liberals need a 0.6% swing in their favour (which, if repeated in the Independent contest in Mount Gambier would also win them that seat. Even without the support of the greys, Steven Marshall would form a government on the back of a uniform 2.6% swing. Whether or not they will get this swing, of course, remains to be seen.

Seat Statistics


Swings, of course, are not uniform, although the search for a reliable indicator of where they will be stronger or weaker continues to allude me. Instead we are going to have to rely on seats being “Labor seats” and “Liberal seats” to calculate which seats are most open to being targeted. Previous analysis indicates that a seat that has a large swing one election might have a small swing the next, so such definitions can only be a rough guide at best.

Previously for the run downs I interpreted each seat on two factors – how strongly it supported a party (its “strength”) and how reliable that support was (its “volatility”). The former was subjectively assigned as very safe, safe or leaning, while the latter was intuitively divided into stable, variable and volatile. These assignments were based on the incumbency and margin (provided above in the pendulum) and the seat's history outlined last weekend. One additional factor used in the federal election – the state seats that lie within the federal seats – cannot be replicated at the state level.

This election we are going to try something different. For the purposes of making the results repeatable, consistent and measurable, these run downs will rely on more rigid definitions.

The VDTAs are useful for examining recent voting trends. The run downs attempt to identify long-term biases in seats. For this reason, historical inclination will be determined from the number of times a party has won the seat but lost the election – that is, where the seat's preferences are revealed to be skewed to one party or another relative to the state as a whole.

To do this, we first need to know the results of every election since 1938:


The following districts voted against the general trend in the following years:


From this we can calculate the following tables:


For simplicity we have ignored Independents. The contrary count is the number of times the seat has been won by a party that lost the election. Adelaide has voted for the ALP 11 times when the Coalition has gone on to win the election, which initially suggests this is a decent seat for Labor.

The In Step table shows how many times the seat has been won by the party that also won the election. This is useful because it lets us total the number of times a party has won the seat; realising that Playford has voted for the ALP 4 times in Coalition victories is somewhat meaningless until we also realise it has never voted for the Coalition.

The final table is calculated to show the percentage of times the ALP has won Coalition elections and vice versa. In Adelaide, Labor won the seat in 11 of the 12 Coalition-won elections (91.7%) and the Coalition won Adelaide in 3 of the 10 Labor elections (30%) since 1938. ALP data fro Little Para cannot be calculated because it has never participated in a Coalition-won election. No figures for Mount Gambier can be calculated since it has always elected an Independent.

Let's calculate the liability of each seat to favour a given party from the difference between these last figures. To continue to use Adelaide as an example, lets say the seat is 91.7 – 30 = 61.7% leaning to the ALP. Like most of the dramatic ALP leans, this is mostly historical, with a large opposition to the Coalition dominance prior to 1965 under the Playford-favouring Gerrymander creatively known as the Playmander. However lets leave the concerns about recent vs ancient trends to be picked up by the VDTAs and use this (admittedly arbitrarily calculated) figure for a ballpark and see what we can kick around.

I have also arbitrarily assigned leans of more than 66.66% the label “steadfast”, those over 33.33% “Reliable” and the rest “Leaning” (with the two previously mentioned exceptions of Little Para and Mount Gambier). This was not done simply because dividing the range into three equal-sized divisions is appealing. I subjectively assigned the labels to the seats and checked their values later. These values are close approximations of my intuitive divisions of the seats:

Adelaide: Reliable Labor
Ashford: Steadfast Labor
Bragg: Steadfast Liberal
Bright: Reliable Liberal
Chaffey: Reliable Liberal
Cheltenham: Steadfast Labor
Colton: Leaning Labor
Croydon: Steadfast Labor
Davenport: Steadfast Liberal
Dunstan: Reliable Labor
Elder: Reliable Labor
Enfield: Steadfast Labor
Finniss: Steadfast Liberal
Fisher: Steadfast Liberal
Flinders: Steadfast Liberal
Florey: Steadfast Labor
Frome: Leaning Liberal
Giles: Steadfast Labor
Goyder: Steadfast Liberal
Hammond: Steadfast Liberal
Hartley: Leaning Labor
Heysen: Steadfast Liberal
Kaurna: Reliable Labor
Kavel: Steadfast Liberal
Lee: Reliable Labor
Light: Steadfast Liberal
Little Para: Unknown
MacKillop: Steadfast Liberal
Mawson: Leaning Labor
Mitchell: Steadfast Labor
Morialta: Reliable Liberal
Morphett: Steadfast Liberal
Mount Gambier: Unknown
Napier: Steadfast Labor
Newland: Leaning Liberal
Playford: Steadfast Labor
Port Adelaide: Steadfast Labor
Ramsay: Steadfast Labor
Reynell: Reliable Labor
Schubert: Steadfast Liberal
Stuart: Steadfast Liberal
Taylor: Steadfast Labor
Torrens: Reliable Labor
Unley: Leaning Labor
Waite: Steadfast Liberal
West Torrens: Steadfast Labor
Wright: Reliable Labor

Exactly how this factors into the final results will have to wait until the predictions, but here is a quick look at the data:


On face value it looks as though the ALP has the slight majority in an even battle, but the Coalition only needs to break into the Labor Leaning seats to take a majority. Colton, Hartley, Mawson and Unley are seats to watch. Particularly Unley – a recurring oddball.

By and large this data conforms with the current state of the seats, which suggests a decent predictive power. Ignoring the unknowns (Little Para is Labor while Mount Gambier is Independent) and the Liberal-leaning Independents (Frome and Fisher are still listed as Liberal seats), there are only 3 seats from each party that have a supposed bias contrary to their current incumbent – Adelaide, Dunstan and Unley are currently Liberal while Bright, Light and Newland are currently Labor. These, again, will be worth watching. Again, particularly Unley.

Finally their historical inclination calculated here also closely correlate with the VDTA. All of the clear Liberal or Labor seats on the VDTA correspond with Liberal or Labor seats from today's data, with the single exception of the oddball Unley (again!) which is historically safe Liberal on the VDTA but leaning Labor here.

Fisher is slightly closer to grey than the other blue seats, but is still Liberal by both measures. The VDTA's slightly red Morialta is listed as reliable Liberal here, but the colour is so close to the midrange on the map I don't see this as a major disagreement. The other mid-ranged seats on the VDTA are as follows:

Hartley and Mawson – leaning Labor
Newland – leaning Liberal
Bright – Reliable Liberal
Light – Steadfast Liberal

Note that these last three – Bright, Light and Newland – are the three peculiarities listed above as Labor occupied but (by this post's reckoning) Liberal inclined. Either today's analysis has failed to capture a recent development in these seats, or the atypical Labor incumbency has biased the VDTA. All have passed to Labor for the last two elections after at least 4 wins by the Coalition, which could support either interpretation.

Either way, in the wider view, all of our methods are starting to align nicely.

Saturday, 15 February 2014

The Post-Mapstravaganza Clean-Up Party

Mapstravaganza comes to an end today. If you have not been following this weekend's binge on all things psephocartographic, you can find parts 1 and 2 on the other end of those links. It is advised that you read those first, as this post may contain some spoilers.

Summary


So far we have six maps:

Current standings

Average of seat histories since 1938
Average of seat histories since 1993
10% CDA (10% 2010 results, 9% 2006, 8.1% 2002 etc.)
Carry% VDTA (each layer determined by % of seats retained)
50% Carry% VDTA (as above, adjusted so top 4 layers have an average 50% opacity)
Maps 1, 2 and 6 are similar to those used in the federal election, although some methodological changes (e.g. 1938 cut off) were necessitated by the peculiarities of South Australian electoral history. The equation for map 6 has been modified from 7 layers to 4.

  1. A map of current seat incumbency. In the 2013 federal election the equivalent of Map 1 was found to be an 85% accurate predictive tool, and the carry% values calculated last post suggest a generally greater than 80% success rate at state level
  2. In the 2013 federal election the equivalent of Map 2 was also found to be 85% accurate
  3. Map 3 is a variation on Map 2 semi-arbitrarily defining 1993 as the start of modern electoral history due to the presence of a significant proportion of contemporary seats
  4. Map 4 is a new approach of unknown accuracy as a predictive tool. Ideologically it is a hybrid of Maps 2 and 6, both of which had well-performing equivalents in 2013
  5. Map 5 is useless and will not be dealt with beyond this point. It is a demonstration of VDTA techniques, but has high levels of opacity resulting in a map largely equivalent to an outdated Map 1 (i.e. 2006 incumbency map). This map is inferior to Map 1 and is therefore ignored.
  6. Is mathematically similar to a 2013 federal tool with 78% predictive success. Modifications to the underlying values for calculating the opacities have hopefully improved this, but may not have. This includes the shift from 7 to 4 layer direct influence (lower layers may still be influential in large numbers) which brings it into line with the 1993 origin of the modern political theatre suggested in Map 3

Analysis

Normally I would save these tables for a data dump, but discussing them here means we do not need a data dump at all.

Good thing I didn't start blogging before the invention of computers, or I would have had to do this with paint chart names...


This table includes the colours of all five usable maps (along with their hex-codes for the number crunchers out there). Note that although there have been no green seats won in SA, their is still a green value for many of these colours. I'm going to assume that anyone who knows how to read hexidecimal colour codes also knows that white light is composed of every colour combined, and so this green is obviously leaking through from the white background. This is obviously also true for red and blue, so areas with more background colour (e.g. those seats with a shorter history) get less clear cut results - which is fitting because it means longer histories can provide more confidence in predictions.

Exactly how close to the red-blue balance a seat should get before it is considered too close to call is up for debate, but for simplicity and considering this is but one tool, I am calling every seat's trend pattern regardless of how close it is:

Colours exaggerated based on their relative red and blue values
The hope is that this approach will improve the overall accuracy by combining data. For example, if we assume each map is roughly 80% accurate, seats like Hartley or Morialta can be viewed as examples where one map's inaccuracy is corrected by the other four. By using multiple sources of roughly equal reliability the outliers can be isolated and dealt with; you would have to be really really unlucky to have the majority of the sources defective.

I make no apologies for basing my statistical methods on the works of Philip K. Dick
A couple of notes: None of these methods, with the possible exception of Map ! (current standings) really allow an Independent's strength to become fully apparent. Mount Gambier is the real outlier here, due entirely to never having been held by a party affiliated candidate.

Also, these methods look at trends and history. The idea of a 100% accurate predictive model is ridiculous, but doubly so one based solely on past elections. This post and its prequels completely ignore the specifics of the election campaigns about to unfold, and are perhaps best views as indicators of likely results in a perfectly balanced election.

This method reliably suggests Labor holds a slight advantage 25 seats, the Coalition has a head start in 18 and one is reliably called for a conservative independent in Mount Gambier (held by Independents Don Pegler and, before that, Rory McEwan).
This leaves Adelaide, Bright and Light as the three others.

Aggregate of Maps 1, 2, 3, 4 and 6. For simplicity, all maps are given equal weight.


On this balance Labor looks to have an advantage, requiring the Coalition to take two historical seats from Labor as well as gaining the Independent's support. But then there are historical seats and historical seats; taking Mawson would be a lot easier than taking Playford on this data, for example, since the former consistently hovers just on the red side of equal.

In reality, this Labor strength is largely the result of the long Labor incumbency. The trends point too a Labor win because that has been the history of this state for more than a decade. Therefore, while there may be a pro-Labor lean in the community at large, the tides may also turn quickly.

We will need to consult our other tools before making any confident predictions.

TL;DR: There are 5 useful maps from the past two days:
Map 1: Current standing of seats
Map 2: Averaged history since single-member electorates became universal in 1938
Map 3: Averaged history since the bulk of modern seats were contested in 1993
Map 4: 10% opacity layers of all elections since 1938, with new results eventually drowning out the old
Map 6: VDTA adjusted for 50% average opacity across the first 4 layers. Opacity of each layer depends on the reliability of that layer in predicting the following results.
Taking these as being equal in weight (adjustments to old systems and introduction of new methods make comparison of accuracy difficult anyhow), these suggest Labor has a strong history in 25 seats to the Coalition's 18.
These trends only reflect historical voting and may differ from the next election depending on campaign-specific events of a general shift in public opinion.

More Maps. More Mayhem. More M-words.

Deck the streets with coreflute posters, the campaign has officially started. Without delay, here is the continuation of our mapstravaganza.

Last week we looked at some averages of seat history. These averages used a series of judgements that will be continued this week, so I will repeat them here:

Firstly, we ignore any elections prior to 1938, since the multi-candidate nature of seats prior to this provides incomparable data.

Secondly, where seats have had multiple incarnations, we consider only the most recent incarnation's results.

Thirdly, where a seat has its names changed, data from the previously-named seat is also included.

Finally, red indicates SA Labor seats, grey is independent and blue indicates current Coalition parties (Liberal and Nationals) and precursors to these (Liberal and Country League and Liberal Movement). The Liberal Movement is counted as Coalition, even though a small segment broke away to form the New Liberal Movement and eventually joined the Democrats.  Affiliated Independents (e.g. Independent Liberals) are treated as independents since they are often running against incumbent representatives of their affiliated parties. The Single Tax League is also counted as independent.

Last week we took two maps of averaged history – one averaged since 1938 and the other averaged for the last 5 elections. Exactly how far back the averages go before they are cut off is an often arbitrary choice. These next maps, however, avoid this issue by weighting more recent elections' results as more valuable predictors of future patterns. This eliminates the problem of arbitrary cut-offs without diluting more recent results. However, it is important to realise that the weighting system is equally subjective in regards to how heavily the weighting should affect the results.

We are still experimenting with this mapping style, first suggested for the Variable Dependent Transparency Arrays (VDTAs). But first, I would like to look at Variable Independent Transparency Arrays (VITAs) or Constant Transparency Arrays (CTAs). These, as the names suggest, use a constant level of transparency. The following map uses a 10% opacity (or 90% transparency) for each election result. This means the 2010 results contribute 10% of the colour of each seat. The 2006 results provide 10% of the remaining 90% (or 9%) of the colour, while the 2002 results provide 10% of the remaining 81% (8.1%). This differs from the averaging system where all elections provide equally weighted results (i.e. all elections contribute ~4% of the total colour) and is easily produced by overlapping semi-transparent maps:

Lighter coloured electorates have a shorter history and thus more white from the background.


The VDTAs are produced in the same way, but the opacity of each layer varies relative to a variable. First, however, let's look at the raw data:


ALP is the SA Labor Party, Co is the combined Coalition parties (and forebears) and IND is independent.

This table shows the voting history of each seat as per the colouring used in our maps (slightly faded to increase legibility). Normally I would leave this table for the data dump, but it is convenient to discuss it here.

Firstly, we can see that the majority of current seats were in place by 1993, hence the 5-election cut-off last post. This is a value we will return to later. Secondly, we can see the calculations conducted below.

For our VDTA's we want the opacity of each election to reflect the predictive strength of its results. For simplicity, we will only consider the seats still on the electoral map, and we calculate the predictive strength of each election by how reflective it was of the following results. For example, of the eight seats from 1938 still in use, five changed hands in 1941. This means three were retained (the 'carry' figure). This gives a 3/8 predictive strength for this election (or a 'carry percent' value of 38%). Note that 2010 has no predictive value as that requires comparison with the 2014 results. As a result this map is a little out-dated in that it does not incorporate the most recent results.

Firstly, here is a map where the layers' opacity is equivalent to the carry percent. As a result, data only dates back to 1973 where a 100% opacity obscures any earlier data.


Unfortunately, since the top layer is at 87% opacity it dominates the map. This map really only shows 5 noticeable categories:
Bright red (e.g. Giles): Labor retained seats
Dark red (e.g. Light): Labor wins from the Coalition
Grey (e.g. Fisher): Independent retained seats
Blue (e.g. Stuart): Coalition retained seats, and
Dark pink (Mitchel): Independent wins from Labor
As a result, the map effectively shows the 2006 results, and if we are going to use only one year's results as a predictive tool we may as well use the 2010 results (provided in the previous post.)

With these high opacities, new layers readily obscure old. With low opacities, however, we quickly reach a position where no year exerts a significant influence on the data, and the wash-out effect is so low as to effectively serve as a seat average.

Data derived from the federal map last year suggest that a contribution of 5% or more of the colour is distinguishable to the human eye, if we take my eye as the type specimen of a human eye and accept that I am, in fact, the definitive example of a human being. Oddly the scientific establishment seems reluctant to accept that I am the standard against which all other people should be measured, but we'll carry on with this self-evident truth until they come to their senses. We must also accept that the arbitrarily scaled federal map is indicative of our current map (and all maps) coloured in this way, although initial evidence suggests it is easier to observe colour differences in larger seats.

Taking 5% colour change as the definition of a seat being meaningfully included, we calculated that roughly 10% average opacity (as used in the CTA above) gives a good approximation of the maximum number of influencing layers (7 layers – almost 8). The arbitrary decision to aim for this maximum inclusion was hastily made, and the opacity of each layer was divided by a figure intended to result in 10% average opacity across the top 7 layers.

It is still undetermined exactly how many layers should be included for best results, but lets this time aim for 4 layers of influence. This will put the majority of the influence on the elections from 1993 onwards and, as we pointed out earlier and last post, 1993 was the election at which the vast majority of current seats were first present.

Last year's calculations based on 5% colour input suggest that the average opacity of the layers should be either delicately balanced between 6 and 7% opacity, or broadly equivalent to 50%. The opacity modifier (the constant denominator in the old VDTA equations) required to give an average opacity in the top 4 layers of 50% can be calculated thus:

where c is the percent carry

or, for the less pretentious or mathematically inclined, by dividing the average percent carry of 1993 to 2006 by 50. This gives a denominator of 1.66. Opacity (O) is then calculated as:


to give the following opacity values:


and this VDTA:



Make what you will of that until tomorrow (when I will make of them what I will).

TL;DR:
  • More maps provided, layering previous election results in semitransparent layers
  • This both represents the history of the seat and favours recent results over old data as a predictive tool by having the newer layers eventually obscure the old
  • Analysis of these and yesterday's maps to follow tomorrow evening

Friday, 14 February 2014

Chromatic Cartography

At 12:01 on Saturday the 15th of February, the Governor of South Australia is expected to issue the writs for the South Australian state election. With the electoral cycle about to start officially, it is time to get into high gear, take our calculators out of the mothballs and do a lot of other symbolic activities. This week we begin to refine the information to formulate our predictions.

And what better place to start than by colouring in some maps? If it were possible to sit back and enjoy whilst simultaneously perching on the edge of your seat, I would advise it now as we enter a weekend of electoral map colouring in a three-part Mapstravaganza!

The last couple of months for me have been a montage of pencil sharpeners and colouring books.
First, here are the current standings:

Red is SA Labor, Blue is the Coalition (LIB and NAT), Grey is Independent
SA Labor: 26, Coalition: 18, Independent: 3

As discovered here, there are a few methods of using prior seat data to give a good starting point for further analysis. In the federal election the "as is" map (i.e. the equivalent of the above) was 85% accurate. Another 85% accurate map was obtained by averaging the seat histories:

Each election contributes roughly 4.2% of the total colour. Seats from 1938* are at 100% opacity, but more recently created seats such as Little Para (which has only contested two elections) are significantly paler. Blue now includes current Coalition parties (Liberal, Nationals) and precursors to these (Liberal and Country League, Liberal Movement**). Affliliated Independents (e.g. Independent Liberals Stan Evans in 1985 and Keith Russack in 1977) are coloured grey, as these often ran against (and took seats from) the recognised candidates for their own parties. Grey also includes the Single Tax League, who won Flinders in 1938.

Where seats that have earlier incarnations (Enfield 1956-1970; Port Adelaide 1857-1970; Stuart 1938-1993 and West Torrens 1857-1902, 1915-1938, 1956-1970) only the most recent version of that seat is included. Where a seat has its names changed but is not modified in any other way, data from the previously-named seat is also included. Dunstan was named Norwood until 2014 and MacKillop was Victoria until 1993.

This next map only considers electoral data from the last 5 elections (1993, 1997, 2002, 2006 and 2010). Over half of the current electoral districts were present by the 1985 election with the additions of Bright and Ramsay, but a further 10 were added two elections later, making 1993 arbitrarily the beginning of the following map.

Now don't worry -- there are Variable Dependent Transparency Arrays to follow as well, but at the risk of overdoing the number of maps in one post, these will form the tomorrow's post, and we will then cover the analysis of all of these maps in the final post on Sunday.

In the mean time, happy new election cycle everybody!

TL; DR:
  • Pretty maps. More to come over the rest of the weekend.
  • Maps of the current standings and averaged history of electorates provided above.
  • VDTA's to follow in part 2 on Saturday.
  • Analysis in part 3 on Sunday.

*All data is used from after (and including) the 1938 election. Although South Australian elections can be followed right back to 1857, 1938 marks the end of multi-candidate electorates and begins the era of comparable data. Several current seats had incarnations dating pack to 1857, but only Flinders has existed since then without interruption.
** The Liberal Movement is counted as Coalition, even though a small segment broke away to form the New Liberal Movement and eventaully joined the Democrats.

Saturday, 8 February 2014

A poor psephologist concludes blaming his tools


So before we start, I recently got some feedback that this site is a little dry. It's a political science blog, so there is only so much I can do to rectify this; the point, however, is valid. Since I am not a professional, my only real draw is being accessible or entertaining.

Occasionally I can do this through a jocular manner or a well placed image. Unfortunately, there is very little to say – much less joke about – while we're still a month out from the state election. So while I will keep the entertainment value of these posts in mind, there's not a huge amount I can do at this point. Realistically this is going to be most enjoyable for those readers who play along at home.


The box contains countless electoral maps, and a dartboard to pin them to.

For those of you without the time or inclination to formulate separate predictions, this next month is not going to be too riveting. Sorry. However, I will be providing TL;DR: summaries at the bottom of my posts for those who find themselves tuning out.

Griffith By-election Update:


For those who are playing along, we have a prediction for Griffith to keep our eyes on. And although the LNQ candidate Dr Bill Glasson is not admitting defeat, the seat can confidently be called for the ALP, though with a slight swing to the Coalition. So that's a point for me, and hopefully for most of you too.


Review of Pendula:


The pre- and post-election pendula have been long-standing features of Australian psephology, as a direct result of their usefulness. As summarised here the pre-election pendulum had a 87% success rate as a predictive tool, which is superior to the VDTA I used. It is difficult to assess the success of the seat run-downs, but clearly the pendulum is one of our most powerful tools.


However, this could easily be improved by a more accurate model of predicting swings. Assuming a uniform swing regularly fails to yield sufficiently accurate predictions, but several analyses conducted last year on this blog failed to refine this method:

This post indicates that swing (i.e. volatility) is not noticeably related to the seat's marginality – in other words being closely contested is not at indication of the size of the swing.

This post went on to show that a seat that had a large swing one election may have a small swing in the next. This dispelled the possibility of “large swing” and “small swing” seats, and led me to look at seat volatility as a long-term trend (as in the seat run downs) rather than an innate feature of particular areas.

This also gives us a helpful hint that the factors that drive swing vary from election to election. I have no doubt I will return to this topic in the future. When I do, I will probably look at comparing the economic activities of a seat with the main themes of the election (e.g. if water restrictions were a major issue, did this provoke a greater swing in agricultural seats than industrial ones?) I'll also keep an eye out for articles by other psephologists that might give me some more hints on the topic.

I suspect that this is going to continue to be a complicated issue, and one that may not be resolved for a long time.

TD;DR: Successful prediction for Griffith: ALP hold
Pendula are the most useful tool used in my 2013 analysis
There is no known reliable way of calculating seat-by-seat swing
Swing is not determined by seat-specific data, but depends on the election

Saturday, 1 February 2014

A poor psephologist continues to blame his tools...

In other news:


Perhaps some of you were expecting some discussion of the whole Don Farrell debacle, but I have nothing new to add except to restate the publicly available facts. This far out from the elections it is too difficult to know the magnitude of the impact on the vote, and even harder to calculate in a way that so much as resembles to pass for political science around here.

Some of you may also expect some comment on the Griffith by-election next Saturday. The seat run-down for Griffith last election (a mechanism we are reviewing this week) put the seat as a standard Labor leaning division with a safe but variable rating:

Griffith:

Incumbent: Kevin Rudd (ALP)
Incumbent/Party Run: 5 Elections won (1998 - Present)
2010 Margin (TPP): 8.46% against LNQ
Electoral History: 1934–1949: ALP
1949–1954: LIB
1954–1958: ALP
1958–1961: LIB
1961–1966: ALP
1966–1977: LIB
1977–1996: ALP
1996–1998: LIB
1998–present: ALP
Longest Electoral Run:  7 Elections won (1977 - 1996) - ALP
State Divisions: Part of Bulimba, part of Chatsworth, part of Greenslopes, part of Mansfield, part of South Brisbane and part of Yeerongpilly.
Assessment: Variable, safe ALP

With Mr Rudd retiring, a variable seat history and a predominantly Liberal state-seat composition (although this is a largely exaggerated indication of Liberal support resulting from a landslide state election two years ago) this seat is obtainable with difficulty for the Coalition. Factor in the muscle that can be gained from being the dominant federal party, united Liberal and National support under the LNQ banner and the prestige of scalping the seat of a former Labor PM, we can expect the Coalition to pull our all the plugs to try and take this one.

The loss of any advantage of incumbency will be unfortunate for Labor, and LNQ candidate Bill Glasson out polled Rudd on first preferences last year anyhow. However, I also expect a lot of historically Labor supporters who could not bring themselves to vote ALP last year might return to the fold, and that the 3,000 odd PUP votes are likely to be part of this.

My personal feeling is that PUP did well last year due to a combination of protest votes and people thinking it would be humorous to vote for now MP Clive Palmer rather than actual support of PUP policies, and I suspect that normally ALP voters were a far larger proportion of the protest vote than Coalition supporters. If I am right, the PUP will be a one hit wonder that will collapse next election, especially now that Palmer is actually in parliament and is therefore an apparently viable candidate. Griffith will be an interesting test case to see if PUP support has fallen in the last 4 months.

TPP the ALP won Griffith with 53% of the vote. This is vulnerable, but it is important to realise that the Coalition has only won this seat once since 1977. I would argue that, although a LNQ victory is far from impossible, this seat is most likely to stay with the ALP, possibly increasing the margin.

Prediction: ALP hold

With that resolved, I will continue as planned with my review of the tools used at the last federal election, this week looking at:

Seat Run-Downs


The seat run-downs were very subjective, largely because this was the summary I used to factor in all of the subjective or non-quantitative information that might play a role in the result. For example, there is no easy way to factor incumbency into a prediction mathematically, so this was dealt with in the run-downs.

The run-downs offered two results, a measure of safety and a measure of volatility. These measures roughly equate to an estimate of how likely a seat is to go to one party or another, and a rough margin of error on that estimate or value of certainty. Two seats might be rated "safe", for example, but this is far more solid in a stable seat than a volatile one.

Because these measures were based on my personal interpretation of the facts as well my choice of which facts to include, this should be a very poor predictive tool in terms of determining how marginal seats will fall. It is, however, potentially useful in identifying which seats are marginal in a more complete way than looking at margins on the pendulum.

Of the 149 seats listed here (I apparently forgot Eden Monaro), 16 were listed as tossups and 32 as critical. 18 more were considered standard or normal, with 83 bastions or very-safe seats.

The AEC lists the 22 seats that changed hands as: Banks, Barton, Bass, Braddon, Capricornia, Corangamite, Deakin, Dobell, Eden-Monaro, Fairfax, Hindmarsh, Indi, La Trobe, Lindsay, Lyne, Lyons, New England, O'Connor, Page, Petrie, Reid and Robertson.

I will ignore Eden Monaro, whose prediction was accidentally excluded from the run-down summary (and being a NSW seat is not recorded in the raw data), even though I suspect it's history as a bell-whether seat might have placed it as a tossup.

I will accept Indi (won by and Independent) and Fairfax (won by Clive Palmer for the PUP) as unavoidable scrapes and bumps suffered in the treck through the political quagmire. Because it is not possible to study every candidate in every seat, minor parties and independents can rarely be factored into nation-wide analysis. Both of these seats were considered to be Bastions because it was inconceivable that Labor could take them; I feel that this assessment is still valid.

Despite Lyne and New England changing hands, these were correctly labelled as Bastions for the Nationals because they changed hands to National candidates. In other words, they were considered safe for the Nationals even when the Nationals did not currently hold the seats. This seems like a ballsy call to make, except both were held by retiring Independents and thus had to change hands. Both were no brainers. Thus, just as Indi and Fairfax cannot fairly be considered points against my methodology, Lyne and New England cannot be considered points in my favour. These are just non TPP figures confusing a TPP model.

This leaves us with 17 seats changing hands. 4 were Tossups (Bass, Dobell, La Trobe and Robertson), 6 were Critical (Braddon, Capricornia, Lindsay, O'Connor, Page and Petrie), 2 were Standard seats (Coorangamite and Hindmarsh) and 5 were considered Bastions (Banks, Barton, Deakin, Lyons and Reid).

This is still a lot of supposed Bastions changing sides. Roughly 30% of seats changing hands were Bastions; this is, however, looking at the results backwards. The question is not what proportion of seats that changed hands were in each category, but what proportion of seats in each category changed hands.

4 of the 16 Tossups (25%) changed hands, as did 6 of the 32 Criticals (19%). Only 2 of the 18 Standards (11%) changed hands, while the 5 Bastions that flipped were a tiny proportion (6%) of the total 83. In these values we can see the increasing resistance to changing hands.

O'Connor is an interesting case. Although there was ample evidence to suggest O'Connor would stay with the Nats and was considered a Bastion it did fall. However, it fell from the Nats to the Libs. Normally this would not occur, except in WA and SA where the Coalition parties have refused to agree not to contest seats already held by a Coalition partner. So this seat is still a Coalition if not a Nationals one. However, even the latter claim might have superficially appear justified in light of the following summary:

O'Connor:

Incumbent: Tony Crook (NAT)
Incumbent/Party Run: 1 Elections won (2010 - Present)
2010 Margin (TPP): 3.56% against LIB
Electoral History: 1980-2010: LIB
2010 - Present: NAT
Longest Electoral Run: 11 Elections won (1980 - 2010) - LIB
State Divisions: Albany, part of Blackwood-Stirling, part of Central Wheatbelt, Eyre, part of Kalgoorlie, part of Pilbara and part of Wagin
Assessment: Variable, leaning to NAT

Note the abundance of dark yellow at the state level, and elsewhere. However O'Conner was newly National (1 Elections won (2010 - Present)) being otherwise Liberal since 1980. This is why the seat is listed as both variable and leaning, which would rule it as Critical according to our conversion matrix:

As I noted here before the election: "I would not be surprised if this was close run between the Coalition parties in this election, but a Coalition victory is pretty well assured. Whether or not this is worth watching depends, I guess, on your interest in inter-Coalition contests." It also explains why the seat is listed here as Critical.

So we have two options. If we treat the Coalition parties as one, O'Connor no longer counts and only 4 Bastions changed hands (5%). Alternatively, if we count the Nat-to-Lib flip as a change of hands, then O'Connor is rated Critical and the corrected percentages should read: 25% of Tossups (4/16), 21% of Criticals (7/33), 11% of Standards (2/18) and 5% of Bastions (4/82) changed hands.

Either way, we can conclude use the figures of 25%, 20%, 10% and 5% as rough approximations of a seat changing hands in each category (Tossup, Critical, Standard and Bastion respectively). How well these figures hold between federal elections (much less state elections) remains to be seen, especially given the lack of definitive guidelines on how to rate safety and volatility.

Further analysis of the size of swings relative to the predicted volatility of a seat may be conducted at a later date. I may also look at the margins of each seat compared to its safety and overall rating. However, as this method does not predict outcomes of elections (with Critical and Tossup seats effectively undetermined) it will never be possible to determine a value for the accuracy of this method as a predictive tool. It is, however, clearly useful in highlighting the seats to watch.

At least, it would be if it were usable for anything other than federal elections. Although much of the data can also be determined for state seats, one of the factors for getting a feel of each seat's security is the results of more recent elections. While the state seats that compose a federal division might be useful to roughly summarise the opinions and ideologies of groups within the latter, the same does not work in reverse. It is far more accurate to call a federal seat for a party based on the overwhelming lean in its constituent state divisions than to call a state seat because it is a subset of a federal seat. And average of smaller seats can be used as an approximation for a large area roughly representing their combined populations, but one cannot use a wider average to accurately reflect a small subsection.

To use the example of O'Connor above, Albany (a state seat) is located entirely within this federal division. Albany was a Labor seat, but O'Connor was accurately called Coalition because this was out weighed by the other (predominantly Nationals) state seats. In reverse, we cannot use the general Coalition trend of O'Connor as an approximation of Albany.

Sunday, 26 January 2014

A poor psephologist blames his tools...


With the South Australian Labor Party having launched their campaign for the March election I suppose it must be time to get back to blogging. However, this early in the run up there is really not much in the news to talk about, so I get to indulge my own little psephological projects. And that, of course, means it is time for a...

Statistics Party!!!



For the federal House or Representatives election last year I based my rather mediocre predictions on three tools – the Variable-Dependent Transparency Arrays which mapped past voting trends, the Pendulum which summarised the margin of each seat, and Seat Run-Downs for each state which summarised the general historical lean of each seat. Over the next few weeks, unless something more interesting or time-sensitive turns up, I will be analysing each tool's accuracy and usefulness; this will then inform my use of these tools (or lack thereof) in the state election. This week we look at the VDTA:

Tool Summary:


Numbers were crunched, maps were coloured and fun was had by all. The VDTA uses very subjective calculations to broadly summarise the voting trends of recent years by superimposing semi-transparent layers of election results so that recent outcomes eventually blot out older ones. The transparency of each layer depends on a variable – in this case the accuracy of using this election predicting the next.

Results Analysis:


These are the results from map we used:

Data source.

and these are the same results divided into distinct predictions based on their hexadecimal colour code (those with a higher red value are red, those with a higher blue are blue and the white divisions remain the same):

Blue are Coalition, red Labor and white excluded.

This map correctly predicted 116 seats, got 32 wrong and called 2 tossups.

Green are correct, red incorrect and black excluded.

This is roughly 78% accurate for all called (i.e. non-tossup) seats. Both tossups had insufficient data to calculate a value for the VDTA. A state-by-state (and territory-by-territory) breakdown of accuracy ratings is as follows:

ACT: 100% (2/2)
NSW: 79% (38/48)
NT: 100% (2/2)
QLD: 69% (20/29)
SA: 82% (9/11)
TAS: 80% (4/5)
VIC: 86% (32/37)
WA: 71% (10/14)

Superficially, we might expect an accuracy percentage of high 70s to low 80s by applying the same VDTA equation to the SA state election. Ignoring for a moment the likely differences between the two elections, lets remember that this is the first data point on the accuracy of this methodology. Lets use this figure as a ballpark, but not rely on it too heavily until we have a few more elections under our collective belt.

The obvious question, then, is whether or not we are using the optimal equation.


I can confirm that we are almost certainly not. I have no doubt that with a little tweak to the denominator in the equation we can improve the accuracy a little. And then, as I outlined in the methodology, redefining the number of elections factored into the C value could possibly improve the long-term predictive power of the method, at the expense of accumulating more short-term outliers. Then, of course, we could try changing the dependent variable (number of seats changing hands) to make the transparency dependent on margins or swings, on a seat-by-seat or national basis.

All of these could be fruitful avenues of investigation once we have more results to work off of, but it would be premature to tinker around now. I am sure we could get some startlingly accurate correlations between the VDTA and the actual results, but I sincerely doubt these would form a good predictive tool rather than an ad hoc and confectted match up with the previous outcome.

However, the VDTA was proposed as an alternative to simply averaging the history of the seats, and when we do a comparison, simply averaging is more accurate. This implies, at this early stage, that recent electoral data is not necessarily more relevant than older data. Further consideration is required, but here are the stats:

Green are correct, red incorrect and black excluded. Data.

ACT: 100% (2/2)
NSW: 94% (45/48)
NT: 100% (2/2)
QLD: 77% (23/30)
SA: 72% (8/11)
TAS: 40% (2/5)
VIC: 92% (34/37)
WA: 80% (12/15)

NATIONAL: 85% (128/150)

The only states where averaging performed worse than the VDTA were SA and Tasmania, which will be our next elections covered. At this point it seems that the VDTA introduces unnecessary noise, but alternatively may be more accurate in the upcoming predictions. I think it may pay to use both and see which works best in these two states and across other elections too.

Finally, the extreme case of a VDTA with 0% transparency which was the other simplistic map the VDTU was intended to supersede. In practice this would just be using the 2010 results as a blueprint for the 2013 predictions, possibly with intensity factored in to represent length of incumbency as proposed here.

The simplistic way of testing this is simply to look at what percentage of seats changed hands on the results pendulum, and the accuracy is 100% minus this value.

"Prediction" column reflects to my 2013 overall prediction, not the prediction of one specific method.

22 seats changed hands, which is roughly 15% of the seats. This gives the method of using the previous election as the predictions for the next 85% accuracy, the same as the seat averages and better than the VDTA. This technique does better with independents and minor parties, who may hold seats consecutively but rarely show up on the VDTA or seat averages.

Conclusion:


While more data is required, initial results suggest the VDTA is not an effective summary of past voting trends for the purposes of extrapolation into the future.