Thursday 7 February 2013

Hindsight is 2010...

Prelude:


A quick warning before I get to the main post:

This week's post is a long one. I got very bogged down in the mechanics of the second map – or “variable-dependent transparency array”. If you don't particularly care how the map works you might as well look at the pretty pictures and jump straight to the results section. I enjoyed writing it, which is the main thing, and I feel it is important to at least describe (if not justify) my methodology. Next week will be more of the same, so those of you not interested in the maths behind the maps will have to be content with the maps themselves. I'll return to something a little less specialised and straightforward two weeks from now. Hopefully.

Bloggers Rush In Where Statisticians Fear to Tread:


I began last week's post by saying it was too early to make a prediction. That's because I was building up to this weeks post, but got distracted (as I promised would happen in the post before last). It is indeed too early to make any informed predictions based on polling, general swing or campaign strategy. Instead, I want to look at past trends to suggest where we might be headed– something I have not previously had the time, inclination or appropriate space to do. It is undeniable that the past strongly informs our predictions. I don't need to look at any recent data to tell you that my seat will vote Liberal, since I have the unfortunate situation of living in a safe seat, making my vote largely redundant in the lower house; specifically I reside in the electoral district of Mayo, which has been Liberal since it was formed in 1984. In fact, it has only ever been held by two people: Alexander Downer (1984 – 2008) and Jamie Briggs (2008 – Present).

Below are two maps. At least a few of you should recognise the depicted land mass as Australia. A couple of you – I flatter myself that I will have at least a couple of semi-regular readers – may know your electoral boundaries well enough to realise this is a map of the seats in the Australian House of Representatives. Alternatively you might have reached that conclusion given that this is me and we are now in the pre-pre-pre-election season (although if I have any American readers – which I doubt – you will justifiably mock me for considering seven months to be a lengthy federal campaign).

This map shows the current distribution of seats. Red is ALP, blue is the Coalition (the Liberal Party, including the LNQ and the CLP, and the Nationals including the CNP), grey is independent (including Kennedy's Bob Katter Jr., although he is now a member of Katter's Australia Party), and green – as you may have guessed – represents the Greens.


This next image is a little more complicated. Simplistically, this suggests how each seat is likely to vote based on historical trends (some data dating back to federation). But then if you wanted it that simple, you should probably be following a different blog.

The colours are as before, with Red for ALP and Blue for the Coalition. Independents and the Greens have not held any seat long enough to have an influence on this data yet. Purple covers the range between the ALP and Coalition parties. Blueish-purples (e.g. Hinkler) are more likely to go to some form of Liberal or National party. Redish-purples (e.g. Dobell) are Labor-leaning. Paler divisions have a shorter electoral history and thus a greater possibility of error. White districts were only created one election ago and have insufficient data to form any conclusions. They are likely to fall as they did in 2010, but whether they are a clear cut red or blue, or a marginal puce is still unknown.

 

Messing with Maps (or Cryptic Cartography):


Over time seats have been redistributed, created and dissolved. I am using the contemporary map for simplicity and comparability, however the data in the older seats may derive from different boundaries.

I said the colour schemes are as before, except the colours now include former incarnations or related branches of the parties previously represented (e.g. UAP is blue and Lang Labour is red). Some might choose the simplistic route of simply averaging the colours, so a seat that votes red half of the time and blue the remaining times would appear a mid-range purple. That's fine if a seat is constantly switching back and forth, but a little misleading if a seat was consistently one colour from 1901 until 1958 and then switched to the other up until the present. The simple fact is that demographics change over time, generally through older residents passing on and younger generations immigrating in to eventually become the next group of elderly inhabitants. In the above example it would be highly likely the seat will continue to vote as is has for 55 years.

It is all very well to just ignore data before a certain date, of course, and weed out trends from the early 1900s but I consider there to be two problems with this:
  1. Given a sufficiently generous cut off date, a similar error to that above might still arise on a smaller scale, and
  2. Any cut-off date will be arbitrary and an artificial influence on the data.

Instead, I have devised a new form of representing data to display trends through time and their two-dimensional geographic distributions. (In other words I've fiddled with the map in ways statisticians are not going to like.) I call it a 'variable-dependent transparency array' in the hopes that someone will find that name too cumbersome and rename it the Thomas Map, ensuring my surname will live on forever (or at least as long as that of Dr. Pie, whose circular chart is still used by students to bluff their way through power-point presentations today).

In any given seat, each subsequent incumbents' party colours have been accumulated as semi-transparent layers. This means that more recent trends will eventually wash out old data, so that results from 1910 are given significantly less attention than results from 2010. If each layer had 50% opacity, the 2010 election data would contribute 50% of the colour of a seat, 2007 would contribute 25%, 2004 12.5% and so forth. In other words a seat that was stable until the 1980s and then became volatile will appear somewhere around the midrange, minimising the impact of long-forgotten, pre-1980s opinions on our predictions for 2013.

However a blanket 50% opacity would be too simplistic. Some highly volatile swings could contribute some considerable outliers in recent years, washing out stable long term trends. Instead, each elections' opacity level is equal to the percentage of seats that did not change hands* in the following election as a proportion of previously existing seats** divided by 8.69***. Unfortunately this means the 2010 data cannot be included, since its opacity would be based on the 2013 results. In a year where every seat changed – a situation that has never come close to arising – the previous election's layer would have 0% opacity (i.e. invisible) since this data clearly provided no indication of the election to follow. Conversely in a year when no seats change, opacity is just over 11%.

I will devote my next post to further discussion of variable-dependent transparency arrays and the reason for this limit of around 11%, but suffice to say this allows data from previous elections to bleed through more readily and prevents one year where every seat changes hands from completely blotting out previous data. It is worth noting, however that the results are more comparative than quantitative, since alternate opacity formulae (e.g. opacity = ((number of seats retained/total pre-existing seats) x 100%)/2) would yield differing results. (I did say statisticians wouldn't like it.) Using that alternative equation, for example, would place greater stress on more recent elections by increasing the power of those years to wash out long-standing trends.

All seats start off white with 100% opacity and remain so until the first election in which that seat was contested. Paler seats, therefore, are those with shorter histories and less available data – and thus potentially less reliable trends.

By-elections are ignored, since they only involve one seat and thus yield either 0% or 100% change, drastically affecting the opacity for the previous layer and making some layers completely transparent.

 

Results:


It is important to note that while in terms of area both maps are dominated by blue or blue-heavy shades of purple, this does not necessarily equate to coalition victory. The coalition currently has one more seat in the lower house than the ALP, but geographically they represent at least two-thirds of the land mass. This is because the Coalition traditionally does well among rural voters, and rural areas have larger seats due to their lower population density. In fact the Melbourne and Sydney areas are so small that they cannot be adequately represented without using insets, yet each contains more seats than SA, NT and WA combined. Alternatively the entire state of Victoria (37 seats) can fit comfortably inside the seat of Kennedy. This means a single independent (to be specific, Bob Katter Jr.) represents more land than 23 ALP and 14 Coalition MPs combined.

While the second map may suggest how voting might fall in any given seat, of particular interest are seats like Eden-Monaro, famously a “bellwether seat”. This means it has voted consistently for the party that has won consecutive elections – since 1972 in the case of Eden-Monaro – and thus is a passable representation of Australia as a whole. Robertson has also been a bellwether since 1983 and both Lindsay and Makin have been bellwethers since they were founded in 1984. These last two may give the most accurate representations of the nation as a whole, since they have no pre-bellwether data to skew their results.

All four bellwether seats contain marginally more red than blue and thus represent a slight lean to the ALP. Of course it would be foolhardy to expect past voting based on past political promises to predict this years election with 100% accuracy, so no such ALP lean can be declared as certain. It is also difficult to predict how many seats may vote for an independent based on this map, since previously voting for a right-wing independent is unlikely to suggest a favourable outcome for a left-wing independent, and vice versa.

What this map does represent that may be of some use in our predictions are the safe seats like my own. If this election's two-party preferred polling ends up anywhere near as close as 2010, it will pay to know who has the greatest number of steadfast seats in their back pocket.

* "Changing hands" is defined here as changing colour, thus one member from a given party replacing a retiring member from the same party is not a change of hands. Nor is a swing from one party to another in coalition with it. Transitions in a party over time – e.g. the transition from the United Australia Party to the Liberal Party – are therefore not counted as change of hands either. Prior to 1910 the Protectionist Party and Anti-Socialists had no recognised allegiance, but both being predecessors of the Liberal Party, seats exchanged between these parties are not considered to have changed hands. In theory, transitions within the grey seats are the exception, since a change from one independent to another does not necessarily imply a continuity of voter opinion and would count as a change. In practice this has not occurred in Australian federal elections. A seat is still considered to have changed hands if the same person is re-elected under a different colour (e.g. Percy Stewart, 1925).
** This means that when the house grew from 74 to 121 seats in 1949, for example, the emerging seats are excluded from both the nominator and denominator when determining the fraction of seats that changed. Basically this is to ensure the creation of additional seats does not interfere with the calculation.
*** This lightens the impact of each layer to ensure more layers are detectable to the naked eye. I will discuss how this number was reached, and the nature of a maximum depth next week.


Seat incumbency data from www.aec.gov.au/.

Opacity is based on exchange of seat data from each federal election's specific page on http://en.wikipedia.org/wiki/.

2 comments:

  1. Do you have some vector graphic version of the map? It's a little hard to get a grasp on some of the smaller electorates and I should like to zoom in.

    ReplyDelete
    Replies
    1. Sadly not at this time. The maximum resolution I can currently offer is that you get if you click on the image. These maps were derived from outlines provided in a raster from the AEC and my technical ability in resolving this issue is quite limited.

      I may have the patience to produce a primitive vector graphic by manually tracing the outlines in a .ppt in the near future, assuming I can find a way to upload it here.

      Alternatively I'm toying with the idea of supplementing the maps with 10x15 grids coloured to represent the electoral divisions. This would represent each seat as identical in size and thus eliminate the perception of rural seats holding more sway than urban ones. While this would not resolve the scaling issue it would display the data without need to zoom in at all (and indeed enlargement would be meaningless since there is no detail to see in the grid.)

      My main concern with this is with my posts becoming too graphic-heavy.

      Delete