Prelude:
I have a going away party tomorrow night, which is good news for me because there's a party and bad news because a friend is going away. It also means you get this week's post early, which is good news for you because you only had to wait six days since the last post, and bad news because you'll have to wait eight days for the next one.
So, if you missed last week's post, I created a colourful map with a stupid name. I did some other stuff too, but I'm going to spend this post discussing that map. Those of you interested in the process behind creating both that map and those below might want to read that first. Normal people might just want to look at all of the pretty colours. You know who you are...
So, if you missed last week's post, I created a colourful map with a stupid name. I did some other stuff too, but I'm going to spend this post discussing that map. Those of you interested in the process behind creating both that map and those below might want to read that first. Normal people might just want to look at all of the pretty colours. You know who you are...
Also,
the nation-wide maps are pretty bulky. You can click on them to open
if you want to identify specific seats, but this could chew up your
bandwidth and/or data quota pretty horrendously over the next few
months. Sorry about that, but it is necessary in order to represent
all 150 seats geographically. Some seats are difficult to see. The best views are afforded by right-clicking on the image and choosing to open it in a new tab or window. This is still not great for tiny seats, but
there are only three solutions that I can think of.
The
first is to make the maps bigger. This is inconvenient for me, harder
to view on screen for you and more demanding for your bandwidth. The
second was proposed to me – that I use vector graphics. This would
be fantastic if I were competent enough to make them and this blog
were designed to host them. The third, which I am using, is to
provide supplementary posts (below) with all the data provided in a
concise, readable form that abandons geographic distribution for maps
and provides cold hard numbers for statistics.
And
now, to business:
Trends versus Averages:
So
the point of my previous map was to show the past voting history of a
seat. Perhaps the simplest way I could have done that would be to
average the results of past elections thus:
However,
as always, there are several problems with this. Firstly, some seats
have much longer histories than others. Durack (WA) and Wentworth
(NSW) both display pure blue, since both have been won by Coalition
parties (or their predecessors) in every election. The difference is
that Wentworth was proclaimed in 1900 and has supported the coalition
in every election since federation while Durack was proclaimed in
2008 and has contested only one election. Wentworth's trend is pretty
stable and allows me to boldly declare with confidence that, short of
retirement or health concerns, Wentworth will be retained by Shadow
Minister for Communications and Broadband Malcolm Turnbull.
Predicting for Durack, on the other hand, carries all the usual
dangers of extrapolation from minimal data points (e.g. illegal polygamous relationships).
To
illustrate another problem with this approach, consider the fictional
seat of Green, a outer-metropolitan seat that first ran in the 1984
election. It is named after Antony Green and its main industry is
pebble counting. Below are four representations of Green from
parallel universes:
In
the top left universe, Green was consistently an ALP seat until the
turn of the millennium, then voted Liberal ever after. Since our
predictions at this point are purely concerned with trends, this is
probably a moderately Liberal seat short of the Coalition imploding.
In
the top right, however, Green voted for the Liberals until the 1996
election, then switched to become a typical Labor seat.
The
bottom left version of Green is more volatile, possibly influenced by
the constantly changing policies on both sides that impact on the
high-risk, high-reward pebble counting industry. It has voted for the
Liberal party four times and Labor six times, but neither has held
the seat for more than two consecutive elections. This seat is
marginal and considered a Tossup.
The
bottom right version of Green is slightly less volatile. It is also a
bellwether seat, and so presumably contains a demographic that
roughly approximates the nation's varying seats in equal proportions.
This is also a Tossup but will probably follow the general trend in
polling.
The
two tossups, the safe Liberal seat and the safe ALP seat at first
glance look identical, but how many of you can see the subtle
difference?
Don't
worry if you can't because, of course, there isn't one. The point I
am making in my traditionally long-winded way is that this approach
only considers averages, not trends. It gives the opinions of 1910
equal footing with those in 2010, even though there are far more
voters from 2010 than 1910 expected to vote this year. (If this turns
out not to be the case you can expect a very interesting blog post in
September and/or Edwardian-era zombies.)
One
possible way of mapping trends, as opposed to pure averages, would be
to display each seat as it currently stands, but with different
intensities of colour for the length that a party has held a seat;
strong red or blue could represent seats that have consistently voted
ALP or Coalition respectively since 1901, while paler seats have
shorter runs, with seats that changed hands in 2010 almost white.
This
map has two major draw backs. Firstly, and perhaps most obviously,
only seats dating back to the early 1900s andthat have voted consistently since then show up in any real
intensity. A great many consistently Labor seats appear marginal
because they were Coalition during Howard's 1996 landslide and many
reliably Coalition seats appear marginal because of Rudd's 2007
landslide. Both years saw unusually large surges for one side or the
other in the public vote and the 2010 election may have since
rendered many of these seats safe by most conventional measures.
Instead, these seats are lost to a faint haze of red or blue at best.
Secondly,
consider a seat that has voted consistently for one party since 1901,
except once in 2001. Compare that to a seat that has voted
consistently for the same party since 1990 (voting for another
party in the preceding decades). Based on trends both are safe for
their current party, but the first is probably the safer of the two.
Despite this, the latter appears the more intensely coloured because
it's history is uninterrupted for longer. In more extreme cases a
2010 outlier could make a very strong seat for one incumbent look
like a very marginal seat for another. For example, the seat of Lyne
looks marginal because Independent MP Rob Oakeshott has only held it
since a 2008 by-election, and thus gets the absolute minimum
colouring of one federal election (2010). Prior to Oakeshott the seat
was consistently Coalition since its proclamation in 1949, and thus
should be called a safe blue in the event of Oakeshott declining to
run (or possibly even if he does run, since his siding with the ALP
in the hung parliament may have lost him considerable right-wing
support – although his recent, outspoken, high-profile opposition
to mining in his seat may have won back many of his supporters).
Clearly
to examine trends geographically we need a map that includes all
of the data since 1901 (unlike the second map here) and yet
mathematically favours recent trends over old data (unlike map 1).
This
is where the map with a silly name comes in...
Variable-Dependent Transparency Arrays:
This
map, as I have noted more than once, displays all incumbents as
semi-transparent layers. The opacity of each layer is proportional to
the number of seats that changed hands at the following election as a
percentage of all seats (not counting those introduced in the
following election). Or
where
O is opacity, c is the number of seats that change hands at the next
election not including new seats and t is the total number of seats
at the next election not including new seats.
If
each layer had around 10% opacity, the top layer would contribute 10%
of the colour, the second layer 9% and the third 8.1%. This accounts
for 27.1% of the colour in the top three layers, giving data from
2001, 2004 and 2007 over a quarter of the total influence. (2010 data
cannot be used here until we know the c-value of the 2013 election.)
In this way new trends replace old ones without introducing arbitrary
cut-off dates into the data.
Eventually,
of course, layers far back in the array will contribute no visible
influence on the map. My own personal experimentation suggests any
influence less than 1.5% over an area of 1250 pixels will not be
picked up by human eyesight (or at least by my eyesight, which is roughly the same thing). This figure jumps to around 5% with a
1-pixel border of black between the (feebly) contrasting areas. The
varied scales used by the AEC maps which I have adapted make
determining an average display size for a seat difficult, but 1250
pixels is roughly the area of the Division of Fadden on these maps at full display size.
Fadden is very close to the median of district sizes and despite the
differing scales of the insets appears visually to be about the
median here too (though I have not confirmed this through
measurement - even my patience has limits).
At
10% opacity the top seven layers each contribute over 5% of the total
colour scheme, so this map can be said to have seven layers of depth.
My image software, however, can deal with accuracies down to 0.1%
opacity prior to being messed up by .jpg compression.
It
turns out that 10% opacity gives close to the maximum possible depth
for such an array. Anything lower than 10% and the lower colours lack the
potency to assert their influence. At 8% opacity we are reduced to a
six-layer deep image, and obviously below 5% even the top layer fails
to contribute sufficient colour to make a distinct impression on its own.
Going the other way, higher opacity soon begins to block out the lower layers.
Going the other way, higher opacity soon begins to block out the lower layers.
At
around 12 to 13% the eighth layer contributes just over 4.9% of the
colour. Allowing for the primitive nature of my experimentation this
may possibly result in an eight-layer deep image, but after being
saved as a .jpg these will be virtually indistinguishable from an
array with 10% average opacity.
This
is where the 8.69 comes from in the equation. The average percentage
of seats changing hands in the top seven layers is around 13.1. This
means c/t*100% will give an average value of 86.9% (~ two layers of
depth). By dividing this by 8.69 the average opacity for the top
seven layers is 10% and we achieve near-maximum penetration.
Larger
seats seam to be more susceptible to influence by lower layers. The
largest seats on this map had influences just visible from layer 14 –
twice the depth predicted for Fadden. Nothing was done to correct
this apparent susceptibility of larger rural seats since it is a
result of perception and the human eye. The raw values displayed by
the map are mathematically accurate, which trumps our lying little eyeballs.
My Methods: the Least of a Thousand Evils?
The invisibility of data from before 1990 (layer seven) in medium-small seats and 1974 (layer 14) in larger seats should not be a cause for concern. If the influence of these elections is invisible and the equation used is reliable, it follows that this data has less than 5% influence on the predicted outcome. This is insignificant compared to the error inherent in using past election results to predict future ones in marginal seats. Modifying the equations to ensure these early elections have a visible impact would clearly over-represent their influence.
I
added the caveat that this method was sound so long as the equation was
reliable. Perhaps, for example,
yields
more accurate predictions, suggesting that only the previous three
elections have any real relevance to future predictions. (The average
percentage of seats changing hands over the last three elections is
88.2 and 60% average opacity allows a three-layer deep display for
1250 pixels; 88.2/60 = 1.47).
Perhaps
seat stability needs to be measured over multiple elections, so c =
average number of seats changing hands for the next two, three or
more elections. The problem with this, of course, is that in order to
obtain a c value incorporating the following three elections'
results, our most recent layer would be 2001 and we would be basing
our predictions on trends from the middle of the Howard-era. Rudd's
ALP landslide victory in 2007 would be based on trends from 1996,
during Howard's landslide victory for the Coalition.
Alternatively,
stability could be measured not based on seats won or lost the
following term, but on the margin by which each seat is held.
Marginal seats would contribute little colour to a seat, while safe
margins of 10%+ would contribute significantly more. This, however,
is a very
long project, requiring me to dig up the pendula for each election
and apply it individually seat by seat.
That
is not to say I won't do it, merely that I won't be doing it right
now. It also assumes I can obtain the data. Wikipedia has data up to
(but not including) 1925 and the AEC gives the necessary figures to
calculate the margins after 2001, so I'm only missing about three
quarters of Australia's voting history. If anyone knows where I can
find the relevant pendula feel free to comment below.
And
on that note I will sign off. I did have the aim of discussing the
2010-2013 pendulum next post, but with the announcement of the Pope's
resignation – the first Papal abdication in 600 years – I feel a
desire to try my hand at the very different arena of conclave voting
analysis. While I did previously state my focus would be on
Australian and American politics I have been known to dabble in other
nations electoral processes and even the UNSC vote last year. The
vote for the papacy, however, will be completely new territory for
me.
But
then, who knows what I will actually end up discussing?
No comments:
Post a Comment