With the South Australian Labor Party
having launched their campaign for the March election I suppose it
must be time to get back to blogging. However, this early in the run
up there is really not much in the news to talk about, so I get to
indulge my own little psephological projects. And that, of course,
means it is time for a...
Statistics Party!!!
For the federal House or
Representatives election last year I based my rather mediocre predictions on three tools – the Variable-Dependent Transparency
Arrays which mapped past voting trends, the Pendulum which summarised
the margin of each seat, and Seat Run-Downs for each state which
summarised the general historical lean of each seat. Over the next
few weeks, unless something more interesting or time-sensitive turns
up, I will be analysing each tool's accuracy and usefulness; this will
then inform my use of these tools (or lack thereof) in the state
election. This week we look at the VDTA:
Tool Summary:
Numbers were crunched, maps were
coloured and fun was had by all. The VDTA uses very subjective
calculations to broadly summarise the voting trends of recent years
by superimposing semi-transparent layers of election results so that
recent outcomes eventually blot out older ones. The transparency of
each layer depends on a variable – in this case the accuracy of
using this election predicting the next.
Results Analysis:
These are the results from map we used:
Data source. |
and these are the same results divided
into distinct predictions based on their hexadecimal colour code
(those with a higher red value are red, those with a higher blue are
blue and the white divisions remain the same):
Blue are Coalition, red Labor and white excluded. |
This map correctly predicted 116 seats,
got 32 wrong and called 2 tossups.
Green are correct, red incorrect and black excluded. |
This is roughly 78% accurate for all
called (i.e. non-tossup) seats. Both tossups had insufficient data to
calculate a value for the VDTA. A state-by-state (and
territory-by-territory) breakdown of accuracy ratings is as follows:
ACT: 100% (2/2)
NSW: 79% (38/48)
NT: 100% (2/2)
QLD: 69% (20/29)
SA: 82% (9/11)
TAS: 80% (4/5)
VIC: 86% (32/37)
WA: 71% (10/14)
Superficially, we might expect an
accuracy percentage of high 70s to low 80s by applying the same VDTA
equation to the SA state election. Ignoring for a moment the likely
differences between the two elections, lets remember that this is the
first data point on the accuracy of this methodology. Lets use this
figure as a ballpark, but not rely on it too heavily until we have a
few more elections under our collective belt.
The obvious question, then, is whether
or not we are using the optimal equation.
I can confirm that we are almost
certainly not. I have no doubt that with a little tweak to the
denominator in the equation we can improve the accuracy a little. And
then, as I outlined in the methodology,
redefining the number of elections factored into the C value could
possibly improve the long-term predictive power of the method, at the
expense of accumulating more short-term outliers. Then, of course, we
could try changing the dependent variable (number of seats changing
hands) to make the transparency dependent on margins or swings, on a
seat-by-seat or national basis.
All of these could be fruitful avenues
of investigation once we have more results to work off of, but it
would be premature to tinker around now. I am sure we could get some
startlingly accurate correlations between the VDTA and the actual
results, but I sincerely doubt these would form a good predictive
tool rather than an ad hoc and confectted match up with the previous
outcome.
However, the VDTA was proposed as an
alternative to simply averaging the history of the seats, and when we
do a comparison, simply averaging is more accurate. This implies, at
this early stage, that recent electoral data is not necessarily more
relevant than older data. Further consideration is required, but here
are the stats:
Green are correct, red incorrect and black excluded. Data. |
ACT: 100% (2/2)
NSW: 94% (45/48)
NT: 100% (2/2)
QLD: 77% (23/30)
SA: 72% (8/11)
TAS: 40% (2/5)
VIC: 92% (34/37)
WA: 80% (12/15)
NATIONAL: 85% (128/150)
The only states where averaging
performed worse than the VDTA were SA and Tasmania, which will be our
next elections covered. At this point it seems that the VDTA
introduces unnecessary noise, but alternatively may be more accurate
in the upcoming predictions. I think it may pay to use both and see
which works best in these two states and across other elections too.
Finally, the extreme case of a VDTA
with 0% transparency which was the other simplistic map the VDTU was
intended to supersede. In practice this would just be using the 2010
results as a blueprint for the 2013 predictions, possibly with
intensity factored in to represent length of incumbency as proposed
here.
The simplistic way of testing this is
simply to look at what percentage of seats changed hands on the results
pendulum, and the accuracy is 100% minus this value.
"Prediction" column reflects to my 2013 overall prediction, not the prediction of one specific method. |
22 seats changed
hands, which is roughly 15% of the seats. This gives the method of
using the previous election as the predictions for the next 85%
accuracy, the same as the seat averages and better than the VDTA.
This technique does better with independents and minor parties, who
may hold seats consecutively but rarely show up on the VDTA or seat
averages.
Conclusion:
While more data is required, initial
results suggest the VDTA is not an effective summary of past voting
trends for the purposes of extrapolation into the future.