Friday, 12 April 2013

Donkey Votes

Having argued my case against compulsory voting last week in the abstract, I figured I should try and back my statements up. They say their are lies, damn lies and statistics; I don't do things by thirds so I'm jumping straight into the stats.

There are several stats I could look at here: the AEC formally records the voter turnout and informal votes seat by seat each election. In the last federal election 955,202 registered voters did not vote in person or by other means, and 729,304 of those who did cast invalid votes. Some of these are undoubtedly accidental, but we could look at these stats over several elections and compare them against approval ratings or other figures that could indicate general voter engagement.

However, people failing to present at a voting booth or not casting a valid vote are not my real concern. I am more interested in people who don't really have a political opinion yet vote because they are dragged along to a polling station every few years, and end up distorting the results away from the intentions of those who actually have opinions on the subject.

The problem is isolating these valid, but convictionless votes from the bulk of valid votes backed up by conviction. Last week I briefly touched on Donkey votes as one specific instance of the many plagues ravaging the compulsory voting system.

A "Donkey Vote" is probably the most simple means of filling in your ballot, normally filling the boxes 1, 2, 3, 4... from top to bottom. Because this is a valid vote it is difficult to determine how prevalent this is...

Also, because of the redistribution system, a small change in a close run fourth-place contest could drastically affect the flow of preferences and determine the outcome.
- Infographinomicon Saturday 6th of April 2013
In some circles filling in a ballot from the bottom up, or even at random, is considered donkey voting, though we will restrict ourselves to the old 1, 2, 3, 4's.

As you will see, there will remain a significant level of ambiguity as to the level of donkey voting by the end of this post, but we will try to give a very rough indication of the prevalence of this practice. Researchers at ANU, with more time and resources than I, suggest as many as one vote in every seventy is a donkey vote. This research was published in 2006 and found no appreciable reduction in the practice since the lower house reached its current approximate size in 1984. It would seem naive to expect a rapid drop in donkey voters in the two federal elections that followed. That would mean somewhere in the order of 177,000 valid donkey votes were included in 2010. Interestingly, their research suggests voters are significantly less likely to donkey vote if this means giving a 1 to a female candidate - another argument, I feel, to do what can be done to reduce the influence of these ballots. I have not been able to determine how the ANU team reached these figures, but they will provide an intersting contrast to the figures I will get in this post.

Now, just to be clear, I am not advocating that we should "correct" the election results if we could determine the number of donkey votes statistically. This would be a dangerous step down a slippery slope. If we allow statistical calculations to adjust the results of elections, we could see a time where a small, representative sample of ballots are counted and the results extrapolated from there. I proposed my solution to the problem last week in the form of a check box that gives legitimacy to the idea of casting an invalid vote by choice. Apart from saving scrutineers considerable time in detecting at least some of the deliberately invalid votes, this will be an easier way to produce a recognised official (if invalid) vote even quicker than producing a donkey.

In addition to backing up my arguments against compulsory voting, an accurate count of donkey votes could in theory be a useful predictive tool in close-run seats.

Just to add a little more difficulty into the mathematics I, obviously, do not have access to the ballots themselves, only the AEC published summaries. This means I don't know how many ballots were marked 1, 2, 3, 4 in 2010, or any other state or federal election, let alone how many were intentional votes and how many were donkeys.

This wasn't going to be the normal, empirical method of determining which calculations would work best and implementing them; this was a search for an admittedly flawed but simple approximation that would give a rough means of calculating the influence of donkey votes. It was going to be ball-park at best, refined from less-than-ideal data through trial and error. I was looking for a reasonably consistent figure, apparently around the 1/70 mark (1.4%). [Editor: Yes, we have already decided what we want to find before we start looking. This is not scientific, just "fun" with statistics. We will do some post-discovery analysis after a phenomina is found to test if it does resemble a Donkey Vote.] The best way to find out the true extent of donkey voting would presumably be to conduct a survey, which would be well within the means of the AEC if it were so inclined.

Some of you may just enjoy this retelling of my mathematical Odyssey. For the rest of you, I apologise. You may tolerate this a little longer if you imagine we are actually trying to count the multiple, result-skewing ballots placed by Eeyore in a malicious attempt to rig the Australian elections.

Okay, that was a terrible donkey-voting pun, but I totally nailed it.

How Hard Can It Be?

So, obviously the first thing to try is to determine exactly how much of an advantage being first on the ballot is. Below, I have compared the support for the first party on each ballot to their nation-wide support.

       Please note the distinction between percentage and percentage points
For example, in the Seat of Canberra the Liberal party heads the ticket and received 41,732 primary votes. This is over 37% of Canberra's primary vote, while nationally the Liberal party received just over 30% of the primary vote. In other words the Liberal party preformed 6.75 percentage points better in Canberra than across the nation as a whole.

So we could say that the Liberal party gained a 6.75 percentage point advantage in Canberra (although this is probably not the result of being first on the ticket, as we shall see). More relevantly, it would seem that all first-place candidates gained an average 3.05 pp advantage, if we ignore the independents and non-affiliated candidates from Cook (NSW), Lyne (NSW), Robertson (NSW), Leichhardt (QLD), Denison (Tas), Indi (Vic) and Wannon (Vic).

The first thing to note, of course, is that this boost is very unreliable. More than a few candidates performed worse than the national average despite being first on the ticket - some significantly worse.

Further, if you care to look at the data dump post below this one, you will notice that the candidates who do not rank first on the list get advantages of a similar range. In fact, the average candidate performs 4.06 pp better than their party's primary vote, so by this measure the candidates leading the ballots are not advantaged at all. The 6.75 pp 'advantage' for the Liberal Party in Canberra is more-or-less matched by the 6.8 pp 'advantage' for the Greens and the 6.24 pp 'advantage' for the ALP.

There are several reasons for these results. Firstly, only the ALP and Greens contested every seat in 2010. This means every other party is faced with seats where 0% of the primary vote was received, which obviously undercuts the national primary vote. To return to the Canberra example, there are only three candidates:
  1. Giulia JONES (Liberal)
  2. Sue ELLERMAN (Greens)
With fewer candidates, more of the primary vote falls to each. In seats with up to 11 candidates, the primary vote is further diluted. Further, different parties contesting different seats vary the primary vote support. If there were a National candidate added to the Canberra list it would carve pretty heavily into the right-wing Liberal Party support base but have a smaller impact on the ALP and Greens. If the fourth candidate was more left-wing - for example someone from the Sex Party - then the Greens would lose out more than the Libs.

Basically, we need to compare apples with apples, even though we've been served a particularly well tossed fruit salad. The best bet seems to be the seats containing only three candidates, specifically the ALP and Greens (who contested all 150 seats) and the Liberal Party. There are six such seats: Canberra (ACT), Barton (NSW), Bradfield (NSW), Mackellar (NSW), Werriwa (NSW) and Braddon (Tas). This sample is not really large enough to provide reliable data, but it is the best available for our current methods.

The Barton ticket is headed by the ALP, the Liberal Party leads the Canberra and Braddon ballots and the rest are topped by the Greens candidate.

Looking at the columns that represent each party's primary vote as a percentage, thus eliminating the influence of district population size, we can compare the average primary vote for each party in these three-way contests with their 'lead average' - that is, their average primary vote from only the ballots they lead. The ALP appeared to have a 10 pp advantage when it lead the ballots, and the Greens are not far behind with 8.01 pp. The Liberal party, however, fell 8.81 percentage points. At best these results are inconclusive, which is hardly surprising considering the constrained sample size.

A Swing and A Prayer:

It is a long shot, but it might just pay off. That was my thinking as I started plugging in an entirely new set of raw numbers to see if a regular, reasonably consistent 'advantage' could be determined by looking at changes in the electoral pendulum instead of primary votes.

By looking purely at the swing within a seat I was hoping to eliminate many of the problems with the primary vote. Firstly, we would be comparing a seat's data against itself, not against national averages. This would eliminate problems of smaller parties standing in fewer seats getting a lower national primary vote purely as a result of limited reach. Secondly, with many parties from 2010 having stood the same candidates back in 2007, the influence of the parties on each other in a given seat should be consistent, and thus minimal in the results. Seats with large numbers of candidates in '07 were unlikely to drop to a small field in 2010 which meant the division of votes among parties should be reasonably constant.

At first glance this looks like a passable approximation of a donkey-vote. The average swing towards a party is increased (or the swing away is reduced) when that party leads a ballot. It makes sense for habitual donkey-voters to show up in the swing, since the leader of most ballots will normally change from election to election. The average improvement (= Average Swing - Average Swing (Lead)) in percentage points when leading the ballot is 2.34, which is significantly greater than the 1.4 proposed in the ANU paper. None of this confirms that the apparent trend favouring the ballot leaders is a result of donkey-voting, of course, but it is encouraging.

As we saw before, there are a lot of other factors that can give a candidate an apparent advantage. The real test is to see if this advantage is universal, or if it is only really present for the top name on the ballot. [Editor: Here is the post-discovery analysis as promised.] If this effect was unique to leaders of the ballots then there is an implied correlation between being placed randomly at the top of the list and receiving a larger swing. Now correlation does not mean causation but since there is a known mechanism (donkey-votes) that could plausibly explain the link, that would be evidence enough to me, and post hoc ergo propter hoc be darned.

I repeated the process, selecting the candidates that came last on each ballot instead and found that the average last-place swing (2.42) was quite close to the overall average swing (2.78). More experimentation is needed, but as shown above the yellow bars (indicating the average swing for each party when it leads the ballots) are regularly above the average of all ballots and the average of ballots where a party comes last - at some points excessively so.
EDIT: I have since calculated the average swing towards candidates second on the ballot at 1.59 pp (see data dump). This is below the overall average (2.78 pp) and last-place average (2.42), but not by much when compared with the advantage to first place swings averaging 5.11 pp.
It would be remarkable if this variation was solely the result of donkey voting, however at the moment it appears that there is a boost in the realm of 2.3 percentage points for parties that lead the ballot.


This topic is far from exhausted, however the next model is a little more complicated and I will be away over the next week, so the second half of this article will be your automatic upload next Friday. In the mean time, I'd like to provide a quote from one of the researchers on the ANU paper, Ms. Amy King:
"On average, 15 of the 150 federal electorates are won by margin of less than 1.4 percent.
"In one out of ten seats ... it could have changed the result.”
If these numbers are accurate then under compulsory voting, I think it is a pretty safe bet that more than one seat has been won as a result of donkey voting. While the ANU's 1.4% donkey-vote rate is some way below of our 2.3, it is important to stress that even a minor change (in the order of fractions of a percent) can influence the order that candidates drop out and thus who makes the two-party preferred match-up.

No comments:

Post a Comment