Posted in Uncategorized

Best laid plans …

Last Friday I gave another session of Statistics for Journalists; the Department for Business, Innovation and Skills funded scheme coordinated by the Royal Statistical Society, this time in London City University.  It didn’t exactly run to plan.

We were expecting to have only an hour with a bunch of MA students to try to explain the basics of statistics.  This, in reality by they time they are settled in, would have been more like 45 minutes.  I ran this session with two other statisticians (London based).

We were most rudely interrupted by a fire alarm that meant that we didn’t have an opportunity for a proper Q and A session nor did we quite get through all of the planned material – today’s “Statistics for Journalists” course went quite well.  We were allowed to run slightly over, but still lost about 10 minutes due to the disruption, so had under 40 minutes with the students.

Disappointingly, we didn’t get to cover relative risk – but we did leave the slides with their lecturer so that hopefully the more interested students will use them for some targeted learning.

What would I have changed? Honestly, at this point, not much – although it’s the second time I’ve seen the “traffic camera” exercise being run and students have not really engaged with it. I think that I would need to have a rethink about how to present this is exercise if I lead it. It definitely needs more time that has been given to it in previous sessions.

On a more positive note, the feedback from students is good – with them more aware of some of the common pitfalls that they need to be aware of in their professional careers. They also now know the difference between a percentage and a percentage point! They have also stated that they will be more sceptical about statistics in the future…

Posted in Topical Statistics

A window of opportunity?

The study by UCL researchers published this week http://onlinelibrary.wiley.com/doi/10.1111/ecoj.12181/full was interesting in its approach… looking at a 10 year window (2001 – 2011); specifically “Over the period from 2001 to 2011, European immigrants from the EU-15 countries contributed 64% more in taxes than they received in benefits. Immigrants from the Central and East European ‘accession’ countries (the ‘A10’) contributed 12% more than they received.”

A major issue with this timeframe is that we cannot consider the conditions for the A10 countries to be approximately constant in terms of ease-of-access to the UK jobs market. The scale of the difference in the numbers of immigrants from the A10 is noteworthy: in 2001 it was 20,735, in 2005 this was 228,030 and in 2011 it was 892,984. The window really matters. Furthermore, in some of the analysis within the paper they look at the window 1995 – 2011. This inconsistency in the timeframes chosen does not lead to full confidence in the results. Just how reliant are the findings on the windows chosen? The robustness measures used within the paper do nothing to check this. The period before the A10 countries joined the EU was one in which the UK performed well economically; since then the economic crisis has hit. While the research supports the conclusions that even in the downturn the EU migrants still out-contributed others, whether that contribution was sufficient to be a net positive depends on the window used.

Many longitudinal analyses are weak in this aspect. A good robustness test will check to see how much of a difference a slight change in the time window selected would have on the overall conclusions as it prevents the criticism of cherry-picking of timeframes to suit a pre-determined outcome.

Posted in Uncategorized

RSS Statistical Ambassador Scheme

It’s been officially announced that I’m one of the twelve Royal Statistical Society’s statistical ambassadors

http://www.statslife.org.uk/members-area/member-news/news-from-errol-street/1876-rss-selects-twelve-statistical-ambassadors-for-training-programme

I couldn’t contain my enthusiasm previously, so did let people know before this – however, my natural tendency towards being nosey has been satisfied now that I see who else is on the list.

My training starts in November, so some blog posts about the process and my experiences are sure to follow.

Posted in Topical Statistics

Consequences

Having been landed with a rather large extra bill for the cost of E.U. membership of £1.7 billion [or about £27 per UK resident] due on the 1st December, the consequences of changes in statistical methodology are prominent in the news.

The UK’s Office for National Statistics submits figures used to calculate the Gross National Income to Eurostat. These figures were agreed; they upwardly revised the estimate of the GNI. This upwards revision in the GNI is directly associated with the upwards revision in the bill.  A guide to what general areas were under revision is here.

One interesting element that added to the upwards figure was the addition of illegal activities into the accounts. Included in this aspect was a calculation of £5.3 billion accounted for by prostitution. The breakdown of this figure is explained well by Jolyon here:  [note that this was published well before the current payment demand was made public and it also links to a commentary about how the Irish Central Statistics Office approached the same problem.]… basically, it boils down to estimates being based on a biased rather than representative sample and no-one really stopping to think about what the total figure would mean.

The surprise shown by politicians about the state of affairs shows as obvious the lack of consideration given to the consequences of changes in statistical methodology. What can seem like minor adjustments can have major consequences.

Posted in politics

Did the Tories kick themselves in the foot by opposing AV?

In a discussion with my parents and brother [three countries, two continents – the wonders of modern communications] over the weekend, we discussed how active Irish voters can be about expressing who they don’t want to be elected. My “home” constituency in Ireland was Dublin South West, which recently had a by-election. This is traditionally a left wing seat, but the transfer patterns were very interesting.

Ireland has a somewhat complicated voting system – Proportional Representation by Single Transferable Vote with multi-seat constituencies. However, in a by-election with only a single seat up for election, this becomes equivalent to the Alternative Vote system that was rejected by the UK electorate in 2011. The first count:

Party Candidate Count 1
Sinn Féin Cathal King 7,288
Anti-Austerity Alliance Paul Murphy 6,540
Independent Ronan McMahon 2,142
Fine Gael Cáit Keane 2,110
Labour Party Pamela Kearns 2,043
Fianna Fáil John Lahart 2,077
Independent Declan Burke 681
People Before Profit Nicky Coules 530
Green Party Francis Noel Duffy 447
Independent Tony Rochford 92
Independent Colm O’Keeffe 74

Under the First Past the Post system, Cathal King would have been elected with just over 30% of the votes cast [note that this was on a turnout of just under 35%]. Under the Irish system, the fun and games are just starting. O’Keeffe is immediately eliminated as he has the fewest votes. Tony Rochford is also eliminated as the sum of his votes and O’Keeffe’s votes [so if all of O’Keeffe’s second preferences transferred to him] is still less than Duffy’s first count total. However if 92+74 = 166 votes were added to Duffy’s total this would exceed Coules’ total, so Duffy cannot be eliminated on the first round. This proceeds for the next few rounds with the candidate with the fewest votes being eliminated each time.

The Irish political landscape is complex, with some artificial boundaries in place due to historic ties to civil war era politics. Fianna Fáil was traditionally seen as left of centre economically, but right of centre socially; whereas Fine Gael was traditionally seen as right of centre economically, but left of centre socially (in terms of being not as linked to the Catholic Church as Fianna Fáil were). Labour is left of centre both economically and socially; Sinn Féin is on the hard left; as are the Anti-Austerity Alliance and many of the independent candidates. At present, there Fine Gael would be seen as the most right-wing of political parties active in Irish politics, but in most other countries they would be considered a centrist party.

Why is the centre so cluttered in Irish politics? The transfer market… to win seats in Ireland you need to be able to convince people who would not give you their first preference to at least give you a preference vote further down their list.  This discourages extreme views as they are not transfer friendly.

In this case, McMahon (Independent) marketed himself as a pro-business candidate, so was considered moderately right-of-centre.

Party Candidate Count 2 Count 3 Count 4 Count 5
Sinn Féin King 7,304 7,340 7,448 7,580
Anti-Austerity Alliance Murphy 6,579 6,622 6,890 7,079
Independent McMahon 2,167 2,227 2,265 2,464
Fine Gael Keane 2,117 2,194 2,203 2,267
Labour Party Kearns 2,053 2,155 2,170 2,239
Fianna Fáil Lahart 2,085 2,138 2,152 2,200
Independent Burke 711 746 818
People Before Profit Coules 540 554
Green Party Duffy 453

Things then begin to get interesting in the context of Irish politics.

Party Candidate Count 5 Count 6 Count 7 Count 8
Sinn Féin Cathal King 7,580 7,828 8,017 8,999
Anti-Austerity Alliance Paul Murphy 7,079 7,436 7,726 9,565
Independent Ronan McMahon 2,464 3,049 3,416
Fine Gael Cáit Keane 2,267 2,575 3,857
Labour Party Pamela Kearns 2,239 2,492
Fianna Fáil John Lahart 2,200

Traditionally, it would have been quite rare for Fianna Fáil votes to transfer to Fine Gael. Of Lahart’s 2,200 votes, 1,751 expressed further preferences. If we examine the pattern of transfers:

John Lahart’s (Fianna Fáil) transfers went in the following manner: Anti-Austerity Alliance 357; Sinn Féin 248; Independent 585; Fine Gael 308 and Labour 253. So the majority of Lahart’s votes went right-of-centre; followed by the party that nationally is perhaps in least direct competition for votes with Fianna Fáil.  National opinion polls indicate that currently Sinn Féin is second to Fine Gael in popularity. However, there is still a post-troubles stigma attached to the Sinn Féin vote, so they aren’t viewed by many as “just another party”. Therefore, despite some policies being very similar, voters are actively shying away from transferring to Sinn Féin.

In the next round, Kearns (Labour) was eliminated. Almost all of her votes were transferred to Keane (Fine Gael). As these two parties are in coalition government, this transfer pattern is not that surprising, although perhaps surprisingly this means that Labour voters were going towards the right rather than the left wing.

Between them Cáit Keane (Fine Gael) and Ronan McMahon (Independent – but right-of-centre) had a total of 7,273 possible votes to transfer however only 2,821 votes were actually transferred. This reflected that the remaining candidates were substantially to the left of the eliminated candidates, so those un-transferred votes expressed a preference of “none of the remaining candidates”. However, these 2,821 votes split in the ratio of 65% to Murphy of the Anti-Austerity Alliance, 35% to Cathal King of Sinn Féin.

Despite receiving more first preference ballots than any other candidate, King lost out in every other count to Murphy – a pattern indicating that anyone else was preferred, even if their political views could be considered even more incompatible.

So, how does this reflect back to the Tories and AV? If AV was in-situ in the UK, the Tories would not be so concerned about the UKIP vote. UKIP candidates would likely need to get over 40% of the first preference votes to be elected. However, in many more constituencies, the Conservatives would be far more “transfer friendly” than UKIP – with people choosing the option of “anyone but…”

Unless the Irish electorate are far more sophisticated than others, the ability to vote against a candidate by expressing preferences for everyone else would have had some very interesting consequences in the political dynamics at the next UK general election. Instead we are left with the boring first past the post system.

Posted in education

Statistical literacy

I was at Highbury College Portsmouth today as part of the Royal Statistical Society’s campaign to increase scientific and statistical literacy amongst journalists.  Today was my first one of these, with the audience comprising of journalism students, I joined Martin Blackwell as part of a double act on this.

Interestingly, I found that my major “take home message” was for them to have a healthy amount of scepticism and for them not to be afraid to contact those producing the research for further information / clarification.  Even if they are not specialist science / medicine journalists, they shouldn’t be afraid to question a scientist – if the scientist can’t explain their research to the journalist, then how could a journalist be reasonably expected to be able to communicate it to a much wider audience?

It is important for journalists to remember that scientists, like many others, often have an agenda – this may be to increase awareness and hence funding for their particular  research topic; so thinking about why one question has been looked at instead of another is important.  This is often something driven by the funding source.  If a funding source isn’t obvious and isn’t given on request, then the research is to be viewed even more critically.

When I do this again, solo, I think that I will probably run a more focused sesssion – we ran to just over two hours on this; I would focus less on the maths and more on the “how to transfer your skills” element if I run this as a one hour session.  I may also include some bits on spurious correlations with examples from http://www.tylervigen.com/ and to include the comic strip on jelly beans from xkcd to look at data fishing.

The sessions on statistics for non-specialist journalist hosted by the Royal Statistical Society, in collaboration with the Science Media Centre, and grant-funded by the Department for Business, Innovation and Skills. Additional grant funding has been provided by Research Councils UK.

Posted in Uncategorized

Open but not readily usable data

As part of their final year projects, I get my students to source their own datasets.  I have several reasons for this, but the main one is that you don’t really appreciate how messy data can be until you try to put together a suitable dataset yourself…

Over the last few years it has become easier to source many different types of data, although the Office for National Statistics website search is still a mess.  However, every year I still find data in embedded within publically available documents but not in a very usable form.

One of my pet peeves is data being made available in the form of pdfs rather than in a more useful format that can be easily imported into statistical packages. A current example of this is data about the Ebola outbreak being made available in pdfs.  However, some people [such as @cmrivers] with better scripting ability than I do have managed to turn it into something more useful and have popped it onto github.

Whatever about issues surrounding making data available in the first place, if data is to be made public, make it available in a useful format!

Posted in Topical Statistics

Tooth decay in young children

A striking feature of the coverage of the Public Health England report available here has been the lack of proper discussion about fluoridation of water.

Public Health England have themselves published the following water fluoridation health monitoring report (2014) that doesn’t seem to have been picked up on in relation to the dental health of three year olds in England.

Flouridation of water is done pretty standardly in Ireland, and has been been considered to be amongst the top ten achievements in public health: find out more here

Hmmm.

Posted in Topical Statistics

Scottish Referendum results

I’ve been looking at the results in the Scottish independence referendum.  The results will be the basis of a final year project, so further results will follow.

How each local authority area contributed to the overall NO majority
How each local authority area contributed to the overall NO majority
No majority (% of valid ballots counted) in each local authority area
No majority (% of valid ballots counted) in each local authority area

One possible trend that I wanted to look at was how the pattern of inhabitants [based on 2011 Census] of the local authority areas was related to these voting patterns…

What percentage of people were born in England (based on 2011 Census).
What percentage of people were born in England (based on 2011 Census).
A question asked in 2011 Census asked about identity.  This map looks at the proportions in each Local Authority who claimed "Scottish" as their only identity.
A question asked in 2011 Census asked about identity. This map looks at the proportions in each Local Authority who claimed “Scottish” as their only identity.