Friday, 20 March 2015

Never tell me the odds!

Just a quicky. Being a bit of a stats geek, I was interested to see what the probabilities of finishing the Bingo Race looked like, and how we should expect tomorrow to pan out. As a quick primer, the plan is that we each have our own ball bag (giggle) with 30 numbered balls in. After each lap (about 2 miles each), we randomly select a ball. If it matches one of the 3 balls on our bib, it's ticked off (by which I mean a tick is physically placed over the number, not that the balls are somehow anthropomorphic and a bit miffed to be chosen). If not, we just run another lap and try again. Then we keep going until we have ticked off all three numbers. So we may be finished in 3 laps or it may be 30. Interesting stuff. 

Now to be clear - I really don't care about how many laps I will have to run from a racing point of view. Tomorrow is going to be fun, and in all honesty I would be perfectly happy running for the full 100 Km. I'm looking forward to a nice long run with some good friends, and I'll just run until I stop. Run Stupid (TM), and don't think about things as you go. However, it is quite an interesting question to answer - as you go along, what are your chances of the misery finally being over at the end of the current lap? 

So being a stats geek, I thought I'd have a quick play. I won't go into the details, but in a nutshell I treated this as a ball and urn problem - there are 30 balls in total, 3 of which I want to pick (green) and 27 of which I don't (red). I performed a random "race", where each lap I calculated my probability of pulling out the final green ball this time (using a hypergeometric probability distribution), then randomly chose a ball (using a pseudo-random number generator) and updated the numbers for the next "lap". I repeated this whole process a million times and averaged over all of them to get a good model for any given set of idiots runners. 

Simple. Got it? Good.

This figure shows the probability of completing your set of 3 numbers after the current lap. Obviously this is zero for the first couple of laps, and there is a vanishingly small chance of being done on the third. After each lap the odds improve, but really very few of us will be finished in fewer than 15 laps. In fact, if we look at this in a slightly different way and ask what percentage of finishers we should expect to see at each lap, we see that half of the runners will be running over 23 laps.

The odds don't look good for a quick finish I'm afraid, but honestly that's what I'm counting on! But to everybody else that's running and was hoping to be at the pub quickly - sorry guys! I'll be interested to see how tomorrow actually pans out, and how closely it correlates with these predictions. Obviously it doesn't account for people stopping for other reasons along the way, but I couldn't be arsed including a DNF coefficient in the model. 

Right. Let's play Bingo!


  1. You are far too clever, firstly to run such a tortuous race format, and secondly to do some statistical analysis on it. I feel stupid. No wonder I failed my further maths A-level

    1. As it turned out, my predictions were completely wrong anyway! So much for statistics. I like to think of myself as being pretty stupid in general, which tends to help with these kinds of races. If you don't think too much and worry, and just have fun, everything tends to work out. "Run stupid" is kind of my mantra.


Note: only a member of this blog may post a comment.