Fear of Music: how many songs does the average list put in the top 250?
Given…
We collectively split 52,370 points between 3103 songs. Precisely one 40-pointer missed the top 250.
… how many top 250 singles can you expect to get in from your list?
The key here is to work backwards. We know that exactly four people voted for the number 250 song. Exactly four people voted for the number 249 song, and every song up to number 163. Exactly five people voted for numbers 162 through 111, and so on.
Now, if we can say at random who voted for what song, we might be able to come up with a distribution and an expectation.
Here are my assumptions [and, in square brackets, why they don't fully hold true]:
There were precisely 104 complete lists, and nobody gave up part-way through. [Real life: 99 complete and nine partial lists]
I don't care about the precise ranking, only about attributing songs back to their voters. ** Consequence: Bonus points, split vote quotient, seeding, how many times it's been on BBC 6 Records - all that gubbins does not matter.
Everyone voted independently, without looking at each others' lists. [Real life: did not happen, and designed not to happen. However, I think the basic idea holds - there was no particular effort to co-ordinate votes, and nothing that looked obvious as ballot stuffing - there wasn't an influx of Take That fans all voting for "Up all night".]
Entries in the top 250 are independent of each other. If you voted for "Crazy in love", you were no more or less likely to vote for "Single ladies". [Real life: may be more accurate than it sounds - voters seemed reluctant but not wholly unwilling to list more than one single by the same performer.]
The points allocated to records will roughly mirror the allocation in #Uncool500 last year. Statisticians say this is a "Zipfian distribution", which I thought was something George cleaned off his fur. ** Broadly, this is holding true. I've used actual data up to the 8-vote break. Based on the shape of votes last year, I've assumed that the winning song will get 18 votes, the top ten get 15 votes, and the top thirty 10 votes. [footnote 1]
So, for song 250, I picked out four voters at random - without duplication in these voters, because you can only vote for a song once.
For song 249, I picked out four voters at random - without duplication in these voters, but I didn't care if any of these voters picked the songs above.
Note how voter 87 has voted for songs 247 and 246 - this is quite fine.
And so on, and so on, and so on, up to 18 voters for the top song.
And then I simulated 50 editions of #FearOfMu21c. Fifty sets of the top 250, each song allocated back to their original voters. And then I tallied up how many songs each voter had made a hit in each list.
The most common single value was 14 hits, the middle value was 16 hits. Two-thirds of the lists got between 12 and 19 hits, and it would be an unusual list to get fewer than 9 or more than 24 hits. [footnote 2]
Based on the assumptions I've made, we can all expect to get between 12 and 19 of our list into the top 250.
[footnote 1]
Here's the full breakdown of how many votes I expected each position to get.
250-163 - 4 votes 162-111 - 5 votes 110-86 - 6 votes 85-69 - 7 votes 68-51 - 8 votes 50-39 - 9 votes 38-30 - 10 votes 29-23 - 11 votes 22-17 - 12 votes 16-13 - 13 votes 12-10 - 14 votes 9-7 - 15 votes 6-5 - 16 votes 4-3 - 17 votes 1-2 - 18 votes
These are approximate figures, and subject to some variation. The most important finding is that about 1614 votes are used to make the top 250 - everything else is discarded. [footnote 3]
[footnote 2] I reckon that Excel's not-quite-random number generator has some sort of echo, as there were a surprising number of lists scoring 30 and more hits. This really shouldn't happen.
[footnote 3] …and 1614 used votes divided by 104 lists gives an average of 15.5, so I could have saved myself a lot of effort.













