In my previous (and first) blog post a week ago, I discussed some interesting statistics that I had compiled from my personal set of 1208 1v1 games played in the last 18 months. While my overall win-rate of 50.7% was very close to the 50% that the Blizzard matchmaking contraption aims for, things looked different when I broke the results down per weekday.
On wednesday, my win-rate did not exceed 39.4%, which is very low. I then went on to discuss the statistical concept of "standard deviation" (expressed by the sigma variable), which is a measure of variance there is in a data-set. In the case of my wednesday data, my win-rate deviated from the expected outcome by 2.5 sigma. This means that there is only a 1% probability to get such an outcome by pure chance.
So the conclusion I ended with was that it seemed unlikely that pure chance was behind the sucky wednesdays and that there had to be a deeper cause. In this post I will take the analysis a bit further and see if my original conclusion lives up to further testing.
The definition of a day
I use SC2Gears to mass-analyze my replays and it does a lovely job at it. Immediately after seeing my wednesday win-rate in the SC2Gears multi-replay-analysis screen, I started to think about what could cause it. One of the first things that popped up in my mind is immediately a very simple idea.
SC2Gears considers a day to start and end at midnight. This is a reasonable thing to do, since it's pretty much what everyone does. However, in the context of this analysis it may not be the right way to go about it. If there is some mental condition that is affecting my play on wednesday, it does not magically disappear at midnight. Nor does it magically appear at midnight on tuesday. All games played past midnight count towards the next day. Makes sense when you look at the formal concept of day, running from midnight to midnight, but not so much when you look at my 24-hour cycle where the sleeping period is the divider between two days.
Going back to the statistics, I noticed that my latest game was logged somewhere at hour 3 and my earliest at hour 8 (SC2Gears breaks the results down by the hour as well, so this info was easy to find). A natural way to go about it would be to place the day-divider somewhere between 3 and 8. I posted in the SC2Gears thread with a feature request for a custom day-divider but this request was not fulfilled unfortunately.
So I let it rest for a while until the numbers started to gnaw at me again. I wrote a short script to run over my replay database and compile a list of win-rates for each day of the week, but now with a "day" having the right beginning and end point.
The results are as follows:
Monday - 58.2% (4.8%) (1.7)
Tuesday - 44.7% (8.1%) (0.65)
Wednesday - 43.3% (3.7%) (2.1)
Thursday - 49.1% (4.9%) (0.18)
Friday - 52.5% (3.9%) (0.64)
Saturday - 50.6% (2.8%) (0.21)
Sunday - 54.3% (2.9%) (1.5)
The first number on each line is the win-rate for that "day". The second percentage, between brackets, is the expected standard deviation if we assume that the chance to win a game is 50%. This value will become smaller the more games that are played. The third number is how many times the standard deviation the observed value deviates from the expected value of 50%.
The first observation is that the wednesday anomaly has been reduced significantly by applying this shift in the data. The odds oup[pf a 2.1 sigma deviation happening by chance are 3.6%, against the 1% for the 2.6 sigma deviation we had before. But there's more to it than this.
The look-elsewhere-effect
Suppose you test whether a coin is fair by flipping it 100 times and counting the number of heads and tails. If the result is not too far (say less than 2 sigma) from the expected value, you call the coin fair. This is a very reasonable way of determining whether a coin is fair or not. The problem is that once you start testing multiple coins, you're bound to run into one that is fair, but does not pass the test. In fact, about one in twenty fair coins will have a 2 sigma (or more) deviation from a 50/50 distribution.
Even though the analysis of a single data-set may be solid, the more data-sets you analyze, the larger the probability of finding an anomalous result that is purely due to chance. This phenomenon is known as the Look-Elsewhere-Effect (among other things). The term gained some popularity after it was used in some reports on the search for the Higgs boson in the LHC particle accelerator.
It applies to this case too. If we assume that there's nothing special about wednesday, then the anomaly we see there may as well have happened on any other day. This means that the odds of having a deviation of this magnitude are much higher than we've previously assumed. Without bothering you with the equation (it's not difficult), the chance of at least one day having at least a 2.1 sigma deviation is 22.5%.
In conclusion
There we have it. The final number to measure the weirdness of this result. 22.5% is the chance that a deviation this big occurs due to pure chance. It's not a big probability, but it's not so small that pure chance can be excluded as the cause of it. There may not be something wrong with me on wednesdays. The only way to know for sure is to get more statistics by playing more.
On a related note, the analysis in this blogpost and the previous one demonstrate how careful one must be when using statistics. The naive result, using just the raw data from SC2Gears suggested a very large deviation, which could (with 99% confidence) not be due to chance. However, after more careful evaluation of the data, the outcome changed dramatically.




