How to Make Accurate Soccer Predictions

Is it possible? Well if I knew for sure I wouldn’t be writing posts like this!

Of course with any mathematical modelling and use of historic data we can perhaps identify the ‘most likely’ results.

So what I am going to talk about is:

Using Poisson Distribution to Calculate Probabilities of Certain Score Lines

So using Poisson and historic data I will demonstrate how to calculate the necessary attack strength and defence Strength to shortlist (in this case 4) probable score lines.

Please note I am still testing the whole method. By my calculation I need a strike rate of around 38% for this to be profitable. This is based on backing the top 4 most likely scores and assuming odds of 10 or better. Currently the strike rate is around 26% but I still think the sample size is too small. The reason being I only started using this towards the end of the 2017/18 English Premiership season.

Step 1 Gather the Stats

The first step in calculating attack strength is to determine the average number of goals scored per team, per home game, and per away game. We will look at only one league at the moment and we are looking to take the results and show the total number of home games played per team and the number of goals they have scored at home and the number of goals they have conceded at home:

Poisson Distribution Goals for and Against

By dividing the goals for by the number of games played and the dividing the number of goals against by the number of games played we now get:

We also sum the games played and goals for and against so that we can calculate a league average for home games like so:

Step 2 Calculate the Attacking Strength of the Home Side

Now we have the stats we can use this data to calcuate the attacking strength. In this example we are going to use the EPL game on 13th May 2018 between Liverpool and Brighton.

So to get Liverpools attacking strength we take their average goals at home of 2.37 and divide by the league average of 1.5315 to get 1.55:

Step 3 Calculate the Defensive Strength of the Away Side

We can use the same stats and maths to calculate the defensive strength of the away side. So we have:

Step 4 Calculate Home and Away Team Goal Expectancy

To get the home team goal expectancy we simply multiply the home attacking strength (1.55) by the away defensive strength (1) by the average goals home for the league (1.5315).

This gives us a home team goal expectancy of 2.373825

We do the same for away team but now we multiply the away attacking strength (0.46) by the home defensive strength (0.46) by average goals for away (not shown in the table above but this figure is 1.1465), this gives us 0.2426.

Step 5 Chart the Goal Distribution using Poisson

OK so now we have our 2 expectancy values we need to chart these outcomes. We are using something called Poisson Distribution. Now we can do this quite easily in Excel if we have our 2 expectancy values. We just create a 10 x 10 grid which has each scoreline from 0-0 to 10-10. Excel actually has a built in Poisson function which can simplify things for us.

So we just need to call this function for each scoreline:

=((POISSON(0 score for home team cell, Home goal expectancy, FALSE)* POISSON(0 score for away team cell, Away goal expectancy, FALSE))) * 100

So this is what it looks like:


Show me the Money!

OK so if we take the top 4 likeliest outcomes we have:

Making Money predicting soccer matches

If we had put money on all four outcomes we would have won! The final score was 4-0. Going back looking at the odds for this scoreline it was around 14. So a nice profit.

However as I stated at the start I am not about to retire on this method, it will be interesting to see how it fairs at the start of the 18/19 season and if including 17/18 data will make the prediction better or worse.

Flaws in the Plan

So we can see that we can use Poisson distribution to select a range of probably outcomes but here are my thoughts:

At the start of the season we don’t have enough data to provide accurate predictions, so in other words we would expect our model to become better over the season.

Coupled with the above we can’t really factor in historic data from previous seasons as the line-ups and even the manager could be completely different.

If you want to see how to grab data from and store it in a database you can view that here.