Saturday 25 September 2010

Predicting future web site traffic

    Recently I had the unenviable task of making an attempt to predict the traffic levels likely to be seen on a web site in the few months following a piece of search engine marketing work. Unenviable because it's a "how long is a piece of string?" question, impossible to answer with the certainty usually demanded by those who ask it. I gave it my best shot and thought it worth recording here how I did it.
    Web site traffic is cyclical. That is to say that the traffic pattern seen on a site over a given period in one year is likely to be mirrored in the same period in the following year. These same cycles can be seen on different sites in the same sector, so if one site selling pies sees a traffic pattern it is likely that another pie site will see the same pattern over the same period.
    The site in question has not been online for long enough to have gathered statistics for this period in a previous year. This meant that for the purposes of this exercise I had to look elsewhere in the same industry to establish the likely traffic patterns for the next few months.
    A competitor graph was created using the compete.com competitor tracking service. These sites can only be seen as estimates of any site traffic levels, but they do seem to get the trend information right. Three sites from the same industry were selected for similar traffic levels. The graph axes were extended for a few months into the future and the traffic patterns for the two competitor sites from same period in the previous year was pasted onto the end of their traces for this year. The trace for that period from the site that had the worst pattern was pasted onto the end of our site's trace for the months to be predicted. This formed the baseline of our predicted traffic, in other words what we thought might happen if no promotional activity took place.
    A new predicted trace for our site was then created by applying a 10% per month increase on the baseline trace. This formed an upper trace becoming increasingly divergent from the baseline trace. 10% was a figure plucked from the air as a realistically achievable upper limit target. The result was a shaded area between the 10% and baseline curves that was roughly triangular into which our future traffic should fall.
    Finally a left hand y axis was created to show estimated Google Analytics visitor figures. Visitor figures from services like compete.com usually significantly under represent the true values, so using the known Analytics figures from this year as a reference, the Analytics estimates were calculated using their ratios to their corresponding compete.com figures.
    So what did this graph tell us? In three months time our site could be receiving the traffic figure mid way between the baseline and 10% traces. Which sounds impressive until you realise that the two traces represent the error on that figure, about 20%. Hardly accurate.
    What it really tells us is this: predicting the future is an inexact science, the further into the future we gaze, the greater the error with which we see it. In fact the graph may already be flawed at the point of its creation. Compete.com works on complete months and the Analytics figures so far this month are not as good as those for the last complete month might lead us to hope.
    As an exercise though it was still worth pursuing. It is always worth knowing what the web traffic cycles are in any industry and for all its inaccuracy this method still gives some idea of what we might expect. I just wouldn't stake my career on it, that's all.

No comments:

Post a Comment