Monday 27 September 2010

RSS feed keyword analysis for the fun of it

    What do you do when the recession hits and you are made redundant?

    When it happened to me last year, I wrote an RSS feed keyword trend analyser in my new-found free time. Over a year and several million keywords and phrases later I can find associated keywords and phrases and plot graphs for almost anything that's been in the UK mainstream news. Like this one, showing the fortunes of three Labour party leaders over the past few weeks.


   You can clearly see Tony Blair's book launch as the blue hump in the middle, and Ed Miliband's election as party leader in green on the right. Meanwhile Gordon Brown bumps along in the obscurity of his Scottish constituency as the red line. Funny that, the colours were allocated at random by my graphing library yet Blair got the Tory blue.
    As a search engine marketeers tool it's of limited use unless you really are looking at up-to-the-minute trends for very fast moving content. But as a toy, or for finding collocated words and phrases for newsworthy themes, it's shaping up pretty well.
    I'll be dipping back in to this particular well of words again on here from time to time, both from the tech side and just for the joy of playing with some words.

Saturday 25 September 2010

Predicting future web site traffic

    Recently I had the unenviable task of making an attempt to predict the traffic levels likely to be seen on a web site in the few months following a piece of search engine marketing work. Unenviable because it's a "how long is a piece of string?" question, impossible to answer with the certainty usually demanded by those who ask it. I gave it my best shot and thought it worth recording here how I did it.
    Web site traffic is cyclical. That is to say that the traffic pattern seen on a site over a given period in one year is likely to be mirrored in the same period in the following year. These same cycles can be seen on different sites in the same sector, so if one site selling pies sees a traffic pattern it is likely that another pie site will see the same pattern over the same period.
    The site in question has not been online for long enough to have gathered statistics for this period in a previous year. This meant that for the purposes of this exercise I had to look elsewhere in the same industry to establish the likely traffic patterns for the next few months.
    A competitor graph was created using the compete.com competitor tracking service. These sites can only be seen as estimates of any site traffic levels, but they do seem to get the trend information right. Three sites from the same industry were selected for similar traffic levels. The graph axes were extended for a few months into the future and the traffic patterns for the two competitor sites from same period in the previous year was pasted onto the end of their traces for this year. The trace for that period from the site that had the worst pattern was pasted onto the end of our site's trace for the months to be predicted. This formed the baseline of our predicted traffic, in other words what we thought might happen if no promotional activity took place.
    A new predicted trace for our site was then created by applying a 10% per month increase on the baseline trace. This formed an upper trace becoming increasingly divergent from the baseline trace. 10% was a figure plucked from the air as a realistically achievable upper limit target. The result was a shaded area between the 10% and baseline curves that was roughly triangular into which our future traffic should fall.
    Finally a left hand y axis was created to show estimated Google Analytics visitor figures. Visitor figures from services like compete.com usually significantly under represent the true values, so using the known Analytics figures from this year as a reference, the Analytics estimates were calculated using their ratios to their corresponding compete.com figures.
    So what did this graph tell us? In three months time our site could be receiving the traffic figure mid way between the baseline and 10% traces. Which sounds impressive until you realise that the two traces represent the error on that figure, about 20%. Hardly accurate.
    What it really tells us is this: predicting the future is an inexact science, the further into the future we gaze, the greater the error with which we see it. In fact the graph may already be flawed at the point of its creation. Compete.com works on complete months and the Analytics figures so far this month are not as good as those for the last complete month might lead us to hope.
    As an exercise though it was still worth pursuing. It is always worth knowing what the web traffic cycles are in any industry and for all its inaccuracy this method still gives some idea of what we might expect. I just wouldn't stake my career on it, that's all.

Wednesday 22 September 2010

My compliments to the cook: SEO vs. SEM

    When I was a small child I attended a primary school in an English village. Summers were long and hot, there were jumpers for goalposts and our school meals were awful. They were the creations of the school cook, a rather nice lady whose culinary output was probably stunted by a poor budget and the dead hand of Ministry of Education dieticians. It was with great surprise then when I moved to secondary school that I found the meals were rather good, worth looking forward to in fact, for they were assembled not by a cook, but a chef. With a white hat and all, very impressive.
    My profession is usually referred to as search engine optimisation, often represented by the initialism SEO. You will rarely see either in my personal lexicon, instead I prefer search engine marketing.
    My reasons for this are twofold: to give a sense of the wider task involved in helping a web site to increase its visibility in the search engines through legitimate means and to differentiate myself from the work of the blackhats in the gutter of my industry. A few years ago while contracting as a quality rater for the large search engine you probably use daily I spent a lot of time following up keyword stuffed link farms, valueless spam blogs and hidden or misleading rubbish from people who definitely refer to themselves as being in the SEO business, so for me the distinction is an important one. I'm lucky enough now to work in-house at a large publishing business and need never ply my trade further afield, so I see no reason to associate myself with the term SEO.
    Looking at a Google Insights search comparing the two terms I find I'm at least not entirely alone. Search engine marketing is used about half as much as search engine optimisation(or optimization for a US search) but it's still a significant enough term for me to be able to describe myself thus without blank looks. Because as with the school catering staff of 1980s Oxfordshire, I'd rather be a chef than a cook.

Sunday 19 September 2010

All blogs have to start somewhere

Keyword
  1. a word which acts as the key to a cipher or code
  2. a word or concept of great significance
Geek
  1. an unfashionable or socially inept person
  2. [usually with modifier] a knowledgeable and obsessive enthusiast
    This is obviously one of those cases when you wish you hadn't looked a word up in the dictionary. I'm a search engine specialist by trade, so "A knowledgable and obsessive enthusiast for words of great significance" doesn't sound too bad. I'm not so sure about "An unfashionable or socially inept person" though.
    I enjoy my job, and in doing it over the years I have frequently encountered words, phrases, techniques and bits of code that have made me think at a tangent to what I am being paid to do. These tangents sometimes stick around in my head for a while, and this blog represents a long-overdue outlet for them.