Tuesday 18 December 2012

Why electrical network frequency analysis might be unsafe to trust in court

    Tl;dr: Electrical network frequency analysis involves analysing the frequency of recorded mains hum to verify the time a recording was made, and that it has not been edited. This piece expresses concern that it could be fooled using readily available computer equipment, and makes a suggestion as to how that might be prevented.

    Electrical network frequency analysis has been in the news recently. It offers a solution to the problem facing courts when dealing with audio recordings; that of establishing the time a recording was made and that it has not been edited or tampered with.
    It works by analysis of any mains hum present on a recording. The mains electricity system uses AC, or alternating current, which is to say that its current changes direction many times a second. AC power cables thus are surrounded by an oscillating magnetic field which induces a tiny AC voltage in any electronic equipment that comes within its range. If the electronic equipment is a tape recorder then that tiny AC voltage will be copied onto any recordings it makes, resulting in a constant detectable background hum.
    In the UK our AC power grid operates at a frequency of 50Hz, which is to say that its current changes direction 50 times a second. All our power lines are connected to the same grid, so when there are minute variations in the frequency of the grid power in response for example to instantaneous surges in demand, those variations will be identical everywhere in the country. Thus if you were to store the frequency of the grid power as it varies over a period of time you could identify when a recording was made within that time by comparing the variations in frequency of any mains hum it contained with your stored values for mains frequency.
    It is a very effective technique, because the mains hum provides a readily reproducible timestamp. An infallible weapon in the fight against crime, you might say.

    Unfortunately I have my doubts.

    As an electronic engineer by training, when I read the BBC piece linked above, I thought immediately of Fourier transforms. A Fourier transform, for those fortunate enough never to have had to learn them, is a mathematical method for taking a piece of data in the time domain and looking at it in the frequency domain. If this sounds confusing, consider a musical stave. As you move from right to left along it you are moving in the time domain, the notes it contains are each played as you pass them. If however you shift your viewpoint through 90 degrees and look at the stave end-on, you are now looking at it in the frequency domain and you are seeing each note as it is played represented in its position on the paper by its pitch. If you encounter a chord, you will see several notes at the same time each at a different pitch.
    Now if you were to imagine the same trick applied to a complex recording such as human speech you would need to abandon the musical stave and instead imagine a much wider frequency range. And instead of single frequencies generated by musical notes you would see a multitude of different frequencies at different intensities which make up the astonishing variation of the human voice.
    Once you have transferred a recording into the frequency domain like this, you can examine individual frequencies such as any 50Hz mains hum. The forensic teams will use this technique to measure any variations in the hum, it's an extremely useful piece of mathematics.
    However, as well as examining individual frequencies you can also manipulate them. You can remove them entirely if you want to, or put new ones in. Then you can recombine all the frequencies from your Fourier transform back together into the time domain to create a new, altered copy of your recording.
    And it is this ability that is at the root of my doubts about electrical network frequency analysis, that since it is possible to remove the mains hum timestamp from a recording in this way and replace it with an entirely different one it seems to me that relying on this technique to verify when a recording was made and that it has not been altered is inherently unsafe.
    While researching this piece I had a good long chat with a friend whose career took him in to the world of DSP. From the course of our discussion came an idea as to how the job of detecting manipulation of a hum signature might be achieved.
    As it has been described, the forensic analysis can only look at the frequency of the 50Hz hum. They record it at their lab and compare it with the recording under examination. Yet the local mains supply where the recording is being made will contain so much more information than simply the hum frequency, it will contain a much wider bandwidth of noise that is unique to the mains environment in that particular location. That noise will be generated by the mains equipment electrically close to the recorder; everything from electric motors through fluorescent lights to poorly-shielded electronics. In addition it will contain phase changes, small movements of the waveform in the time domain, caused by any of those pieces of equipment that do not have purely resistive loads, and those phase changes could be readily linked to the noise from the devices that generate them. This information would be much more difficult to remove from a recording than just the 50Hz hum, so could provide a means to tie a genuine hum signature to a recording.
    Unfortunately though the only component of this that will be recorded will be the strongest lower frequency component of this noise, the 50Hz hum itself. This is because whatever is recorded has to be induced in the recorder by the magnetic field of the mains installation, hardly a coupling conducive to the transfer of higher frequencies.
    But what if instead of relying on induction the recorder mixed in a suitably attenuated copy of the complete  mains noise spectrum with the input from its microphone? In that case all the information about nearby mains-connected devices and their effect on the phase of the 50Hz hum it might contain would be preserved, making it extremely difficult to insert another hum signature whose phase changes do not match the changes in electrical noise also present on the recording. It is not beyond the bounds of possibility to imagine that "official" recorders in police stations and the like could be modified to record this noise.
    Of course, I may be an electronic engineer, but I spend my days working for a dictionary. The frequency analysis I do for a living these days involves language and word frequencies rather than audio, and any digital signal processing I have a go at is strictly in the hobby domain. I know the removal and reinsertion of a 50Hz hum signature in the way I have described is nothing special and could be performed by someone proficient with DSP software on a rather modest computer far less powerful than most modern cellphones, but I have no knowledge of any specialist techniques that might be used to detect it in a finished recording. My concern is that I am seeing a forensic technique acquire a scientific halo of being somehow a piece of evidence that is beyond reproach, and this prospect worries me when I can see such a flaw. This is not from a desire to damage justice but to strengthen it, for it is not unknown for evidence to be found to have been fabricated.
    So if there is nothing to be concerned about and manipulation of hum signatures in the way I have described could be easily spotted, fine. That's what I want to hear. Don't just say it though, prove it. But if instead this technique turns out to be a valid attack on network frequency analysis, then let it be brought into the public arena so that methods of detecting it can be devised.

Tuesday 20 November 2012

Slashdot is beyond resuscitation.

    I'm a news junkie. Specifically I'm a tech news junkie. Part of my morning routine involves scanning Google Reader to see what's new from a long list of tech news sites.
    One of the consistencies in my news feed for well over a decade has been the venerable tech news blog, Slashdot. News for nerds, stuff that matters, as the tagline put it. My Slashdot ID isn't one of the really low numbers and I've not used it much but it's low enough to date me to the early 2000s. The site has helped shape my outlook on Internet culture and brought me first news of some of the defining tech stories of the last decade.
    But all good things eventually decline. I first saw whispers that Slashdot was past it and Hacker News was the Place to Be a few years ago and though I followed HN I refused to believe that Slashdot was dead. Sure they sometimes took a day to post a story and there were the inevitable dupes, but nobody's perfect, you insensitive clod!
    This week however I finally see that Slashdot, the Slashdot that I used to know, is dead. Why? I'm at the epicentre of one of their stories. Well, I should say I was at the epicentre, they're not just a day late but a whole week late on a story that was all over mainstream media last Monday. Yes, my US colleagues chose 'GIF'(verb) as their Word of the Year and like your rather out-of-touch elderly relative Slashdot has caught up with a week-old newspaper on the day room coffee table and proudly announced the story as today's news.
    News for nerds, stuff that mattered a week ago. Worth following only for old time's sake.

Tuesday 23 October 2012

Bye bye analogue telly

   It is with some sadness that I note today sees the turning off of the final UK terrestrial analogue TV transmitter in Northern Ireland. Not because I miss Ceefax or because I hanker again for the days of only three, four, or five channels, but because analogue TV was what gave me my start in electronics when I was a teenager.
    When my contemporaries were doing more conventional 1980s teen stuff like riding BMX bikes or burning away their money on Pac-Man, I was hunting through skips for discarded TV sets, fixing them, learning how they worked, and using them as sources of components for my other electronic projects. I must have had hundreds of them pass through my hands, mostly the sets from the colour TV boom of the early 1970s. I learned the foibles of the Philips G8, the Decca Bradford and the ITT CVC5, I understood how an analogue PAL decoder worked and I picked up what is now one of the most useless skills around for an engineer, converging a delta-gun colour CRT.
    I remember some of my projects, the UHF transmitters fashioned from tuner cavities and the scary spark generator using TV EHT parts. My DX-TV setup, my home-made satellite receiver, and those weird Lockfit transistors. And the FM bugs made in IF cans, or the stereo valve amplifier using dirt-cheap PCL86 TV frame output valves. I made a lot of awful projects, some useless projects, other scary projects and one or two really good projects from discarded TV parts.
    As you might expect, I never had to pay for a TV until I was 35 and wanted an LCD panel.
    I still have one or two sets left over from that period. A few black and white sets of varying sizes, and a solitary ITT CVC5 colour set, rather battered. I sometimes fire one up with a Humax set-top-box, but there's no practical reason for me to keep them. Too good to throw away though.
    I feel privileged to have grown up as an engineer in the 1980s. Not only did I get the explosion of 8-bit microcomputers, I was also lucky enough that the electronic devices of the day were accessible enough to understand. I pity today's teenagers for whom electronic devices are highly integrated and surface-mount, they have such a restricted opportunity for experimentation.
    So bye bye analogue telly. I can't say I'll miss you in 2012, but I'm indebted to what you gave me. I doubt I'll see your like again.

Thursday 4 October 2012

What I really want from a mobile phone

    Every week it seems, there comes a new smartphone launch. Despite the fact that they are increasingly becoming identical black slabs, we're told that this one is different, special somehow because of one of its new features. It has an extra few mm of display width, it's 0.5mm thinner or it has an extra core in its processor.

    All very nice, but y'know what? I don't give a toss.

    For me, a technophile, to say that indicates that for me at least the multi billion dollar mobile phone industry has failed. Its products are all pretty much indistinguishable, but more importantly for me they don't do what I want from a smartphone.

   My perfect smartphone must have these features:
  • Nuke-proof hardware. Tough enough to survive my pocket, clever enough to conjure a signal out of almost nothing.
  • A useful and popular operating system. And a realistic chance of OS upgrades over its life, if I buy a phone from you that gets Osborned you simply will not get another chance. I'm looking at you, Motorola, I haven't forgotten my DEXT with its official support for Android 1.5 only.
  • No stupid manufacturer front ends or resource-consuming bloatware.
  • A QWERTY keyboard. Hey phone companies, I've got NEWS for you! We don't all have tiny fingers, and sometimes we use our phones in environments where touch screen keyboards are quite frankly shit. No, let me qualify that. Touch screen keyboards are ALWAYS shit.
  • A kick-arse camera. No, simply having a gazillion megapixels is not enough. It has to be a decent quality camera module in the first place. Nokia cracked this one a decade ago, wake up at the back there!
  • Enough screen area and resolution to be useful for browsing, enough processor power to keep up.
  • Decent hardware expandability. 3.5mm audio, Micro SD, USB, no weird and expensive proprietary connectors. 
That's it. I don't give a toss about device thickness, dot pitch, chipset willy-waving, Angry Birds, tinny built-in speakers, cheap music deals or all the other crap. If I could buy a bullet-proof QWERTY Android smartphone with a camera like the Nokia Pureview 808, I'd pay full price. Five hundred quid, there and then. Until then, you can keep your shiny black slabs, and I'll keep my money.

Saturday 8 September 2012

Baby killer

    My car was involved in a collision with a teenaged cyclist this morning. As far as I am aware she's shaken but OK, with little worse than a nasty graze to show for the incident. Nothing I'm particularly proud of but fortunately in the view of the police officer who interviewed me it was a fairly unavoidable accident caused by another motorist making a sudden risky manoeuvre hiding me and the cyclist from each other's view. I'm a cyclist, pedestrian and motorcyclist as well as a motorist, and turning it over in my head I can't imagine another outcome. The cyclist wasn't doing anything bad crossing the road in the context of what she could see and I was using a road I've used thousands of times in the last twenty years. My reaction times, good brakes and in the view of the police officer non-excessive speed meant she lives to ride another day.
    What did shock me though was the actions of other motorists. I'm a man driving a small hatchback. It's a 5-door family model that has a small economy engine chosen for diesel MPG rather than BHP so it's no sports car, but to them I was obviously a reckless young baby killer in a hot hatch. So they proceded to paint a picture of the incident so ludicrous in its level of malicious falsehood that the policeman said in as many words that he was far more interested in the facts of what had really happened.
    It started when I got out of the car. I suddenly had this crazy woman from another car haranguing me. Sorry love, I've just been involved in an accident, I don't need a silly bitch screaming at me. In fact the girl who's just limping to the side of the road doesn't need it either. Shouty woman was lucky, I'm sure someone other than me might have engaged with her as aggressively as she did and she wouldn't have liked that.
    Meanwhile I went over and made sure the girl was OK. Her mother was there and turned out to be a lovely lady who had seen the whole thing and said in effect "Don't worry love, I saw what happened and you weren't to blame".  Thank you very much for that, I can't express how much that meant to me.
    I could see several other motorists who had stopped, discussing it amongst themselves. This was where it started to become scary. I could hear them going over what they had happened, sharing tidbits and embelishing their stories. By the time the police arrived I heard them saying the most outrageous things in their statements, turning an everyday Oxford manoeuvre into something from a particularly boisterous touring car race. I had it seemed swerved around all over the road at an impossible speed, narrowly avoiding killing them all before moving down a helpless child. As I said to the policeman when he came to me, I considered those things to be barefaced malicious falsehoods that I would vigorously contest, and I was able to easily and quietly demonstrate both my lane discipline and with my relatively short stopping distance, evidence of my lack of excessive speed. I consider the fact that the policeman informed me that he would not be recommending any further action as vindication of my actions and if I hear any more credance being given to the lies I will vigorously defend myself.
    But I can't help being worried at how close I came to getting into trouble based on someone else's malicious falsehood. In effect, those people made up some lies with no consideration of the effect it might have had on their target. As I told the policeman I am completely certain they wouldn't like someone doing it to them.
    I am well spoken, approaching middle age, and the driver of a spectacularly unexciting car. By telling the truth I was able to foil any lies and describe what had happened to the satisfaction of the policeman who interviewed me. But what if I had been driving a performance car? What if I had been a so-called "chav", a non English speaker, or perhaps from an ethnic minority? Would I have had the same experience? I hope the answer would have been a "yes", but I can't help think my path would have not been so pleasant this morning. I also can't help thinking that people who are so ready to lie to put someone in my position in a bad light are not helping either justice or themselves, and that there should be some form of censure for people who do that. Because without it. we're all at risk of being accused of the most outrageous things. Me, you, those lying motorists, everybody. Do you feel comfortable with that? I certainly don't.

Sunday 26 August 2012

Living in a post-PC world

    I've spent the weekend having a clear out. Lots of old tax papers, magazines and assorted detritus, all gone. And a load of treasures from a couple of decades of hoarding PC bits.

    Some things are easy to part with. An ISA multi-IO card, for instance, is an easy throw. I'm never going to need one of them again in my life. Or an 8-bit cheap-and-nasty Soundblaster clone from about 1990. I've never even used it since levering it out of the XT clone it came from, space wasted.

   But then I came to the pile of cables. IDE cables, do I really need ten of them? Floppy drives. CD-ROM drives. Even old hard drives of a gigabyte or two's capacity. These were real treasures a few years ago, but now I can buy a flash card with tens of gigabytes for a few quid, they simply aren't necessary.

    I realised as I was clearing out my stock of PC bits that what I was seeing was the end of an era. For the last couple of decades my computers have continuously upgraded, but they've all been desktop PCs. I still have one, an AMD Duron running Lubuntu, but my main PC is now a seven year old laptop and I'm increasingly finding my development and everyday computing happening on ARM devices. The Raspberry Pi, and Android phones.

    For me, the PC era seems to be drawing to a close. I can see the next generation of ARM tablets - either Android or Windows 8, I haven't decided yet -  will eventually replace the laptop for portability, and the next generation of Raspberry Pi-style Flash-based single board computers will replace it for development and power. My storage has already migrated from the PC - either into NAS or the cloud - so the PC with all its inbuilt peripherals and power consumption is now an increasingly redundant web browsing platform. I'm entering my personal version of the post-PC world.

    So, does anyone want a stack of fully-populated Pentium motherboards or enough 72-pin SIMMs to pave a driveway?

Wednesday 1 August 2012

Oxford Raspberry Jam meetup 3

    So last night a bunch of us made our way down to Electrocomponents HQ again  for Oxford Raspberry Jam meetup number 3. As before, a fairly informal show-n-tell format with plenty of scope for discussion, and a lively exchange of ideas.
The serial terminal in action
    It was noticeable that people are starting to get to grips with the Pi's hardware. Our previous meeting has featured limited hardware demonstrations, but this time people had brought complete projects to show us. A serial display from a POS terminal, a Pi used as a network client for an audio industry control standard, and a very neat little serial terminal using an Atmel processor and destined to become a commercial product that can fit in the top of a Pi case.
    It was also encouraging to see a discussion of the Pi's application in education. How to capture the excitement of a huge bunch of kids when you only have a lunch hour to do it in. I realised at that point that I must have been an unusually geeky teen, having saved up my 30 quid for a second hand ZX81 I needed no such encouragement to get stuck in.
RiscOS blowing raspberries.
    On the software front we had a demonstration of the latest RiscOS build. Looking very slick, but with the intriguing promise of more to come as GPU support is included. I was a willing convert to RiscOS on my Pi because of its speed and ease of use, so I am especially looking forward to what the Pi and RiscOS community can achieve in porting more up-to-date software to the platform.
    I brought along two demonstrations to the meeting. The first was a shameless use of the Pi as an appliance, a DarkELEC image. This is an OpenELEC fork that includes all the clients for UK TV-on-demand services such as BBC iPlayer and 4OD. It's an impressive distribution in that it delivers very good performance from the Pi, and a stunning picture on the Electrocomponents meeting room TV. I must have made copies of the SD card for most people in the room.
    DarkELEC might seem a frivolous use of the Pi to some, but I think such appliance distributions are important. They mean more Pis will be used rather than lie forgotten on experimenters desks, and they provide a handle to gain the interest of young people in their Pi, making it more than just a geeky toy.
Yes, that's Internet Explorer 3. Best viewed in...
    My other demonstration was at the same time a joke and a serious demonstration of the Pi's capability. I ran Windows 95 in a Bochs virtual x86 machine over Debian on my Pi. And it was just about usable, despite no effort having gone in to tuning the Bochs setup. I can think of no practical application for Windows 95 on a Pi, but it is not impossible that perhaps someone might have to run a piece of legacy DOS software somewhere and might find Bochs a useful means to do it.
    Anyway, a few tech details. Bochs is in the Debian repository, so a simple apt-get installed it. I installed Windows 95 from the CD that came with a laptop in the '90s to a 100Mb Bochs hard disk image on my desktop PC and transferred it to the Pi on a USB disk. The Pi has no CD-ROM drive and I didn't fancy trying to extract the ISO file to do the task. I used the X-windows Bochs display library, so the Debian desktop was always present in the background of the Windows 95 session. A much faster result could probably have been achieved had I compiled the SVGAlib package and run it without X, but this was more a demonstration for the laughs than practicality. As I said, "You've seen an open OS on your Pi, now here's a wide-open one!".
    So that was it. Another Oxford Raspberry Pi meeting. I look forward to seeing you at the next one.

Monday 2 July 2012

What was that about "Don't be evil"?

    I'm sure most readers of this blog will be familiar with the famous Google motto "Don't be evil". Having had the chance to look at Google culture from a viewpoint slightly closer than the average Joe it comes across as something taken pretty seriously within Google. When they say that, they really mean it.
    "Being evil" is generally taken as a reference to some of the shady practices found elsewhere in the tech industry. Really good examples of "Being evil" can be found in the history of Opera Software, as the underdog in the browser race they faced over a decade of unfair practices from the developer of the dominant browser.
    But things are different now, aren't they? MSIE is no longer the top dog, and there's a new kid in town. Chrome, from Google, and they have that "Don't be evil" motto, don't they?

    I use all the main browsers, I'm a web developer. I use Opera quite a lot, it's a damn good browser, just like Chrome or Firefox. I'm also a Google user, you might say I've drunk the Google Kool-Aid. So seeing screens like these three from flagship Google services running in the latest version of Opera (12.00) distresses me.


 I develop interactive web sites that work with all major browsers. Chrome, MSIE, Firefox, Safari and Opera. It's standard web developer stuff, not difficult at all. The web is full of HOWTOs, compatibility libraries and guides for the novice developer, so I'd expect the kind of experienced developers Google hires to have no problems writing code that works cross-browser. It's hardly as though Opera makes it difficult anyway, it's one of the most standards-compliant browsers on the market.
    So what I'm seeing is a major browser developer not making the effort to support a smaller competitor's product in their web servicess when I know that supporting that product is straightforward for a competent web developer. And then using the lack of support to display a message pushing users of the smaller competitor product to their own offering.


    "Don't be evil" is Google's motto. It pains me slightly to say this, but as an Opera user I don't think they're living by it here. Come on Google, step up to the plate!

Wednesday 20 June 2012

MSIE overtaken

    Back in April I wrote a piece about the rise of Google Chrome and the pending loss to MSIE of the number one browser slot. Based on StatCounter GlobalStats data I predicted that this would happen in June.

    I was wrong. It happened in May. Finally we're in an era in which supporting outdated, insecure, and non-standards-compliant browsers is no longer considered a priority.

    It would be tempting to slam MSIE. Hell, the product deserves it! But I hope losing the top spot reveals Microsoft at their best. When they're top dog they don't behave well, but when they're the underdog, they innovate. It would be great to see a future version of MSIE that really gave Chrome and Firefox a run for their money, a super-fast, secure, up-to-date, non-proprietary, and standards-compliant browser.

    Well, I can hope, can't I.

Tuesday 19 June 2012

Oxon-RaspberryPi meetup 2

   Here's a quick report on my return from the second meeting of the Oxon-RaspberryPi group.

    So, about twenty people gathered in the meeting room at Electrocomponents on Oxford Business Park. Electrocomponents are the parent company of RS, one of the companies selling the Raspberry Pi, so a massive thank you to them for allowing us to use their space, and for providing us with some very interesting insights into some of the details behind their Pi offering.
    It was interesting to see the diversity of attendees, of their interests and level of expertise. We were treated to demonstrations of XBMC media centres running on a Pi, of a wireless serial link between a Pi and an Arduino, to a demo of simple GPIO interfacing and to my favourite of the evening - a Pi running RiscOS. My demo of a Pi running the Natural Language Toolkit seemed paltry by comparison.
A Pi running RiscOS
    Most importantly we discussed the idea of a group project, something diverse enough to allow all group members to contribute yet unique enough to allow us to contribute something to the Pi community. Several good ideas have been proposed, no doubt a front-runner will have emerged by the time we next meet.
    So all in all a positive experience. One of my fellow attendees remarked to me as we left that it was likely some of the major tech businesses of the coming decade would receive their start from the Raspberry Pi. I thought of the Silicon Valley garages of a few decades ago and couldn't help but agree. Where did we lose our way, back in the 1990s?

Thursday 24 May 2012

Life with Pi

    My Raspberry Pi single board computer arrived just over a week ago after a long wait for my ordered unit to be manufactured and shipped. I wrote back in early March about my thoughts on the product launch and I outlined my plans for the device a couple of weeks ago, now I've had it in my possession for a week here are my thoughts about the board itself.
    On first unboxing the unit my reaction echoed those of other reviewers, this is a small device. Of course we all knew it would be credit card sized, but having the board in front of you really brings that home. We've become so conditioned to computers requiring significant space for peripherals and heat management that one without that need is something of a shock.
    Gathering together the required peripherals was an easy task. I already had network and audio cables, as well as a micro-USB phone charger and my venerable Logitech wireless keyboard and mouse combo. I had to buy an HDMI to DVI cable because my TV is in reality an Acer monitor hooked up to a PVR and doesn't have HDMI. Thankfully the days of crazy pricing for DVI or HDMI cables are behind us.
    Similarly, downloading the Debian Squeeze reference distribution from the Raspberry Pi website and installing it on an SD card was very straightforward. Some SD cards have been reported as having problems with the Pi, I can confirm that my Lexar 8Gb SDHC card has no such issues.
    So with all peripherals and software in place and connected, I turned on the telly and plugged in the phone charger. Without fuss, the LEDs on the Pi lit up, and the Linux boot screen appeared on the TV. Success!
    Success, that is, until it hung during the boot process. But a very helpful message explained that this can happen at first boot and simply rebooting the device would fix matters. A further reboot and login process, and I had a bash prompt. My £25 Linux PC was a reality.
    Given a working computer with a Linux command line, the world is your oyster. I am reviewing the hardware for this piece rather than the software because I feel every Pi owner will have their own plans for the device and simply describing a Linux distribution will be of little interest. So the software is only described in this piece in scant detail, and to give an idea of the speed of a Raspberry Pi compared to a more familiar computer.
    So I typed startx at my prompt, and was rewarded with the LXDE desktop. As a simple first task, I loaded the BBC Weather site and then GMail in the bundled Midori browser. Hardly heavy stuff, but it gives a good idea of the speed involved.
    I have heard the Pi described as having performance similar to a Pentium II with a very fast graphics adaptor. By coincidence one of my desktop PCs is a Pentium II 266 running Lubuntu, so given that the Pi's Debian does not yet have driver support for the graphical acceleration I would say that the performance of both machines is very similar indeed. Browsing is a usable experience, but a slow one. A typical web page will take over ten seconds to render, but in-page Javascript features are usable in real time once the page has loaded. Services like GMail for instance are a bit slow-feeling, but not so slow as to be impossible to use. Having used the Pentium II as my main development platform for several years before moving to my laptop I do not expect a Pi with similar performance to struggle with the kind of scripts I am likely to use it for. On that note, I installed the Python Natural Language Toolkit package which will be a significant plank of the project I plan on using my device for.
    If I have any software gripes, they are minor and will I am sure be fixed in future distributions. The image is for a 2Gb SD card, and though instructions are readily available online for extending the partition onto a larger card they are not for the faint hearted and would benefit from being made easier as part of the Pi distribution. GPartEd is bundled and might perform this task but trying to use it revealed a further gripe; the administrator password doesn't seem to be available. Of course I could use sudo from the command line to achieve this aim by other means, but to someone with little knowledge of Linux this would be a show-stopper.
    I would like my Pi to run full-time next to my router, processing keyword data. To this end my interest in the Pi centres on its low power consumption and heat dissipation, I do not want a traditional PC with all its heat issues running full-time in my home. I put the Pi to the test on this front by constantly reloading the BBC website for several minutes in Midori, causing the CPU graph to show as maxed-out. If the Pi had a traditional Intel processor then merely running LXDE would cause it to be too hot to touch, as it was I could put my finger on the Pi's memory/processor combo and feel that it was merely discernibly a bit warmer than its surroundings.
    The Pi's small size means that there is plenty of space behind my telly for it next to the router. But with a bulky HDMI cable plugged into it there was something of a feeling of the tail wagging the dog as the weight of the cable threatened to pull the board with it. Also the board layout with cables on all sides meant that positioning it was slightly difficult. But that is an acceptable compromise as has been explained by the Raspberry Pi team, for reasons of size and cost. My solution to the mechanical issues of a small board held only by a selection of cables was to make a simple case from a surplus business card box, a speedy procedure involving a set of nail scissors. This case does not seem to result in the Pi becoming too warm, and will probably end up being fixed to the router using sticky Velcro pads.
    In conclusion, after a chance to play with my Pi I am still extremely impressed by it. Sure, the software distribution has a few very minor gripes, but this is very much still a product in development by a volunteer organisation. Some care is needed in mounting or enclosing a Pi to protect it from damage, but that is no more than would be expected for any printed circuit board. Otherwise it is an extremely powerful computer for its exceptionally low price and with a low power consumption, so I look forward to it spawning the same diversity of creative computing that its 8-bit forebears did. If you haven't ordered one, do so now!

Monday 14 May 2012

What I'm going to do with my Raspberry Pi

    That magic email from Farnell came on Saturday, my Raspberry Pi is in the post!
    So, what am I going to do with it?
    Like loads of other geeks I expect I'll plug it into my telly, connect it to my router and use it as a web terminal and media centre with geek bragging rights. There it'll sit for however many years it takes until I get a new telly or a Raspberry Pi 2, unseen and uncomplaining. But rather wasted, don't you think.

    Somewhere, the Flying Spaghetti Monster has just killed a kitten.

    So what do I really want it to do? I will have a very small and moderately powerful computer - insanely powerful by the standards of a few years ago -  that uses negligible electrical power and can be left on all the time. I'm still going to plug it into my router and telly, but to make it earn its keep I'm going to have it run my keyword analysis tool.
    Events have moved on a little since my blog post describing the tool, but the principle is still the same. I take new posts every day from a big list of RSS feeds and process them for keyword phrases which I store in a database. I can then extract frequencies and collocates over time, which gives me a picture of the interrelationship between the language and terms in the news over any given period. It not only fulfils my original aim of having a tool that would generate keywords and phrases for previously unseen search terms, but also allows any newsworthy subject to be examined in a way that is not possible by any other means.
    The original tool runs in PHP on my Windows laptop. Its MySQL database is pushed well beyond its limit, and I have been working on a version that uses a large directory tree of precomputed JSON files instead. It's an approach I've since also used in my work, relying on the principle that disk space is cheap and quick while complex joins on monster MySQL databases are expensive and slow.
    I could of course compile PHP for my Pi. It's probably already available precompiled anyway. But the Pi's a Python platform (Try saying that after five pints of real cider!) and that offers me a unique opportunity. My PHP code does the job, but it relies on my own language processing libraries which I built myself as a search engine specialist. I'm not a computational linguist so I'd be the first to say that they aren't as good as they could be.
    Python has the incredibly useful Natural Language Toolkit libraries which allow me to do so much more with my source texts, and so much more quickly than my PHP code. So my first effort with my Pi will be to port my keyword tool to Python, using the NLTK instead of my own library. The Pi will still sit behind my telly and be used for the occasional bit of web surfing, but for the rest of the time it'll be crunching keywords and giving me lots of lovely language data that I can work with in real time rather than with enough time to make a cup of tea every time I make a MySQL query.
    In a way I'm not taking advantage of everything the Pi can do. Almost any internet connected computer could do this job, I'm only using a Pi because it's cheap and low power, and I've lusted for one ever since I read their early press releases. Other people will use the Pi's hardware capabilities to do much more eye-catching things. But my Pi, quietly crunching words all day and night behind my telly, will still be earning its keep. It will allow me to learn new things and since its data is likely to end up in some of my work stuff it may even in its own small way make a contribution to the wider understanding of language.
    So that's what I'll be doing with my Pi, what'll you be doing with yours?

Sunday 6 May 2012

How We Deal with the Homeless Problem in Oxfordshire

    If you were given a heap of money and told you had to spend it deal with some homeless people living in Oxford, how might you proceed? Go on, you can spend it how you'd like.
    Perhaps you'd invest it in some kind of housing, maybe a subsidised rent scheme or something. I'm not a housing expert, but you'd kinda expect the solution to have something to so with housing, wouldn't you.
    In the past few weeks I've seen the authorities spend a considerable amount of money and effort on some homeless people here in Oxford. What they did with their money, and how incredibly useless and damaging the result of it all has been, is such a disaster that I feel it has to be disseminated more widely.
    If you wander round Oxford's more overgrown corners, you'll encounter a homeless community that lives full-time outdoors somewhat unseen in tents and rudimentary shelters. Being a natural explorer and chasing random wildlife I've encountered people living in this way all around the edges of our city, in City of Oxford, West Oxfordshire and Cherwell council districts. These aren't the people you'll see sleeping rough in doorways and they aren't the people who visit the homeless shelters. They are spread about such that they aren't more noticeable in any one place (with notable exceptions of the floating community on the Thames backwaters and that encampment a few years ago where Osney Mead meets the railway) so we don't talk of favelas, barrios or shanty towns. I have no idea how  many of them there are, but since I keep encountering them there must be quite a few people living in this way.
    A few months ago I got to know a pair of tent-dwelling homeless people. They are friends of a friend of mine, and their tent was pitched in some bushes on land owned by an Oxford college.
    I had better make this very clear, they did not choose to live in a tent. They would do anything to find accommodation, but time and time again the system has failed them. They do not seem to fit any of the categories required to advance their case for housing, so they are among the city's long-term homeless.
    I'd also better make something else very clear indeed: They aren't dodgy. The police all know them, and despite their situation, they are not "known to the police". I've seen their interactions with police officers, and there was none of the attitude you get from police when they think they are dealing with a criminal. Neither of them has a criminal record and neither of them has ever been arrested for anything. They were at pains to minimise their impact of the field because they knew to do otherwise would only invite trouble.
    Earlier this year, eviction proceedings were started against them. The owners of the field wanted them out. Which is fair enough, it's their field. So lawyers and courts and enforcement officers and God knows what other legal machinery was brought to bear against them. Without a fuss, at the last possible moment, they moved out. No sense in falling foul of the law when you are as vulnerable as they are.
    What does a homeless person kicked out of one pitch do? They look for another one. Our city has no shortage of forgotten corners, so their tent was pitched once more somewhere else before too long.
    Unfortunately though we then had the wettest April for many years. Their new pitch was flooded, ruining their tent and all their belongings. From camping on someone else's land with semidecent shelter, they're now camping on someone else's land with very inadequate shelter.
    So to summarise the last few paragraphs: a lot of money was spent by both the landowner and the Authorities to evict from a field a pair who really didn't want to be there anyway, and because of that eviction they're now in a much worse state, yet still camping on someone else's land. So nothing much has changed overall, except a lot of money has gone to a load of lawyers, and the two homeless people in question have had their lives made a whole load worse. No doubt in a few months a fresh set of lawyers will be expending billable time to ensure they are moved on again.
    Homeless people like these two do not want to cause problems. They simply want somewhere to live, to get out of the cold and damp. If they represent any kind of problem and any cash is going to be chucked around, it does not take any kind of genius to reach the conclusion that perhaps the problem would go away if the cash was spent on housing them rather than on legal manoeuvres which as I hope I've demonstrated above are pretty pointless. That this doesn't happen does not reflect well upon those authorities responsible for the provision of services to homeless people and is a sad indictment of our society.
    All I can do is write about it here, I'm not some philanthropist or power-broker so I can't put anything right. So my tech blog has been subsumed by a bit of minor political ranting for a while. I think that the above represents such a cock-up that people of most political persuasions should find it as annoying as I do, if you agree with me please share it. Maybe then something positive can happen.

Tuesday 1 May 2012

A simple question for SEO practitioners everywhere

    My friends ask me for advice when they are looking at search engine marketing services for their websites. They've usually got some pushy SEO salesman giving them the hard-sell. "We can put you on the top of Google searches!".
    I tell them to ask this simple question of SEO sales people who make that particular promise.

    "What's your name?".

    If the answer ain't "Larry Page" or "Sergey Brin", my excedingly distant ex-employers from my Google Rater days, they're lying.

Monday 23 April 2012

Preparing for MSIE Overtaking Day

    It's been a trope of the web developer's existence for the last decade: Microsoft Internet Explorer won the browser wars in the 1990s, all other browsers are irrelevant. The customer has this firmly lodged in their heads from the days when MSIE had over 90% of the market and demands support for IE in all its forms over support for any other browser. We may just about have won the war over abandoning IE6 support, but we are still demanded to code in all the workarounds demanded by its just-as-creaky younger siblings.
    It might have been true in the mid-2000s to manipulate the famous phrase about IBM to "nobody ever got fired for supporting MSIE". But as a quick look at StatCounter's Global Stats will tell you, over the last couple of years Google Chrome has come from nowhere and is rapidly converging on MSIE's market share. Extrapolate the graph forward a few months, and it becomes obvious that sometime this summer, probably in June, Chrome will overtake MSIE as the world's most popular browser.

    That's right, June 2012 will see MSIE Overtaking Day.

    You might think it would be unwise to break out the champagne though. After all, the top spot is just passing from one big company to another, won't it just be a case of "Here's the new boss, same as the old boss"? In that we're fortunate: unlike MSIE with its proprietary approach to rendering HTML, Chrome is a Webkit browser, underpinned by open source and web standards. So if we code to those standards we can expect it to work without too many tweaks on all browsers that support them.

    No more browser-specific stylesheets, no more special Javascript hacks, no more compatibility libraries.

    But the title of this piece is "Preparing for MSIE Overtaking Day". We're already there, as developers we're used to coding web standards, all our sites already work in Chrome, Firefox, Safari and Opera. It is the non-technical people who need preparing, all those marketing people at the customer, the legal people and the developer project managers who are stuck in that 2000-era trope. I was shocked not too long ago to encounter a site whose contract specified individual browser versions. Not even "version X and above", so when a legitimate bug was reported in a recent browser the response came back that it wasn't supported because the browser wasn't several years old. That kind of thinking is simply not acceptable.
    So we have to think away from the browser in the post-MSIE world of frequently released standards-compliant browsers. We have to sell web standards such as HTM5 rather than support for particular browsers to the non-technical people we encounter as web developers, and we have to hammer home that message using the clearly visible statistics.
    Otherwise we'll still be coding for MSIE7 in 2017 just like some of us had to support MSIE6 in 2010 And that just ain't funny, not at all.

Thursday 5 April 2012

Accessibility: it's an engineering problem

    A  little over a week ago, the British paralympian athlete Tanni Grey-Thompson gave an interview in which she decried the state of accessibility for people with disabilities in the UK and described the experience of having to crawl off a train because the rail employees who had been booked to provide her with a ramp had failed to materialise.
    It has been interesting to watch the commentary unfold over the intervening days. On one hand we have seen many other stories from people with disabilities of being denied the use of services, of being stranded or of being forced to great inconvenience to gain access to things which should be as easy for them as for anyone else. Meanwhile there has been a chorus of discontent from the kind of people who read the Daily Mail, who see the disabled as a fantastically privileged minority who have vast amounts of hard-working able-bodied people's money squandered on them and should accept their lot and stop whinging. To paraphrase one of their battle cries: "If being disabled is so good, why don't you go and live the dream!".
    It's slightly uncomfortable to realise that while the experiences of the people with disabilities are unfortunately too real, there is also a germ of truth in the root of the bitterness from Mail readers. As a country, we have spent a vast amount of money over the last couple of decades on improving accessibility, so where have we failed?
    An exercise I would counsel anyone able-bodied to try is to accompany a wheelchair user across a British city. A simple walk becomes a lengthy traverse; it is sobering to realise how many wheelchair obstacles you pass on foot without realising it.
    Meanwhile the world is festooned with dubious paraphernalia with the aim of improving accessibility. Cash machines have been moved closer to the ground without a thought being given as to whether they need redesigning for use from a perspective of someone with limited mobility or reach. Small business premises sport unused ramps, electric lifts, and disabled toilets used only as dusty storage rooms.
    One is left with the feeling that a vast exercise in box-ticking has been completed. Everywhere has 'done' accessibility because they have the ramps, signs, and lifts to prove it, yet real-world accessibility for people with disabilities remains as elusive as ever.
    I look at this and see not an accessibility problem but an engineering problem. Of course Tanni Grey-Thompson was let down by the rail company, but was the real failure not in the equipment she was provided with? It seems inconceivable that in an age in which we can create robots that can travel to Mars and operate autonomously for years at a time, yet we can't design a lightweight personal mobility aid for someone in a wheelchair that is capable of letting them traverse the kind of step you might encounter between a train and a platform or anywhere else. The Mars robots definitely are rocket science, the mobility aid definitely isn't! Or it shouldn't be, anyway.
    Of course, we engineers have a major failing, we see everything as an engineering problem, we live in a black-and-white world. Engineering can't always fix social or political problems. But engineering is nothing if it isn't the art of making machines to solve physical problems in the real world, and rather evidently the engineering available to people with disabilities isn't fit for purpose. By my observation the basic design of a wheelchair hasn't changed in many decades, is this really a technology that's reached its zenith?
    Perhaps if our Government had spent less time and money patting itself on the back for a successful but ultimately useless box-ticking exercise and instead invested in research and NHS funding for mobility aids that allowed people with disabilities to render some of their accessibility problems irrelevant, there would be no need for any of this. People with disabilities would be able to go where they wanted and get on with their lives just like anyone else.
   But why on earth would any politician want to do that, it wouldn't win any votes from Mail readers, would it!

Thursday 8 March 2012

How the Pi could have been (some of) ours

  (edit: My Pi arrived on 2012-05-14. If you would like to read about my plans for it you can do so here, and my review of it can be found here.)

    So, the Pi will not be ours. At least until April, according to my email from Farnell.
    The Pi? The Raspberry Pi, that is, powerful yet inexpensive single board computer and object of desire. Released to a storm of interest that created a Slashdot-like denial of service to the websites of its two suppliers, its launch left a lot of hopeful would-be buyers disappointed and venting their anger online as the first production run sold out in seconds.
    Now the dust has settled, time for a look at the launch from a customer's perspective.
    I don't think many people involved will disagree with me when I say that the launch of the Raspberry Pi could have gone better. That's not having a go at the Raspberry Pi team, it's simply stating the obvious given hindsight. The Raspberry Pi foundation are a small charitable endeavour and what they have done is amazing, creating their product from nothing and with minimal resources. They are not a huge multinational company with a sales and marketing operation to match so it is unfair to expect them to be able to emulate one. The fact that we'll be able to buy our Pi at all is an incredible achievement, even if we all have to wait a couple of months.
    But it's worth examining the launch from a customer perspective, to quantify what seemed to fail and arrive at some possible solutions. This isn't a "How I would have launched the Raspberry Pi differently from those losers!" piece but a "Gosh, how can I learn from that and what would I do if that happened to my next product launch?" piece.
    So, in the words of an F1 commentator of yore interviewing Johnny Herbert: what went wrong? Here are the answers to that question from my perspective:
  • The launch was massively oversubscribed. A hundred thousand geeks were chasing ten thousand boards. Most of these potential customers were always going to be disappointed.
  • The launch was at a very odd time of day. A hundred thousand geeks had to get out of bed for 6am. Thus not only were the customers disappointed, they were tired and disappointed.
  • The two suppliers - Farnell and RS - completely dropped the ball. In the age of turnkey cloud computing if you know a hundred thousand people are going to come to you all at once for a single page it is not beyond an organisation of their size to direct them to a web presence that can handle that level of traffic. They failed massively, and they will have paid dearly for that failure in lost business while their sites were out of action.
  • The email notification failed. For which the Raspberry Pi people apologised, it seemed their email server wasn't up to the volume required. It's a little unfair to put this here because Twitter and the Raspberry Pi blog seemed to do just as good a job, however it was part of the picture that was missing.
     It's easy to get irate about this catalogue of unfulfilled expectations. It is however worth reminding any reader tempted to cry universal failure that the Raspberry Pi people succeeded in doing exactly what they set out to do, which was launch their product and sell their first batch of boards. They were very clear about the size of that first batch beforehand, also they were very clear why they only had that number.
     So, they succeeded in their primary aim, but received an online slamming from disappointed would-be customers. Where did they not succeed, and how might other product launch teams learn from their launch?
  • They didn't manage the expectations of their customers. Sure, we all knew it would be busy, but everyone went in thinking that they might have a chance of snagging a Pi. Perhaps a lottery the week before launch to allocate the right to purchase what boards would be available might have contained those expectations.
  • The time picked for the launch was in my view unwise. Was it to synchronise with US time zones perhaps, or was it at the behest of RS and Farnell? Either way, it had the effect of intensifying the disappointment of the customers whose expectations had been dashed, not only had they failed to score a Pi but they'd had no reward for getting out of bed early. Yes that sounds petty, but customers are fickle and the best way to get them reaching for a credit card rather than a whiny social media post if you've got them out of bed early is not to annoy them with no reward.
  • The email was not farmed out to a server capable of handling the volume of traffic. Yet again this is slightly unfair. Other media did the same job, and their budget is better spent on making more boards than supporting commercial email providers. However when looking at how a commercial product launch could learn from the Pi launch, this is a valid point to consider.
    There is one way in which the episode can be rated as a complete failure though: the server outages from both RS and Farnell. I'm sure the Raspberry Pi team would have done their best to communicate the likely traffic levels to these two suppliers so I can only assume that they did not listen. Or perhaps they did not believe that a small organisation with ten thousand boards to sell could generate that level of interest. Either way the server outage demonstrated the deficiencies of their infrastructure only too well. A right-to-buy lottery would have mitigated the traffic surge, but even without that the suppliers could have set up cloud-hosted Raspberry Pi sales microsites able to handle the traffic. I'm guessing that lost general sales due to the website outages will have focused their minds, and next time it would be different.
    The Raspberry Pi will be an astounding success. Deservedly so, it is an amazing product. And a few whiny geeks on its first day won't change that in the slightest. I guess this piece is looking at the Pi as a case study for more mundane product launches, ones that don't benefit from the goodwill or groundbreaking nature of the Pi. In that light, the twin lessons of managing customer expectations and ensuring the readiness of external suppliers have to be learnt and implemented. Without them, a product lacking the Pi's star qualities risks sinking without trace.

Note: Comments are moderated for this piece. Be civil if you do comment, disjointed fanboy rants will be derided. I'm a Raspberry Pi fanboy who's just as excited about the project as you are.

Wednesday 15 February 2012

An AJAX and jQuery driven web feature, the OxfordWords Text Analyser

    If you are a follower of the OxfordWords Blog, you may have seen the launch of the OxfordWords Text Analyser, coinciding with the 200th anniversary of Charles Dickens. Here follows a technical description of the feature, what it does and how it works.
    The challenge was to show the logophile visitors to OxfordWords some of the computational linguistic techniques used in the preparation of the Oxford English Corpus, and thus give some insight into the preparation of a modern dictionary.
    Finding ourselves unable to share the corpus itself with the public, we settled on the idea of  delivering a much smaller text with similar analytical techniques applied to it to those used by our corpus analysis software. Since we already publish a huge range of classic texts in the Oxford World's Classics range it made the most sense to use those as our sources and provide collocate and frequency analysis as well as example sentences for each text. This presented a data problem: to incorporate all this in a single web page would mean adding several megabytes of data to the page, resulting in an unsustainable page load time.
    The obvious solution was to create a lightweight page containing just the Javascript display code, and deliver all the data on an as-needed basis via AJAX calls. There was a time when writing this to work reliably on all browsers would have been the bane of a programmer's life, but fortunately we now have the jQuery library to abstract such nasty jobs from the programmer, so taking that route was a no-brainer decision.
    AJAX having been decided upon, the next decision related to the server-side component. I wrote a piece about this last summer, about how experience of database driven back-ends for language analysis had led me to precomputing data as json files rather than querying a database. Disk space is fast and cheap, server processing power isn't. So my next task was to write a set of command-line PHP scripts that generated a tree of JSON files for each word in the source text, containing collocates, frequencies and example sentences.
    While these tasks are essentially simple ones, they are quite computationally intensive. In a typical Dickens novel for instance, there will be as many as 20000 unique words, and every one of those words needs to be searched for across the whole text and the frequency of its colloscates in all the locations it appears in computed. The whole process takes about six hours, and produces a roughly 20Mb tree of thousands of tiny JSON files which are then uploaded to the web server.
    So that's the surprisingly low-tech backend for the feature, how about the front end?
    The core functionality of jQuery makes coding an application like this very simple. The three main parts of the feature live in hidden DIVs which are shuffled using the jQuery show() and hide() functions. TheJSON data is pulled in using jQuery's getJSON() function. Collocates are shown in a word cloud courtesy of the excellent jQCloud plugin, example sentences are simply loaded into an unordered list, and frequency graphs are created using Google's Image Chart API.
    These plugins and code made a working feature. But to make a single-page jQuery application like this one feel like a proper application, there was one further component required. Users expect to be able to use the back button, and to be able to return to a particular part of the application by URL alone. Miss Havisham's dress needs to be readily conjoured up by linking directly to the word "bridal".
     We thus used the jQuery-BBQ plugin to provide URL and history functionality through the use of in-page anchors. Because the page can't be reloaded, the plugin appends the word to the end of the URL after a # symbol as it triggers any changes to what is displayed.
    In summary, the use of precomputed JSON files made the server-side compontnet of this feature very simple at the expense of using more filesystem space, and the use of jQuery and its plugins made client-side development a lot faster and more reliable across browsers. Sometimes libraries like jQuery are used for the sake of it when simple Javascript would have sufficed, but in this case I believe its use made for a far better application.