Twittering the Shipping Forecast
Although the main use of Twitter is for real people to say what they’re up to “right now“, it was quickly apparent to your average hacker that it’s just as useful for any service or device that changes state on a regular basis. Hence Tom and Tom making Tower Bridge Twitter, so that the bridge announces every time it raises and lowers.
Talking with Russell Davies a few weeks ago, he mentioned the Shipping Forecast - which immediately struck me as something that would be fun (although not necessarily useful) to get Twittering.
The Shipping Forecast is one of those uniquely British institutions - four times a day, the Met Office produces a forecast for the seas around the British Isles which is then broadcast on BBC Radio 4. Being brought up on the coast of the Irish Sea by Radio 4-listening parents, it was part of my life from an early age - and because it’s issued in a standard format, it’s almost poetic. Entire generations of Brits have grown up with phrases like “Southwest, backing southeast for a time, 5 to 7, occasionally gale 8″ becoming earworms. Hearing or reading it still brings back memories of a Roberts radio first thing in the morning, after Farming Today but before the Today programme.
Getting it to Twitter, though, presents one or two challenges. The first is getting hold of the source data - the Met Office being one of a number of British institutions that are forced by the grasping UK Government to be profit-generating, they want about £600 a year for the privilege of accessing a clean XML feed of the data. And although the forecast is published online by both the BBC and the Met Office, the quality of the HTML leaves a lot to be desired, at least from the point of view of scraping it.
In the case of the BBC, that’s because it’s presented primarily to be human-readable - in the case of the Met Office, it’s because they’re one of many brain-dead British public bodies that are Microsoft monocultures, and know or care nothing about standards and being good online citizens. Nor for that matter would they know good online design if they tripped over it. But that’s a rant for another day.
Back to Twittering. It’s a Ruby script that gets kicked off by a cron job four times a day, and uses the marvelous Hpricot gem to grab the appropriate page from the Met Office. Then the extraneous junk HTML is thrown away to leave just the table cell containing the forecast itself, which gets chopped up into a number of array elements by splitting it at the emboldened headings.
At this point the second problem arises - because the Forecast is intended to be read aloud, it’s fairly verbose. Fitting it into 140 characters is something of a problem. To get round this, the individual array elements get the snot parsed out of them, chopping down the character count by searching-and-replacing the content with abbreviations. It doesn’t make for pleasant-looking tweets, although if you’re familiar with the overall syntax and cadences of the verbal Forecast it’s actually surprisingly readable. Once squished down, each element is then tweeted out with the Twitter4R gem.
Originally I was going to set up a Twitter account for each forecast area, but that was a pain to set up (there’s lots of them) and awkward because area names like “Viking” had already been taken. So in the end I’ve compromised by pushing all the area forecasts out to the one Shippingcast account.
There’s room for improvement - cleaning up the abbreviations by using regular expressions would be one, or replacing obscure abbreviations with Unicode symbols being another. (There’s a key for the abbreviations I’ve used here) And it’s clearly not a particularly useful thing to be Twittering in the first place.
However, it’s made me realise just how useful it could be for public bodies to make their data available in structured, machine-readable form - and not to charge ridiculous amounts of money for it. The chances of the Met Office coming up with the idea of Twittering the Shipping forecast internally is next to nil - so this kind of “innovation” takes place externally. (I’m using innovation loosely here, but there are far more interesting and useful things that can be done with public data - anything that MySociety have done, for example). Making the information available would be trivial, if only there was the will to do it - and the potential benefits could be huge.
Filed under General |4 Responses to “Twittering the Shipping Forecast”
Leave a Reply
Love it.
Living in Cromarty, I thought it was a “must-follow”, so I did.
It’s definitely compact to read though.
If you ever did decide to to do a single shipping area - I know @Cromarty would be delighted!
If only it was free m8, Im working on a weather forecast site at the moment and googled shipping “forecast xml” to be presented with the proverbial price and thats how i came accross your page here. I get a lot of data from the american noaa site which is free. But in the uk and most of europe its not free and in some cases like this one its far from free. But hey in america you have to pay for medical treatment and in the uk you dont meaning if you get in trouble at sea in america and need medical assistance it costs you money but if you get in to trouble at sea in the uk and need medical assistance its free but the information about how not to get in to trouble costs money so is not as freely available.
I would love to scrape the content and have been thinking about it but that will get my site shut down.
If anyone finds any free broadcasts on this subject i would love to hear about it.
As for the geeky part one of my info sources has there txt file readily and writable (probably whores to microsoft to lol!!)
Great stuff - thanks for doing this. Slight problem with parsing the textual part at the beginning? Gale warnings in ???? today.
Fantastic - feeding to my PDA, just what I needed!
I discovered your post when researching just this problem but using XQuery.
Having looked at the text format of the forecast before and struggled to parse it, I discovered that the Met page is generated in JavaScript from a generated JavaScript file. Parsing this proved much easier and gives individual area forecasts directly.
My work in progress is at
http://en.wikibooks.org/wiki/XQuery/UK_shipping_forecast
and is only an SMS on-demand service this morning, with scheduled scrapes, caching and Twitter feed to come. As a cruising yachtsman, this might actually be useful to me!
I’d be interested to compare the Ruby code with the XQuery code (when I’ve got up to your level of functionality)
Chris