Taking Responsibility for Data

One of my early aims for the local election website was to list each candidate, along with party and website, directly on the page. It seemed right that – if I were promoting ease of access – that this should be a basic requirement.

Pretty quickly it became clear that this was an ambitious task - most of the 2000+ wards are in PDF format, some in Word and a tiny minority in HTML. The PDFs are not easily readable (aside – I’d love to know how they fare for disability discrimination tests) so any hope of automating the process went out of the window.

I simply didn’t have the time either. It took two days just to get the links together. A conservative estimate of one minute per document would have me working another week just to get the local elections up.

The bigger problem is liability. While I have rightly disclaimed that there are probably errors, and that users should double-check the council website, trying to present concise summary information based on copy & paste would be risking making mistakes. Disclaimers can probably help protect me against anybody getting nasty, but I neither want the hassle nor the exposure.

With the links as-is, there is a failsafe of sorts. Each linked PDF has the ward and constituency listed at the top of the page. If the viewer sees the wrong name, they know something is amiss.

If I were to put a candidate in the wrong political party, or list the party in the wrong election, the user would have little reason to doubt the results. There is no failsafe.

Had I had more time I might have tried to put something together that enabled a community effort. There are plenty of people out there who support this kind of stuff and would likely donate some time and effort. This additional manpower – mixed with proper QA – could help reduce the risk.

However, I still think a decent dataset from central government and/or local governments would be the ideal solution. They are likely going to take considerable measures to ensure the data is correct, and the candidates will undoubtedly be referring to their lists for completeness and instantly feeding back corrections.

One of the strong benefits of opening data is that the economics can be changed. Effort need not be duplicated; we largely eliminate human error. Time spent recreating these lists is time wasted.

UK Elections in May

Introducing the Elections 2014 site – find local candidates by simply entering your postcode.

Come May 22 this year, many residents in the UK will be going to the polling stations to elect their representative in the European Parliament. Many local councils will also be electing some or all of their councillors, and some areas will also be choosing their mayor or parish/town councillor.

If you are registered to vote, you should receive a polling card through the post. These tell you where and when to go to cast your vote, and which votes you can cast. The polling cards don’t tell you who the candidates are – these are usually finalised just a few weeks before the election.

Some may have already decided who gets their vote. Others will be unsure. I have a party preference based on policies which I broadly agree with, but am by no means a staunch supporter. Voters might also vote nationally for one party, and locally for another – or choose to put their support behind independent campaigners.

Frankly it all becomes a little bewildering, and there doesn’t appear to be a single definitive source of election material. We’ve had flyers from two parties so far and nobody has knocked the door. Websites such as ElectionLeaflets.org attempt to collect flyers from all the campaigns, but they’re by no means comprehensive.

The local council are supposed to provide notice of forthcoming elections on their website, both announcing the election and stating the candidates. In the age of the Internet, this is undoubtedly a good thing – but goodness me, they’re tough to find. Almost all are in PDF, some in Word and more than a few are embarrassingly hard to uncover.

The humble voter – wanting to make an informed decision – is left bewildered. I would dearly love to be proven wrong, but as far as I can tell nobody has created a single source of election candidates. There is nowhere we can go to review the choices and make an informed decision prior to voting.

Presenting my super-duper Election 2014 site. Pop in your postcode, et voila – a list of candidates for local, mayoral and European elections handily presented in (almost) one place.

The caveat is unfortunately something I’m unlikely to be able to fix in a hurry. I would have loved to put all options on a single page but it took nearly two days just to get the links together. Getting all candidates in one big database is going to take weeks of effort, and the elections are little over a fortnight away.

There are a few gotchas. Unfortunately some election boundaries were applied earlier this year and the Office of National Statistics has not yet provided an update (and they tell me it’ll be released ‘around the end of May’ – great). Parish/town council elections are also unlisted but I’ve tried to note them where possible – a high proportion are uncontested anyway.

All in, I think this is a decent step towards my goal. Hopefully other organisations might be interested in picking up the baton in time for the next elections (if they don’t already have plans). At least one possibility is to open the database up to public edits Wikipedia-style, but I think we’ve missed the boat for May 22.

Please, let me know what you think. If you find it useful, interesting or worthwhile please spread the word.

 

Chrome’s Windows 8 Mode

Screenshot 2014-04-09 08.51.06

If you switch Chrome to “Windows 8 Mode” it creates its own little environment, complete with draggable windows, a task bar and a clock. It looks like a complete little operating system. (I’ve never used Chrome OS – but the screenshots look familiar).

This seems utterly daft to me.

Windows 8 mode services a very specific purpose. It’s full- or split-screen apps with no sense of windows. Think of them as panels. This approach – whether we enjoy it or not – is supposed to be consistent.

Chrome comes in, adds its own layer, and confuses the heck out of anybody who happens to click the wrong button. Am I still in Windows? Where have all my programs gone? Why is something different here? Anyone who has helped friends and colleagues with basic computer needs will know that the simplest change - the tiniest disruption – can cause users to lose their bearings.

For what it’s worth I don’t necessarily appreciate Windows 8 Mode either – I find the whole thing a half-way compromise between tablet and desktop that fails both sides. A dichotomy of inconsistent metaphors and actions. It’s a mess, but the last thing we surely want is another company (Google) throwing things even further off kilter.

As a proposition, I quite like the idea of Chrome OS, but as a separate choice only. Chrome in Windows 8 Mode appears to fail to appreciate the good things that Windows 8 Mode brings (yes, there are good bits) and wilfully catapults its users into a confusing, inconsistent environment. It reaks of the 90s trend of building apps with their own confusing controls and windows just because we can – although I suspect Google genuinely has long-term plans for what it’s doing here.

Heartbleed

A vulnerability has been found in the encryption library OpenSSL, used by a huge proportion of web and Internet services. This bug allows malicious users to access bits of memory on the server and potentially read enough information to render the encryption useless.

Worse, having obtained the right data, they could compromise the security of past and future communications allowing eavesdropping, impersonation and stealing of data.

The vulnerability, known as Heartbleed, was found by researchers at Google and Codenomicon. While publicly announced only yesterday (7 Apr), it seems the bug has been present since December 2011, and was part of a release in March 2012.

The various affected Linux distributions have been speedily updated and I updated our servers this morning. We must now wait and see how quickly the fixes will be applied to other servers and systems.

The effect of this bug is serious: it undermines the security protocols used throughout the Internet, and an attack is apparently undetectable in ordinary logs. This means that high-profile websites might be well-advised to renew their security certificates, so that any ‘exposed’ details cannot be used in a future attack.

No Marketing, No Service

The ostensibly ‘free’ wifi you often get at pubs, hotels and other locations usually requires some form of sign-up to access the service. These are usually fairly dumb captive portals that ask for name, email address and permission to send you marketing. No great deal – if you don’t like it there are plenty of fictional names that work equally well.

However, one particular hotspot from a well-known brand stuck out the other day. To use their service you must provide your mobile number – “it’s just to confirm your identity,” they say, but the terms and conditions state something else entirely: by signing in you are automatically opted into marketing. If you opt out, you lose your right to use the service.

In other words, receive our junk or no wifi for you.

This is – on the face of things – not that unreasonable. You get something for free, and you receive wifi. Except, it’s not quite free, is it? Personal details; attention. Each has an implicit value: just look at advertising, where space on TV, radio, print and online is usually charged based on attention potential – i.e. how many people might see it.

So, is it really ‘free’? My feeling is not, but it’s hard to draw the line. A captive portal with a simple ‘Go Online’ and a banner ad is equally ‘non-free’ by this equation. Perhaps it’s the combination of giving up a mobile number and receiving marketing? But what’s the value…?

For me, the value of getting the wifi was less than the value of the data and rights I would have to exchange for it. From person to person, this is going to vary. Some might think nothing of it – in fact, given this is the company’s business model, I’d wager most do just that. In any case, the value of the ‘stuff’ we possess (data, privacy, attention) is unlikely to be always zero, and there is a limit to the amount people will give in return for a service or item.

Whatever – I didn’t sign up. Thankfully my mobile had signal and whatever I needed wasn’t that important anyway.

NLS Releases 37,000 Old Maps Online

I like collecting old maps. I like the stories they tell: how towns and cities have grown from tiny hamlets and fields.

Getting a selection of old maps online and geo-referenced is a bit of cartographic porn. Imagine my delight upon reading that The National Library of Scotland has released some 37,000 maps of Great Britain and selected other areas online and are rapidly working to geo-reference (show in the right place on an overlay) them.

nls-map

Pure pleasure.

(Found via oobrien.com)

Tracking Devices via Raspberry Pi (Part Two)

In the last post I described how it could be possible to build a simple tracking device using a Raspberry Pi acting as a wireless network access point. In this post, I try it out for real. This isn’t a detailed technical run-down, just the basics.

IMG_20140325_105251

It seems to get this working you’ll need just a Raspberry Pi (Model B has extra USB + LAN so is easier to configure) and a USB Wireless dongle. I started off with the Nano-USB TP-Link WN725N but this isn’t supported by the default drivers. I managed to get it working thanks to this thread on the Pi forums, which describes recompiling direct from Realtek’s website.

However, it seems that the Realtek drivers don’t have the necessary hooks for responding to probe requests. My guess is this is usually handled at a lower level and hostapd merely makes use of callbacks to customise the handling of these requests. I didn’t really fancy rewriting large chunks of code to support it, so I opted for Plan B, which is an Edimax 7711USn (the big white thing in the picture).

The Edimax does seem to work with the default nl80211 drivers and sure enough, when I fired up debug mode, the probe requests came through.

By default, the probe requests only appear with some really high debug level (makes sense – they’re usually not that useful) and running in this mode would surely generate a whole load of debug messages I wouldn’t need. The second snag is that there is no timestamp with these messages.

So, next step was to recompile hostapd with some customisations. At this stage, I merely wanted proof-of-concept so I moved the probe request notifications up to INFO level and added a timestamp using some existing functions. I then ran the program for a little while et voila, the requests started coming in. The screenshot below shows the requests being made (MAC address partially obscured) and the timestamp (a second counter).

probereq

These requests will be coming from all sorts of wifi-enabled devices: my phone, my laptop, phones of passers-by. The key thing is that I did nothing to these devices to enable this – they automatically scan for wifi networks and this device is merely visualising it.

In little under forty minutes there were 1720 probe requests from sixteen separate devices. I’d call that a proven concept.

All in, the Raspberry Pi setup cost around thirty pounds. This is for a single device. My original plan was to see how this could be used for tracing paths through a space, such as a city or stadium. For this to work, one would need many more devices, both to triangulate the devices and to provide blanket wifi coverage.

The theory is: with enough of these devices in the right place it should be possible with reasonable accuracy to work out where people are, and where they go. At thirty quid* a pop, this needn’t be a hugely expensive task.

As it turns out, this is precisely the sort of work going on both in the commercial and development world:

CreepyDOL is a networked tracking tool – sniffs network traffic and tracks users. As the name suggests, the powers ‘in the wrong hands’ are creepy indeed.

Wifi tracking of the kind I describe has been in use for a couple of years now, and it’s already used to track customers in supermarkets.

Sure enough, just as I was working on my project, The Register published an article about how Asda and EE in the UK are using wifi to achieve precisely this.

 

There are quite obvious privacy implications to all of this. MAC addresses are unique and, even if we can’t directly identify someone from them, it may be possible to get their identity from all the clues available (such as where they shop, where they dwell, and by sniffing wifi traffic). Privacy is a huge concern and any technique such as this – even if used with the best intentions and no direct logging – must still be dealt with with care. Last year, it was widely reported that wifi trackers were embedded in some of London’s bins. Officials ordered the company to stop amid privacy concerns, and – despite the company’s protestations that all is anonymous – it has clearly hit a nerve with privacy advocates. There may well be no attempt whatsoever to track individuals, but a poorly executed plan will be met with harsh criticism irrespective of the actual risks.

I have a couple more articles planned to follow up these points – any comments are most welcome.

* There are, of course, all kinds of additional costs: You probably want a case, decent power, and public liability insurance. The Raspberry Pi is also not likely to be a great choice in the long run, for reasons which I hope to pick up later.

Using Raspberry Pi Access Point to Track Devices

I’ve been looking recently at ways to measure population movements across a large outdoor area. There are various ways of doing this: we watch them, we ask them, or we infer. Watching is a popular option at stadia, city centres and large events. Automated footfall cameras can track movements and figure out how many people walk past a certain point in any period. Put enough cameras up and you can get a pretty good plan.

However, this merely measures numbers of people. There is no connectivity. 200 people walk past point A, 400 past point B. Does that mean 600 in total at two locations, or did some of those in point A also walk past point B?

This got me thinking about technical solutions (of course) and fairly quickly I got to wireless networks. Back in 2008, I wrote about a scheme in Portsmouth’s Gunwharf Quays to track mobile devices. At the time I think the company was using bluetooth but since then we’ve had an astounding growth in smartphone usage. Wifi is pretty much standard and – I would perhaps wager – more likely to be turned on.

Keen to see how this might be implemented, I did some research into how mobile phones discover wifi networks and found an interesting behaviour. Every few seconds or minutes, your phone will send out a signal “who’s there” to probe local wifi networks. Any active access points will respond accordingly with their network name “Hi I’m BtHub_12345”. This surprised me a little, as the device itself has to ask for wifi – I was expecting the access points to broadcast themselves and send out signals every few seconds. It seems this is to conserve battery.

Anyway, having established this, it seemed likely the device would be giving its MAC address away. To clarify, the MAC address is the hardware-level identifier for network devices. It’s a bit like a serial number, so when two devices talk to each other they address each other by their MAC address so they know who’s who. MAC addresses are supposed to be unique, so no two devices should ever have the same address.

Curious to see how easy this would be, I have started to build a Raspberry Pi device that is capable of this. It seems possible – hostapd allows the Pi to act as a wifi network access point. This forms the basis for tools such as Karma (which achieves a lot more than I need) and clearly has the capability to show probe requests

There are also some privacy issues that need to be tackled. What are the legal aspects of tracking MAC addresses? It seems that, provided the addresses are not tied to individuals, we do not need disclosure. Indeed, my aim is to aggregate data from the start. Once a path has been created, any trackable information should likely be discarded. I’ll touch on these too.

More updates to come…

 

Daft Technology 2

Google Hangouts system seems to have a weird bug (seen on Nexus 4, Nexus 7 and HTC device). When a correspondent is writing a message, the chat screen scrolls up a little. The latest message disappears.

It’s incredibly annoying, since the reader has to scroll back down to see new messages. There is no obvious cause. Others have reported it on the bugs list; forums and elsewhere online, but it’s been running like this for at least a year – sometimes it works and other times it doesn’t.

What could cause such a fundamental UI behaviour to fail? I want to embrace these systems, but they’re making it pretty difficult to adopt.

If you create a meeting in your calendar on Android, the Google Now agent will ‘helpfully’ notify you when it’s time to leave. Great, but it’s not smart enough to know that you’ve already left.

A number of times I’ve left for a meeting (with, say, a half hour drive) and suddenly the phone notification sounds, informing me that ‘to be on time, I should leave now’.

Such a simple thing to detect, surely – the phone knows that I’m on the move – but it seems this logic was too easily overlooked.

Daft Technology

Went to Boots today, to print some photos. We stuck ten JPEGs on a USB drive – nothing fancy.

Touched screen to begin, put disk in. First thing it asks ‘what media are you using?’ – umm, the USB drive I just plugged in would be a reasonable guess.

Okay, so we press USB. ‘Please wait…’ and wait we did, for a very long time. Nothing. Eventually it returns to the starting screen, seemingly giving up.

Try again, same thing. Move to a second machine which has become free – not much better… it seems to be making progress but still incredibly slow.

While my wife is working the machines, I’m looking for a member of staff. Together, we realise the Boots machines will be cheaper, so we move across again.

This time, it works a little better. The machine reads the USB reasonably fast, and shows us the photos. Now the staff member does something quite extraordinary. She goes through each photo, and explains “the machine sometimes tries to ‘be clever’ and crops the photos – but it makes lots of mistakes.”

Sure enough, it’s quite evident the machine has indeed chopped off a few photos in inappropriate places. People are cut in half. Methodically, the employee reverts each of these ‘smart’ crops, undoing the work of the computer. We ask how this laborious task works if one has hundreds of photos? ‘It takes a while.’

Finally, the photos are ready to print. We go shopping, come back half an hour later, and they’re done. A fairly straightforward process made needlessly complicated – requiring the time and knowledge of a staff member. Sort of defeats the point of self-service really.