SPSS .sav to CSV

In attempting to download some data from the UK Data Service, I ended up having to use a file created by the statistics package SPSS. I don't have SPSS, nor have I ever used it, and these .sav files don't seem to open up sensibly in a text editor.

Fortunately, R has come to my rescue! What follows are the steps I took in Debian linux to get a comma separated values (CSV) file out, which can be opened by Excel, text editors and lots of other programs.

To install R, you do:

sudo apt-get install r-base

Then you open up R by typing "R". Once you're in, install and start the "foreign" package for handling SPSS files:

install.packages('foreign')
library(foreign)

Then you can read in the data set as a data frame:

dataset1 = read.spss('/path/to/file.sav', to.data.frame=TRUE)

Then all that remains is to output the CSV file. I saved mine in /home/beth:

write.csv(dataset1, file='/home/beth/dataset1.csv')

...

Wuthering Bytes

I spent this weekend at Wuthering Bytes in Hebden Bridge, Yorkshire - three days of talks and workshops on hardware, software, and tech in general.

A joystick attached to a Raspberry Pi.
Some of the projects from the workshop.

Amongst the cool talks that I listened to there was steam-powered techno, a presentation from the mother of the ARM processor, the Oxford Flood Network, and some literal rocket science, with fire and ear protectors and everything.

(You can see what we were all tweeting by checking out the #wutheringbytes hashtag.)

The wonderful Gareth & Naomi gave a presentation on their work with the Incredible Aqua Garden, an aquaponics system for growing food in Todmorden. They've built a monitoring system from scratch using a combination of Raspberry Pi and Arduinos, and even invented their own wireless sensors. They use NodeRED for wiring together hardware and APIs. We took a little detour to visit the garden, which is inside a local school, and found a greenhouse full of basil and watercress, getting nutrients from the fish that live beneath them.

The Sunday was given over to workshops. At my table, the Incredible Aqua Garden people (Gareth, Naomi and Paulo) were teaching attendees how to interface Raspberry Pis and sensors. I spent the day with some JavaScript libraries, and (with help from @hoegrammer and @errietta) made some graphs of real-time data from the sensors attached to the Raspberry Pi inside Incredible Aqua Garden, which broadcasts data across the internet using WebSockets.

Graphs of light level and pH from the Incredible Aqua Garden
The graphs aren't very exciting because the environment inside the aqua garden is very well controlled.

The code is available on GitHub (incidentally, this was my first experiment with GitHub Pages, and I benefitted very much from this tutorial), and it uses Rickshaw, as well as WebSockets. We were given two different electronics kits to play with as going-home gifts, as well as the world's cutest GitHub sticker, which now has pride of place on my little laptop.

...

The bedroom window was a very seedy and disreputable hard-felt hat.

Jeremy Brett as Sherlock Holmes

A Markov chain is a type of statistical model that's used to describe things that happen sequentially. You begin in one state, then there is a certain probability that you will move to each of the next possible states, which is useful for things like finding conserved DNA sequences.

This can also be quite a fun thing to play with - it's great for taking in text and trying to make sensible-sounding sentences out of it. The idea being that instead of understanding what the sentences actually mean, you can just see what word usually comes after the word you started with and pick one of them to go next.

So I decided to model the English language as a Markov chain, using the text from The Adventures of Sherlock Holmes (from Project Gutenberg) as training data, and produced about as much coherence as you'd expect from such a method. If you go to http://bethmcmillan.com/geek/markov/, you can generate your very own pseudo-sentence.

I also made a Twitter bot that tweets these nonsense Holmesian sentences.

In brief, I installed node.js, which lets you run JavaScript without a browser, and added the "twit", "jsdom" and "jquery" modules. I followed this tutorial for making a twitter bot. The bot tweets every 5 minutes (I might change this if it turns out to be too much). After stripping the newlines, quotation marks and double spaces from the text, it picks a random word to begin with. Then, it takes this random word and the one that follows it, and finds all of the other places in the text where this pair of words can be found. Next, at random, it picks one of these locations and takes the next word in the sentence. Finally, the process repeats with the two newest words until there's a tweet-length phrase.

All my code's available under the fold, for anyone who's interested. Feel free to follow @markov_holmes for entertaining gibberish!

Continue reading The bedroom window was a very seedy and disreputable hard-felt hat.

...