Your job’s so easy a robot could do it!

Luckily that applies to my job very rarely. But there are a few aggregation tasks that a computer can do better. Like writing “The Election in Numbers.” So I wrote a Python script to do TEIN for me. It’s progressed in fits and starts but now it’s good enough that I can just copy the HTML it spits out, add tags and assorted other metadata, and post away. The last two components to come together were the party IDs for Senate and gubernatorial candidates (accomplished by manually entering info into a dictionary, because I couldn’t think of a better way) and the screenshot at top (which is a YouTube URL hack).

Anyway, the script runs off of ScraperWiki here, where you can also see the code. There are points where it could be more sophisticated. For example, I’m sure there’s a better way to read JSON than what I use for scraping Nate Silver’s predictions, and I fiddled with raw HTML on YouTube user pages rather than spending the time to learn the site’s REST API. I also considered and rejected using HuffPo’s Pollster API in place of the RealClearPolitics scrape for the same reason. But it does the job and writing it was more fun (and faster) than manually copying poll results every day, so I’ll take the win.

Advertisements