June 23, 2006
Week Three - Friday
It's been a good couple of days for coding. I got on something of a tear and managed to knock out almost all of the remaining code. The MySQL stuff still needs to be done, but the database is installed on a computer named Thanatos, and I haven't been able to access it remotely. Rumi's needed to use it for the past couple of days, so no MySQL stuff this week.
The scripts, however, are looking pretty good. There are two separate scripts, the intialization script (which gets the set of photos we'll be tracking) and the update script (which updates the number of views and favorites for each photo every couple of hours). Both scripts produce a log file, and I made sure to implement exception handling so that the scripts could keep running even if something went wrong. In the case of socket errors, I threw in a retry routine that tries the request again after a five second sleep. I discovered earlier in the week that the number of views and favorites for each photo can't be obtained via the APIs, so I had to go another route. I'd fortuitously stumbled across a guide to web scraping in Ruby, and revisiting it led me to the scraping tool Rubyful Soup. A port of the Python tool Beautiful Soup, this proved to be just what I needed. Several test runs helped me work out the bugs, and now all that's left is to write the code to put the data in the database.