This past week, I’ve spent a bit of time (at least when not dealing with a busy week at work, including leading two walking tours on Sunday) playing around with tools that we learned last week, and looking a bit ahead.

After Sasha and Jeri’s excellent tutorials, I was eager to dive into webscraping tools. I poked around a bit with Wget most, learning different options for bulk downloading. I thought of various historical resources I may want to use, and experimented with how to get that data. I learned, in particular, the importance of the -np (no parent) option, the hard way! I also successfully downloaded just the HTML and image files from a site; I was particularly interested in the HTML files, as they contain many primary sources divided among different pages. Next thing to learn: how to fuse these texts in a bulk manner…

I also tried to use it to back up my own website, with less success. Less success, in this case, means that it downloaded the “index.html” page that my WordPress installation produces… And nothing more. Am I missing something–perhaps I (or Dreamhost) have set my security settings too well? Then again, perhaps I shouldn’t complain about that.Is there an issue with downloading, say, a CMS? Should I be using something else (e.g., Python) in this case?

Python, and generally writing more complicated scripts than one command at a time with Wget, is another area in which I need to experiment more. For the end goals of which I could think this week, Wget seemed to work. Mostly.

At some point I would also like to write my own Zotero translator. I spent some time thinking of the kind of item for which I’d like to write a translator (ironic, since I can think of plenty of times where lack of a translator frustrated me), and so haven’t at the moment. Likely, I will follow Sasha’s lead and write one for my own data pages about the claims that I’m researching. The first step in that process will, of course, be building said pages…

I also thoroughly enjoyed Julie Meloni’s tutorial on using APIs (in three parts: 1 2 3). It was written for someone at my level: an advanced (debatable) beginner and non-programmer.

That inspired me to get my own Google Maps API and do some very, very basic playing. By very basic, I mean taking Google’s basic map page and plugging in my own coordinates and API. Here is the result, focused in on my former place of employment, the Alamo.

As happened last week, learning these new tools helped me think more about what product I want coming out of this class. In developing my free-standing database site, I’d like to include maps–both for the aggregate data, as well as for each individual claim. For the aggregate data, I’m not sure yet what mapping application I would use–I look forward to the next two weeks, when we learn about what’s available and what works for what. For each individual claim, I would like to include maps for my four geographic fields: home port of ship (when applicable), place of incident, destination port, and destination of goods. Google Maps might be best for that, although the jury is still out.

A side note, along the lines of my individual data pages: I’d like to learn how to hide portions; e.g., the empty fields. If that is even possible. Hint, hint.

So, that’s where things stand. I look forward to learning more about mapping and APIs from Laura.

Leave a Reply