Warning: Declaration of WPSDAdminConfigAction::render() should be compatible with WPSDWPPlugin::render($ug_name, $ug_vars = Array, $action = NULL) in /home/davidmck/public_html/musings/wp-content/plugins/wp-stats-dashboard/classes/action/WPSDAdminConfigAction.php on line 0
Digital History – David Patrick McKenzie

David Patrick McKenzie

Digital Public Historian

Category: Digital History (page 1 of 4)

Building Two Databases for My Dissertation…?

Wow, it’s been a long time since I’ve blogged outside of a class. As much as I admire the people who blog through their dissertations and have meant to do so, well… Good intentions and all that. I hope to blog my progress in the future, though, both as a way for me to explore and refine ideas, and, perhaps more importantly, to share the process for those who are currently undertaking or might undertake this type of work in the future.

For this first post, I’m delving into an area with which I’m struggling at the moment: Structuring and using data.

via GIPHY

Data in my Dissertation

When the History Department at George Mason University accepted my prospectus in December 2016, my topic was the experiences of U.S. and Mexican travelers and migrants between each others’ countries from the start of Mexico’s Wars of Independence until the two nations went to war in 1846. I planned the work to be mostly qualitative, using a series of case studies of individual experiences to illuminate broader trends.

Although I’ve long had an interest in digital methods, I didn’t want to do digital work for the sake of doing so. I saw some uses for digital methods:

But I didn’t have concrete plans.

That said, one of the pieces with which I’ve long struggled in pondering my topic is how to integrate quantitative work with qualitative analysis, to explore what a dataset could yield that individual case studies could not.

As I began to research, I found a data source that could help, thanks to Harold Dana Sims’s The Expulsion of Mexico’s Spaniards, 1821-1836. In 1820, the United States began to require inbound vessels from foreign ports to submit passenger manifests. These manifests are available on microfilm at the U.S. National Archives nearby. Then, even better, I found that these manifests were also available via Ancestry.com and FamilySearch. For someone working full-time Monday through Friday, this was golden, especially following the tragic (but, given the federal budget situation, sadly understandable and unsurprising) decision of the National Archives to end Saturday hours in summer 2017.

A black-and-white ship manifest of the Schooner Sally Ann, which sailed to New Orleans from Rio Grande, Mexico, in October 1826. The manifest contains the names of five passengers.

Passenger manifest of the Sally Ann.

Each manifest contains:

Ship information:

  • Ship Name
  • Port of Departure
  • Port of Arrival
  • Date of Arrival

Passenger information:

  • First and last names
  • Age
  • Sex
  • Occupation
  • Country to Which They Belong
  • Country in Which They Intend to Become Inhabitants

Why Ship Manifests?

What could a dataset based on these ship manifests add to this dissertation? For one thing, they could yield numbers of ships and passengers going between the United States and Mexico between 1820 and 1846.

Tracking those raw numbers could help me identify ebbs and flows in travel and trade, which I could then investigate to determine why. I could also see a more detailed picture of how traffic between individual ports ebbed and flowed over time, and, again, investigate why. I also quickly began to recognize some distinctive names in the records, yielding clues as to who might have business interests in which places and could make for a prime case study.

For example, a merchant or hatter (depending on which manifest) named John Baptiste Passement showed up frequently in voyages between New Orleans and various Mexican ports, most frequently Campeche, in the early 1820s. Conducting a further search of his name on Ancestry.com yielded a will listing creditors in Mexican cities.

I could also, on a more advanced level, even find social networks among travelers. Who traveled together multiple times? How were these people connected?

I could also see if demographic profiles changed over time.

What Format?

Since there seemed many possibilities of what I could do with this data, I created an Excel spreadsheet as a first step. Pretty quickly, the spreadsheet began to grow unwieldy. For one thing, I was entering a lot of repetitive information—for example, I put a new person on each line, but a lot of repetitive information about the ship voyage. I suspected I needed something more sophisticated. But after setting up a custom MySQL database in my Clio 3 class in 2012, I wasn’t ready to do that again. If the data is a mid-sized nail, an Excel spreadsheet is too small of a hammer, but a custom MySQL database is a sledgehammer. I needed something in-between.

A screenshot of a spreadsheet of data about ships coming into U.S. ports from Mexico in the early 1820s, showing a large number of repeated cells.

What my spreadsheet began to look like. Note the repeated cells.

Thankfully, as I was wrestling with this question, I attended THAT Camp DC 2017. At the spur of the moment, I suggested a session on dealing with historical data—hoping that my experiences could help others and that I could get some advice on what to do with this.

Thankfully, someone there—I don’t remember whom, as the person I suspect doesn’t think it was him—suggested Heurist. Heurist, as I learned, is a database platform created specifically for humanities research at the University of Sydney. It seemed that this would do the trick for me.

Indeed, it has.

Setting Up My Heurist Database

Amazingly, within 24 hours of me signing up for Heurist, I received an email from the project’s lead, Dr. Ian Johnson. I told him about what I was trying to do and shared my spreadsheet. He and I then exchanged emails and held a Skype call in which I sat on my balcony in Arlington, Virginia, he was in Paris, and we were pinging a server in Sydney. After some working to figure out how to structure the data, he came up with a scheme based on how the ship manifests are structured and what I might do with the data in the dissertation:

Chart showing structure of David McKenzie's Heurist database. The database includes five tables: Trip, Person, Voyage (of Ship), Ship, and Place.

The structure of my Heurist database.

After I used OpenRefine to clean up my spreadsheet, he imported the data for me as a way to test out the importing features.

I cannot say “thank you” enough to him for all that he did.

And Now the Fun Part…

Since getting the database set up in the late spring, I’ve been going in spurts inputting data. Sadly, although I inquired on Twitter whether Ancestry or FamilySearch have these manifests available for bulk download, it seems I’ll be inputting manually (which, given the state of the OCR of names in particular, might not be a bad thing). I input two types of voyages:

  • For those inbound from Mexican ports, I input data on the voyage itself, as well as on all passengers.
  • For those inbound from other ports but with Mexicans on board, I input data about the voyage and then only input data about the Mexicans on board.

I realize that this method would not allow me to answer the question of what percentage of inbound voyages to a particular port is from Mexico, but I’ve decided that the amount of time doing that additional data entry would not be worth any questions it could answer.

Screenshot of the Heurist database, showing David McKenzie's "Add Trip (of Person)" interface.

The interface of a Trip record.

My workflow starts with selecting “Add Trip (of person).”

The Trip is the basic unit of my database—each Person takes a Trip on a Voyage of a Ship.

If I’m starting on a new manifest, I create a new Voyage record (often involving creating a new Ship record, as well).

I then check the person’s name to see if the name already exists in the database. Often, this is a guessing game as to whether a person is the same or not. Are the birthdates listed similar? Does the person who might be the same have a record of traveling between the same ports? Is the name unique enough to lessen the possibility that the records are referring to different people?

After making that judgment, I then proceed, if it’s a new person, to copy over the information that I can glean from the manifest about that person. Finally, after the Person record is created, I input the rest of the data about that person on the trip.

I found a large number of Voyages—roughly 400—between Mexican ports and New Orleans just from 1820 to 1826. I started to question whether creating a comprehensive dataset would be worth the effort.

When I discussed this database project with faculty members and classmates at George Mason University’s Early Americas Workshop, opinion in the room was divided on that question.

I’m still debating, although I’m leaning toward plowing ahead in creating a comprehensive dataset to really be able to see and show change over time.

I’d love advice on this question!

Right now, with working full-time, I’ve taken the advice of others and given up on doing brain-intensive work on weeknights; instead, I’ll either read secondary literature or input data (often with a basketball game on in the background), while reserving primary source research and, eventually, writing for the weekends. It’s still taking quite a bit of time, though!

To take a break from the intensive data entry of ships coming into New Orleans, I’ve begun inputting ships coming into New York. So far, I’ve found many fewer (which, admittedly, creates the opposite problem—sorting through those manifests can be mind-numbing, although it does get me thinking about the differences in traffic).

What’s Next for the Manifests?

In the time since I started this database, my research focus has narrowed. My advisor, Joan C. Bristol, and I agreed that the original topic was too broad and ambitious. We agreed that I would instead focus on traffic going one way: U.S. migration into the interior of Mexico. This is because I found I was developing an argument about U.S. commercial expansion into the interior of Mexico being related but distinct from the migrations into border regions like Texas that eventually resulted in U.S. territorial expansion. Preliminarily, I suggest that this secondary migration laid the groundwork for the formation of an informal, as opposed to territorial, U.S. empire in Latin America (more on this in another post).

This has led me to question the value of ship manifest data for that topic. I still think being able to quantify shipping and movement will be valuable, as will being able to pinpoint comings and goings of U.S. and Mexican nationals. I could still see connections between U.S. and Mexican ports, and find more people who were U.S. nationals but resident in Mexico.

What are your thoughts on how this data could be valuable?

And Possibly Another Database…

Meanwhile, examining how U.S. migrants to the interior of Mexico laid the groundwork for a future informal, commercial empire brought me back to the database that I already began constructing over five years ago: Tracking U.S.-Americans who filed claims against the Mexican government.

When I set up the database, I mainly looked at what the extensive files of 1839 (15 boxes) and 1849 (30 boxes) claims commissions could tell us about the claims themselves: Who the claimants were, summaries of the cases, and amounts of claims. But I’ve since realized, thanks to rethinking the topic, that I might have been asking the wrong questions, and thus extracting the wrong data.

While many of these files have to do with incidents involving merchants who simply traveled to Mexico but did not stay, there are also many about U.S.-Americans who settled in Mexico’s interior. Many of these files contain information such as when they settled in Mexico, where, their occupations, and demographics.

As part of the narrowed focus, I’m realizing that this data could prove valuable in painting a portrait of the U.S.-American diaspora in Mexico’s interior during this era, and how that group of people changed over time. What patterns exist? Where did the U.S.-Americans who settled in Mexico come from? Where did they settle? When? With whom did they interact? This could allow for a good number of visualizations that can paint a broader picture, beyond qualitative exploration of individual experiences.

Looking for Advice

I would love advice on how to build and use this dataset.

Should I use the records of claimants as my main source, keeping an intentionally limited but self-selected data set?

Or should I cast a wider net, knowing that it would be nearly impossible to create a comprehensive set?

And furthermore, should I scrap the claims database that I already started to create, or simply change some of the categories?

Should I create a new Heurist database and import the previous custom database, or, knowing that some of the same people are likely to be passengers on vessels (indeed, I’ve found my central case study, John Baldwin, on at least two ship manifests), add them to the ship manifest database?

Lots of questions for going forward…

Cartography: Maps & Conquests

This week’s readings focused on mapping in the Americas, particularly in the time of Contact and European colonization. For me, this was rather appropriate, as my Latin America and the World minor field readings course addressed the topic of Conquest last week. The cartography readings nicely complemented that, and particularly showed a difference between English and Spanish colonization and conquest.

Barbara Mundy’s fascinating deconstruction of the 1524 Nuremberg map of the Mexica capital Tenochtitlan showed something about the Spanish mode of colonization. She argues that the map, European in appearance due to the styles of Tenochtitlan and other Lake Texcoco cities, actually reflects Mesoamerican influences in how it depicts the geography of the region. She speculates that the model for this engraving was a now-lost map Hernan Cortés sent to Emperor Charles V (Carlos I of Spain).

This comports with the understanding I’ve gained of the Spanish “Conquest” era from my recent readings and classes. The Conquest was not top-down and complete; as James Lockhart shows in his book Nahuas After the Conquest, Mesoamerican cultural structures (such as separate, Nahuatl-speaking courts and communities) remained for a long time after the Spanish decapitation of the Mexica Empire. Spaniards depended on indigenous geographic knowledge. Because of the nature of the Spanish conquest of the Mexica Empire–as part of an alliance with Tlaxcalans and Tarascans–then indigenous names and ways of thinking, including geographically, melded with European.

Contrast this with English colonization of, well, New England. As J.B. Harley points out in Chapter 6 of The New Nature of Maps, English colonists did a symbolic erasure of Native American names, even while incorporating Native American knowledge into their maps. This is not to say that these contrasts are absolute; after all, present-day Mexico (and more) was the Viceroyalty of New Spain for 300 years, and plenty of indigenous-named places took on Spanish names. But these two articles, read together, are further demonstrations of the general idea that Spanish colonizers sought people they could exploit, while English colonizers largely sought to displace native populations. So perhaps no surprise that not only indigenous knowledge but indigenous styles of cartography went into Spanish maps, while only indigenous knowledge went into English maps. Thus, we have another way of using maps as a primary source to illuminate the past.

Speaking of Latin America and Maps…

I remember mentioning how Peace Corps does world map projects on the first day of class. As it turns out, they have a whole website devoted to these projects now. At least from what I saw in El Salvador, maps tended to be on the sides of schools. A great way to show people the world–although sometimes people would realize just how small El Salvador actually is…

Addendum

This week I commented on Amanda’s blog. Forgot to post that last week I commented on mwill4’s blog then (sorry I don’t have your name down yet!).

Cartography: Map as Argument

For all two of my readers not from class (hi Mom!), I’ve just started a History and Cartography class for the semester. This week’s readings (week 1) focus on the idea of maps as primary sources containing arguments and agendas.

Admittedly, in spite of years of advanced history education, this is not a way I had consciously conceptualized maps before. This is a bit ironic since I used to stand by a map (of recent vintage) of the Republic of Texas when I worked at the Alamo and point out its errors to visitors. Most glaringly, it only showed how the Republic of Texas conceived of its own borders (extending to the Rio Grande, including not just present-day Texas but half of New Mexico, much of Colorado, and parts of Oklahoma, Kansas, and even Wyoming) and not the territory it ever actually controlled (about a third of that). Visitors were usually surprised. Using that map opened up the story of the border dispute that helped lead to the U.S.-Mexican War. But in spite of that experience, and realizing that map most certainly had an agenda and a story that it hid, I didn’t take the next step these readings helped me take.

I had thought of maps, and used them, as primary sources, but more as references. For example, this summer I used 1830s maps to gauge locations of roads and towns in my own creation of a map of the journey of Antonio López de Santa Anna to Washington in 1837. Could I indeed find Columbia, Arkansas? Where did my travelers cross the Brazos River? Maps from that period told me this information, and that was the extent to which I used them.

But this week’s readings helped me think more about how maps shape our perceptions of reality, and carry messages–whether intended or not–from the cartographer. What did the cartographer choose to emphasize? For example, J.B. Harley notes (page 39) that U.S. Geological Survey cartographers typically indicated woodland density, because these maps were initially made for military use.

The readings prompted me to go back and look at the maps I used as primary sources for creating my Google Map. Let’s take this 1837 map of the United States, which I primarily used for road locations. This map’s main emphasis, from a quick glance, is the political organization of the United States, rather than topography. As I did my Google Map, I only thought about my travelers going into the mountains as I looked at the Google Map in topographic view; I didn’t get that idea from the 1837 map. The 1837 map does, however, contain roads, rivers, and towns. It also includes statistics about U.S. states and cities, as well as traveling distances. So evidently the cartographer, publisher, etc., thought the customer would perhaps be most interested in trade, in getting around the country.

Admittedly, I hadn’t looked at the map much beyond the, well, map portion; in fact, I cropped out the rest when I adapted and used it for my Clio 2 project. The presence of engravings of six cities, plus George Washington and the Marquis de Lafayette, along the side convey a patriotic message (not to the extent of the eagle map of 1833, but close). Each city engraving gives an idea of bustling commerce.

So looking at this map as a primary source beyond a reference, it becomes clear that the cartographer, publisher, etc., are trying to convey an image of a prosperous, commercial country in 1837–a country with its autonomous states, but linked together by roads and rivers that carry commerce. A country showing reverence to its founding father, and his French aide. Why these two were chosen is a question for another day…

Map: Santa Anna Goes to Washington

For a long time, I’ve wanted to make a comprehensive map of the journey of Antonio López de Santa Anna, Juan Almonte, Barnard Bee, George Washington Hockley, and William Patton from Texas to Washington in 1837. The project for my summer New Media Minor Field Readings class builds on previous attempts:

  • For my Clio 2 website in spring 2012, I drew their route on an 1837 map of the United States. I planned to add interactivity to the map, but ran out of time for the project. Also, the map was not geolocated at all–it was just a static map.
  • For Clio 3, I used the journey to learn basic mapping software that is available online. I used Laura O’Hara’s tutorial (not available online at the moment). First I created a basic spreadsheet of the places cited in Almonte’s diary, then fed that into BatchGeo, an engine that spits out latitude/longitude and a KML file. I then fed that KML file into Google Maps. I also used MapWarper to place my 1837 map as the background. While I wasn’t happy with some of the automatic locations, I was able to move them to roughly the right places. I embedded the map on this webpage, but ran into trouble when I tried to embed it in my Omeka site. That map represented a good start, but I was looking for a more precise map, one that could depict the journey and not just points.

Our next assignment for this summer’s course is to prepare a Google Map of a journey. Since we had five weeks between classes, I decided to try again, but this time, make the map more precise, more interactive, and more indicative of a journey. Here is where I am:

Like Jordan, I used Google Map Engine Lite Beta. I could have gone through the steps that I did for the Clio 3 version, with a full KML file and such, but wanted to see how this software worked for me. It seems to have done the job.

To get the locations where the travelers stopped, I relied a great deal on a Southwestern Historical Quarterly article by the late, lamented Dr. Margaret Swett Henson. Henson annotated Almonte’s diary as an appendix to her article about the treatment of Mexican prisoners of war held in Texas following its 1836 war of independence. The diary itself contains only basic information, such as where the travelers stopped each night. Henson tracked down many of these locations.

I used Dr. Henson’s descriptions, cross-referencing other geographic sources, to plot precise locations on my map. I found the rough locations that I wanted to use in Google Maps, then right-clicked (a trick I learned after the first few places!) to get the exact latitude and longitude. I corresponded those with the dates in a Google Docs spreadsheet.

I found areas where I disagreed with Henson’s analysis; these were instances of more information being available at hand to me in 2013 than was easily available for Henson. For example, Henson suggested that Almonte’s reference to passing “Columbia” meant a steamboat, because she found no river town of that name in Louisiana, Arkansas, or Mississippi. Indeed, that is true today. However, starting with a Google Search for “Columbia Arkansas 1836,” I found reference to a steamboat exploding near a town called Columbia, Arkansas, in 1836. I then checked a 1850 Library of Congress map of Arkansas, and located the town of Columbia. That town no longer exists today. While I found these within a few minutes in 2013, I don’t want to know how far and hard Henson would have had to look for this information–or how much it would be worth it to do so for a footnote in an appendix of an article on another subject…

I also felt that including Almonte’s diary entries, Henson’s analysis, and my notes where appropriate would enhance the user’s experience with the map, so I put each into the spreadsheet. Finally, I ran the map engine with the spreadsheet, and voila, I got clickable dots for each day!

But because I am obsessive, that was not enough for me. Some of my surmising of imprecise locations came from combining Almonte’s distance calculations, Henson’s knowledge of what routes they took, and my comparisons to 1837 maps. As such, I wanted to be able to show not just the dots, but the route the travelers took. Thankfully, I discovered that Google Maps Engine also offers the option of drawing on the map. Some parts were easy, such as the Mississippi and Ohio rivers or the National Road. Other parts were more difficult, such as figuring out the routes the travelers took through East Texas, so I had to guess or just draw straight lines.

So in the end, I produced a map that shows the places that Almonte mentioned, as well as the route they took. There’s still more that I could do with this (and indeed may). For some parts I felt a great deal of certainty–e.g., where I could find an exact location for the hotel where they stayed–while in many other parts I chose a central spot within a town. I may go back and depict the certain and uncertain locations in different colors. Also, I may add more links to external information. And finally, I need to figure out how to handle multiple dots in the same spot.

In the end, this exercise helped me think about this journey in different ways. I started researching this journey off and on in 2009 (research that helped lead me back to graduate school). Looking more closely at the map gave me ideas of new sources, such as local newspapers from towns along the way. I also gained a greater understanding of the topography, especially through using Google Street View.

In the future, I may try to embed this map in my updated Omeka site (once I get around to customizing the theme for Omeka 2.0). I may, though, instead try Neatline, as it seems that will help me depict the space in time all the better. Have others used Neatline for these purposes? What have been your experiences?

Skyping into Class

Today I used Skype for only the second time.

Instead of venturing to Fairfax for my History and New Media minor field readings class this afternoon, I Skyped in. This was partially out of convenience–it was nice to be in my apartment instead of on I-66 at 5:45–but mostly as an experiment. Because history education is a large part of the readings course, and videoconferencing is increasingly used in the classroom, Dr. Kelly Schrum, our professor, rightfully urged us to try it, if we could, at least once. I’m glad I did.

These are my thoughts, based purely on experience and not on educational theory. I’d be curious to hear what others think, both from their experience and from a pedagogical perspective.

Due to technical issues, I wound up experiencing videoconferencing into class in two ways: full video and, on my classmates’ end, audio-only.

Full Video

During the first half, my video connection worked. My classmates told me I was a giant head on the screen (I made a sign saying “Obey” to celebrate the occasion). With the way Dr. Schrum set up the webcam, meanwhile, I felt like I was sitting right there in the classroom.

This felt more natural than I was expecting, at least on my end. I could follow along the discussion extremely well. Two weird parts stood out for me:

Since the webcam was in one place and my head was displayed in another, whenever my classmates looked toward me, they looked toward the screen, although I was “gazing out” from the webcam. I had the same thing happen on my end–because the webcam on my MacBook Pro is on the top of the screen, I’m sure it seemed as if I was looking down. I also found myself more conscious of my expressive face–something that I normally don’t notice in-person in the classroom, but did notice more knowing my face was blown up on the screen. I also found that my hand gestures were more deliberate–for example, making sure that air-quotes were visible.

Second, and perhaps more important from a pedagogical perspective, I did feel more hard-pressed in participating in the discussion. In person, it’s easy to see that I’m wanting to say something. I felt myself needing to push a little bit more–even though it was at least easier with the video on, since others could see I was looking to say something.

When it came to the group activity that Nate and Lindsey created, I realized it would be easier for me to do it solo than have Dr. Schrum move the webcam over to one group, not to mention to have me on the speakers. So that part was not as conducive to Skype.

Audio Only, On One End

When the class took a five-minute break, I turned off the video on my end while I got out of my chair. When I came back and turned the video back on, I noticed that I was the last one back. After a couple of minutes, it looked like everyone was waiting for me; it was then that I realized that I wasn’t showing up to the class! I could see everyone but they couldn’t see me! In spite of Dr. Schrum’s and my best efforts, we couldn’t get me back on the screen.

So for the rest of the class, I was an audio-only participant, which provided a different experience. It was less disconcerting on my end because I could still see everyone, so still felt like I was in the room. I found myself less conscious of my facial expressions than I had been. But I also felt like it was harder for me to participate. I had to insert myself into the conversation more than I had with video and, as classmates can tell you, much more than in person.

I also noticed a slight technical glitch, which I found in the first part but particularly in the second: When I spoke, I couldn’t hear my classmates. It might have been that I was using a headset. But I did find that a bit distracting. There was a lag time of 1-2 seconds after I said something before the audio from the class came back. This was especially evident when I presented my thoughts from Megan’s great activity on audiences and presenting history online. So with those factors my participation felt less natural–but still more so than if I had been on audio-only, as happened due to technical glitches when both Megan and Nate Skyped in (I hope they will comment on their experiences below. Hint hint.).

Overall Thoughts

In the end I still prefer being in class. There is something to be said for being there in person. However, all things considered, Skyping in did not take away from my experience as a student nearly to the extent I thought that it would. I was able to participate almost as fully as I normally do, especially when I was on video. In part because of where Dr. Schrum put the camera, and I think in part because this is my third class in the same room, I felt the same level of comfort and like I was part of the class. I followed along the conversation well and, except for the aforementioned glitches, felt like I could take part. This was even true, albeit less so, for the second half.

So while I think Skype or other videoconferencing technology was not a substitute for being there in person, it did come close. As the technology and, hopefully, bandwidth improves, I can see videoconferencing being a vital tool for education to overcome factors that prevent people fro being there in person, whether minor factors like laziness about I-66 traffic or major ones like teaching students in another country.

I would be curious, however, to see how it would work in a class of more than seven people. Would it be as effective? How much adaptation would the teacher and students have to make? That I’d like to try next.

Clio 3 Long (enough, I hope) Tutorial: Creating a MySQL Database Listing in Your WordPress Site

Note: I have cross-posted this on the class blog. Should any changes be made, they will appear on that site in the future.

Lesson Goals & Reasons

Goals

In this lesson, you will learn how to create a listing of data from a MySQL database, and have it display in WordPress.

You will need:

  • A WordPress site (often used for a blog), hosted on your own server. In other words, not a blog/site hosted on WordPress.com.
  • A MySQL database, separate from your WordPress site.
  • A text editor, such as Komodo Edit or TextWrangler. These are available for free.
  • A FTP client, such as CyberDuck, for connecting to your server.

You will do the majority of the editing in the separate text editor; you may also use WordPress’s native interface.

Why?

Many historians have own own blogs–perhaps even as part of our own websites–and separate databases that we use for research. This tutorial will show you how to display the data from your MySQL database on your WordPress site, so that you don’t need a separate site for that purpose. In the long run, this will save you time, as you won’t have to develop a separate CSS for your database listing (and other related pages). Your listing will simply follow the style of your WordPress site.

This tutorial is for those who have an existing MySQL database and would like not to have to create a custom website for displaying, modifying, and/or querying the data. In other words, you can make your database pages follow the style of your WordPress template and live in your already-existing website.

As you may know, WordPress offers a plethora of plugins. There are two (here and here) that offer the option of integrating a MySQL database into your WordPress. What they do, however, is require you to copy your already-existing database into WordPress’s MySQL database. Doing this increases the possibility of error–for instance, if you make an alteration to your data and click the wrong place, you may break something in WordPress. So, we want to keep our MySQL database separate from our WordPress database; in this tutorial, we are simply using WordPress to display data from your MySQL database.

What Will You Do?

This tutorial has four parts:

  • Understanding WordPress: Providing a basic explanation of relevant parts of WordPress and how it works.
  • Creating the Connection: How to connect your separate database into WordPress.
  • Creating a New Template: How to create a page to hold your listing.
  • Creating Your Listing: How to list the contents of your database, right in your WordPress page.

Understanding WordPress

Basics

WordPress is a content management system. Many historians, in particular, use it to host a blog; indeed, that’s its original function. But WordPress is also a great system to use for an entire website.

WordPress divides into two basic “units”: posts and pages. To make a technical explanation short, posts are your typical blog posts. Pages, meanwhile, act like the static pages you find on any website. In this tutorial, you will create custom pages to display the data from your MySQL database.

WordPress, like many other content management systems, is coded in PHP. In fact, it is connected to its own MySQL database; anything you put into it–a page, a post, a picture–is stored in a specified spot within WordPress’s database. Each page or post fishes out that content and displays it.

Themes

A theme is what controls the appearance of your WordPress site. Each theme (of which there are, now, literally thousands) contains several PHP files for different parts of WordPress. There are several page templates within each WordPress theme. In this tutorial, we will duplicate and edit a page template.

An important note: If you haven’t done any modifications before to your WordPress template, you should create a child theme. Here is how to do that and why it’s important. Any pages that we create here will be saved into the child theme.

We will be working in the folders for your parent and child themes: duplicating files from your parent theme folder into your child theme, and making changes there. To find your theme folders, open your FTP software and navigate to the folder where you have your installation of WordPress. The file structure of WordPress is the same for each installation. The themes can be found under wp_content, then themes. The folder holding your parent theme houses the different page files for that theme.

Creating the connection

In this tutorial, we will just be making database listing. However, you may, in the long run, want to use your WordPress site for multiple functions related to your MySQL database. For example, you may want to have a data entry form. Or you may want to have a page where you, or your users, can query the database. As such, in this section we are going to set up a function that will allow you to connect to your separate MySQL database with just one line of code, in any page.

Create a child theme Functions.php file

Each WordPress theme contains a file called functions.php. This file is essential to WordPress. A function is a set of code in PHP containing a series of commands. Generally, you want to create functions for a sequence of commands you may use frequently. WordPress stores all of its functions–called up throughout your site–in functions.php.

We will set up a separate functions.php file for our child theme. This will augment, but not alter, the functions.php file that came installed with your theme. The reason: less room to break the entire site by accidentally altering or deleting an essential function.

Create your function

Create a new file called “functions.php,” and save it in your child theme folder. Open that file.

Now we will create a function to connect to our database. Begin with the basic opening and closing command for PHP:

 <?php

?>

All of our commands will be within those brackets. Next, create the basis for the function. Let’s call it open_mysql_db:

 <?php 
function open_mysql_db() {

}
?>

As you can see, PHP calls on us to use parentheses after the name of the function. All of our commands to execute the function, meanwhile, go within the curly braces.

Next, within the function, create a PHP variable–many of the tutorials I’ve seen use $mysqli for this purpose, so I’ve used it here. With this variable, you tell your file to connect to your MySQL database, located separately on your server. You input the address for your database, your login name, your password, and the specific database you are using. Make sure to put those things in single quotation marks, like below:

<?php 
function open_mysql_db() {
$mysqli = new mysqli('address of database', 'username', 'password', 'database name'); 
}
?>

Finally, we want to know if the connection does not work. This code will tell us:

<?php
  function open_mysql_db() {
    $mysqli = new mysqli('address of database', 'username', 'password', 'database name'); 
		
    // check connection 
    if (!$mysqli)
    throw new Exception('Could not connect to database');
    else
    return $mysqli;
  };
?>

What does this do? Essentially, it tests the connection (represented by the variable $mysqli), and gives you an error message if it doesn’t connect.

Now that we have our function defined, save and close your functions.php file. This function will now be available for any part of your site to use!

Creating a New Template

Now we will create the page to house your database listing. We will do this in a text editor (such as TextWrangler), but we can also do it in WordPress’s editor on your site. For this tutorial, I’m using the text editor, because it uses colors to help us understand the code we are inputting, and even indicates where we need to close brackets!

Create a new template

To create the page to display your data, you can pick any of the pre-existing page templates. Each page lives in the directory for your parent theme. In the parent theme directory, find the template you wish to use. Copy (do not move!) it into your child directory. Rename the duplicate file; for this tutorial, let’s call it “database_list.php.” Open that file in your text editor.

This next portion will vary depending on the theme. In some themes, at the very top of the page you will see a line saying “Template Name:”. If it is there, on this line input the name you want to call the template. This is important, as this is how you will choose this particular template for your page. For what we’re doing in this tutorial, let’s call our template “Database Listing.”

If the code is not there, insert the following at the top of the file–some themes offer this option, some do not:

<?php /**  
* Template Name: Database Listing  
*/ ?>

Connect to your MySQL database

Every WordPress page template has similar elements, the most important being “the_content.” This displays the content that you create for a particular page in WordPress’s scheme. Look for this line:

<?php the_content(); ?>

You will place everything that you need to for your database listing beneath this line, and above everything else–especially this line:

 <?php endwhile; ?> 

The reason? The key to how WordPress does this is a concept called “The Loop.” Basically, this is what tells tells WordPress to continue fishing content out of the database to display. Without The Loop, you could only display one blog post at a time. You can learn more about The Loop here and here. For our purposes, the most important part to know: any code that we add to our pages must be above that particular line, which is what makes The Loop stop running. If your code is below the line, WordPress will not know to use it, and thus will not.

Now, we tell the template to run the function that we just created. To do that, we give the function a variable. In this case, we’ll use “$conn” for “connection”–you can use whatever variable you want, as long as you remember it and use the same one! Insert this line:

<?php $conn = open_mysql_db(); ?>

Each time that variable is used, WordPress will know to run the function.

Creating Your Listing

Now we are ready to create our database listing. We will create a table using basic HTML commands–except that we are telling it to display the contents of our database. For purposes of this tutorial, we’ll use a database with four fields (called “field1,” “field2,” “field3,” and “field4”–creative, I know) in a table called “table.”

Query the database

We begin, as always, with our opening and closing PHP brackets:

<?php

?>

Now connect to the database and select the fields that we want to display in our listing. We start with an if-else statement. Essentially, we are telling the template to display the table if it can connect and get results from the query. Otherwise, it will tell you that there is an error:

<?php
// get the records from the database
if ()
  {

  }
// show an error if there is an issue with the database query
else
  {
    echo "Error: " . $conn->error;
  }
?>

Next, we give the query a variable–“$result” in this case–and run the query using regular MySQL commands:

<?php
// get the records from the database
if ($result = $conn->query("SELECT field1,field2,field3,field4 FROM table;"))
  {

  }
// show an error if there is an issue with the database query
else
  {
    echo "Error: " . $conn->error;
  }
?>

Now, within our curly braces after we query the database, we include another if-else statement. If the query works, we want to display our results. Otherwise, it displays an error message. So first, set up the if-else statement:

<?php
// get the records from the database
if ($result = $conn->query("SELECT field1,field2,field3,field4 FROM table;"))
  {
    // display records if there are records to display
    if ($result->num_rows > 0)
      {

      }
    // if there are no records in the database, display an alert message
    else
      {
        echo "No results to display!";
      }
  }
// show an error if there is an issue with the database query
else
  {
    echo "Error: " . $conn->error;
  }
?>

Now give a style to your table:

<?php
// get the records from the database
if ($result = $conn->query("SELECT field1,field2,field3,field4 FROM table;"))
  {
    // display records if there are records to display
    if ($result->num_rows > 0)
      {
        // display records in a table
        echo "<table border='1' cellpadding='10'>";

        echo "</table>";

      }
    // if there are no records in the database, display an alert message
    else
      {
        echo "No results to display!";
      }
  }
// show an error if there is an issue with the database query
else
  {
    echo "Error: " . $conn->error;
  }
?>

Next, set up the headers for your table:

<?php
// get the records from the database
if ($result = $conn->query("SELECT field1,field2,field3,field4 FROM table;"))
  {
    // display records if there are records to display
    if ($result->num_rows > 0)
      {
        // display records in a table
        echo "<table border='1' cellpadding='10'>";

          // set table headers
          echo "<tr><th>Field 1:</th><th>Field 2:</th><th>Field 3:</th><th>Field 4:</th></tr>";

        echo "</table>";

      }
    // if there are no records in the database, display an alert message
    else
      {
        echo "No results to display!";
      }
  }
// show an error if there is an issue with the database query
else
  {
    echo "Error: " . $conn->error;
  }
?>

Next, we query the database and pull out the data from each record (row) in the database. We set up a while statement–while each row of data is being fetched, will be put into the table. Begin with the while statement:

<?php
// get the records from the database
if ($result = $conn->query("SELECT field1,field2,field3,field4 FROM table;"))
  {
    // display records if there are records to display
    if ($result->num_rows > 0)
      {
        // display records in a table
        echo "<table border='1' cellpadding='10'>";

        // set table headers
          echo "<tr><th>Field 1:</th><th>Field 2:</th><th>Field 3:</th><th>Field 4:</th></tr>";
          while ($row = $result->fetch_object())
            {

            }
        echo "</table>";

    }
  // if there are no records in the database, display an alert message
  else
    {
      echo "No results to display!";
    }
  }
// show an error if there is an issue with the database query
else
  {
    echo "Error: " . $conn->error;
  }
?>

Finally, we put in our table, which will display each record of our database. To do this, we use the HTML code to create rows of the table. Then for each row, we input the variable “$row”, then “->”, then the name of each field. This tells the file to cycle through all of the records (i.e., each row) of the database until there are no more. Here is the code:

<?php
// get the records from the database
if ($result = $conn->query("SELECT field1,field2,field3,field4 FROM table;"))
  {
    // display records if there are records to display
    if ($result->num_rows > 0)
      {
        // display records in a table
        echo "<table border='1' cellpadding='10'>";

          // set table headers
          echo "<tr><th>Field 1:</th><th>Field 2:</th><th>Field 3:</th><th>Field 4:</th></tr>";
            while ($row = $result->fetch_object())
              {
                // set up a row for each record
                echo "<tr>";
                  echo "<td>" . $row->field1 . "</td>";
                  echo "<td>" . $row->field2 . "</td>";
                  echo "<td>" . $row->field3 . "</td>";
                  echo "<td>" . $row->field4 . "</td>";
                echo "</tr>";
              }
        echo "</table>";

      }
    // if there are no records in the database, display an alert message
    else
      {
        echo "No results to display!";
      }
  }
// show an error if there is an issue with the database query
else
  {
    echo "Error: " . $conn->error;
  }
?>

Now you have your template, which will include your listing. But we’re not done yet; all we’ve done is create a template for your eventual page.

Create a new page for the data

Next we will create the actual page for displaying this data.

On the left side of your WordPress dashboard, go to Pages>Add New. Name the page whatever you would like, e.g., “[Database Name]: Listing.” Next, write any explanatory text you would like in the content box. This will go above your listing. Here is what the interface looks like:

Now, we want to connect this page to its template. On the right side, you see a box for “Page Attributes.” Click on the dropdown menu for Template, and select the template you just created. This will call up the code that you just wrote!

Now, publish the page, and have a look at the finished product. You will see a database listing that now conforms to your WordPress theme, with no need for extra styling!

Here is what it looked like in mine:

Playing with new tools

This past week, I’ve spent a bit of time (at least when not dealing with a busy week at work, including leading two walking tours on Sunday) playing around with tools that we learned last week, and looking a bit ahead.

After Sasha and Jeri’s excellent tutorials, I was eager to dive into webscraping tools. I poked around a bit with Wget most, learning different options for bulk downloading. I thought of various historical resources I may want to use, and experimented with how to get that data. I learned, in particular, the importance of the -np (no parent) option, the hard way! I also successfully downloaded just the HTML and image files from a site; I was particularly interested in the HTML files, as they contain many primary sources divided among different pages. Next thing to learn: how to fuse these texts in a bulk manner…

I also tried to use it to back up my own website, with less success. Less success, in this case, means that it downloaded the “index.html” page that my WordPress installation produces… And nothing more. Am I missing something–perhaps I (or Dreamhost) have set my security settings too well? Then again, perhaps I shouldn’t complain about that.Is there an issue with downloading, say, a CMS? Should I be using something else (e.g., Python) in this case?

Python, and generally writing more complicated scripts than one command at a time with Wget, is another area in which I need to experiment more. For the end goals of which I could think this week, Wget seemed to work. Mostly.

At some point I would also like to write my own Zotero translator. I spent some time thinking of the kind of item for which I’d like to write a translator (ironic, since I can think of plenty of times where lack of a translator frustrated me), and so haven’t at the moment. Likely, I will follow Sasha’s lead and write one for my own data pages about the claims that I’m researching. The first step in that process will, of course, be building said pages…

I also thoroughly enjoyed Julie Meloni’s tutorial on using APIs (in three parts: 1 2 3). It was written for someone at my level: an advanced (debatable) beginner and non-programmer.

That inspired me to get my own Google Maps API and do some very, very basic playing. By very basic, I mean taking Google’s basic map page and plugging in my own coordinates and API. Here is the result, focused in on my former place of employment, the Alamo.

As happened last week, learning these new tools helped me think more about what product I want coming out of this class. In developing my free-standing database site, I’d like to include maps–both for the aggregate data, as well as for each individual claim. For the aggregate data, I’m not sure yet what mapping application I would use–I look forward to the next two weeks, when we learn about what’s available and what works for what. For each individual claim, I would like to include maps for my four geographic fields: home port of ship (when applicable), place of incident, destination port, and destination of goods. Google Maps might be best for that, although the jury is still out.

A side note, along the lines of my individual data pages: I’d like to learn how to hide portions; e.g., the empty fields. If that is even possible. Hint, hint.

So, that’s where things stand. I look forward to learning more about mapping and APIs from Laura.

Presenting WordPress

This week for Clio 3, I’m presenting on WordPress–the platform on which I’m writing right now. As we’ll discuss in class, though, it’s so much more. WordPress is, indeed, a full content management system.

To give my classmates a preview of what I’ll be doing:

  • First, a Prezi (which you are free to browse) giving a bit of explanation of WordPress and its structure. The infographic that I tweeted shows things nicely; I’m going into a bit more detail.
  • Next I’m talking specifically about pages in WordPress, giving an explanation of them, and a brief overview of creating a custom page in your template. To do that, I’m going through the PHP of an individual page (including slight modifications I made), and using one of my own examples. I’ll show a bit of how we might do some more sophisticated things, and question if we want to do all of that to make one specific page… It may be useful for having more such pages, though.
  • I’ll also talk about important aspects of WordPress themes–including some rules that I’ve broken and things that I need to correct.
  • Finally, I’m talking about linking your own database into WordPress. As I discovered, there are some sophisticated ways to do this, even involving two plugins. In the end, though, I went for what might be a simpler way, one that did not involve copying my extant database. Why? I didn’t want there to be an extra copy, and I knew how to link to the copy that I already have on my server. Plus, I like having the self-contained database. This involved creating a special type of page and putting in the PHP that I did for my basic listing of claims (original). Here’s what the same page looks like placed into WordPress. I’ll show you how I did that in my presentation (here’s the code, with my login info redacted).

Since Sasha will also be presenting (on Omeka), the presentation will be short, and thus general. What I learned in preparing this presentation: you can do a lot with WordPress, and since it’s so used, people have done a lot with it. So my goal is to give everyone an idea of its basic structure, and show a couple of small things you can do with it. I hope this will then help everyone to play on their own with it. 

If there is anything else someone wants to know, please don’t hesitate to comment. I’m working through the day but will at least try to touch upon it.

And thus, I will have done my two presentations for the class. Since I was silly enough to do my presentations two weeks in a row, with a West Coast trip between them, I got behind in a couple of other things; thankfully I am again off work on Monday. I’ve finally normalized most of my database, and added everything into the joiner tables. It’s now ready to take data entry… Once I get a more sophisticated form up and running. For now, I have this, which populates one of my tables.  My next goal: using Sasha’s excellent tutorial, create a complete data entry form. Thankfully, I am off on Monday, and have no travel planned until Thanksgiving.

In the next couple of weeks I will also be creating tutorials based on my presentations for Programming Historian, and contributing resources (thank you Erin for setting this up!) to the class site.

So that is where I am. This time, the presentation will be a lot shorter (I promise!), and I will not be running on 2.5 hours of sleep, and a full day of work (including a presentation to a board committee), before it!

See everyone in class.

It’s 3 a.m. … Do you know where your CSV columns are?

Tomorrow, or technically today, I’m presenting in Clio 3 on Data Manipulation.

As Professor Gibbs and I defined it on Monday, my presentation on this potentially broad topic is twofold:

  • Using SQL commands in PHPMyAdmin to merge and split fields; e.g., merge or split names;
  • Using PHP to switch a CSV file’s date format into an acceptable one for input into a MySQL database.

The first is one with which I feel rather comfortable, and ready to present.

The second, on the other hand… I spent a few hours last night dealing with that (and a bad allergy attack), and I’ve spent all evening tonight on it. After a lot of trial and error, I have much of it working. I can get the file open, and even write back into it. The problem is the middle–switching the order of the dates.

Here is what I have:

The middle parts are the problem.

I am most thankful to this blog post by Evan Cordulack, an American Studies graduate student at William & Mary; after looking at many sites that gave me parts of what I needed, his helped me crystalize most of what I needed.

I tried a few different things: getting slightly familiar with PHP functions (via these two posts that gave functions for changing order of numbers), and using Sasha’s code for her form. The latest version (as posted below, next to the original) reflects Sasha’s code (thanks for going over it with Megan and me on Monday! Hey, look, alliteration!).

I get the feeling that part of my issue is trying to change the data in just one column. Here’s what arouses my suspicions: I get a variation of the jumbled data each time I try.

So… I’ve reached a point where I’m not sure what else to do. There’s something that I’m clearly missing here. Since I’m having too hard of a time figuring out what I’m doing wrong in that middle part, I’m writing this post. Any suggestions are most appreciated.

Dr. Gibbs–if I am able to get off work early (a big if), will you be around? Otherwise, may I make figuring this out part of my presentation? 🙂

This is what the CSV file originally looked like. I stripped out everything else except for the case number.

Eeep. Other times, it’s changing my initial numbers.

And now, it is done…

At least for now. At least for the sake of Dr. Petrik’s gradebook. You can see my final assignment, “Santa Anna Goes to Washington.”

There is still more that I would like to do. In spite of Geoff and Sheri’s helpful advice, I never got around to learning how to make an image map. So, my map is not clickable, as was my original plan. I simply ran out of time with the content. Nonetheless, it is here for all to see.

Overall, I’m happy that I worked with Omeka, as it will help me to build upon this site in the future. Part of me wished that I had worked with regular HTML and CSS for the sake of the class assignment, as I would have needed to do less tinkering, but in the end, I was happy with the flexibility to add more pages and objects. As I continue on my overall project, I will continue to add objects and information. I’m also excited to learn more about using PHP in Clio 3 this fall.

But for now, I am going to sleep.

To all of my classmates and Dr. Petrik, thank you for a great semester. I have learned a lot, and have particularly enjoyed getting to know a dynamic, intelligent, and nice bunch of fellow historians and art historians. Thanks to everyone for your help this semester. I will look forward to continuing to learn from, and with, all of you.

Older posts