Revealing History with Chronicling America — K-12 Winner of the Chronicling America Data Challenge
An Introductory Note from the Teacher
The 2016 Digital APUSH work was inspired by a crowdsourced history project begun by the United States Holocaust Memorial Museum in 2016, one where "citizen historians" look within sources like newspapers.com for nationwide coverage of the Holocaust. Like these volunteer researchers, students in the 2016 AP U.S. History class searched for specific news items within a large collection of digitized historical newspapers. Using the Library of Congress' Chronicling America database, APUSH students looked for patterns of coverage, or lack of coverage. One goal of this work, like that of the Holocaust Museum's research, was to uncover differences, geographically, among newspapers and to speculate as to why. While the ongoing Holocaust project requires its researchers to read and evaluate stories, the APUSH work relied on word frequency analysis—a kind of distant reading made possible by using the advanced search features of Chronicling America's user interface. This AP U.S. History work dovetailed with a Library of Congress data challenge promoting exploration of Chronicling America through the use of the Library of Congress' API (application program interface). In the spirit of the Challenge, other alternative searching was carried out by using the database's documented page search parameters. Additionally, one student used HTTrack website copying software to capture OCR data files and indexing files directly to his computer.
An Investigation of Plessy v. Ferguson
Classwork, which began in May following the AP U.S. History exam, first explored regional geographic coverage differences. To do so, students organized web data for the database's 2063 searchable newspapers into a spreadsheet (above) and then assigned regional designations. Because the number of newspapers from each state varies, students also calculated the percentage of overall newspapers within each region, along with a percentage of total issues published (see spreadsheet, By Region, column 7). This information about the database's holdings helped students determine the relative importance of historical events. One event investigated, which was looked at by the entire class of students, was the 1896 Supreme Court case of Plessy v. Ferguson. As seen in the spreadsheet and data visualization (right; created by Lynnsey, Maryann, and Meg), interest was high in Lousianna, as might be expected. But interest also extended beyond the South—yet not in the Northeast. Curiously, although at the time of the search Chronicling America contained 32 newspapers from the Northeast operational in 1896, none covered the case during 1896.
The other findings and data visualizations on this page, which look at issues irrespective of region, are the work of small groups of student historians within the class who investigated topics of personal interest.
Investigating Secession — by Andrew, Connor, and Miles
We performed an advanced search to determine the frequency of the keywords “secede” and “secession” being used during a 30-year period extending from before to after the Civil War. This investigation was designed to determine the points at which the South seemed most likely to break from the Union and form a confederacy. The data, compiled using Google Sheets, was used to create a line graph detailing the changes in frequency in both the North and South. Frequency peaked during the Compromise of 1850 and during Lincoln’s election, but the word usage continued through the Reconstruction period. Plot.ly was used to graph the data. The interactive Plotly graph can be seen by clicking on the image of the graph.
Investigating Uncle Tom's Cabin — by Ben, Jessica, Katie, and Nate
We searched Chronicling America for the frequency of the phase “Uncle Tom’s Cabin” from the time of the book’s publication to the start of the Civil War, or 1851 to 1861. We wanted to see the effect that the publication of Harriet Beecher Stowe’s novel might have had on the secession of the Southern states. The searches, which were done by year, specifically counted how many times the book title appeared on the front page of a newspaper. After counting how many times it appeared each year, we graphed the data using Google Sheets. The frequency peaked in 1854 with 150 front-page stories about the book. Since the novel is commonly viewed as one of the major causes of the war, it was surprising to see that the novel apparently lost some of its impact in the years immediately prior to the Civil War. For a second data visualization, we counted the number of front-page newspaper articles by state. There were several states that did not have any newspapers from that time period in the database (Alaska, Delaware, Rhode Island, Wisconsin, Wyoming, New Hampshire, and New Jersey). These states are marked with no color, along with the other states in the database that did not have any articles.
Investigating Labor Unions — by Sawyer and Virgile
Using the Library of Congress’ Chronicling America newspaper database, we surveyed the newspaper coverage of labor unions across America. We searched by state, recording the number of hits each search produced along with each state’s primary zip code. While this data produced a generously biased representation of the coverage for labor unions within states with more newspapers in total, we were able to produce historically accurate results by finding the percentage of the total newspapers per state that covered labor unions. We did so by dividing each state's results by the total newspapers within the each state, and then multiplying by 100. We then imputed this data into a choropleth graph on plotly, using the percentages as the value variables and the state zip codes as the location variables. The choropleth graph showed a prevalence in coverage of unions in California, Minnesota, Colorado, and Connecticut. For a second visualization, we used OpenHeatMap.com to display a choropleth graph of newspaper coverage of unions by the percentage of total pages printed that mentioned labor unions.
Investigating the KKK — by Lynnsey, Maryann, and Meg
Racial disagreement and violence increased with the founding of the KKK in 1865. Over time, the Klan's numbers increased, as did newspaper coverage of the organization. In this year's APUSH class, we went into the Library of Congress' Chronicling America database to search for the occurrence of articles about the KKK during the time period 1866-1876. The data gathered was only from the available newspaper articles published by the database and does not fully represent nationwide data from the time period. Regardless, the data is a good representation of the growth of the KKK throughout the period. We used Plotly, an online data visualization tool, to make the graph. Plotly was also used to generate the time period maps which appear within the Google slides (below). The graph can be viewed at the Plot.ly website by clicking on it.
Sunapee High is located in rural Sunapee, New Hampshire. Very small, the school's total student population is typically around 135. This year's APUSH class consisted of fourteen juniors and one senior, some of which are pictured left being goofy:
Benjamin van Paassen