Data and Display Decisions for the Hull-House Maps: Part 4
Part 1 of this series introduced the Hull-House Maps and Papers, a landmark of statistical graphics published in 1895. But statistics and graphics do not just magically appear: they are the result of careful choices about what data to collect and how to summarize and display them. This post explores how the decisions for the Hull House maps reflect the influences and priorities of their time, but also created a landmark of statistical reasoning.
Which Data to Collect and How to Collect Them?
Congress had directed Commissioner of Labor Carroll Wright to record “occupations, earnings, sanitary surroundings, and other essential facts necessary to show the condition of residents.” But it was up to Wright and his team to decide which “essential facts” to gather and how to do the gathering.
Hull House resident Agnes Sinclair Holbrook described the data collection process:
The entire time of four government schedule men from the 6th of April till the 15th of July, 1893, was devoted to examining each house, tenement, and room in the district, and filling out tenement and family schedules, copies of which are printed at the beginning of this chapter. These schedules were returned daily to Mrs. Kelley; and before they were forwarded to the Commissioner of Labor in Washington, a copy was made by one of the Hull-house residents, of the nationality of each individual, his wages when employed, and the number of weeks he was idle during the year beginning April 1, 1892 (Holbrook, 1895).
The family schedule shown in Figure 1, which collected information about occupations, earnings, hours of work, sickness, and disabilities, closely followed the 1890 census form.* Using the same questions allowed the 1893 statistics from the Hull House neighborhood to be compared with Chicago-wide statistics from the 1890 census; it also saved Wright’s team the work of designing a new questionnaire.
The tenement schedule asked about sanitary conditions and other topics not on the 1890 census form. After tabulating the data, Wright (1894) reported that fewer than 3 percent of families in the Hull House neighborhood lived in a house with a bathroom; about one quarter lived in a house with a water closet or privy. As in Kelley’s 1892 sweatshop investigation, information on room dimensions was used to calculate the cubic feet of air available for each sleeping room occupant. Ventilation and cleanliness were rated subjectively as excellent, good, fair, or bad; about 70 percent of Chicago’s tenements were deemed to fall in the latter two categories.
How was the data collection done? Surveys in 2020 are carried out via multiple modes: face-to-face, on the internet, by telephone, by e-mail, or through postal mail. Some are conducted by an interviewer; in others, respondents provide answers directly on paper or electronic screen. In the 1890s, data were almost always collected by interviewers who visited the residence, and that was done for the 1890 census as well as this survey.** There was no other way to obtain information from the approximately one quarter (as determined by Wright, 1894) of Hull House neighborhood residents who were illiterate.
Which Data to Display
For the map-making, Kelley and Holbrook were limited to the information gathered in the federal tenement and family schedules in FIgure 1. But they still had more than 60 questions to choose from. Why did they choose to construct only two sets of maps: the first on nationalities and the second on wages?
The 1890 census officially established Chicago as America’s “second city” — second only to New York in population. Chicago’s population had more than doubled in one decade, increasing from about 500,000 residents in 1880 to nearly 1.1 million in 1890. Much of that increase came from European immigrants: the 1890 census recorded about 450,000 foreign-born residents, compared with 200,000 in 1880. Nearly 80 percent of Chicagoans in 1890 had been born in a foreign country or were children of immigrants.
The near west side, where Hull House was situated, was home to many of these immigrants. It was, quite literally, a melting pot for people hailing from every corner of Europe. But how much melting was there, and what were the economic conditions of different ethnic groups? The nationalities map showed the degree of intermingling, and together with the wage map showed the relative poverty of different areas.
Jane Addams and Ellen Gates Starr had established the Hull House social settlement with the hope of finding solutions to some of the problems arising from inequality and poverty. It was natural that Kelley and Holbrook would want to study data on wages, and there was precedent for creating wage maps.
In 1889, Charles Booth had just published the first volumes of his monumental study Life and Labour of the People in London. Booth, a wealthy English businessman, was concerned about the social problems in London as well as the inadequacy of the statistical data available to document them. Booth wanted to know how many people were poor, and was skeptical of a statistic published in 1885 by the Social Democratic Federation (a forerunner of the British Socialist Party) that said 25% of the population of London lived in extreme poverty. He decided to launch his own investigation.
The data for Booth’s poverty maps (see Figure 2 for a map of East London) came from a convenience sample. In 1871 the London School Board had passed by-laws for compulsory school attendance and payment of school fees. School Board Visitors were hired to enforce attendance. They performed “house-to-house visitation” and attempted to collect details about every family with children of school age. The Visitors were the primary data collectors; Booth’s research team*** then interviewed the Visitors and tabulated the data obtained from them.
The Hull House Wage Maps
Holbrook (1895) modeled her wage maps (the full set is displayed in Part 1 of this series; Figure 3 above contains the westernmost map) after Booth’s, noting that the “great interest and significance attached to Mr. Charles Booth’s maps of London have served as warm encouragement.” But, from a statistical perspective, Holbrook’s maps improved greatly on Booth’s. Let’s look at some of the reasons why.
Data quality. Using conveniently available data saved Booth money and effort. But the price for those savings was paid in data quality. Although School Board Visitors had extensive experience in their districts, they were not professional data collectors, and the data quality undoubtedly varied from Visitor to Visitor. The data did not come directly from the residents, but were an interpretation of the conditions as seen by the Visitors. The Booth data were limited to families with school-age children — and not even all of those, since some parents eluded registration — who were not necessarily representative of the blocks studied.
By contrast, Holbrook had access to data collected by professional “government schedule men” supervised by Florence Kelley, herself an experienced and meticulous data collector. Although Holbrook (1895) acknowledged that “it is inevitable that errors should have crept in” an investigation as complex as this, she wrote that “the facts set forth are as trustworthy as personal inquiry and intelligent effort could make them. Not only was each house, tenement, and room visited and inspected, but in many cases the reports obtained from one person were corroborated by many others.”
Color coding. Holbrook kept Booth’s general scheme of color-coding, but made it precise and objective. In the legend to Booth’s map (Figure 2), the color black represented the “lowest class, vicious, semi-criminal.” Dark blue indicated “very poor” with casual earnings, pink was “fairly comfortable, good ordinary earnings,” and red areas were “middle-class” or “well-to-do.” On other Booth maps (not seen in Figure 2) the color gold, appropriately enough, represented the wealthy areas.
Now look at Holbrook’s legend in Figure 3. Her colors represent the total earnings per week of a family. Black denotes $5 and less, blue is $5 to $10, red is $10 to $15, green is $15 to $20, and gold is over $20. The color brown represents unknown wages: Holbrook does not ignore the missing data (as so many data collectors do, even today) but displays the families with missing data in the map, so that the readers can make their own imputations or judgments.
Holbrook (1895) also clearly described how she defined a “family” and how she calculated the average wage for each family: “first the number of unemployed weeks in each individual case was subtracted from the number of weeks in the year, the difference multiplied by the weekly wage when employed, and the result divided by fifty-two; then the amounts received by the various members of each family, thus determined, were added together, giving the average weekly income of the family throughout the year.”
Granularity. In the Booth maps, each city block consisted of one color. Holbrook’s maps had much finer geographic detail, showing the distribution of average wages in each house (although for a much smaller area of the city). She colored each house proportionately to the wage categories represented therein. A house with two families having average wage $7 and one family with average wage $12 would have two-thirds of the house’s area colored dark blue and the remaining third colored red. This allowed the reader to see clustering patterns of poverty within blocks and houses.****
Interpretation. Holbrook’s maps require little explanation other than the description of proportionate coloring. She emphasized that her maps preserved the geographical relationships, but that they did not display the total number of families in a house. A family that occupied the entire ground floor would “receive equal recognition” with a family that was crowded into one room.
The Hull House Nationality Maps
Figure 4 shows the first nationality map, corresponding to the same area as the wage map in Figure 3.
The nationality maps had an important difference from the wage maps: they displayed data for individual persons, not families. One would use different classifications if constructing a nationality map today (for example, Holbrook does not include Irish or Black persons in the “English-speaking” category, and she classifies non-Black children as “English-speaking” if over age ten or attending public school) but, as with the wage map, Holbrook was transparent how she assigned nationality categories to persons in the neighborhood.
Holbrook’s (1895) notes on the maps indicated she had thought a great deal about how to display the data. She had also wanted to be able to display population density in the nationality and wage maps, but did not see how to include that information along with the house-by-house statistics and decided to prioritize the location data. One must read her notes to learn of the extreme overcrowding in the area, such as the “sixty men [who] sleep every night in one basement room at No. 133 Ewing Street.”
She referred the reader to Wright’s (1894) 600+ page volume of statistical tables for information on statistics that she could not fit into the maps. And indeed, Wright’s publication tabulated the number of families and individuals by tenements to a house and rooms to a tenement. But the tables are difficult to interpret and their information cannot be related to the information in other tables in the publication. The Hull House maps displayed patterns and information about clustering of nationalities, concentration of poverty, and the relationship between the two, that could not be gleaned from any of Wright’s tables:
But the partial presentation here offered is in more graphic and minute form; and the view of each house and lot in the charts, suggesting just how members of various nationalities are grouped and disposed, and just what rates of wages are received in the different streets and sections, may have its real as well as its picturesque value. A comparison of the two sets of outlines may also be of interest, showing in a general way which immigrants receive the highest, and which the lowest rates, and furnishing points for and against the restriction of immigration. (Holbrook, 1895)
Holbrook hoped that the maps would be of value “not only to the people of Chicago who desire correct and accurate information concerning the foreign and populous parts of the town, but to the constantly increasing body of sociological students more widely scattered.” Looking at the maps 125 years later, I think her hope was realized.
Next: Statistical Legacy of the Hull House Maps and Papers
Copyright (c) 2020 Sharon L. Lohr
Footnotes
*As discussed in Part 2, in the 1890s it would have been automatically assumed that any data used for official statistics had to be collected as a census; this was before the theory of probability sampling had been developed.
Unlike the 2020 census, in which families are asked to provide their own responses to questions, the 1890 census was collected by enumerators who went door to door and entered the information on the census schedules (Gauthier, 2002). It typically took years of laborious hand calculations to tabulate the results from a census. The 1890 census was the first to be tabulated by machine. Data from the very long census questionnaire were entered on punched cards and run through Herman Hollerith’s tabulating machine.
**Wright strongly advocated using fixed schedules for data collection, as opposed to letting the interviewers ask open-ended questions. He wrote: “The information under any investigation is usually collected on properly prepared schedules of inquiry in the hands of expert special agents, by which means only the information which pertains to an investigation is secured. Rambling and nebulous observations which would be likely to result from an investigation carried on by inquiries not properly scheduled are thus avoided” (Wright, 1904).
This preference led him to exclude questions about residents’ backgrounds (other than birthplace and literacy), their opinions about their living or working conditions, or their aspirations or ideas for how their situation might be improved: “Those looking to theoretical conditions or psychological elements must be avoided to a certain extent, for two reasons: Because they will not result in any satisfactory information, and because to carry them out is altogether too expensive; as, for instance, inquiries looking to the causes why people are found in the slum districts of cities, what brought them there, the experience in life which leads to such a residence, and all such questions are too vague for the application of the Statistical method, and although a great number of opinions and varied views might be obtained, the result would be far from satisfactory and would not compensate for the expense Involved” (Wright, 1894, pp. 12-13).
***It helped that the wealthy Booth could afford to hire a research staff to interview the School Board Visitors and tabulate the data. Many of the researchers on Booth’s staff were associated with Toynbee Hall, the London settlement house that had served as the model for Hull House. Bales (1991) noted however that the School Board Visitors, who had collected the data, were not paid for the time during which they were interviewed by the research staff.
****The maps did not, however, allow the reader to infer the relative proportions of families in the entire neighborhood in different wage groups because houses contained different numbers of families. If, as might be supposed, lower-wage families lived in more crowded conditions, the houses colored black would be expected to contain more families on average than those colored gold, and the overall percentage of area colored black on the maps would be expected to be less than the overall percentage of families in the lowest wage group.
References
References are found at the end of the last part of this series.