Does Asphalt Art Improve Traffic Safety?

The safety statistics presented at a recent Tempe, AZ neighborhood meeting sounded impressive. The speaker was proposing “asphalt art” — painting designs on the intersection and crosswalks as in Figure 1 — for a neighborhood intersection, and stated that studies had shown that installing such art led to:

  • A 50% decrease in crashes involving pedestrians and cyclists

  • A 37% decrease in crashes involving fatalities or injuries

  • A 17% decrease in the total crash rate.

Figure 1. Asphalt art at the intersection of McAllister Avenue and La Jolla Drive in Tempe, AZ.

To me, though, the statistics sounded a bit too impressive. Would painting a crosswalk really cause such dramatic decreases in crash rates? When I hear the words “studies show,” I want to know: What studies? Who conducted them, how were the data collected, and was the analysis done properly?* After an internet search, I found the source of these numbers: the “Asphalt Art Safety Study” (Schwartz, 2022), done as part of the “Asphalt Art Project” by Bloomberg philanthropies.

Let’s look at what they did, and then see how the results might apply (or not) to Tempe. For statistics teachers reading this post, the statistical methods I will be using are typically taught in a Stat 101 class so this data set might form the basis for a class activity. You can download my spreadsheet, which contains the data copied from the report, here.

The Asphalt Art Safety Study appears to be a carefully done study by a reputable organization. Importantly, they provide a full description of the methodology used to collect the data and calculate the statistics.

The researchers located and reviewed 150 sites that had installed asphalt art, and stated (Schwartz, 2022, p. 13): “Of those, 17 sites were selected that met all of the below criteria while offering a diverse array of project types, geographic locations, and neighborhood contexts.” The criteria included having known installation dates, being at a stop- or signal-controlled intersection or at a mid-block crossing that is open to vehicle traffic, and having detailed public-use crash data available for the site for at least 12 months before and 12 months after the installation.

This was thus a purposively chosen sample — it is not necessarily representative of asphalt art projects that have been installed across the country. The report does not give information about studies that met the criteria but were not included. All of the sites were in Florida, Georgia, Massachusetts, New Jersey, and New York.

Once the researchers selected the sites, they obtained data on different types of crashes from the state’s open data system (all data sources are given in the Appendices of the report). The data included:

  • Location of the intersection containing the asphalt art

  • Type of art (nine sites had crosswalk art alone, six had roadway art alone, and two had both types)

  • Year in which the art was installed (between 2016 and 2019)

  • Information on site setting (urban core, neighborhood commercial, neighborhood residential, suburban) and facility type (intersection with traffic signal, intersection with stop sign, mid-block with crosswalk)

  • Number of months of crash data before the art installation

  • Number of months of crash data after the art installation

  • Number of crashes of all types in the time period before installation

  • Number of crashes of all types in the time period after installation

  • Number of crashes involving pedestrians or cyclists (called “vulnerable users”) in the time period before installation

  • Number of crashes involving pedestrians or cyclists after installation

  • Number of crashes leading to fatality or injury before installation

  • Number of crashes leading to fatality or injury after installation

Because the sites were followed for different numbers of months, Schwartz (2022) calculated the average number of crashes per year for each time period, site, and type of crash. For example, the site in St. Petersburg, FL had 18 crashes in the 52 months before installation and 13 crashes in the 39 months after installation, so the St. Petersburg crash rate was calculated as 18*12/52 = 4.15 crashes per year in the Before period and 13*12/39 = 4 crashes per year in the After period.

Analyzing the Data

The first step in any data analysis is to graph the data. Figure 2 shows the average crash rates for the Before and After periods for the 17 sites. As you can see, only the site at the top (Atlanta) has more than a handful of crashes in either period. It saw an increase in the crash rate after the installation; most of the other sites either stayed the same or declined slightly.

Figure 2. Average number of crashes per year at each site, before and after the asphalt art installation. The line at the top is for the site in Atlanta.

But was there a real change in crash rates after the art installations? After all, the number of crashes at any location fluctuates from year to year. Could these changes simply be due to random variation?

We can use statistical methods from Stat 101 to look at that. Specifically, a paired t test looks at whether the average (Before minus After) difference across the sites is consistent with the hypothesis that there was no change in the average crash rate after the art installation. The mean difference is 0.6 crashes per year. We get a t statistic of 1.1, corresponding to a two-sided p-value of 0.3. In other words, if the true underlying means are the same, we would expect a difference this large to occur just by chance about 30 percent of the time. If you prefer confidence intervals, the 95% confidence interval for the mean (Before minus After) difference is [–0.58, 1.77]. This interval is fairly wide, and includes 0, again confirming that the observed pre-post difference is entirely consistent with random fluctuations.**

Thus, even though the mean crash rate was 17% lower after the art installations than before, this reduction is consistent with random variation seen in the data. There is no reason to believe that the asphalt art lowered the crash rates.

The reductions for the crashes involving vulnerable persons and those involving fatality or injury also fail to be statistically significant, with p-values of 0.36 and 0.14, respectively. There are small numbers of such crashes at most sites, so one needs to look at the statistical significance or confidence interval to assess percentage decreases. After all, a crash rate is reduced by 50 percent if there are two crashes one year and one crash the next, but that difference could well be due to year-to-year variation.

Could This Be Regression to the Mean?

The before-after differences in this study fail to meet the threshold for statistical significance. Even if they were significant, however, would that show that installing asphalt art causes a reduction in crashes? You, dear reader, already know the answer to that question, since correlation does not imply causation.

There is an additional potential problem with generalizing from the set of sites that have decided to install asphalt art to other sites, and it is related to the statistical concept of regression to the mean. Suppose that we want to study a new blood pressure medication, and we enroll subjects who have untreated blood pressure exceeding 140/90. The people are then given the medication. Also suppose that in reality, the medication has no effect. Even though the medication is ineffective, we would expect the average blood pressure of the study participants to be lower when measured post-treatment. This phenomenon occurs because blood pressure varies from day to day. Some of the people who were included in the study had an unusually high value (compared to their typical blood pressure) on the day of the first measurement and we might expect their blood pressure to drop for the second measurement. But the study dataset does not include people with blood pressure below 140/90 on the first day who might, because of the random variation, have blood pressure above 140/90 on the second measurement; as a consequence, we expect the mean for the group to decrease.*** One avoids the regression-to-the-mean problem in clinical studies by having a randomized control group chosen with the same criteria. We would expect the blood pressure in both groups to drop on the second measurement, but we can compare the reduction in the group taking the medication with that in the group taking a placebo or standard treatment.

How does regression to the mean relate to the asphalt art data? I do not know why the cities in the Schwartz (2022) dataset chose to install art projects at these particular sites, but often a city takes action at an intersection after there is a problem. The city might target the intersections with the highest crash rates for action, or might install an art project after a severe accident occurs there. If that is the case, then the sites selected for asphalt art projects tend to be an extreme group, much like the people chosen for the study because of their high blood pressure readings, and would be expected to have a reduction in crashes post-intervention simply because of regression to the mean.

In fact, it was a serious crash that prompted consideration of the Tempe intersection at College Avenue and Alameda Drive for an asphalt art project. On February 14, 2023, there was a tragic crash at the intersection in which a bicyclist was killed after running a stop sign. Crashes involving fatalities are all too common in the aggregate (the National Highway Traffic Safety Administration estimated that about 43,000 people died in motor vehicle crashes in 2021) but are rare at any particular minor intersection. Thus, if asphalt art were to be installed at College and Alameda, before-and-after data would very likely exhibit a drop in fatal and injury crashes, simply because it was the occurrence of a rare event at that intersection — the fatality crash — that motivated the choice of that site for an intervention.

What Does the Federal Highway Administration Say?

The Federal Highway Administration (FHWA) defines the standards for traffic control devices and roadway markings. In the past, the FHWA has ruled against crosswalk art (see Lewis, 2016), arguing that bright colors and patterns “would clearly degrade the contrast between the white transverse crosswalk lines and the roadway pavement” and would thus make intersections less safe.

The 11th edition of the Manual on Uniform Traffic Control Devices (MUTCD) was published in December, 2023. Part 3 of this manual deals with pavement markings. The purposes of crosswalks are to provide guidance to pedestrians and to help alert drivers to the possible presence of pedestrians. All crosswalk markings are to be white. Section 3H.03, new in the 11th edition, now allows for aesthetic pavement treatments, but emphasizes that these are not to interfere with traffic control devices, create confusion for pedestrians with vision disabilities, or in any way “inhibit users from crossing the street in a safe and efficient manner.” The MUTCD examples of aesthetic pavement treatments feature unobtrusive designs such as a red brick pattern within the white lines defining a crosswalk. The manual states that the aesthetic pavement treatments “should not contain pictographs, illustrations, or symbols.”

The Supplemental Summary of Dispositions for Final Rule Changes states FHWA’s longstanding position on aesthetic pavement treatments (p. 164): “these treatments, which are intended to draw the attention of the road user, can distract from the task of operating a vehicle or crossing the roadway as a pedestrian, and that many of the goals of an agency installing these treatments can be accomplished through other means that do not alter or compromise the uniform appearance of traffic control devices.” The Federal Register entry dealing with the change stated that commenters had disparate views on the merits of aesthetic pavement treatments, but were in general agreement on the “lack of research or safety data, positive or negative, to support the proposed provisions on aesthetic surface treatments; how individuals with vision disabilities are impacted by different surface treatments with varying colors or patterns; and concerns with machine vision and driving automation systems’ ability to detect and process nonuniform aesthetic treatments” (p. 87684).

The report by Schwartz (2022) cannot answer these research questions.**** The report states that all crashes, crashes involving pedestrians and cyclists, and crashes involving fatality or injury decreased in the period following installation of asphalt art, but none of those decreases are statistically significant. The lack of a comparison group — sites that are similar to those studied, but without the intervention — limits the conclusions that can be drawn. Even if the before-after difference were statistically significant, such a reduction could quite plausibly be attributed to regression to the mean. A randomized study would be needed to evaluate effects of pavement art on crashes, and more research is needed on how it might affect persons with visual impairments and self-driving cars. 

Details on high severity crashes from 2012 through mid-2023 in Tempe can be downloaded from the Tempe Data Catalog. The intersection of College and Alameda appears 12 other times in the Tempe dataset, with four crashes involving minor or possible injuries, and eight involving no injury. By contrast, there are three intersections less than a mile from College and Alameda that each have more than 700 crash records in the dataset. More than 500 intersections have more crash records in the Tempe dataset than College and Alameda. If the goal is to reduce crashes, it would likely be more effective to focus resources on intersections with large numbers of crashes, and to use safety measures that are backed up by data.

Copyright (c) 2024 Sharon L. Lohr

Footnotes

*See my list of eight questions to ask to judge the quality of a statistic in chapter 7 of Measuring Crime: Behind the Statistics.

**If we are concerned about the assumption needed for the t test that the differences are normally distributed, we can do a randomization test (Box et al., 1978) in which we compare the observed average difference to that from a randomization reference distribution obtained by randomly assigning the two observations from each site to the “Before” and “After” groups. That test gives a p-value of 0.32 — almost the same as the p-value from the t test. One could also assign different weights to sites based on number of months of observation, or adjust results for vehicle traffic at the sites (if one had that information), but it is unlikely that any of these variations would result in different conclusions for this example. Sometimes the simplest statistical tools are all you need to answer a question.

Note that Schwartz (2022) properly included all sites in the analysis, even though it appears that Atlanta has a different pattern than the others. He noted that Atlanta had redevelopment of the area around the intersection at about the same time the art was installed, resulting in a tripling of bicycle traffic and an 18% increase in vehicle traffic. It would, however, be a statistical no-no to discard Atlanta from the data on that basis, because it is possible that other sites also had traffic changes that were unknown to the researchers. If one wanted to exclude sites with simultaneous interventions of other types, that should have been one of the inclusion criteria specified before data were collected.

The differing periods of data collection mean that other external factors may have influenced some sites. For example, traffic patterns everywhere changed during the early months of the COVID pandemic, which overlapped with the “After” period for some of the sites.

***Regression to the mean for before-after studies occurs in a steady-state situation when individuals with unusually large or small values are selected for the study sample, and the before and after measurements are imperfectly correlated. The closer the correlation is to zero, the larger the regression-to-the-mean effect. For example, suppose 300 people flip a coin, and those flipping heads are chosen for the study sample. We then ask the study sample to flip a coin again. We would expect to see about half heads and half tails on the second flip. On the second flip, the mean of the study sample has regressed all the way to the population mean because the two coin flips are uncorrelated.

Kahneman (2011) gives an excellent description of regression to the mean, with examples including the “Sports Illustrated Jinx” (athletes featured on the cover tend to have worse performance the following year).

Note that a number of statistical methods have been proposed that attempt to account for regression-to-the-mean effects in before-after studies, but many of these require additional data from sites with characteristics similar to those in the study.

****The sites in the Schwartz (2022) study, in addition to being more than 1,800 miles from Tempe, also have different types of art than proposed for the Tempe intersection. The pictures in Appendix C show relatively sedate designs, mostly in crosswalks, not the kinds of bright patterns and figures seen in Figure 1.

References

Box, G.E.P., Hunter, J.S., and Hunter, W.G. (1978). Statistics for Experimenters. New York: Wiley.

Kahneman, D. (2011). Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.

Lewis, D. (2016). The Federal Highway Administration Says Stop to Crosswalk Art. Smithsonian Magazine. https://www.smithsonianmag.com/smart-news/federal-highway-administration-says-no-crosswalk-art-180958149/

Schwartz, S. (2022). Asphalt Art Safety Study. New York: Bloomberg Philanthropies.

Sharon Lohr