Concerned about Coronavirus? Where to Find the Data
Something unusual happened at a meeting I attended last week. During the breaks, every woman I saw in the restroom soaped up and washed her hands for 20 seconds. The Centers for Disease Control and Prevention (CDC) have advised this practice for years, but this was the first time I ever saw absolutely everyone do it. (Every woman, that is — I had no direct observation of what the men did.)
The reason for the improved hand hygiene practices, of course, was the publicity about coronavirus. Having good data about this, or any disease, is essential for public health. There is a lot of misinformation out there, and several readers have asked me where the data come from and how reliable they are. Here are some sites you can check to find the data:
The World Health Organization (WHO) posts daily updates for every member country in the Coronavirus disease (COVID-2019) situation reports. The report for February 27, for example, gives the cumulative number of worldwide confirmed cases (82,294, of which 78,630 are in China) and deaths (2,804, of which 2,747 are in China), the number of new cases and deaths in the past day, and other statistics. The statistics are reported by member countries; the WHO provides standards and guidance for surveillance and reporting statistics.
The CDC publishes statistics on confirmed cases in the United States. The page also has links to more information about the virus, guidance for travelers and healthcare professionals, and other resources.
The European Centre for Disease Prevention and Control (ECDC) presents statistics on cases in the European Union, updated daily.
Johns Hopkins University's desktop and mobile apps display maps of confirmed cases. Their data come largely from the WHO, CDC, ECDC, and National Health Commission of the People’s Republic of China, but are presented in more digestible form with greater geographical detail. Click on a country to see the number and geographic distribution of cases and deaths.
Although these are the best data available, all estimates about COVID-19 currently have large measures of uncertainty. This is particularly true for number of cases and fatality rate estimates.
There are two types of mortality statistics. The population mortality rate for a cause of death is calculated as (number of persons who have died from that cause) divided by (number of persons in the population). The CDC FastStats about influenza lists the 2017 population mortality rate as 2 fatalities per 100,000 population. This statistic was calculated by dividing 6,515 (the number of death certificates in 2017 listing influenza as cause of death) by 326 million (the U.S. population in 2017).
The denominator for a population mortality rate is pretty accurate — most countries have excellent statistics on number of residents. Uncertainty about a population mortality rate comes from the numerator, the number of deaths from that cause. In the United States, the number of death certificates listing influenza as cause of death underestimates influenza-associated deaths, because many deaths caused by flu-related complications (or conditions such as congestive heart failure that are aggravated by flu) do not list influenza as a cause of death on the certificate. The CDC thus uses a mathematical model to estimate the number of influenza-associated deaths in the United States. The number of deaths associated with influenza in any particular year is usually reported as a range, to reflect the uncertainty in the estimate from the model. For the 2017-2018 flu season, the CDC estimated there were between 46,000 and 95,000 influenza-associated deaths — a much higher number than the statistic of 6,515 deaths (from death certificate information) on the FastStats page.
The statistic being reported for coronavirus is the case fatality rate, which is (number of persons who have died from the disease) divided by (number of confirmed cases of the disease). For coronavirus, the numerator has uncertainty because deaths due to coronavirus may be classified as due to another cause and vice versa. But there is even greater uncertainty about the denominator. An infected person with mild symptoms who does not seek medical treatment will not be captured in the denominator. Confirming that a suspected case is actually COVID-19 can take days, and in many areas the number of test kits, and laboratories that can test for the virus, are limited. As of February 28, only 459 tests had been performed in the United States so the estimated number of cases excludes anyone who might have been exposed to the virus but has not been tested.
The case fatality rate currently being quoted for COVID-19 comes from a recent study of 72,000 medical records in China. The researchers reported an overall case fatality rate of 2.3%, but emphasized that this is a preliminary estimate from the data they had available. Factors associated with higher fatality, according to this report, are: being over age 80 (14.8%), having cardiovascular disease (10.5%), and having diabetes (7.3%). Persons with no underlying conditions had a case fatality rate less than 1%. But all of these statistics are based on early, and incomplete, data, and no one knows how accurately they describe the wider population. In the study, 81% of the patients diagnosed with COVID-19 were classified as having mild cases. If many infected persons with mild or no symptoms are undiagnosed, and are thus uncaptured in the denominator of the case fatality rate, then these early estimates of the case fatality rate will likely prove to be too high. At this stage, however, there is not enough information to tell.
For more advice on interpreting statistics about coronavirus, and on the myths and misinformation about the disease, I highly recommend the “This Week in Virology” podcast from the American Society for Microbiology. The February 9 episode discussed issues about the coronavirus statistics starting at minute 7, and myths and conspiracy theories being disseminated about the virus starting at minute 35.
So, please get your information from the trustworthy sources in the bulleted list above. Don't click on e-mail links purporting to give information about the virus, trust news from social media, or fall for scams. The Better Business Bureau recently wrote about a scam in which fake online stores “sell” face masks because some reputable stores are sold out. But of course all the fake stores do is take your money. So do the scammers that advertise vaccines (none yet exists).
And please support the agencies and researchers who are working so hard to obtain and report accurate statistics and to project the trajectory of this illness. It’s easy to take statistical work for granted because it’s often in the background of daily life. But data collection and analysis will be the key to managing COVID-19.
Copyright (c) 2020 Sharon L. Lohr