How to Ignore More than a Century Worth of Scientific Literature and Make Headline-Grabbing Claims on One of the Deadliest and Exhaustively Studied Pandemics in History Based on Flawed Data and Questionable Analysis
The 1918 influenza pandemic, also erroneously referred to as the “Spanish flu,” has affected up to one billion people—half the world’s population at the time—and has killed an estimated 20–30 million people in the Western World and God-only-knows-how-many people in other countries. Previous and subsequent influenza pandemics usually hit infants and the elderly the hardest. The 1918 influenza pandemic was a weird one because of the atypically high mortality among young adults. The peak mortality rate during the fall wave of the 1918 influenza pandemic in Canada and the USA was 28 years. The death of young adults lowered the average life expectancy in the United States by more than 10 years. Numerous studies have confirmed these findings.
A typical picture illustrating the haunting age peculiarity of this pandemic is shown below.
Even in poor, medically underserved communities, it was the young healthy adults that perished leaving many very young orphans behind.
In the early 2000s scientists started looking for samples of the 1918 virus. They reasoned that their best chance was to find it inside an influenza victim buried in permafrost. Brevig Mission in Alaska had 89 inhabitants in 1918 of which 87 died of influenza. It was a good bet that some sequenceable genetic material could be recovered from the graves of these flu victims. Indeed, genomic RNA of the 1918 virus was recovered from frozen lung tissues of an Alaskan influenza victim who was buried in permafrost in November of 1918.
In 2005, scientists used reverse genetics to generate an influenza virus bearing all eight genome segments of the 1918 virus to study its properties. The deadly 1918 virus has thus been resurrected!
The reconstructed virus turned out to be as deadly as the original one with a 100% mortality in mice. Another conclusion of this “resurrection” study was that the virus kills by overreaction of the body’s immune system, which explained the weird age distribution of deaths due to the "Spanish Flu." The strong adaptive immune systems of young adults ravaged their body, whereas the underdeveloped immune systems of young children and the weakened immune system of old people resulted fewer deaths. It was inflammation that killed people, not the viral infection directly or secondary infections. These studies seemed like the end of the 1918 influenza story.
Imagine my surprise, then, when at the end of 2023, I started noticing that the 1918 influenza pandemic is in the news again. “Killer 1918 flu didn't pick on the healthy, after all,” declared Science (the same Science that “killed” Junk DNA a decade earlier). “History Says the 1918 Flu Killed the Young and Healthy. These Bones Say Otherwise,” quipped WIRED. And US News and World Reports headlined the findings as “Contrary to Popular Belief, 1918 Flu Did Not Target the Healthy Young.”
The origin of these headlines was a 2023 paper in Proceedings of the National Academy of Science entitled “Frailty and Survival in the 1918 Influenza Pandemic” by Amanda Wissler, currently an Assistant Professor of Anthropology at McMaster University in Hamilton, Ontario and Sharon DeWitte, Professor of Anthropology at University of Colorado in Boulder. The paper was largely based largely on Wissler’s PhD dissertation at Arizona State University under the supervision of Professor Jane Buikstra.
One would have thought that a study that purports to discard more than a hundred years of observations, refute dozens of analyses, refute the conclusions in about 18,000 peer-reviewed articles, and generate headlines in serious news media would require a great amount of unimpeachable data and an analysis that is—if not infallible—close to infallible. If you thought that, you will be sorely disappointed.
Let us start with the Materials and Methods section. The study is based on a bone sample from the Hamann-Todd Osteological Collection which contains skeletons of people who died in Cleveland, Ohio between 1910 and 1939. The skeletons belong to people whose bodies were not claimed within 36 hours of death (i.e., the poor, the homeless, the socially isolated, the incarcerated, the institutionalized as “mentally defectives,” and those whose relatives couldn’t possibly collect their loved ones’ bodies within a day and a half. Black and indigenous people most probably made the bulk of this collection. Wissler and DeWitte euphemistically called their sample as consisting of “individuals of low socioeconomic status,” who mostly “died in almshouses or public hospitals.” Is this sample representative of the population in Cleveland between 1910 and 1939? I very much doubt it. And don’t even start me on the ethical crimes committed to assemble this collection. Of course, the bodies in this collection were legally obtained following an Ohio legislation that permitted people whose bodies were not claimed within 36 hours of death to be “donated” for scientific study, but was the collection ethically assembled? I think not! The ethics of this collection reminds me of a joke my dad used to tell. “A religious Jewish woman goes to the Rabbi in a panic. ‘Rabbi, the chicken I cooked for Shabbat dinner fell into a soiled baby's diaper. Is it still kosher?’ she asks. The Rabbi replies, ‘It's kosher, of course, but I would I eat it?’” But, I digress.
Wissler and DeWitte’s sample consisted of 369 individuals: 310 males (84%) and 59 females (16%). This fact immediately tells the reader that something is off; the sample is not representative of the general population.
Curiously, the Materials and Methods section also contains the following statement, “to maximize the sample size, both the 1918 flu and the control groups include individuals who died from influenza and pneumonia as well as other diseases such as tuberculosis and myocarditis.” (No data is provided as to how many individuals were included for the purpose of padding the sample size.) Finally, medical history for the individuals in the study is mostly not known. Thus, it is impossible to know whether “an individual suffered another disease during life unless it was listed as the cause of death or left diagnostic evidence on their skeleton.”
Finally, we have the problem of missing data. In the previous paragraph, we have seen one method of padding the data. Here comes another one. The last sentence of the Analytical Methods section states that missing skeletal data “were imputed using the ‘pmm’ function of the mice R package following previous recommendations.” Now, I need to tell the reader that I get hives and homicidal thoughts whenever I read the terms “imputed” or “imputation,” which nowadays essentially mean conjuring data out of thin air. Interestingly, “imputed” was originally a theological term meaning to falsely ascribe guilt to a person. The way data analysts use the word “imputed” started with economists in 1893 when a step in a multistep process was assigned a value by inference from the value of the process to which it contributes. More recently, the term “impute” was used by the Internal Revenue Service to assign interest to an investment when the interest rate is not known. The fact that imputation is used in scientific research to artificially increase the sample size is an inexcusable obscenity.
In the abstract of the article, it is stated that “frail or unhealthy individuals were more likely to die during the pandemic than those who were not frail.” Now, “frailty” is a nebulous and inexact term often defined as “an aging-related syndrome of physiological decline, characterized by marked vulnerability to adverse health outcomes.” https://www.uptodate.com/contents/frailty Thus, the authors used a skeletal proxy to identify frailty. The proxy was lesions on the shinbones (periostosis). Whether this proxy has anything to do with anything is unknown.
The choice of lesions on the shinbones reminds me on of the Streetlight effect or the Drunkard's search principle—both examples of an observational bias whereby one limits the variables used in a study to those that are easy to obtain regardless of whether the choice is actually relevant to the study question.
Let us now discuss the article’s statistical analyses, results, and conclusions. A good piece of advice that I once got from one of my mentors was “Before you subject your data to complicated analyses and reach extravagant conclusions, do yourself a favor and look at the data carefully and perform some simple analyses.” In this case, subjecting the data to complicated (and frankly confusing) Kaplan–Meier survival analysis, Cox proportional hazards analysis, and Schoenfeld test, one should look at the data and do some long division or at most some simple 2×2 contingency analyses.
The following data was extracted from Table 1 of Wissler and DeWitte (2023).
The first thing I noticed was that 67% of the skeletons in the sample belonged to "frail" people. Does this look like a representative sample of the population of Cleveland, Ohio in 1918? The second thing I noticed is that 69% of the control group have had lesions on their shinbones (either active or healed) versus 59% in the group that succumbed to flu. Thus, by the definition used by Wissler and DeWitte, the group of people that succumbed to influenza were 14–17% less “frail” than the group that survived. Of course, in my simple analysis I can use the skeletons exhibiting active or mixed lesions versus the rest of the skeletons (healed and no lesions). In this case, 25% of the control group turn out to be frail, versus 20% in those that succumbed to flu. This result seemed to support the thesis of Wissler and DeWitte, until you realize that the difference is not statistically significant (Fisher's exact test, P = 0.3074).
After reading this paper very carefully, I am left with one open question. It remains unclear to me how this paper managed to get published in PNAS and become a news sensation. Is this another example of the power of the press release?

















