Wait - why is Jason talking about COVID-19? And why are these written like FaceBook posts? There’s a longer explanation here but the short version is that my day job for the past 15 years has been developing models of human health effects and medical response for chemical injuries and biological illnesses, including pandemics. I’ve been making these posts on FaceBook and I was asked to put them in a more shareable manner. I’m linking to the posts on the explanation page. These are the original, unedited posts. I’ll continue until I run out of things to say.
___________________________
Originally posted here on March 12th.
Well, since people seemed to get some worthwhile information out of my last post I figured I’d put some more out there about the mathematics of epidemics/pandemics. You see a lot of estimates flying around about how many people will be sick, etc, but it’s a good idea to at least have a sense of where those numbers come from. I’ll start with a phrase that I hate, because people misuse it, but it’s probably worth starting with: All models are wrong, some are useful.
So what does that mean? It doesn’t mean we should ignore models, but it means we should use some models as a framework for discussing possibilities, figuring out gaps, and making plans. I wanted to start with this, because when people throw out numbers (large or small), they generally come from a model. And that model could be useful! But if you don’t know the assumptions that went into that model or the particular analysis, the result is meaningless.
In my last post I mentioned the basic foundation of epidemic models - Susceptible-Exposed-Infectious-Removed. Susceptible people are the people who can become exposed to the disease. The moment COVID-19 came to America, our susceptible population was the population that was currently in America. Exposed people are people who have been in contact with an infectious individual and could, themselves, become infectious. Some models treat this cohort differently or separate it out into finer granularity or don’t include it at all. Infectious people are people who are transmitting the disease. Some models break this into infectious/asymptomatic and infectious/symptomatic. If there is a significant infectious/asymptomatic cohort (people who are infectious but NOT exhibiting any symptoms) a disease can spread like wildfire. Removed people are people who are cleared of the disease - they either recovered or they died. You can put vaccinated people in this cohort, too, although sometimes it’s good to treat vaccinated people separately, especially when vaccination efficacy is not 100% (as in, vaccines don’t always work).
Great, so, how do people model disease spread? The simplest equation is one that looks like this: Newly Exposed People = (Susceptible People/Total Population) * Infectious People * Some Rate.In English, the number of newly exposed people is equal to the portion of the total population that is susceptible times the number of infectious people times a rate that I’ll talk about in a minute. So at the beginning of this outbreak, if we were to model America as one big population (we shouldn’t, I’ll get to why), the newly exposed population on day two would be 249,999,999/250,000,000 * 1 * Rate which is equal to...essentially 1 * Rate. That number will grow each day, but how quickly it grows depends on that Rate.
What is that rate? It’s basically a measure of how many people one person can infect. Sort of...and depending on the model. What matters is, the higher the number, the more people that will be infected by one person. You may hear the term R0 (pronounced R-naught) in the coming days and you’ll hear people say, “If the R0 is greater than one, you’ll have a pandemic,” or something. That’s related to the rate in this equation. It goes without saying, if one person infects more than one person, the epidemic will continue to grow.
The other thing to consider is that population size. There are different types of models, but I’ll focus on three. There are homogeneous mixing models - in this type of model, you assume that your entire population of interest is interacting with each other at the same rate, there are no demographic differences, no lifestyle differences, etc. You can imagine why modeling a country as a homogeneously mixed population is a bad idea, as it assumes someone in NYC has an equal probability of infecting someone on a farm in Idaho as they have infecting their partner. You can assume Seattle is a homogeneously mixed population and you’d be getting CLOSER to a more representative model, but you’re still off.
You can break down Seattle into different subpopulations - divided by region or demographics - and develop a heterogeneous mixing model. This type of model will assume people within those subpopulations are equally likely to infect each other and then there is a different probability of infecting someone in a different subpopulation. So you have a senior home subpopulation and a military base subpopulation - you say everyone in each subpopulation is the same, and there is some additional, often lower probability that someone from a subpopulation can infect someone from a different subpopulation.
Then there’s an agent-based model. That’s the SimCity of models - social customs, demographics, lifestyles, travel routes, etc are applied to individual people who walk around, interact, and get each other sick.
Each of these models have a purpose. The homogeneous mixing model can be good for quick estimates. You can also use them for quick estimates of certain controls. I’ve personally built homogeneous mixing models with vaccinations, post-exposure prophylaxis, quarantine, and isolation cohorts. Those are four of the big ones! Are the answers right? No, of course not - no models are right. Do they give some insight into where we should be putting resources? Absolutely...and the answer is almost always vaccines and isolations.
Heterogeneous models are good at looking at things like school closures, travel restrictions, etc. You want to see what school closures will do? Well, close down that school subpopulation and mix them into the smaller at-home subpopulation. Quarantining in mid-sized groups works well early in an epidemic - the further along an epidemic goes WITHOUT quarantining, the more likely quarantining increases the rate of disease spread. That may sound ridiculous, but if you start home quarantining at a point where 1 in every 4 people in your population is sick, now you’re going to confine one sick person with three healthy people, on average. If I need to spell it out for you, that’s a family of four. That is why it’s important to start staying home as often as you can NOW, not in four weeks.
Agent-based models are good for the more complex population controls like social distancing, compliance - the types of things that are really individual based.
All three of these models will tell you the same exact thing in one case and one case only: they will tell you that if you do nothing, and the R0 is greater than zero, basically every single person in your susceptible population will get sick.
So when you see a number, try to tease out the assumptions that went into that number. And if you can’t - well, just add it to all of the other numbers you will hear.
I do encourage the greater modeling community to discuss their assumptions, and for folks doing the messaging to communicate to the public the assumptions and what those numbers mean to the population. It’s important in these days to have that.
Anyway, that’s what I can offer today. Wash your hands, don’t congregate in large groups if you don’t have to, monitor for symptoms, quarantine yourself if you have suspected contacts with sick people and isolate yourself if you're showing respiratory symptoms.
Love you all, stay safe.
______________________________________
These are my opinions and thoughts and analyses - I am not representing any government agency or my company. More disclaimers on the main page.