New Trade Routes

Drawing digital pathways on the new trade maps.

Trade drives the way people interact.  People, products, money, and ideas follow the trade routes and impact everything in their path.  Keeping pace with the way trade routes are changing is essential to success or even survival.  New Trade Routes is working to better understand the changes so we can help our clients, investees, and grantees improve their chances of success.

 

Filtering by Category: Data Science

We Need a Zillow of Public Health

The US needs an automated data store for all things healthcare.  It needs a Zillow of Public Health. The Gates Foundation is uniquely positioned to bring this about.  Many people will say this is impossible. Here are some of the objections they will lob at the idea:

Objection 1:  It is the CDC’s job / the CDC is doing it already / the CDC will do it now

  • The CDC has not done it in the past

  • The CDC is not doing it now

  • The CDC exists in a world where doing it will be very difficult

  • Homeland security has had nearly 20 years to do something similar on the security front and has not succeeded (consider their inability to track children crossing the border)

Objection 2:  There are many regulations that make it impossible for a private foundation to collect and manage health data

Objection 3:  It is too hard, and will never be ready in time to make a difference

  • Yes it is hard and a few years ago it might have been impossible, but new tools and techniques will enable us to see the initial results in just weeks

  • It is the 3,142 counties and over 8,000 hospitals, each with different governance, systems and capabilities that drives this need for an automated central data store

  • Even after the initial wave has passed, efficiently collected and trusted healthcare data will be essential for bringing the economy back and managing future waves without resorting to closing down large portions of the country. 

Objection 4:  The health insurance lobby will kill it (many, many reasons they would find it threatening)

  • They could be persuaded that equal access across the health insurance industry to high quality health data would be better than their current situation

  • They might prefer to support a neutral 3rd party with the reputation of benevolence instead of being subject to data poorly managed by the government, or worse yet, intentionally mismanaged by the government.

Benefits of This Approach

A Zillow user can use a simple map to access information about any house in the United States.  The last actual sale price is listed along with other specific facts about that house. As a result, not only could Zillow tell you exactly how many houses sold last month, but it could give you a complete list of them.  Zillow does this by aggregating the property records from every single county in the country. For the purpose of this discussion, let’s call this approach of capturing every single record the complete data approach.  

The healthcare industry has never adopted the complete data approach.  In healthcare, data is sampled and statistical methods are applied and results are calculated.  This is why the CDC’s best guess as to the number of flu deaths per year range from 12,000 to 61,000.  The CDC is not tracking each case, it is sampling some data and analyzing it.  If there was a complete data system for healthcare, a Zillow for healthcare, you could know exactly how many death certificates issued by county coroners listed flu/pneumonia as the cause of death.  In addition, you would know the age, gender, race, and location of each case. The necessity of this is obvious in the current pandemic environment. Contact tracing at scale is not possible without a complete data approach.

Great Efforts Being Made by the States

State governments maintain systems tracking disease in their jurisdictions.  Washington State is no exception and is using its system to track the coronavirus.  You can see the results here.  This is not an attempt to malign the good people working for the State of Washington.  I have no doubt they are working around the clock to assemble the data. To their credit, they may be adopting a complete data approach, tracking each case of covid19 in our state.  There are several signs that the systems they are using are not up to the task. The note that the last 4-7 days may be under-reported illuminates the struggle the state is facing to get the data into the system.  The note that 15% of the cases are not associated with a county is also a red flag. It also may be true that new cases are 40% less likely to start on weekends, but it does not seem likely, and ultimately, it does not seem likely that the state will be able to sustain the current manual data management process.

Next Up at the Gates Foundation

First let me say, I have no knowledge of what is going on inside the Gates Foundation.  I know they are an organization that uses data to drive decision making, and they attract the best talent in data science.  I would love to know if the Gates Foundation has the appetite to become the trusted center of a complete data approach in US Healthcare.  Because we need a Zillow of Public Health.

Other Recent and Relevant Articles

IHME Covid 19 Updates

Gates Foundation Annual Letter 2020

HBR Vaccine Distribution Article (4/2/2020)

Bill Gates’ Washington Post OpEd (3/31/2020)

Modern Data Management and the COVID 19 Pandemic

For a timely example of modern data management, take a look at the Johns Hopkins visualization of the COVID 19 pandemic. For an example of NOT modern data management, take a look at the CDC reporting on the flu. The contrast is stunning. The COVID 19 visualization updates every 15 minutes and shows its source data and methods. The CDC flu page has not been updated since week 8, February 22.

Modern Data Management
(updated every 15 mins, exposes source data and methods)

Johns Hopkins Visualization of COVID 19 Data - lists its sources, updates every 15 minutes

Johns Hopkins Visualization of COVID 19 Data - lists its sources, updates every 15 minutes

With 100,000 cases and 3,400 deaths we can easily calculate the death rate of 3.4%. We know that many of the governments are not reporting the data accurately — particularly when it comes to cases. So the death rate may be significantly less because the number of cases could be much higher.

It would be interesting to compare the COVID 19 model to the CDC flu model, but so far I have not found a good side by side where the data is presented in similar formats.  It appears that the CDC takes some time to compile its estimates of the impact of the flu. In this report, the 2017-2018 and 2018-2019 seasons are still listed as preliminary and subject to future revision. The elevated attention on COVID 19 and the hour by hour real time reporting is a significantly different method from compiling estimates based on death certificates. It is not hard to imagine a scenario where a COVID 19 death goes unreported.

NOT Modern Data Management

(Update interval unknown - weeks or more; source data and methods unknown)

CDC Flu Report Weekly — 3 week lagtime.

CDC Flu Report Weekly — 3 week lagtime.

The CDC reporting presents deaths as a percent of all deaths in the US and cases as n per 100,000 — making it theoretically possible to calculate the death rate. In the text the CDC estimates there have been at least 32 million cases, 310,000 hospitalizations, and 18,000 deaths from the flu this season. A death rate of .0005. (one 20th of one percent).

There is no question that the COVID 19 dashboard also has shortcomings. Most notably, it can only present the data we have available and as has been widely reported in the news, the US government has barely been testing anyone. So the number of cases in the US is not accurate and everyone knows it.

For those interested in pathogen tracing, check out Nextstrain.

Nextstrain, an open-source project tracking pathogen genome data, does a better job of tracking how the virus travels, but does not do as well in presenting the number of cases, their current status, and fatalities.

NextStrain tracking of the evolution of the virus

NextStrain tracking of the evolution of the virus

I am sure this is interesting to epidemiologists. I am not sure what it telling us though.

For those that think we should give our governments or the CDC a pass because data work is hard. Check out this website put up by a high school student in Mercer Island, WA (by Seattle).