Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DATA] Case fatality rate formula is misleading #346

Open
Olivyf opened this issue May 23, 2020 · 4 comments
Open

[DATA] Case fatality rate formula is misleading #346

Olivyf opened this issue May 23, 2020 · 4 comments
Labels

Comments

@Olivyf
Copy link

Olivyf commented May 23, 2020

Which Dataset

Case Fatality Rate is currently calculated as
(Total Deaths) / (Total Positive Cases)

Error Description

The faster the infection rate accelerates the lower the calculated fatality rate, because it takes weeks before new cases results in 1. Fatality or 2. Recovered.
Hence fatality rate is understated because it is compared against undetermined cases. (Open cases)
Thus the public is led to believe chance of fatality once infected is lower than the true chances of "fatality vs recovery".

Suggested fixes

True Fatality Rate must be expressed as a percentage of closed cases.
i.e. (Total Deaths) / (Total Deaths + Total Recovered)

Crosscheck: Fatality Rate + Recovery Rate = 100%

@Olivyf Olivyf added the data label May 23, 2020
@vukosim
Copy link
Member

vukosim commented May 23, 2020

Can you provide any literature to assist in this change @Olivyf?

Currently, the rate used is so that you can compare across countries even with the shortcomings. Making the change, we will then have to list as something else, not as case fatality rate. We can rename to Observed Case Fatality Rate if that makes it more clearer.

See

@shaze
Copy link
Contributor

shaze commented May 24, 2020

For my 2c it may be worth showing both. Looking at the daily data it seems to me like the recoveries data is the noisiest and without knowing the local processes of reporting this we should not over interpret these rates at a daily level while the epidemic is going on. The formula @Olivyf suggests may be an overestimate because as deaths are likely to be reported faster than recoveries. It would also be useful to show both to address the concern of public perceptions -- rather than reporting one number to show two numbers which show a range may be better in showing the uncertainty in the data.

@vukosim
Copy link
Member

vukosim commented May 24, 2020

The reporting of recoveries is slower and erratic so that's why I am concerned. Thanks for your thoughts @shaze

@shaze
Copy link
Contributor

shaze commented May 25, 2020

I think keeping consistency with what other people do is paramount and we shouldn't relabel something that other people have used. But I think there is value in having both. I would suggest as the new one "Reported Resolved Case Fatality Rate". A bit klunky I know. And then explaining in the README

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants