IPW weights bug. Missuse of np.average rather than mean of weighted outcomes #67

marcoBmota8 · 2024-05-16T13:52:24Z

Hi,

I believe there is a mistake in the calculation of outcomes in IPW.
The code is using np.average to compute the expected outcome across treatment groups. This computation sums the weighted outcomes and normalizes by the sum of the weights which is incorrect for IPW:
sum(weights*y)/sum(weights)

Instead it should be the mean of the weighted outcomes:
sum(weights*y)/len(y)

This is an easy but important fix.

The text was updated successfully, but these errors were encountered:

marcoBmota8 · 2024-05-16T15:25:12Z

Additional information to this issue

How the IPW is implemented as of now is not wrong per se. It is just another (better in most cases) way to implement IPW, the Hajek estimator. It reduces variance at the cost of bias (no free lunch) with respect to my proposal (Hortvitz-Thompson estimator) which is unbiased but normally leads to high variance.

Maybe the package would be better if it implemented both and let the user decide.

ehudkr · 2024-05-17T06:55:13Z

Hi Marco,
Thank you so much for bringing this up to my attention. This is indeed an important nuance that should be made more explicit.
I agree both options can be implemented, should be simple enough (let me know if you have the time and will to give it a try 🙃).

I'll keep this issue open until solved.

Thanks again.

ehudkr · 2024-05-17T07:13:10Z

Note to self for contemplating an elastic-net-like solution for any combination between the Hájek and Horvitz–Thompson estimators a-la equation (1) from Khan and Ugander (2023).

marcoBmota8 · 2024-05-17T13:34:39Z

Sounds good. I will fork and make a pull request when I get a chance.

That ENet aproach also looks interesting. Might be worth exploring a benchmark sudy on synthetic data ;).

Best,

marcoBmota8 · 2024-06-15T02:07:49Z

Just made the pull request with my fork changes for the HT & Hajek IPW implementations.
I tried to follow the existing code structure and make as few changes as possible but feel free to lmk if there is something that can be improved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPW weights bug. Missuse of np.average rather than mean of weighted outcomes #67

IPW weights bug. Missuse of np.average rather than mean of weighted outcomes #67

marcoBmota8 commented May 16, 2024 •

edited

Loading

marcoBmota8 commented May 16, 2024 •

edited

Loading

ehudkr commented May 17, 2024

ehudkr commented May 17, 2024

marcoBmota8 commented May 17, 2024

marcoBmota8 commented Jun 15, 2024 •

edited

Loading

IPW weights bug. Missuse of np.average rather than mean of weighted outcomes #67

IPW weights bug. Missuse of np.average rather than mean of weighted outcomes #67

Comments

marcoBmota8 commented May 16, 2024 • edited Loading

marcoBmota8 commented May 16, 2024 • edited Loading

ehudkr commented May 17, 2024

ehudkr commented May 17, 2024

marcoBmota8 commented May 17, 2024

marcoBmota8 commented Jun 15, 2024 • edited Loading

marcoBmota8 commented May 16, 2024 •

edited

Loading

marcoBmota8 commented May 16, 2024 •

edited

Loading

marcoBmota8 commented Jun 15, 2024 •

edited

Loading