Doubly Robust Prediction and Evaluation Methods Improve Uplift Modeling for Observational Data


Uplift modeling aims to optimize treatment allocation by predicting the net effect of a treatment on each individual (ITE) and is expected to achieve causal-based personalization in medicine, marketing, etc. This approach needs specialized methods to train and evaluate ITE prediction models because the true ITE is unobservable. The conventional uplift modeling requires data to be gathered through randomized controlled trials (RCTs), on the other hand, for non-RCT data, the transformed outcome (TO) is commonly used as an unbiased estimator of ITE. However, it is often impossible to conduct RCTs for ethical and economic reasons, and, in observational data, the unbiasedness of TO is based on the unrealistic assumption that the propensity score of each individual is given. In this paper, we theoretically and quantitatively show TO becomes an unreliable proxy ITE when the propensity score estimator is biased or has a large degree of heterogeneity. We then propose a novel proxy outcome, Switch Doubly Robust, turning on and off the effect of propensity score estimator on the outcome prediction models. We theoretically prove SDR achieves better bias-variance trade-off as a proxy ITE than TO and develop novel prediction (SDRM) and evaluation (SDR-MSE) methods. Furthermore, we experimentally show our methods outperformed existing approaches on synthetic datasets. In addition, we applied them to the Right Heart Catheterization dataset and discovered 20% of patients are actually curable, even though the conventional causal inference methods only showed the average treatment effect is negative. We anticipate our methods to be a standard practice of uplift modeling for observational data and lead to optimized personalization in various fields.

In Proceedings of the 2019 SIAM International Conference on Data Mining (SDM) (Acceptance rate=22.7%)
Yuta Saito
Yuta Saito
Second-year CS Ph.D. Student