Off-Policy Learning

Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It

There has been a growing interest in off-policy evaluation in the literature such as recommender systems and personalized medicine. We …

Yuta Saito, Masahiro Nomura