Yuta Saito
Yuta Saito
Home
Publications
Contact
CV
Light
Dark
Automatic
English
日本語
2024
Effective Off-Policy Evaluation and Learning in Contextual Combinatorial Bandits
We explore off-policy evaluation and learning (OPE/L) in contextual combinatorial bandits (CCB), where a policy selects a subset in the …
Tatsuhiro Shimizu
,
Koichi Tanaka
,
Ren Kishimoto
,
Haruka Kiyohara
,
Masahiro Nomura
,
Yuta Saito
Cite
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It
There has been a growing interest in off-policy evaluation in the literature such as recommender systems and personalized medicine. We …
Yuta Saito
,
Masahiro Nomura
Cite
Proceedings
Long-term Off-Policy Evaluation and Learning
Short- and long-term outcomes of an algorithm often differ, with damaging downstream effects. A known example is a click-bait …
Yuta Saito
,
Himan Abdollahpouri
,
Jesse Anderton
,
Ben Carterette
,
Mounia Lalmas
Cite
Code
arXiv
Proceedings
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction
We study off-policy evaluation (OPE) in slate contextual bandits where a policy selects multi-dimensional actions known as slates. This …
Haruka Kiyohara
,
Masahiro Nomura
,
Yuta Saito
Cite
Code
arXiv
Proceedings
Scalable and Provably Fair Exposure Control for Large-Scale Recommender Systems
Typical recommendation and ranking methods aim to optimize the satisfaction of users, but they are often oblivious to their impact on …
Riku Togashi
,
Kenshi Abe
,
Yuta Saito
Cite
Video
arXiv
Proceedings
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation
Off-Policy Evaluation (OPE) aims to assess the effectiveness of counterfactual policies using only offline logged data and is often …
Haruka Kiyohara
,
Ren Kishimoto
,
Kosuke Kawakami
,
Ken Kobayashi
,
Kazuhide Nakata
,
Yuta Saito
Cite
Code
arXiv
Cite
×