Yuta Saito
Yuta Saito
Home
Publications
Contact
CV
Light
Dark
Automatic
2025
Off-Policy Evaluation and Learning for the Future under Non-Stationarity
Ranking interfaces are everywhere in online platforms. There is thus an ever growing interest in their Off-Policy Evaluation (OPE), …
Tatsuhiro Shimizu
,
Kazuki Kawamura
,
Takanori Muroi
,
Yusuke Narita
,
Kei Tateno
,
Takuma Udagawa
,
Yuta Saito
Cite
OpenReview
A General Framework for Off-Policy Learning with Partially-Observed Reward
Off-policy learning (OPL) in contextual bandits aims to learn a decision-making policy that maximizes the target rewards by using only …
Rikiya Takehi
,
Masahiro Asami
,
Kosuke Kawakami
,
Yuta Saito
Cite
OpenReview
Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits
Off-Policy Evaluation and Learning (OPE/L) in contextual bandits is rapidly gaining popularity in real systems because new policies can …
Yuta Natsubori
,
Masataka Ushiku
,
Yuta Saito
Cite
OpenReview
POTEC: Off-Policy Contextual Bandits for Large Action Spaces via Policy Decomposition
We study off-policy learning (OPL) of contextual bandit policies in large discrete action spaces where existing methods – most of …
Yuta Saito
,
Jihan Yao
,
Thorsten Joachims
Cite
OpenReview
Cite
×