齋藤優太
齋藤優太
ホーム
論文・出版
ブログ
連絡先
履歴書
英語
Light
Dark
Automatic
日本語
English
Off-Policy Evaluation
Long-term Off-Policy Evaluation and Learning
Short- and long-term outcomes of an algorithm often differ, with damaging downstream effects. A known example is a click-bait …
Yuta Saito
,
Himan Abdollahpouri
,
Jesse Anderton
,
Ben Carterette
,
Mounia Lalmas
引用
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction
We study off-policy evaluation (OPE) in slate contextual bandits where a policy selects multi-dimensional actions known as slates. This …
Haruka Kiyohara
,
Masahiro Nomura
,
Yuta Saito
引用
arXiv
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation
Off-Policy Evaluation (OPE) aims to assess the effectiveness of counterfactual policies using only offline logged data and is often …
Haruka Kiyohara
,
Ren Kishimoto
,
Kosuke Kawakami
,
Ken Kobayashi
,
Kazuhide Nakata
,
Yuta Saito
引用
コード
arXiv
Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
Ranking interfaces are everywhere in online platforms. There is thus an ever growing interest in their Off-Policy Evaluation (OPE), …
Haruka Kiyohara
,
Tatsuya Matsuhiro
,
Yusuke Narita
,
Nobuyuki Shimizu
,
Yasuo Yamamoto
,
Yuta Saito
引用
コード
ポスター
スライド
arXiv
Proceedings
Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling
We study off-policy evaluation (OPE) of contextual bandit policies for large discrete action spaces where conventional …
Yuta Saito
,
Qingyang Ren
,
Thorsten Joachims
引用
コード
ポスター
スライド
arXiv
Proceedings
Policy-Adaptive Estimator Selection for Off-Policy Evaluation
Off-policy evaluation (OPE) aims to accurately evaluate the performance of counterfactual policies using only offline logged data. …
Takuma Udagawa
,
Haruka Kiyohara
,
Yusuke Narita
,
Yuta Saito
,
Kei Tateno
引用
コード
スライド
arXiv
オフ方策評価の基礎と動向
齋藤優太
学会誌
Counterfactual Evaluation and Learning for Interactive Systems
Counterfactual estimators enable the use of existing log data to estimate how some new target recommendation policy would have …
Yuta Saito
,
Thorsten Joachims
引用
コード
会議録
Website
Off-Policy Evaluation for Large Action Spaces via Embeddings
Off-policy evaluation (OPE) in contextual bandits has seen rapid adoption in real-world systems, since it enables offline evaluation of …
Yuta Saito
,
Thorsten Joachims
引用
コード
動画
スライド
arXiv
Proceedings
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model
In real-world recommender systems and search engines, optimizing ranking decisions to present a ranked list of relevant items is …
Haruka Kiyohara
,
Yuta Saito
,
Tatsuya Matsuhiro
,
Yusuke Narita
,
Nobuyuki Shimizu
,
Yasuo Yamamoto
引用
コード
arXiv
会議録
»
引用
×