当サイトは、Javascriptを使用しています。Javascriptを無効にして閲覧した場合、コンテンツが正常に動作しないおそれやページが表示されない場合があります。当サイトをご利用の際には、Javascriptを有効にして閲覧下さい。

革新知能統合研究センター逐次的意思決定チーム

チームディレクター伊藤伸志（Ph.D.）

English Page

研究概要

逐次的意思決定チームでは、予測の不確実性や環境の変動の中で、逐次的に合理的な判断を下すためのアルゴリズムや理論の開発に取り組みます。近年、情報技術の進化と共に、大量のデータがリアルタイムで生成される現代において、そのデータを基に合理的な意思決定を行う技術が求められています。この課題に対応するため、変動する環境の中での効果的な意思決定アルゴリズムの理解と、それを支える理論体系の構築・拡張を目指し、オンライン学習やバンディット問題、強化学習などに関連した研究を推進します。

研究主分野

情報学

研究関連分野

工学
数物系科学
情報学基礎論
数理情報学
知能情報学

キーワード

逐次的意思決定
オンライン学習
バンディット問題
強化学習
学習理論

主要論文

1. S. Ito and K. Takemura:
"An Exploration-by-Optimization Approach to Best of Both Worlds in Linear Bandits"
Advances in Neural Information and Processing Systems 36 (NeurIPS), to appear (2023).
2. S. Ito, D. Hatano, H. Sumita, K. Takemura, T. Fukunaga, N. Kakimura, and K.-I. Kawarabayashi:
"Bandit Task Assignment with Unknown Processing Time“
Advances in Neural Information and Processing Systems 36 (NeurIPS), to appear (2023).
3. T. Tsuchiya, S. Ito, and J. Honda:
"Stability-penalty-adaptive follow-the-regularized-leader: Sparsity, game-dependency, and best-of-both-worlds“
Advances in Neural Information and Processing Systems 36 (NeurIPS), to appear (2023).
4. S. Ito and K. Takemura:
"Best-of-Three-Worlds Linear Bandit Algorithm with Variance-Adaptive Regret Bounds"
Proceedings of 36th Conference on Learning Theory (COLT), pp. 2653-2677 (2023).
5. T. Tsuchiya, S. Ito, and J. Honda:
"Further Adaptive Best-of-Both-Worlds Algorithm for Combinatorial Semi-Bandits"
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 8117-8144 (2023).
6. J. Honda, S. Ito, and T. Tsuchiya:
"Follow-the-Perturbed-Leader Achieves Best-of-Both-Worlds for Bandit Problems"
Proceedings of The 34th International Conference on Algorithmic Learning Theory (ALT), pp. 726-754 (2023).
7. T. Tsuchiya, S. Ito, and J. Honda:
"Best-of-Both-Worlds Algorithms for Partial Monitoring"
Proceedings of The 34th International Conference on Algorithmic Learning Theory (ALT), pp. 1484-1515 (2023).
8. S. Ito, T. Tsuchiya, and J. Honda:
"Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs"
Advances in Neural Information and Processing Systems 35 (NeurIPS), pp. 28631-28643 (2022).
9. S. Ito:
"Revisiting Online Submodular Minimization: Gap-Dependent Regret Bounds, Best of Both Worlds and Adversarial Robustness"
Proceedings of the 39th International Conference on Machine Learning (ICML), pp. 9678-9694 (2022).
10. S. Ito, T. Tsuchiya, and J. Honda:
"Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds"
Proceedings of 36th Conference on Learning Theory (COLT), pp. 1421-1422 (2022).

メンバーリスト

主宰者

伊藤伸志: チームディレクター

メンバー

本多淳也: 客員研究員
土屋平: 客員研究員
筒井和詩: 客員研究員
相馬輔: 客員研究員
大城泰平: 客員研究員
坂上晋作: 客員研究員
BAO Han: 客員研究員
玉腰勇司: 研究パートタイマーⅡ
渋川裕生: 研究パートタイマーⅡ

お問い合わせ先

〒103-0027 東京都中央区日本橋1-4-1 日本橋一丁目三井ビルディング 15階
Email: shinji.ito.hh@riken.jp

革新知能統合研究センター 逐次的意思決定チーム