Ryuto Koike

Email: ryuto.koike [at] nlp.c.titech.ac.jp

Google Scholar Icon GitHub Icon Twitter Icon LinkedIn Icon CV
About me

Hello! I am a final-year PhD student (est. March 2026) at the Institute of Science Tokyo, advised by Prof. Naoaki Okazaki. My research aims to develop principles and methods for building and ensuring safe, secure, and reliable AI systems, mitigating their negative societal implications. My current work focuses on membership inference attacks, AI-generated text detection, jailbreak defense, and reliable LLM-as-a-judge. My work has been published in top-tier conferences such as ACL, EMNLP, and AAAI, and I have led multiple collaborative projects with Prof. Chris Callison-Burch at UPenn and Prof. Preslav Nakov at MBZUAI. Outside of academia, I was involved as a research advisor for a startup in Japan on multi-lingual text generation.

News
Selected Works (*: equal contribution. †: undergraduate/master's mentee.) [ Google Scholar ]

Machine Text Detectors are Membership Inference Attacks
Ryuto Koike*, Liam Dugan*, Masahiro Kaneko, Chris Callison-Burch, Naoaki Okazaki
Preprint 2025
TL;DR - We theoretically prove that membership inference attacks (MIA) and machine-generated text detection share the same optimal metric, and empirically demonstrate strong cross-task transferability (ρ > 0.6) across diverse domains and generators. Notably, a machine text detector outperforms a state-of-the-art MIA on MIA benchmarks. To support cross-task development and fair evaluation, we introduce MINT, a unified evaluation suite implementing 15 recent methods from both tasks.
🏆 Outstanding Young Researcher’s Paper (21/517 ≈ 4.1%) in ANLP

ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability
Ryuto Koike, Masahiro Kaneko, Ayana Niwa, Preslav Nakov, Naoaki Okazaki
Preprint 2025
TL;DR - We propose ExaGPT, an interpretable AI text detector that identifies a text by checking whether it shares more similar spans with human-written vs. machine-generated texts from a datastore and presents those spans as evidence for users to assess how reliably correct the decision is. ExaGPT achieves both high interpretability and significant performance, outperforming prior interpretable detectors by up to +37.0% accuracy at 1% FPR.

OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples
Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki
AAAI 2024
TL;DR - We propose OUTFOX, a framework that improves the robustness of AI text detectors by allowing both the detector and the attacker to adversarially learn from each other's output through in-context learning, achieving a +41.3% F1 improvement against strong adaptive attacks. This paper is among the first to effectively use AI to detect AI.
🏆 Double Sponsorship Awards (1/140 ≈ 0.7%) in YANS
📸 Featured in Nikkei, NAACL Tutorial, Originality.ai Blog
📈 >150 citations in Google Scholar

How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection
Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki
EMNLP Findings 2024
TL;DR - We reveal the vulnerabiltiies of AI text detectors against prompt diversity in text generation. Specifically, even task-oriented constraints -- constraints that would naturally be included in an instruction and are not related to detection-evasion -- cause existing powerful detectors to degrade their detection performance. We highlight the importance of ensuring prompt diversity to build robust benchmarks grounded in real-world scenarios.

Likelihood-based Mitigation of Evaluation Bias in Large Language Models
Masanari Ohi†, Masahiro Kaneko, Ryuto Koike, Mengsay Loem, Naoaki Okazaki
ACL Findings 2024
TL;DR - We identify a self-preference bias in LLM-as-a-judge i.e., LLMs overrate texts with higher likelihoods while underrating those with lower likelihoods. We further propose a simple yet effective mitigation method via in-context learning, achieving better alignment with human evaluations.
🏆 Outstanding Young Researcher’s Paper (18/427 ≈ 4.2%) in ANLP
🏆 Best Paper Award in Journal of Natural Language Processing
All Publications

Journal

  • Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki. On the Robustness of LLM-Generated Text Detection Against Instruction Diversity. Journal of Natural Language Processing, 33(1), to appear, March 2026.
  • 大井 聖也, 金子 正弘, 小池 隆斗, Mengsay Loem, 岡崎 直観. 大規模言語モデルにおける評価バイアスの尤度に基づく緩和. 自然言語処理, 32(2):480–496, July 2025. 最優秀論文賞 🎉
  • 小池 隆斗, 萩原 将文. スタイル変換を用いたYouTube動画タイトルの生成支援システム. 日本感性工学会論文誌, 21(1):49–55, 2022.

International Conferences

  • Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki. OUTFOX: LLM-generated Essay Detection through In-context Learning with Adversarially Generated Examples. AAAI, pp. 21258–21266, Vancouver, Canada, February 2024.
  • Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki. How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection. Findings of EMNLP, pp. 14384–14395, Miami, Florida, USA, November 2024.
  • Masanari Ohi, Masahiro Kaneko, Ryuto Koike, Mengsay Loem, Naoaki Okazaki. Likelihood-based Mitigation of Evaluation Bias in Large Language Models. Findings of ACL, pp. 3237–3245, Bangkok, Thailand, August 2024.
  • Yuxia Wang, Artem Shelmanov, Jonibek Mansurov, Akim Tsvigun, Vladislav Mikhailov, Rui Xing, Zhuohan Xie, Jiahui Geng, Giovanni Puccetti, Ekaterina Artemova, Jinyan Su, Minh Ngoc Ta, Mervat Abassy, Kareem Elozeiri, Saad El Dine Ahmed, Maiya Goloburda, Tarek Mahmoud, Raj Vardhan Tomar, Alexander Aziz, Nurkhan Laiyk, Osama Mohammed Afzal, Ryuto Koike, Masahiro Kaneko, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov. GenAI Content Detection Task 1: English and Multilingual Machine-generated Text Detection: AI vs. Human. Proceedings of the 1st Workshop on GenAI Content Detection (GenAIDetect), pp. 244–261, COLING, Abu Dhabi, UAE.

Preprints

  • Masanari Oi, Koki Maeda, Ryuto Koike, Daisuke Oba, Nakamasa Inoue, Naoaki Okazaki. From Correspondence to Actions: Human-Like Multi-Image Spatial Reasoning in Multi-modal Large Language Models. arXiv. 2026.
  • Ryuto Koike*, Liam Dugan*, Masahiro Kaneko, Chris Callison-Burch, Naoaki Okazaki. Machine Text Detectors are Membership Inference Attacks. arXiv. 2025.
  • Ryuto Koike, Masahiro Kaneko, Ayana Niwa, Preslav Nakov, Naoaki Okazaki. ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability. arXiv. 2025.
  • Yuxia Wang, Rui Xing, Jonibek Mansurov, Giovanni Puccetti, Zhuohan Xie, Minh Ngoc Ta, Jiahui Geng, Jinyan Su, Mervat Abassy, Saad El Dine Ahmed, Kareem Elozeiri, Nurkhan Laiyk, Maiya Goloburda, Tarek Mahmoud, Raj Vardhan Tomar, Alexander Aziz, Ryuto Koike, Masahiro Kaneko, Artem Shelmanov, Ekaterina Artemova, Vladislav Mikhailov, Akim Tsvigun, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov. Is Human-like Text Liked by Human? Multilingual Human Detection and Preference Against AI. arXiv. 2025.

Domestic Conferences and Symposiums

  • 小池 隆斗*, Liam Dugan*, 金子 正弘, Chris Callison-Burch, 岡崎 直観. 機械生成文検出とメンバーシップ推論は相互に転移可能. 言語処理学会第32回年次大会 (NLP2026), 2026年3月. 若手奨励賞 🎉
  • 一瀬 達矢, Youmi Ma, 大井 聖也, 小池 隆斗, 岡崎 直観. 対照的デコーディングを用いた指示学習データの合成. 言語処理学会第32回年次大会 (NLP2026), 2026年3月.
  • 島田 比奈理, 大葉 大輔, 小池 隆斗, 金子 正弘, 岡崎 直観. 現在と将来の応答の有害性を低減させることによるマルチターン脱獄攻撃の防御手法. 言語処理学会第32回年次大会 (NLP2026), 2026年3月.
  • 齋藤 幸史郎, 小池 隆斗, 金子 正弘, 岡崎 直観. 機械文としての検出されやすさと文章の品質は両立する. 言語処理学会第32回年次大会 (NLP2026), 2026年3月. 若手奨励賞 🎉 スポンサー賞(CyberAgent、ELYZA)🎉
  • 島田 比奈理, 大葉 大輔, 小池 隆斗, 金子 正弘, 岡崎 直観. マルチターンJailbreak攻撃に対する防御アルゴリズムの提案. 第20回YANSシンポジウム (YANS2025), S3-P49, 2025年9月. スポンサー賞(Polaris.AI)🎉
  • 齋藤 幸史郎, 小池 隆斗, 金子 正弘, 岡崎 直観. PUPPET:タスク性能を維持しながらLLMとして検出されやすくする学習フレームワーク. 第20回YANSシンポジウム (YANS2025), S2-P13, 2025年9月.
  • 一瀬 達矢, Youmi Ma, 大井 聖也, 小池 隆斗, 岡崎 直観. 対照的デコーディングを用いた指示学習データの合成. 第20回YANSシンポジウム (YANS2025), S1-P34, 2025年9月. スポンサー賞(CyberAgent)🎉
  • 小池 隆斗, 金子 正弘, 丹羽 彩奈, Preslav Nakov, 岡崎 直観. ExaGPT: 解釈性向上に向けた事例ベース機械文検出. 人工知能学会第39回年次大会 (JSAI2025), 2025年5月.
  • 齋藤 幸史郎, 小池 隆斗, 金子 正弘, 岡崎 直観. PUPPET:タスク性能を維持しながらLLMとして検出されやすくする学習フレームワーク. 言語処理学会第31回年次大会 (NLP2025), P7-5, pp. 2791–2796, 2025年3月.
  • 齋藤 幸史郎, 小池 隆斗, 金子 正弘, 岡崎 直観. 強化学習を用いた、言語理解能力を維持したLLM検出器の性能向上. 第19回YANSシンポジウム (YANS2024), S1-P23, 2024年9月. 奨励賞🎉 スポンサー賞(CyberAgent)🎉
  • 大井 聖也, 金子 正弘, 小池 隆斗, Mengsay Loem, 岡崎 直観. 大規模言語モデルにおける評価バイアスの尤度に基づく緩和. 言語処理学会第30回年次大会 (NLP2024), A11-4, pp. 3021–3026, 2024年3月. 若手奨励賞 🎉
  • 小池 隆斗, 金子 正弘, 岡崎 直観. 制約が異なる指示で生成された文章に対するLLM生成検出の頑健性. 言語処理学会第30回年次大会 (NLP2024), A4-4, pp. 943–948, 2024年3月.
  • 小池 隆斗, 金子 正弘, 岡崎 直観. 敵対的事例を用いたIn-context learningによるLLM生成エッセイの検出. 第18回NLP若手の会シンポジウム, S3-P13, 2023年8月. スポンサー賞(PKSHA Technology、HAKUHODO Technologies)🎉
Experiences
Institute of Science Tokyo, Tokyo, Japan
Doctoral Researcher (2023.04 - Present)
Advisor: Prof. Naoaki Okazaki
University of Pennsylvania, Philadelphia, PA, USA
Visiting Researcher (2024.10 - 2025.10)
Advisor: Prof. Chris Callison-Burch
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE
Research Collaborate (2024.04 - 2025.04)
Advisor: Prof. Preslav Nakov
Exawizards, Inc., Tokyo, Japan
Machine Learning Engineer Intern (2022.02 - 2022.03)
CyberAgent, Inc., Tokyo, Japan
Research Intern (2021.09 - 2022.01)
Software Engineer Intern (2021.07 - 2021.08)
Education
Institute of Science Tokyo (formerly Tokyo Institute of Technology), Tokyo, Japan
Ph.D. in Computer Science (2023.04 - est. 2026.04)
Keio University, Tokyo, Japan
M.S. in Information and Computer Science (2021.04 - 2023.03)
B.S. in Information and Computer Science (2017.04 - 2021.03)
Grants
Honors
Academic Service

©︎ Ryuto Koike / Design: Jon Barron.