Ryuto Koike

Ryuto Koike

Email: ryuto.koike [at] nlp.comp.isct.ac.jp

About me

Hello! I am a postdoctoral researcher at the Institute of Science Tokyo, under Prof. Naoaki Okazaki. My research aims to make AI systems safer and more trustworthy by evaluating and mitigating risks such as misuse, privacy leakage, and unsafe behavior in AI agents. My work has been published in top-tier conferences such as ACL, EMNLP, and AAAI, and I have led multiple collaborative projects with Prof. Chris Callison-Burch at UPenn and Prof. Preslav Nakov at MBZUAI. Outside of academia, I was involved as a research advisor for a startup in Japan on multi-lingual text generation.

News

Jul. 2026: A paper was accepted to COLM 2026 @San Francisco 🇺🇸
May. 2026: A paper was accepted to ICML 2026 MemFM Workshop @Seoul 🇰🇷
May. 2026: A paper was accepted to ICML 2026 @Seoul 🇰🇷
Apr. 2026: Two papers with MBZUAI were accepted to ACL 2026 @San Diego 🇺🇸
Feb. 2026: Gave an invited talk (slides) @Google Research
Jan. 2025: Organized the GenAI Content Detection workshop at COLING 2025 @Abu Dhabi 🇦🇪
Oct. 2024: Started working at Chris Callison-Burch's group as a Visiting Researcher @University of Pennsylvania 🇺🇸
Sep. 2024: A paper was accepted to EMNLP 2024 @Miami 🇺🇸
May. 2024: A paper was accepted to ACL 2024 @Bangkok 🇹🇭
Dec. 2023: A paper was accepted to AAAI 2024 @Vancouver 🇨🇦
Oct. 2023: Our paper OUTFOX was featured in Nikkei
Apr. 2023: Started my PhD journey at Okazaki Lab, Institute of Science Tokyo (formerly Tokyo Tech) 🇯🇵

Selected Works (*: equal contribution. †: undergraduate/master's mentee.) [ Google Scholar ]

	Machine Text Detectors are Membership Inference Attacks Ryuto Koike, Liam Dugan, Masahiro Kaneko, Chris Callison-Burch, Naoaki Okazaki ICML 2026 MemFM Workshop paper / code TL;DR - We theoretically prove that membership inference attacks (MIA) and machine-generated text detection share the same optimal metric, and empirically demonstrate strong cross-task transferability (ρ ≈ 0.7) across diverse domains and generators. Notably, a machine text detector outperforms a state-of-the-art MIA on MIA benchmarks. To support cross-task development and fair evaluation, we introduce MINT, a unified evaluation suite implementing 15 recent methods from both tasks. 🏆 Outstanding Young Researcher’s Paper (21/517 ≈ 4.1%) in ANLP 🗣️ Talk invited at Google Research
	ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability Ryuto Koike, Masahiro Kaneko, Ayana Niwa, Preslav Nakov, Naoaki Okazaki ACL 2026 Findings paper / code TL;DR - We propose ExaGPT, an interpretable AI text detector that identifies a text by checking whether it shares more similar spans with human-written vs. machine-generated texts from a datastore and presents those spans as evidence for users to assess how reliably correct the decision is. ExaGPT achieves both high interpretability and significant performance, outperforming prior interpretable detectors by up to +37.0% accuracy at 1% FPR.
	OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki AAAI 2024 paper / data + code / technical appendix TL;DR - We propose OUTFOX, a framework that improves the robustness of AI text detectors by allowing both the detector and the attacker to adversarially learn from each other's output through in-context learning, achieving a +41.3% F1 improvement against strong adaptive attacks. This paper is among the first to effectively use AI to detect AI. 🏆 Double Sponsorship Awards (1/140 ≈ 0.7%) in YANS 📸 Featured in Nikkei, NAACL Tutorial, Originality.ai Blog 📈 >170 citations in Google Scholar
	How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki EMNLP 2024 Findings paper / data TL;DR - We reveal the vulnerabiltiies of AI text detectors against prompt diversity in text generation. Specifically, even task-oriented constraints -- constraints that would naturally be included in an instruction and are not related to detection-evasion -- cause existing powerful detectors to degrade their detection performance. We highlight the importance of ensuring prompt diversity to build robust benchmarks grounded in real-world scenarios.
	Likelihood-based Mitigation of Evaluation Bias in Large Language Models Masanari Ohi†, Masahiro Kaneko, Ryuto Koike, Mengsay Loem, Naoaki Okazaki ACL 2024 Findings paper / code TL;DR - We identify a self-preference bias in LLM-as-a-judge i.e., LLMs overrate texts with higher likelihoods while underrating those with lower likelihoods. We further propose a simple yet effective mitigation method via in-context learning, achieving better alignment with human evaluations. 🏆 Outstanding Young Researcher’s Paper (18/427 ≈ 4.2%) in ANLP 🏆 Best Paper Award in Journal of Natural Language Processing

All Publications

Journal

Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki. On the Robustness of LLM-Generated Text Detection Against Instruction Diversity. Journal of Natural Language Processing. 2026. [paper]
Masanari Ohi, Masahiro Kaneko, Ryuto Koike, Mengsay Loem, Naoaki Okazaki. Likelihood-based Mitigation of Evaluation Bias in Large Language Models. Journal of Natural Language Processing. 2025. [paper] Best Paper Award 🎉
Ryuto Koike, Masafumi Hagiwara. A support system for generating attractive YouTube titles using text style transfer. Transactions of Japan Society of Kansei Engineering. 2022. [paper]

Preprints

Koshiro Saito†, Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki. LLM Output Detectability and Task Performance Can be Jointly Optimized. arXiv. 2026. [paper]

International Conference

Tatsuya Ichinose†, Youmi Ma, Masanari Oi, Ryuto Koike, Naoaki Okazaki. Synthesizing Instruction-Tuning Datasets with Contrastive Decoding. COLM 2026. [paper]
Ryuto Koike*, Liam Dugan*, Masahiro Kaneko, Chris Callison-Burch, Naoaki Okazaki. Machine Text Detectors are Membership Inference Attacks. ICML 2026 Workshop on the Impact of Memorization on Trustworthy Foundation Models (MemFM). [paper]
Masanari Oi, Koki Maeda, Ryuto Koike, Daisuke Oba, Nakamasa Inoue, Naoaki Okazaki. From Correspondence to Actions: Human-Like Multi-Image Spatial Reasoning in Multi-modal Large Language Models. ICML 2026. [paper]
Ryuto Koike, Masahiro Kaneko, Ayana Niwa, Preslav Nakov, Naoaki Okazaki. ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability. ACL 2026 Findings. [paper]
Yuxia Wang, Rui Xing, Jonibek Mansurov, Giovanni Puccetti, Zhuohan Xie, Minh Ngoc Ta, Jiahui Geng, Jinyan Su, Mervat Abassy, Saad El Dine Ahmed, Kareem Elozeiri, Nurkhan Laiyk, Maiya Goloburda, Tarek Mahmoud, Raj Vardhan Tomar, Alexander Aziz, Ryuto Koike, Masahiro Kaneko, Artem Shelmanov, Ekaterina Artemova, Vladislav Mikhailov, Akim Tsvigun, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov. Is Human-like Text Liked by Human? Multilingual Human Detection and Preference Against AI. ACL 2026 (Oral) [paper]
Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki. OUTFOX: LLM-generated Essay Detection through In-context Learning with Adversarially Generated Examples. AAAI 2024. [paper]
Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki. How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection. EMNLP 2024 Findings. [paper]
Masanari Ohi†, Masahiro Kaneko, Ryuto Koike, Mengsay Loem, Naoaki Okazaki. Likelihood-based Mitigation of Evaluation Bias in Large Language Models. ACL 2024 Findings. [paper]
Yuxia Wang, Artem Shelmanov, Jonibek Mansurov, Akim Tsvigun, Vladislav Mikhailov, Rui Xing, Zhuohan Xie, Jiahui Geng, Giovanni Puccetti, Ekaterina Artemova, Jinyan Su, Minh Ngoc Ta, Mervat Abassy, Kareem Elozeiri, Saad El Dine Ahmed, Maiya Goloburda, Tarek Mahmoud, Raj Vardhan Tomar, Alexander Aziz, Nurkhan Laiyk, Osama Mohammed Afzal, Ryuto Koike, Masahiro Kaneko, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov. GenAI Content Detection Task 1: English and Multilingual Machine-generated Text Detection: AI vs. Human. COLING 2025 Workshop on GenAI Content Detection (GenAIDetect). [paper]

Domestic Conference and Symposium (in Japanese)

小池隆斗*, Liam Dugan*, 金子正弘, Chris Callison-Burch, 岡崎直観. 機械生成文検出とメンバーシップ推論は相互に転移可能. 言語処理学会第32回年次大会 (NLP2026), 2026年3月. 若手奨励賞 🎉
一瀬達矢, Youmi Ma, 大井聖也, 小池隆斗, 岡崎直観. 対照的デコーディングを用いた指示学習データの合成. 言語処理学会第32回年次大会 (NLP2026), 2026年3月.
島田比奈理, 大葉大輔, 小池隆斗, 金子正弘, 岡崎直観. 現在と将来の応答の有害性を低減させることによるマルチターン脱獄攻撃の防御手法. 言語処理学会第32回年次大会 (NLP2026), 2026年3月.
齋藤幸史郎, 小池隆斗, 金子正弘, 岡崎直観. 機械文としての検出されやすさと文章の品質は両立する. 言語処理学会第32回年次大会 (NLP2026), 2026年3月. 若手奨励賞 🎉 スポンサー賞（CyberAgent, ELYZA）🎉
島田比奈理, 大葉大輔, 小池隆斗, 金子正弘, 岡崎直観. マルチターンJailbreak攻撃に対する防御アルゴリズムの提案. 第20回YANSシンポジウム (YANS2025), S3-P49, 2025年9月. スポンサー賞（Polaris.AI）🎉
齋藤幸史郎, 小池隆斗, 金子正弘, 岡崎直観. PUPPET：タスク性能を維持しながらLLMとして検出されやすくする学習フレームワーク. 第20回YANSシンポジウム (YANS2025), S2-P13, 2025年9月.
一瀬達矢, Youmi Ma, 大井聖也, 小池隆斗, 岡崎直観. 対照的デコーディングを用いた指示学習データの合成. 第20回YANSシンポジウム (YANS2025), S1-P34, 2025年9月. スポンサー賞（CyberAgent）🎉
小池隆斗, 金子正弘, 丹羽彩奈, Preslav Nakov, 岡崎直観. ExaGPT: 解釈性向上に向けた事例ベース機械文検出. 人工知能学会第39回年次大会 (JSAI2025), 2025年5月.
齋藤幸史郎, 小池隆斗, 金子正弘, 岡崎直観. PUPPET：タスク性能を維持しながらLLMとして検出されやすくする学習フレームワーク. 言語処理学会第31回年次大会 (NLP2025), P7-5, pp. 2791–2796, 2025年3月.
齋藤幸史郎, 小池隆斗, 金子正弘, 岡崎直観. 強化学習を用いた、言語理解能力を維持したLLM検出器の性能向上. 第19回YANSシンポジウム (YANS2024), S1-P23, 2024年9月. 奨励賞 🎉 スポンサー賞（CyberAgent）🎉
大井聖也, 金子正弘, 小池隆斗, Mengsay Loem, 岡崎直観. 大規模言語モデルにおける評価バイアスの尤度に基づく緩和. 言語処理学会第30回年次大会 (NLP2024), A11-4, pp. 3021–3026, 2024年3月. 若手奨励賞 🎉
小池隆斗, 金子正弘, 岡崎直観. 制約が異なる指示で生成された文章に対するLLM生成検出の頑健性. 言語処理学会第30回年次大会 (NLP2024), A4-4, pp. 943–948, 2024年3月.
小池隆斗, 金子正弘, 岡崎直観. 敵対的事例を用いたIn-context learningによるLLM生成エッセイの検出. 第18回NLP若手の会シンポジウム, S3-P13, 2023年8月. スポンサー賞（PKSHA Technology, HAKUHODO Technologies）🎉

Experiences
	Institute of Science Tokyo, Tokyo, Japan Doctoral Researcher (2023.04 - 2026.03) Advisor: Prof. Naoaki Okazaki
	University of Pennsylvania, Philadelphia, PA, USA Visiting Researcher (2024.10 - 2025.10) Advisor: Prof. Chris Callison-Burch
	Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE Research Collaborate (2024.04 - 2025.04) Advisor: Prof. Preslav Nakov
	Exawizards, Inc., Tokyo, Japan Machine Learning Engineer Intern (2022.02 - 2022.03)
	CyberAgent, Inc., Tokyo, Japan Research Intern (2021.09 - 2022.01) Software Engineer Intern (2021.07 - 2021.08)

Education
	Institute of Science Tokyo (formerly Tokyo Institute of Technology), Tokyo, Japan Ph.D. in Computer Science (2023.04 - 2026.03)
	Keio University, Tokyo, Japan M.S. in Information and Computer Science (2021.04 - 2023.03) B.S. in Information and Computer Science (2017.04 - 2021.03)

Grants

Off-Campus Study Plus in Tokyo Tech SPRING Scholarship
Tokyo Institute of Technology, 2024.
Research Funds: 900,000 JPY / APPROX 6,000 USD
Tobitate! (Leap for Tomorrow) Study Abroad Scholarship (Acceptance Rate: 16.7%)
The Ministry of Education, Culture, Sports, Science and Technology (MEXT), 2024.
Scholarship: 1,920,000 JPY / APPROX 12,100 USD per year, Preparation Funds: 350,000 JPY / APPROX 2,200 USD
Tokyo Tech SPRING Scholarship
Tokyo Institute of Technology, Apr. 2024 - Mar.2026.
Scholarship: 2,160,000 JPY / APPROX 14,400 USD per year, Research Funds: 300,000 JPY / APPROX 2,000 USD per year,
Full Tuition Exemption.
Tokyo Tech Advanced Human Resource Development Fellowship for Doctoral Students
Tokyo Institute of Technology, Apr. 2023 - Mar. 2024.
Scholarship: 1,800,000 JPY / APPROX 12,000 USD per year, Research Funds: 300,000 JPY / APPROX 2,000 USD per year,
Full Tuition Exemption.

Talks

Invited talk. Machine Text Detectors are Membership Inference Attacks. Google Privacy ML Seminar. Virtual. February 4, 2026. [slides]

Honors

PhD Graduate Representative
Department of Computer Science, School of Computing, Institute of Science Tokyo.
Outstanding Young Researcher’s Paper Award
The 32th Annual Meeting of The Association for Natural Language Processing (NLP 2026).
Sponsorship Awards from CyberAgent, Inc. and ELYZA, Inc.
The 32th Annual Meeting of The Association for Natural Language Processing (NLP 2026).
Best Paper Award
Journal of Natural Language Processing. 2025.
Sponsorship Award from CyberAgent, Inc.
The 20th Symposium of Young Researcher Association for NLP Studies (YANS 2025).
Sponsorship Award from Polaris.AI
The 20th Symposium of Young Researcher Association for NLP Studies (YANS 2025).
Encouragement Award and Sponsorship Award from CyberAgent, Inc.
The 19th Symposium of Young Researcher Association for NLP Studies (YANS 2024).
Sponsorship Awards from PKSHA Technology and HAKUHODO Technologies
The 18th Symposium of Young Researcher Association for NLP Studies (YANS 2023).
Silver Medal
Kaggle, Mechanisms of Action (MoA) Prediction, 2020.

Academic Service

Reviewer/Program Committee: ACL (2023-2026), EMNLP (2024-2025), ICLR (2026), ICML (2026), NeurIPS (2025), AAAI (2026), COLM (2026), COLING (2025), AACL (2025)
Journal Reviewer: Journal of Natural Language Processing
Workshop/Shared Task Organizer: GenAI Content Detection (GenAIDetect), COLING 2025