Hiroo Takizawa (SOKENDAI/NII LLMC) received the Encouragement Award (奨励賞) at the 20th Young Researchers Symposium on Natural Language Processing (YANS2025) for his presentation “Evaluating Context-Following Capabilities of LLMs in Japanese.”
Author: aizawa
(2025.9.23)Our website has moved to a new server.
Our website has moved to a new server. Some content is still under construction.
(2025.5.1) Seminar Talks by Dr. Frieder Simon
Seminar talk was given by Dr. Frieder Simon. The talks was titled “The AIMO (AI Math Olympiad) and LLMs’ mathematical abilities.” Thank you for interesting talks.
(2025.9.3) The NTCIR-19 kickoff event was held
NTCIR-19 Kickoff Event was held. We introduced a new pilot task
(2025.9.3) The NTCIR-19 pilot task SciClaimEval has begun
For more details on sciclaimeval, click here.
(2025.9.1) MCQFormatBench dataset released
Multiple-Choice Questions (MCQs) are frequently used to evaluate Large Language Models (LLMs). We have released MCQFormatBench, an evaluation dataset designed to assess a model’s robustness to the MCQ format. We categorized the answer process for MCQs into four types and designed eight tasks. By converting 600 questions sampled from the MMLU dataset for each task,… Continue reading (2025.9.1) MCQFormatBench dataset released