Outcomes
What our cohorts have produced.
Publications, fellowships, and other achievements by our scholars and mentors.
AISAR 2025
Publications
-
Inference-Time Toxicity Mitigation in Protein Language Models via Logit-Diff Amplification
Manuel Fernández Burda, Santiago Aranguri, Ivan Arcuschin, Enzo Ferrante. Generative and Experimental Perspectives for Biomolecular Design Workshop at ICLR 2026. -
Benchmarking AI Control Protocols for Safety in Medical Question-Answering Tasks
Guido Freire, Agustín Martínez-Suñé, Viviana Cotik. Principled Design for Trustworthy AI Workshop at ICLR 2026. -
White-Box Monitoring for Personality Mirroring in Conversational AI
Eitan Sprejer, Agustin E. Martinez-Sune, Bruno Bianchi. Catch, Adapt, and Operate: Monitoring ML Models Under Drift Workshop at ICLR 2026. -
What Large Language Models Know About Plant Molecular Biology
Manuel Fernández Burda et al. LatinX in AI (LXAI) Research Workshop at NeurIPS 2025. Acknowledges AISAR support; also submitted to Nature Plants. -
Is Gemini 3 Scheming in the Wild?
Alejandro Wainstock, Agustín Martínez-Suñé, Ivan Arcuschin, Victor Braberman. LessWrong.
Scholar outcomes
- Two scholars were admitted to MARS (Cambridge).
- A scholar was admitted to LASR Labs.
- A scholar was admitted to ARENA.
- Scholars took part in ML4Good Brazil, one as a participant and one as a facilitator.
- Scholars took part in BlueDot's AI Strategy course, one as a participant and one as a facilitator.
- A scholar gave a talk at a MILA (Quebec) reading group.
- A scholar received a career-transition grant from Coefficient Giving.
- Manuel Fernández Burda, one of our scholars, was co-first author of Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation (ICLR 2026), separate from his AISAR project.
- A scholar's AISAR project became their undergraduate thesis.
Mentor outcomes
- Four of the six mentors would not have worked on AI Safety in 2025 without AISAR, and all six plan to continue.
- Two CONICET doctoral fellowships were secured for external students on AI Safety topics: one on confidence quantification for safer AI models, and one on formal foundations for reliable LLM-based software.