Outcomes

What our cohorts have produced.

Publications, fellowships, and other achievements by our scholars and mentors.

Inference-Time Toxicity Mitigation in Protein Language Models via Logit-Diff Amplification
Manuel Fernández Burda, Santiago Aranguri, Ivan Arcuschin, Enzo Ferrante. Generative and Experimental Perspectives for Biomolecular Design Workshop at ICLR 2026.
Benchmarking AI Control Protocols for Safety in Medical Question-Answering Tasks
Guido Freire, Agustín Martínez-Suñé, Viviana Cotik. Principled Design for Trustworthy AI Workshop at ICLR 2026.
White-Box Monitoring for Personality Mirroring in Conversational AI
Eitan Sprejer, Agustin E. Martinez-Sune, Bruno Bianchi. Catch, Adapt, and Operate: Monitoring ML Models Under Drift Workshop at ICLR 2026.
What Large Language Models Know About Plant Molecular Biology
Manuel Fernández Burda et al. LatinX in AI (LXAI) Research Workshop at NeurIPS 2025. Acknowledges AISAR support; also submitted to Nature Plants.
Is Gemini 3 Scheming in the Wild?
Alejandro Wainstock, Agustín Martínez-Suñé, Ivan Arcuschin, Victor Braberman. LessWrong.

Two scholars were admitted to MARS (Cambridge).
A scholar was admitted to LASR Labs.
A scholar was admitted to ARENA.
Scholars took part in ML4Good Brazil, one as a participant and one as a facilitator.
Scholars took part in BlueDot's AI Strategy course, one as a participant and one as a facilitator.
A scholar gave a talk at a MILA (Quebec) reading group.
A scholar received a career-transition grant from Coefficient Giving.
Manuel Fernández Burda, one of our scholars, was co-first author of Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation (ICLR 2026), separate from his AISAR project.
A scholar's AISAR project became their undergraduate thesis.

Four of the six mentors would not have worked on AI Safety in 2025 without AISAR, and all six plan to continue.
Two CONICET doctoral fellowships were secured for external students on AI Safety topics: one on confidence quantification for safer AI models, and one on formal foundations for reliable LLM-based software.