Seminarios

AISAR 2025

  • Kei Nishimura-Gasparian
    Early Signs of Steganographic Capabilities in Frontier LLMs
  • Paul Bogdan & Uzay Macar
    Thought Anchors: Which LLM Reasoning Steps Matter?
  • Adrià Garriga-Alonso
    Reverse-engineering a neural network that plans: a mesa-optimizer model organism
  • Oscar Balcells Obeso
    Real-Time Detection of Hallucinated Entities in Long-Form Generation
  • Mikhail Terekhov
    Control Tax: The Price of Keeping AI in Check
  • Joar Skalse
    The Theoretical Foundations of Reward Learning
  • Fernando Rosas
    AI in a vat: Fundamental limits of efficient world modelling
  • Nora Ammann
    Safeguarded AI: a scalable workflow for safety-by-construction
  • Matthew Farrugia-Roberts
    Mitigating Goal Misgeneralization via Minimax Regret