Keynote Speakers

Building the Hybrid Human-AI Developer: From Code Completion to Agents

Abstract: AI has rapidly transformed the developer landscape, moving from experimental code completion features to indispensable tools integrated into modern IDEs. The impact is undeniable, with significant portions of code now being generated with AI assistance, accelerating development workflows worldwide.

The progress doesn’t stop at code completion. The next frontier is the development of sophisticated AI agents within the IDE, capable of more complex tasks and deeper collaboration. These agents are intelligent systems designed to understand context, perform multi-step actions, and interact collaboratively with the developer. Building these hybrid human-AI systems presents unique challenges: designing intuitive UX, building auxiliary models, and steering the agent towards effective collaboration with the user.

In this talk, Federico will discuss the evolution of AI copilots within IDEs, from the initial release of GitHub Copilot to the emerging paradigm of AI agents in Cursor. We will explore the critical aspects of building successful hybrid human-AI development environments, sharing insights and lessons learned on tackling the technical challenges involved in creating the next generation of intelligent developer tools.

Location: The talk will be held on zoom.

Bio: Federico Cassano is a research scientist at Cursor, where he works on CodeLLM training methodology and infrastructure. His research interests broadly include code generation, distributed training, and reinforcement learning.

Towards Autonomous Language Model Systems

Abstract: Language models (LMs) are increasingly used to assist users in day to day tasks such as programming (Github Copilot) or search (Google's AI Overviews). But can we build language model systems that are able to autonomously complete entire tasks end-to-end? In this talk I'll discuss our efforts to build autonomous LM systems, focusing on the software engineering domain. I'll present SWE-bench, our novel method for measuring AI systems on their abilities to fix real issues in popular software libraries. I'll then discuss SWE-agent, our system for solving SWE-bench tasks. SWE-bench and SWE-agent are used by many leading AI orgs in academia and industry including OpenAI, Anthropic, Meta, and Google, and SWE-bench has been downloaded over 2 million times. These projects show that academics on tight budgets are able to have substantial impact in steering the research community towards building autonomous systems that can complete challenging tasks.

Location: The talk will be held on zoom.

Bio: I am a postdoc at Princeton University where I mainly work with Karthik Narasimhan's lab. I previously completed my PhD at the University of Washington in Seattle, where I was advised by Noah Smith. During my PhD I spent two years at Facebook AI Research Labs on Luke Zettlemoyer's team.

πŸ“ All names are sorted alphabetically by last name.