Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10501 publications
    Deep Multi-modal Species Occupancy Modeling
    Timm Haucke
    Yunyi Shen
    Levente Klein
    David Rolnick
    Lauren Gillespie
    Sara Beery
    bioRxiv (2025)
    Preview abstract Occupancy models are tools for modeling the relationship between habitat and species occurrence while accounting for the fact that species may still be present even if not detected. The types of environmental variables typically used for characterizing habitats in such ecological models, such as precipitation or tree cover, are frequently of low spatial resolution, with a single value for a spatial pixel size of, e.g., 1km2. This spatial scale fails to capture the nuances of micro-habitat conditions that can strongly influence species presence, and additionally, as many of these are derived from satellite data, there are aspects of the environment they cannot capture, such as the structure of vegetation below the forest canopy. We propose to combine high-resolution satellite and ground-level imagery to produce multi-modal environmental features that better capture micro-habitat conditions, and incorporate these multi-modal features into hierarchical Bayesian species occupancy models. We leverage pre-trained deep learning models to flexibly capture relevant information directly from raw imagery, in contrast to traditional approaches which rely on derived and/or hand-crafted sets of ecosystem covariates. We implement deep multi-modal species occupancy modeling using a new open-source Python package for ecological modeling, designed for bridging machine learning and statistical ecology. We test our method under a strict evaluation protocol on 16 mammal species across thousands of camera traps in Snapshot USA surveys, and find that multi-modal features substantially enhance predictive power compared to traditional environmental variables alone. Our results not only highlight the predictive value and complementarity of in-situ samples, but also make the case for more closely integrating deep learning models and traditional statistical ecological models. View details
    Perceptual Audio Coding: A 40-Year Historical Perspective
    Juergen Herre
    Schuyler Quackenbush
    Minje Kim
    2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2025)
    Preview abstract In the history of audio and acoustic signal processing perceptual audio coding has certainly excelled as a bright success story by its ubiquitous deployment in virtually all digital media devices, such as computers, tablets, mobile phones, set-top-boxes, and digital radios. From a technology perspective, perceptual audio coding has undergone tremendous development from the first very basic perceptually driven coders (including the popular mp3 format) to today’s full-blown integrated coding/rendering systems. This paper provides a historical overview of this research journey by pinpointing the pivotal development steps in the evolution of perceptual audio coding. Finally, it provides thoughts about future directions in this area. View details
    Preview abstract Users of routing services like Apple Maps, Google Maps, and Waze frequently wonder why a given route is proposed. This question particularly arises when dynamic conditions like traffic and road closures cause unusual routes to be proposed. While many such dynamic conditions may exist in a road network at any time, only a small fraction of those conditions are typically relevant to a given user's route. In this work, we give a simple algorithm that identifies a small set of traffic-laden road segments that answer the following question: Which traffic conditions cause a particular shortest traffic-aware route to differ from the shortest traffic-free route? We theoretically and experimentally show that our algorithm generates small and interpretable answers to this question. View details
    Perceptual Evaluation of a Mix Presentation for Immersive Audio with IAMF
    Carlos Tejeda-Ocampo
    Toni Hirvonen
    Ema Souza-Blanes
    Mahmoud Namazi
    AES 158th Convention of the Audio Engineering Society (2025)
    Preview abstract Immersive audio mix presentations involve transmitting and rendering several audio elements simultaneously. This enables next-generation applications, such as personalized playback. Using immersive loudspeaker and headphone MUSHRA tests, we investigate bitrate vs. quality for a typical mix presentation use case of a foreground stereo element, plus a background Ambisonics scene. For coding, we use Immersive Audio Model and Formats, a recently proposed system for Next-Generation Audio. Excellent quality is achieved at 384 kbit/s even with reasonable amount of personalization. We also propose a framework for content-aware analysis that can significantly reduce the bitrate when using underlying legacy audio coding instances. View details
    Preview abstract Measuring software development can help drive impactful change. However, it’s a complex task, and getting started can be daunting as it involves understanding what you should measure, and determining what you can measure. This article provides a guide to selecting a framework that aligns with organizational measurement strategy. View details
    Matryoshka Model Learning for Improved Elastic Student Models
    Chetan Verma
    Cho-Jui Hsieh
    Ngot Bui
    Yang Zhang
    Wen Chen
    Xin Liu
    Inderjit Dhillon
    2025
    Preview abstract Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development. In this paper, we propose MatTA, a framework for training multiple accurate Student models using a novel Teacher-TA-Student recipe. TA models are larger versions of the Student models with higher capacity, and thus allow Student models to better relate to the Teacher model and also bring in more domain-specific expertise. Furthermore, multiple accurate Student models can be extracted from the TA model. Therefore, despite only one training run, our methodology provides multiple servable options to trade off accuracy for lower serving cost. We demonstrate the proposed method, MatTA, on proprietary datasets and models. Its practical efficacy is underscored by live A/B tests within a production ML system, demonstrating 20% improvement on a key metric. We also demonstrate our method on GPT-2 Medium, a public model, and achieve relative improvements of over 24% on SAT Math and over 10% on the LAMBADA benchmark. View details
    Preview abstract This tutorial examines the progress and scaling limitations of IM-DD based optical technologies and explores how datacenter use cases optimized coherent technology, including a newly proposed polarization-folding, time-diversity approach and a novel single-sideband coherent detection technology—can address some of these challenges View details
    Preview abstract Judging an action’s safety requires knowledge of the context in which the action takes place. To human agents who act in various contexts, this may seem obvious: performing an action such as email deletion may or may not be appropriate depending on the email’s content, the goal (e.g., to erase sensitive emails or to clean up trash), and the type of email address (e.g., work or personal). Unlike people, computational systems have often had only limited agency in limited contexts. Thus, manually crafted policies and user confirmation (e.g., smartphone app permissions or network access control lists), while imperfect, have sufficed to restrict harmful actions. However, with the upcoming deployment of generalist agents that support a multitude of tasks (e.g., an automated personal assistant), we argue that we must rethink security designs to adapt to the scale of contexts and capabilities of these systems. As a first step, this paper explores contextual security in the domain of agents and proposes contextual agent security (Conseca), a framework to generate just-in-time, contextual, and human-verifiable security policies. View details
    Deletion Robust Non-Monotone Submodular Maximization over Matroids
    Paul Duetting
    Federico Fusco
    Ashkan Norouzi Fard
    Journal of Machine Learning Research, 26 (2025), pp. 1-28
    Preview abstract Maximizing a submodular function is a fundamental task in machine learning and in this paper we study the deletion robust version of the problem under the classic matroids constraint. Here the goal is to extract a small size summary of the dataset that contains a high value independent set even after an adversary deleted some elements. We present constant-factor approximation algorithms, whose space complexity depends on the rank $k$ of the matroid and the number $d$ of deleted elements. In the centralized setting we present a $(4.597+O(\eps))$-approximation algorithm with summary size $O( \frac{k+d}{\eps^2}\log \frac{k}{\eps})$ that is improved to a $(3.582+O(\eps))$-approximation with $O(k + \frac{d}{\eps^2}\log \frac{k}{\eps})$ summary size when the objective is monotone. In the streaming setting we provide a $(9.435 + O(\eps))$-approximation algorithm with summary size and memory $O(k + \frac{d}{\eps^2}\log \frac{k}{\eps})$; the approximation factor is then improved to $(5.582+O(\eps))$ in the monotone case. View details
    DroidCCT: Cryptographic Compliance Test via Trillion-Scale Measurement
    Rémi Audebert
    Pedro Barbosa
    Borbala Benko
    Alex (Mac) Mihai
    László Siroki
    Catherine Vlasov
    Annual Computer Security Applications Conference (ACSAC) (2025) (to appear)
    Preview
    Differentiable Approximations for Distance Queries
    David M. Mount
    Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)
    Preview abstract The widespread use of gradient-based optimization has motivated the adaptation of various classical algorithms into differentiable solvers compatible with learning pipelines. In this paper, we investigate the enhancement of traditional geometric query problems such that the result consists of both the geometric function as well as its gradient. Specifically, we study the fundamental problem of distance queries against a set of points P in R^d, which also underlies various similarity measures for learning algorithms. The main result of this paper is a multiplicative (1+epsilon)-approximation of the Euclidean distance to P which is differentiable at all points in R^d \ P with asymptotically optimal bounds on the norms of its gradient and Hessian, from a data structure with storage and query time matching state-of-the-art results for approximate nearest-neighbor searching. The approximation is realized as a regularized distance through a partition-of-unity framework, which efficiently blends multiple local approximations, over a suitably defined covering of space, into a smooth global approximation. In order to obtain the local distance approximations in a manner that facilitates blending, we develop a new approximate Voronoi diagram based on a simple point-location data structure, simplifying away both the lifting transformation and ray shooting. View details
    Reasoning-SQL: Reinforcement Learning with Partial Rewards for Reasoning-Enhanced Text-to-SQL
    Mohammadreza Pourreza
    Shayan Talaei
    Hailong Li
    Azalia Mirhoseini
    Amin Saberi
    Conference on Language Modeling (COLM) (2025) (to appear)
    Preview abstract Text-to-SQL is a challenging task involving multiple reasoning-intensive subtasks, including natural language understanding, database schema comprehension, and precise SQL query formulation. Existing approaches often rely on handcrafted reasoning paths with inductive biases that can limit their overall effectiveness. Motivated by the recent success of reasoning-enhanced models such as DeepSeek R1 and OpenAI o1, which effectively leverage reward-driven self-exploration to enhance reasoning capabilities and generalization, we propose a novel set of partial rewards tailored specifically for the Text-to-SQL task. Our reward set includes schema-linking, AI feedback, n-gram similarity, and syntax check, explicitly designed to address the reward sparsity issue prevalent in reinforcement learning (RL). Leveraging group relative policy optimization (GRPO), our approach explicitly encourages large language models (LLMs) to develop intrinsic reasoning skills necessary for accurate SQL query generation. With models of different sizes, we demonstrate that RL-only training with our proposed rewards consistently achieves higher accuracy and superior generalization compared to supervised fine-tuning (SFT). Remarkably, our RL-trained 14B-parameter model significantly outperforms larger proprietary models, e.g. o3-mini by 4% and Gemini-1.5-Pro-002 by 3% on the BIRD benchmark. These highlight the efficacy of our proposed RL-training framework with partial rewards for enhancing both accuracy and reasoning capabilities in Text-to-SQL tasks. View details
    Preview abstract This IEEE Spectrum article reflects on advocacy for U.S. technological leadership during my Congressional visit through IEEE-USA. Leading an expert group of other distinguished IEEE members, we urged lawmakers to support critical initiatives. Key priorities included sustained funding for federal research institutions like NIST, NASA, and the NSF, reauthorizing the SBIR/STTR programs vital for small business innovation, and passing the CREATE AI Act to democratize AI resources by establishing the National AI Research Resource (NAIRR). We also emphasized strengthening the STEM talent pipeline through the CHIPS and Science Act and expanding high-skilled immigrant visas. We highlighted rapid AI advancements, such as autonomous vehicles, the surge in FDA-approved AI based medical devices, as underscoring the need for these strategic investments and policy actions. The article conveys a sense of urgency, calling for concrete congressional action to ensure the U.S. maintains its technological edge while also sharing my personal experiences. View details
    Visualizing Dynamics of Charges and Strings in (2+1)D Lattice Gauge Theories
    Tyler Cochran
    Bernhard Jobst
    Yuri Lensky
    Gaurav Gyawali
    Norhan Eassa
    Melissa Will
    Aaron Szasz
    Dmitry Abanin
    Rajeev Acharya
    Laleh Beni
    Trond Andersen
    Markus Ansmann
    Frank Arute
    Kunal Arya
    Abe Asfaw
    Juan Atalaya
    Brian Ballard
    Alexandre Bourassa
    Michael Broughton
    David Browne
    Brett Buchea
    Bob Buckley
    Tim Burger
    Nicholas Bushnell
    Anthony Cabrera
    Juan Campero
    Hung-Shen Chang
    Jimmy Chen
    Benjamin Chiaro
    Jahan Claes
    Agnetta Cleland
    Josh Cogan
    Roberto Collins
    Paul Conner
    William Courtney
    Alex Crook
    Ben Curtin
    Sayan Das
    Laura De Lorenzo
    Agustin Di Paolo
    Paul Donohoe
    ILYA Drozdov
    Andrew Dunsworth
    Alec Eickbusch
    Aviv Elbag
    Mahmoud Elzouka
    Vinicius Ferreira
    Ebrahim Forati
    Austin Fowler
    Brooks Foxen
    Suhas Ganjam
    Robert Gasca
    Élie Genois
    William Giang
    Dar Gilboa
    Raja Gosula
    Alejo Grajales Dau
    Dietrich Graumann
    Alex Greene
    Steve Habegger
    Monica Hansen
    Sean Harrington
    Paula Heu
    Oscar Higgott
    Jeremy Hilton
    Robert Huang
    Ashley Huff
    Bill Huggins
    Cody Jones
    Chaitali Joshi
    Pavol Juhas
    Hui Kang
    Amir Karamlou
    Kostyantyn Kechedzhi
    Trupti Khaire
    Bryce Kobrin
    Alexander Korotkov
    Fedor Kostritsa
    John Mark Kreikebaum
    Vlad Kurilovich
    Dave Landhuis
    Tiano Lange-Dei
    Brandon Langley
    Kim Ming Lau
    Justin Ledford
    Kenny Lee
    Loick Le Guevel
    Wing Li
    Alexander Lill
    Will Livingston
    Daniel Lundahl
    Aaron Lunt
    Sid Madhuk
    Ashley Maloney
    Salvatore Mandra
    Leigh Martin
    Orion Martin
    Cameron Maxfield
    Seneca Meeks
    Anthony Megrant
    Reza Molavi
    Sebastian Molina
    Shirin Montazeri
    Ramis Movassagh
    Charles Neill
    Michael Newman
    Murray Ich Nguyen
    Chia Ni
    Kris Ottosson
    Alex Pizzuto
    Rebecca Potter
    Orion Pritchard
    Ganesh Ramachandran
    Matt Reagor
    David Rhodes
    Gabrielle Roberts
    Kannan Sankaragomathi
    Henry Schurkus
    Mike Shearn
    Aaron Shorter
    Noah Shutty
    Vladimir Shvarts
    Vlad Sivak
    Spencer Small
    Clarke Smith
    Sofia Springer
    George Sterling
    Jordan Suchard
    Alex Sztein
    Doug Thor
    Mert Torunbalci
    Abeer Vaishnav
    Justin Vargas
    Sergey Vdovichev
    Guifre Vidal
    Steven Waltman
    Shannon Wang
    Brayden Ware
    Kristi Wong
    Cheng Xing
    Jamie Yao
    Ping Yeh
    Bicheng Ying
    Juhwan Yoo
    Grayson Young
    Yaxing Zhang
    Ningfeng Zhu
    Yu Chen
    Vadim Smelyanskiy
    Adam Gammon-Smith
    Frank Pollmann
    Michael Knap
    Nature, 642 (2025), 315–320
    Preview abstract Lattice gauge theories (LGTs) can be used to understand a wide range of phenomena, from elementary particle scattering in high-energy physics to effective descriptions of many-body interactions in materials. Studying dynamical properties of emergent phases can be challenging, as it requires solving many-body problems that are generally beyond perturbative limits. Here we investigate the dynamics of local excitations in a LGT using a two-dimensional lattice of superconducting qubits. We first construct a simple variational circuit that prepares low-energy states that have a large overlap with the ground state; then we create charge excitations with local gates and simulate their quantum dynamics by means of a discretized time evolution. As the electric field coupling constant is increased, our measurements show signatures of transitioning from deconfined to confined dynamics. For confined excitations, the electric field induces a tension in the string connecting them. Our method allows us to experimentally image string dynamics in a (2+1)D LGT, from which we uncover two distinct regimes inside the confining phase: for weak confinement, the string fluctuates strongly in the transverse direction, whereas for strong confinement, transverse fluctuations are effectively frozen. We also demonstrate a resonance condition at which dynamical string breaking is facilitated. Our LGT implementation on a quantum processor presents a new set of techniques for investigating emergent excitations and string dynamics. View details
    Collaborative Diffusion Model for Recommender System
    Gyuseok Lee
    Yaochen Zhu
    Hwanjo Yu
    Yao Zhou
    Jundong Li
    2025
    Preview abstract Diffusion-based recommender systems (DR) have gained increasing attention for their advanced generative and denoising capabilities. However, existing DR face two central limitations: (i) a trade-off between enhancing generative capacity via noise injection and retaining the loss of personalized information. (ii) the underutilization of rich item-side information. To address these challenges, we present a Collaborative Diffusion model for Recommender System (CDiff4Rec). Specifically, CDiff4Rec generates pseudo-users from item features and leverages collaborative signals from both real and pseudo personalized neighbors identified through behavioral similarity, thereby effectively reconstructing nuanced user preferences. Experimental results on three public datasets show that CDiff4Rec outperforms competitors by effectively mitigating the loss of personalized information through the integration of item content and collaborative signals. View details
    Design a Mobile Site
    View Site in Mobile | Classic
    Share by: