News & Events

221 News items, Awards, Events or Talks found.



Learn about the MERL Seminar Series.



  •  TALK    [MERL Seminar Series 2024] Samuel Clarke presents talk titled Audio for Object and Spatial Awareness
    Date & Time: Wednesday, October 30, 2024; 1:00 PM
    Speaker: Samuel Clarke, Stanford University
    MERL Host: Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Robotics, Speech & Audio
    Abstract
    • Acoustic perception is invaluable to humans and robots in understanding objects and events in their environments. These sounds are dependent on properties of the source, the environment, and the receiver. Many humans possess remarkable intuition both to infer key properties of each of these three aspects from a sound and to form expectations of how these different aspects would affect the sound they hear. In order to equip robots and AI agents with similar if not stronger capabilities, our research has taken a two-fold path. First, we collect high-fidelity datasets in both controlled and uncontrolled environments which capture real sounds of objects and rooms. Second, we introduce differentiable physics-based models that can estimate acoustic properties of objects and rooms from minimal amounts of real audio data, then can predict new sounds from these objects and rooms under novel, “unseen” conditions.
  •  
  •  AWARD    University of Padua and MERL team wins the AI Olympics with RealAIGym competition at IROS24
    Date: October 17, 2024
    Awarded to: Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli, Diego Romeres
    MERL Contact: Diego Romeres
    Research Areas: Artificial Intelligence, Dynamical Systems, Machine Learning, Robotics
    Brief
    • The team composed of the control group at the University of Padua and MERL's Optimization and Robotic team ranked 1st out of the 4 finalist teams that arrived to the 2nd AI Olympics with RealAIGym competition at IROS 24, which focused on control of under-actuated robots. The team was composed by Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres. The competition was organized by the German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt and Chalmers University of Technology.

      The competition and award ceremony was hosted by IEEE International Conference on Intelligent Robots and Systems (IROS) on October 17, 2024 in Abu Dhabi, UAE. Diego Romeres presented the team's method, based on a model-based reinforcement learning algorithm called MC-PILCO.
  •  
  •  TALK    [MERL Seminar Series 2024] Tom Griffiths presents talk titled Tools from cognitive science to understand the behavior of large language models
    Date & Time: Wednesday, September 18, 2024; 1:00 PM
    Speaker: Tom Griffiths, Princeton University
    Research Areas: Artificial Intelligence, Data Analytics, Machine Learning, Human-Computer Interaction
    Abstract
    • Large language models have been found to have surprising capabilities, even what have been called “sparks of artificial general intelligence.” However, understanding these models involves some significant challenges: their internal structure is extremely complicated, their training data is often opaque, and getting access to the underlying mechanisms is becoming increasingly difficult. As a consequence, researchers often have to resort to studying these systems based on their behavior. This situation is, of course, one that cognitive scientists are very familiar with — human brains are complicated systems trained on opaque data and typically difficult to study mechanistically. In this talk I will summarize some of the tools of cognitive science that are useful for understanding the behavior of large language models. Specifically, I will talk about how thinking about different levels of analysis (and Bayesian inference) can help us understand some behaviors that don’t seem particularly intelligent, how tasks like similarity judgment can be used to probe internal representations, how axiom violations can reveal interesting mechanisms, and how associations can reveal biases in systems that have been trained to be unbiased.
  •  
  •  AWARD    MERL team wins the Listener Acoustic Personalisation (LAP) 2024 Challenge
    Date: August 29, 2024
    Awarded to: Yoshiki Masuyama, Gordon Wichern, Francois G. Germain, Christopher Ick, and Jonathan Le Roux
    MERL Contacts: François Germain; Jonathan Le Roux; Gordon Wichern; Yoshiki Masuyama
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.

      The LAP Challenge workshop and award ceremony was hosted by the 32nd European Signal Processing Conference (EUSIPCO 24) on August 29, 2024 in Lyon, France. Yoshiki Masuyama presented the team's method, "Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization", and received the award from Prof. Michele Geronazzo (University of Padova, IT, and Imperial College London, UK), Chair of the Challenge's Organizing Committee.

      The LAP challenge aims to explore challenges in the field of personalized spatial audio, with the first edition focusing on the spatial upsampling and interpolation of head-related transfer functions (HRTFs). HRTFs with dense spatial grids are required for immersive audio experiences, but their recording is time-consuming. Although HRTF spatial upsampling has recently shown remarkable progress with approaches involving neural fields, HRTF estimation accuracy remains limited when upsampling from only a few measured directions, e.g., 3 or 5 measurements. The MERL team tackled this problem by proposing a retrieval-augmented neural field (RANF). RANF retrieves a subject whose HRTFs are close to those of the target subject at the measured directions from a library of subjects. The HRTF of the retrieved subject at the target direction is fed into the neural field in addition to the desired sound source direction. The team also developed a neural network architecture that can handle an arbitrary number of retrieved subjects, inspired by a multi-channel processing technique called transform-average-concatenate.
  •  
  •  NEWS    MERL researchers present 9 papers at ACC 2024
    Date: July 10, 2024 - July 12, 2024
    Where: Toronto, Canada
    MERL Contacts: Ankush Chakrabarty; Vedang M. Deshpande; Stefano Di Cairano; Christopher R. Laughman; Arvind Raghunathan; Abraham P. Vinod; Yebin Wang; Avishai Weiss
    Research Areas: Artificial Intelligence, Control, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, Robotics
    Brief
    • MERL researchers presented 9 papers at the recently concluded American Control Conference (ACC) 2024 in Toronto, Canada. The papers covered a wide range of topics including data-driven spatial monitoring using heterogenous robots, aircraft approach management near airports, computation fluid dynamics-based motion planning for drones facing winds, trajectory planning for coordinated monitoring using a team of drones and a ground carrier vehicle, ensemble Kalman smoothing-based model predictive control for motion planning for autonomous vehicles, system identification for Lithium-ion batteries, physics-constrained deep Kalman filters for vapor compression systems, switched reference governors for constrained systems, and distributed road-map monitoring using onboard sensors.

      As a sponsor of the conference, MERL maintained a booth for open discussions with researchers and students, and hosted a special session to discuss highlights of MERL research and work philosophy.

      In addition, Abraham Vinod served as a panelist at the Student Networking Event at the conference. The student networking event provides an opportunity for all interested students to network with professionals working in industry, academia, and national laboratories during a structured event, and encourages their continued participation as the future leaders in the field.
  •  
  •  NEWS    Jianlin Guo delivered a keynote in IEEE ICC 2024 Workshop
    Date: June 13, 2024
    Where: IEEE International Conference on Communications (ICC)
    MERL Contacts: Jianlin Guo; Philip V. Orlik; Kieran Parsons; Pu (Perry) Wang
    Research Areas: Communications, Machine Learning, Signal Processing
    Brief
    • Jianlin Guo delivered a keynote titled "Private IoT Networks" in the IEEE International Conference on Communications (ICC) 2024 Workshop "Industrial Private 5G-and-Beyond Wireless Networks", held in Denver, Colorado from June 9-13. The ICC is one of two IEEE Communications Society’s flagship conferences.

      Abstract: With the advent of private 5G-and-Beyond communication technologies, private IoT networks have been emerging. In private IoT networks, network owners have full control on the network resource management. However, to fully realize private IoT networks, the upper layer technologies need to be developed as well. This keynote presents machine learning based anomaly detection in manufacturing systems, innovative multipath TCP technologies over heterogeneous wireless IoT networks, novel channel resource scheduling in private 5G networks and efficient wireless coexistence of the heterogeneous wireless systems.
  •  
  •  NEWS    MERL at the International Conference on Robotics and Automation (ICRA) 2024
    Date: May 13, 2024 - May 17, 2024
    Where: Yokohama, Japan
    MERL Contacts: Anoop Cherian; Radu Corcodel; Stefano Di Cairano; Chiori Hori; Siddarth Jain; Devesh K. Jha; Jonathan Le Roux; Diego Romeres; William S. Yerazunis
    Research Areas: Artificial Intelligence, Machine Learning, Optimization, Robotics, Speech & Audio
    Brief
    • MERL made significant contributions to both the organization and the technical program of the International Conference on Robotics and Automation (ICRA) 2024, which was held in Yokohama, Japan from May 13th to May 17th.

      MERL was a Bronze sponsor of the conference, and exhibited a live robotic demonstration, which attracted a large audience. The demonstration showcased an Autonomous Robotic Assembly technology executed on MELCO's Assista robot arm and was the collaborative effort of the Optimization and Robotics Team together with the Advanced Technology department at Mitsubishi Electric.

      MERL researchers from the Optimization and Robotics, Speech & Audio, and Control for Autonomy teams also presented 8 papers and 2 invited talks covering topics on robotic assembly, applications of LLMs to robotics, human robot interaction, safe and robust path planning for autonomous drones, transfer learning, perception and tactile sensing.
  •  
  •  TALK    [MERL Seminar Series 2024] Chuchu Fan presents talk titled Neural Certificates and LLMs in Large-Scale Autonomy Design
    Date & Time: Wednesday, May 29, 2024; 12:00 PM
    Speaker: Chuchu Fan, MIT
    MERL Host: Abraham P. Vinod
    Research Areas: Artificial Intelligence, Control, Machine Learning
    Abstract
    • Learning-enabled control systems have demonstrated impressive empirical performance on challenging control problems in robotics. However, this performance often arrives with the trade-off of diminished transparency and the absence of guarantees regarding the safety and stability of the learned controllers. In recent years, new techniques have emerged to provide these guarantees by learning certificates alongside control policies — these certificates provide concise, data-driven proofs that guarantee the safety and stability of the learned control system. These methods not only allow the user to verify the safety of a learned controller but also provide supervision during training, allowing safety and stability requirements to influence the training process itself. In this talk, we present two exciting updates on neural certificates. In the first work, we explore the use of graph neural networks to learn collision-avoidance certificates that can generalize to unseen and very crowded environments. The second work presents a novel reinforcement learning approach that can produce certificate functions with the policies while addressing the instability issues in the optimization process. Finally, if time permits, I will also talk about my group's recent work using LLM and domain-specific task and motion planners to allow natural language as input for robot planning.
  •  
  •  NEWS    Toshiaki Koike-Akino to give a seminar talk at EPFL on quantum AI
    Date: May 22, 2024
    MERL Contact: Toshiaki Koike-Akino
    Research Areas: Artificial Intelligence, Machine Learning
    Brief
    • Toshiaki Koike-Akino is invited to present a seminar talk at EPFL, Switzerland. The talk, entitled "Post-Deep Learning: Emerging Quantum AI Technology", will discuss the recent trends, challenges, and applications of quantum machine learning (QML) technologies. The seminar is organized by Prof. Volkan Cevher and Prof. Giovanni De Micheli. The event invites students, researchers, scholars and professors through EPFL departments including School of Engineering, Communication Science, Life Science, Machine Learning and AI Center.
  •  
  •  NEWS    MERL Papers and Workshops at CVPR 2024
    Date: June 17, 2024 - June 21, 2024
    Where: Seattle, WA
    MERL Contacts: Petros T. Boufounos; Moitreya Chatterjee; Anoop Cherian; Michael J. Jones; Toshiaki Koike-Akino; Jonathan Le Roux; Suhas Lohit; Tim K. Marks; Pedro Miraldo; Jing Liu; Kuan-Chuan Peng; Pu (Perry) Wang; Ye Wang; Matthew Brand
    Research Areas: Artificial Intelligence, Computational Sensing, Computer Vision, Machine Learning, Speech & Audio
    Brief
    • MERL researchers are presenting 5 conference papers, 3 workshop papers, and are co-organizing two workshops at the CVPR 2024 conference, which will be held in Seattle, June 17-21. CVPR is one of the most prestigious and competitive international conferences in computer vision. Details of MERL contributions are provided below.

      CVPR Conference Papers:

      1. "TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models" by H. Ni, B. Egger, S. Lohit, A. Cherian, Y. Wang, T. Koike-Akino, S. X. Huang, and T. K. Marks

      This work enables a pretrained text-to-video (T2V) diffusion model to be additionally conditioned on an input image (first video frame), yielding a text+image to video (TI2V) model. Other than using the pretrained T2V model, our method requires no ("zero") training or fine-tuning. The paper uses a "repeat-and-slide" method and diffusion resampling to synthesize videos from a given starting image and text describing the video content.

      Paper: https://www.merl.com/publications/TR2024-059
      Project page: https://merl.com/research/highlights/TI2V-Zero

      2. "Long-Tailed Anomaly Detection with Learnable Class Names" by C.-H. Ho, K.-C. Peng, and N. Vasconcelos

      This work aims to identify defects across various classes without relying on hard-coded class names. We introduce the concept of long-tailed anomaly detection, addressing challenges like class imbalance and dataset variability. Our proposed method combines reconstruction and semantic modules, learning pseudo-class names and utilizing a variational autoencoder for feature synthesis to improve performance in long-tailed datasets, outperforming existing methods in experiments.

      Paper: https://www.merl.com/publications/TR2024-040

      3. "Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling" by X. Liu, Y-W. Tai, C-T. Tang, P. Miraldo, S. Lohit, and M. Chatterjee

      This work presents a new strategy for rendering dynamic scenes from novel viewpoints. Our approach is based on stratifying the scene into regions based on the extent of motion of the region, which is automatically determined. Regions with higher motion are permitted a denser spatio-temporal sampling strategy for more faithful rendering of the scene. Additionally, to the best of our knowledge, ours is the first work to enable tracking of objects in the scene from novel views - based on the preferences of a user, provided by a click.

      Paper: https://www.merl.com/publications/TR2024-042

      4. "SIRA: Scalable Inter-frame Relation and Association for Radar Perception" by R. Yataka, P. Wang, P. T. Boufounos, and R. Takahashi

      Overcoming the limitations on radar feature extraction such as low spatial resolution, multipath reflection, and motion blurs, this paper proposes SIRA (Scalable Inter-frame Relation and Association) for scalable radar perception with two designs: 1) extended temporal relation, generalizing the existing temporal relation layer from two frames to multiple inter-frames with temporally regrouped window attention for scalability; and 2) motion consistency track with a pseudo-tracklet generated from observational data for better object association.

      Paper: https://www.merl.com/publications/TR2024-041

      5. "RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation" by Z. Yang, J. Liu, P. Chen, A. Cherian, T. K. Marks, J. L. Roux, and C. Gan

      We leverage Large Language Models (LLM) for zero-shot semantic audio visual navigation. Specifically, by employing multi-modal models to process sensory data, we instruct an LLM-based planner to actively explore the environment by adaptively evaluating and dismissing inaccurate perceptual descriptions.

      Paper: https://www.merl.com/publications/TR2024-043

      CVPR Workshop Papers:

      1. "CoLa-SDF: Controllable Latent StyleSDF for Disentangled 3D Face Generation" by R. Dey, B. Egger, V. Boddeti, Y. Wang, and T. K. Marks

      This paper proposes a new method for generating 3D faces and rendering them to images by combining the controllability of nonlinear 3DMMs with the high fidelity of implicit 3D GANs. Inspired by StyleSDF, our model uses a similar architecture but enforces the latent space to match the interpretable and physical parameters of the nonlinear 3D morphable model MOST-GAN.

      Paper: https://www.merl.com/publications/TR2024-045

      2. “Tracklet-based Explainable Video Anomaly Localization” by A. Singh, M. J. Jones, and E. Learned-Miller

      This paper describes a new method for localizing anomalous activity in video of a scene given sample videos of normal activity from the same scene. The method is based on detecting and tracking objects in the scene and estimating high-level attributes of the objects such as their location, size, short-term trajectory and object class. These high-level attributes can then be used to detect unusual activity as well as to provide a human-understandable explanation for what is unusual about the activity.

      Paper: https://www.merl.com/publications/TR2024-057

      MERL co-organized workshops:

      1. "Multimodal Algorithmic Reasoning Workshop" by A. Cherian, K-C. Peng, S. Lohit, M. Chatterjee, H. Zhou, K. Smith, T. K. Marks, J. Mathissen, and J. Tenenbaum

      Workshop link: https://marworkshop.github.io/cvpr24/index.html

      2. "The 5th Workshop on Fair, Data-Efficient, and Trusted Computer Vision" by K-C. Peng, et al.

      Workshop link: https://fadetrcv.github.io/2024/

      3. "SuperLoRA: Parameter-Efficient Unified Adaptation for Large Vision Models" by X. Chen, J. Liu, Y. Wang, P. Wang, M. Brand, G. Wang, and T. Koike-Akino

      This paper proposes a generalized framework called SuperLoRA that unifies and extends different variants of low-rank adaptation (LoRA). Introducing new options with grouping, folding, shuffling, projection, and tensor decomposition, SuperLoRA offers high flexibility and demonstrates superior performance up to 10-fold gain in parameter efficiency for transfer learning tasks.

      Paper: https://www.merl.com/publications/TR2024-062
  •  
  •  NEWS    Diego Romeres gave an invited talk at the Padua University's Seminar series on "AI in Action"
    Date: April 9, 2024
    MERL Contact: Diego Romeres
    Research Areas: Artificial Intelligence, Dynamical Systems, Machine Learning, Optimization, Robotics
    Brief
    • Diego Romeres, Principal Research Scientist and Team Leader in the Optimization and Robotics Team, was invited to speak as a guest lecturer in the seminar series on "AI in Action" in the Department of Management and Engineering, at the University of Padua.

      The talk, entitled "Machine Learning for Robotics and Automation" described MERL's recent research on machine learning and model-based reinforcement learning applied to robotics and automation.
  •  
  •  NEWS    Saviz Mowlavi gave an invited talk at North Carolina State University
    Date: April 12, 2024
    MERL Contact: Saviz Mowlavi
    Research Areas: Control, Dynamical Systems, Machine Learning, Optimization
    Brief
    • Saviz Mowlavi was invited to present remotely at the Computational and Applied Mathematics seminar series in the Department of Mathematics at North Carolina State University.

      The talk, entitled "Model-based and data-driven prediction and control of spatio-temporal systems", described the use of temporal smoothness to regularize the training of fast surrogate models for PDEs, user-friendly methods for PDE-constrained optimization, and efficient strategies for learning feedback controllers for PDEs.
  •  
  •  TALK    [MERL Seminar Series 2024] Na Li presents talk titled Close the Loop: From Data to Actions in Complex Systems
    Date & Time: Wednesday, April 10, 2024; 12:00 PM
    Speaker: Na Li, Harvard University
    MERL Host: Yebin Wang
    Research Areas: Control, Dynamical Systems, Machine Learning
    Abstract
    • The explosive growth of machine learning and data-driven methodologies have revolutionized numerous fields. Yet, translating these successes to the domain of dynamical, physical systems remains a significant challenge, hindered by the complex and often unpredictable nature of such environments. Closing the loop from data to actions in these systems faces many difficulties, stemming from the need for sample efficiency and computational feasibility amidst intricate dynamics, along with many other requirements such as verifiability, robustness, and safety. In this talk, we bridge this gap by introducing innovative approaches that harness representation-based methods, domain knowledge, and the physical structures of systems. We present a comprehensive framework that integrates these components to develop reinforcement learning and control strategies that are not only tailored for the complexities of physical systems but also achieve efficiency, safety, and robustness with provable performance.
  •  
  •  NEWS    Devesh Jha appointed as an Area Chair for NeurIPS 2024
    Date: December 9, 2024 - December 15, 2024
    Where: NeurIPS 2024
    MERL Contact: Devesh K. Jha
    Research Areas: Artificial Intelligence, Machine Learning
    Brief
    • Devesh Jha, a Principal Research Scientist in the Optimization & Intelligent Robtics team, has been appointed as an area chair for Conference on Neural Information Processing Systems (NeurIPS) 2024. NeurIPS is the premier Machine Learning (ML) and Artificial Intelligence (AI) conference that includes invited talks, demonstrations, symposia, and oral and poster presentations of refereed papers.
  •  
  •  NEWS    Ankush Chakrabarty gave a lecture at UT-Austin's Seminar Series on Occupant-Centric Grid-Interactive Buildings
    Date: March 20, 2024
    Where: Austin, TX
    MERL Contact: Ankush Chakrabarty
    Research Areas: Artificial Intelligence, Control, Data Analytics, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization
    Brief
    • Ankush Chakrabarty, Principal Research Scientist in the Multiphysical Systems Team, was invited to speak as a guest lecturer in the seminar series on "Occupant-Centric Grid Interactive Buildings" in the Department of Civil, Architectural and Environmental Engineering (CAEE) at the University of Texas at Austin.

      The talk, entitled "Deep Generative Networks and Fine-Tuning for Net-Zero Energy Buildings" described lessons learned from MERL's recent research on generative models for building simulation and control, along with meta-learning for on-the-fly fine-tuning to adapt and optimize energy expenditure.
  •  
  •  TALK    [MERL Seminar Series 2024] Sanmi Koyejo presents talk titled Are Emergent Abilities of Large Language Models a Mirage?
    Date & Time: Wednesday, March 20, 2024; 1:00 PM
    Speaker: Sanmi Koyejo, Stanford University
    MERL Host: Jing Liu
    Research Areas: Artificial Intelligence, Machine Learning
    Abstract
    • Recent work claims that large language models display emergent abilities, abilities not present in smaller-scale models that are present in larger-scale models. What makes emergent abilities intriguing is two-fold: their sharpness, transitioning seemingly instantaneously from not present to present, and their unpredictability, appearing at seemingly unforeseeable model scales. Here, we present an alternative explanation for emergent abilities: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due to the researcher's choice of metric rather than due to fundamental changes in model behavior with scale. Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continuous metrics produce smooth, continuous predictable changes in model performance. We present our alternative explanation in a simple mathematical model. Via the presented analyses, we provide evidence that alleged emergent abilities evaporate with different metrics or with better statistics, and may not be a fundamental property of scaling AI models.
  •  
  •  EVENT    MERL Contributes to ICASSP 2024
    Date: Sunday, April 14, 2024 - Friday, April 19, 2024
    Location: Seoul, South Korea
    MERL Contacts: Petros T. Boufounos; François Germain; Chiori Hori; Sameer Khurana; Toshiaki Koike-Akino; Jonathan Le Roux; Hassan Mansour; Kieran Parsons; Joshua Rapp; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
    Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning, Robotics, Signal Processing, Speech & Audio
    Brief
    • MERL has made numerous contributions to both the organization and technical program of ICASSP 2024, which is being held in Seoul, Korea from April 14-19, 2024.

      Sponsorship and Awards

      MERL is proud to be a Bronze Patron of the conference and will participate in the student job fair on Thursday, April 18. Please join this session to learn more about employment opportunities at MERL, including openings for research scientists, post-docs, and interns.

      MERL is pleased to be the sponsor of two IEEE Awards that will be presented at the conference. We congratulate Prof. Stéphane G. Mallat, the recipient of the 2024 IEEE Fourier Award for Signal Processing, and Prof. Keiichi Tokuda, the recipient of the 2024 IEEE James L. Flanagan Speech and Audio Processing Award.

      Jonathan Le Roux, MERL Speech and Audio Senior Team Leader, will also be recognized during the Awards Ceremony for his recent elevation to IEEE Fellow.

      Technical Program

      MERL will present 13 papers in the main conference on a wide range of topics including automated audio captioning, speech separation, audio generative models, speech and sound synthesis, spatial audio reproduction, multimodal indoor monitoring, radar imaging, depth estimation, physics-informed machine learning, and integrated sensing and communications (ISAC). Three workshop papers have also been accepted for presentation on audio-visual speaker diarization, music source separation, and music generative models.

      Perry Wang is the co-organizer of the Workshop on Signal Processing and Machine Learning Advances in Automotive Radars (SPLAR), held on Sunday, April 14. It features keynote talks from leaders in both academia and industry, peer-reviewed workshop papers, and lightning talks from ICASSP regular tracks on signal processing and machine learning for automotive radar and, more generally, radar perception.

      Gordon Wichern will present an invited keynote talk on analyzing and interpreting audio deep learning models at the Workshop on Explainable Machine Learning for Speech and Audio (XAI-SA), held on Monday, April 15. He will also appear in a panel discussion on interpretable audio AI at the workshop.

      Perry Wang also co-organizes a two-part special session on Next-Generation Wi-Fi Sensing (SS-L9 and SS-L13) which will be held on Thursday afternoon, April 18. The special session includes papers on PHY-layer oriented signal processing and data-driven deep learning advances, and supports upcoming 802.11bf WLAN Sensing Standardization activities.

      Petros Boufounos is participating as a mentor in ICASSP’s Micro-Mentoring Experience Program (MiME).

      About ICASSP

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 3000 participants.
  •  
  •  TALK    [MERL Seminar Series 2024] Stefanos Nikolaidis presents talk titled Enhancing the Efficiency and Robustness of Human-Robot Interactions
    Date & Time: Friday, March 8, 2024; 1:00 PM
    Speaker: Stefanos Nikolaidis, University of Southern California
    MERL Host: Siddarth Jain
    Research Areas: Machine Learning, Robotics, Human-Computer Interaction
    Abstract
    • While robots have been successfully deployed in factory floors and warehouses, there has been limited progress in having them perform physical tasks with people at home and in the workplace. I aim to bridge the gap between their current performance in human environments and what robots are capable of doing, by making human-robot interactions efficient and robust.

      In the first part of my talk, I discuss enhancing the efficiency of human-robot interactions by enabling robot manipulators to infer the preference of a human teammate and proactively assist them in a collaborative task. I show how we can leverage similarities between different users and tasks to learn compact representations of user preferences and use these representations as priors for efficient inference.

      In the second part, I talk about enhancing the robustness of human-robot interactions by algorithmically generating diverse and realistic scenarios in simulation that reveal system failures. I propose formulating the problem of algorithmic scenario generation as a quality diversity problem and show how standard quality diversity algorithms can discover surprising and unexpected failure cases. I then discuss the development of a new class of quality diversity algorithms that significantly improve the search of the scenario space and the integration of these algorithms with generative models, which enables the generation of complex and realistic scenarios.

      Finally, I conclude the talk with applications in mining operations, collaborative manufacturing and assistive care.
  •  
  •  AWARD    Jonathan Le Roux elevated to IEEE Fellow
    Date: January 1, 2024
    Awarded to: Jonathan Le Roux
    MERL Contact: Jonathan Le Roux
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."

      Mitsubishi Electric celebrated Dr. Le Roux's elevation and that of another researcher from the company, Dr. Shumpei Kameyama, with a worldwide news release on February 15.

      Dr. Jonathan Le Roux has made fundamental contributions to the field of multi-speaker speech processing, especially to the areas of speech separation and multi-speaker end-to-end automatic speech recognition (ASR). His contributions constituted a major advance in realizing a practically usable solution to the cocktail party problem, enabling machines to replicate humans’ ability to concentrate on a specific sound source, such as a certain speaker within a complex acoustic scene—a long-standing challenge in the speech signal processing community. Additionally, he has made key contributions to the measures used for training and evaluating audio source separation methods, developing several new objective functions to improve the training of deep neural networks for speech enhancement, and analyzing the impact of metrics used to evaluate the signal reconstruction quality. Dr. Le Roux’s technical contributions have been crucial in promoting the widespread adoption of multi-speaker separation and end-to-end ASR technologies across various applications, including smart speakers, teleconferencing systems, hearables, and mobile devices.

      IEEE Fellow is the highest grade of membership of the IEEE. It honors members with an outstanding record of technical achievements, contributing importantly to the advancement or application of engineering, science and technology, and bringing significant value to society. Each year, following a rigorous evaluation procedure, the IEEE Fellow Committee recommends a select group of recipients for elevation to IEEE Fellow. Less than 0.1% of voting members are selected annually for this member grade elevation.
  •  
  •  TALK    [MERL Seminar Series 2024] Melanie Mitchell presents talk titled "The Debate Over 'Understanding' in AI's Large Language Models"
    Date & Time: Tuesday, February 13, 2024; 1:00 PM
    Speaker: Melanie Mitchell, Santa Fe Institute
    MERL Host: Suhas Lohit
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Human-Computer Interaction
    Abstract
    • I will survey a current, heated debate in the AI research community on whether large pre-trained language models can be said to "understand" language -- and the physical and social situations language encodes -- in any important sense. I will describe arguments that have been made for and against such understanding, and, more generally, will discuss what methods can be used to fairly evaluate understanding and intelligence in AI systems. I will conclude with key questions for the broader sciences of intelligence that have arisen in light of these discussions.
  •  
  •  TALK    [MERL Seminar Series 2024] Greta Tuckute presents talk titled Computational models of human auditory and language processing
    Date & Time: Wednesday, January 31, 2024; 12:00 PM
    Speaker: Greta Tuckute, MIT
    MERL Host: Sameer Khurana
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Abstract
    • Advances in machine learning have led to powerful models for audio and language, proficient in tasks like speech recognition and fluent language generation. Beyond their immense utility in engineering applications, these models offer valuable tools for cognitive science and neuroscience. In this talk, I will demonstrate how these artificial neural network models can be used to understand how the human brain processes language. The first part of the talk will cover how audio neural networks serve as computational accounts for brain activity in the auditory cortex. The second part will focus on the use of large language models, such as those in the GPT family, to non-invasively control brain activity in the human language system.
  •  
  •  AWARD    Honorable Mention Award at NeurIPS 23 Instruction Workshop
    Date: December 15, 2023
    Awarded to: Lingfeng Sun, Devesh K. Jha, Chiori Hori, Siddharth Jain, Radu Corcodel, Xinghao Zhu, Masayoshi Tomizuka and Diego Romeres
    MERL Contacts: Radu Corcodel; Chiori Hori; Siddarth Jain; Devesh K. Jha; Diego Romeres
    Research Areas: Artificial Intelligence, Machine Learning, Robotics
    Brief
    • MERL Researchers received an "Honorable Mention award" at the Workshop on Instruction Tuning and Instruction Following at the NeurIPS 2023 conference in New Orleans. The workshop was on the topic of instruction tuning and Instruction following for Large Language Models (LLMs). MERL researchers presented their work on interactive planning using LLMs for partially observable robotic tasks during the oral presentation session at the workshop.
  •  
  •  AWARD    MERL team wins the Audio-Visual Speech Enhancement (AVSE) 2023 Challenge
    Date: December 16, 2023
    Awarded to: Zexu Pan, Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux
    MERL Contacts: François Germain; Chiori Hori; Sameer Khurana; Jonathan Le Roux; Gordon Wichern; Yoshiki Masuyama
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.

      The AVSE challenge aims to design better speech enhancement systems by harnessing the visual aspects of speech (such as lip movements and gestures) in a manner similar to the brain’s multi-modal integration strategies. MERL’s system was a scenario-aware audio-visual TF-GridNet, that incorporates the face recording of a target speaker as a conditioning factor and also recognizes whether the predominant interference signal is speech or background noise. In addition to outperforming all competing systems in terms of objective metrics by a wide margin, in a listening test, MERL’s model achieved the best overall word intelligibility score of 84.54%, compared to 57.56% for the baseline and 80.41% for the next best team. The Fisher’s least significant difference (LSD) was 2.14%, indicating that our model offered statistically significant speech intelligibility improvements compared to all other systems.
  •  
  •  TALK    [MERL Seminar Series 2023] Dr. Kristina Monakhova presents talk titled Robust and Physics-informed machine learning for low light imaging
    Date & Time: Tuesday, November 28, 2023; 12:00 PM
    Speaker: Kristina Monakhova, MIT and Cornell
    MERL Host: Joshua Rapp
    Research Areas: Computational Sensing, Computer Vision, Machine Learning, Signal Processing
    Abstract
    • Imaging in low light settings is extremely challenging due to low photon counts, both in photography and in microscopy. In photography, imaging under low light, high gain settings often results in highly structured, non-Gaussian sensor noise that’s hard to characterize or denoise. In this talk, we address this by developing a GAN-tuned physics-based noise model to more accurately represent camera noise at the lowest light, and highest gain settings. Using this noise model, we train a video denoiser using synthetic data and demonstrate photorealistic videography at starlight (submillilux levels of illumination) for the first time.

      For multiphoton microscopy, which is a form a scanning microscopy, there’s a trade-off between field of view, phototoxicity, acquisition time, and image quality, often resulting in noisy measurements. While deep learning-based methods have shown compelling denoising performance, can we trust these methods enough for critical scientific and medical applications? In the second part of this talk, I’ll introduce a learned, distribution-free uncertainty quantification technique that can both denoise and predict pixel-wise uncertainty to gauge how much we can trust our denoiser’s performance. Furthermore, we propose to leverage this learned, pixel-wise uncertainty to drive an adaptive acquisition technique that rescans only the most uncertain regions of a sample. With our sample and algorithm-informed adaptive acquisition, we demonstrate a 120X improvement in total scanning time and total light dose for multiphoton microscopy, while successfully recovering fine structures within the sample.
  •  
  •  NEWS    Ankush Chakrabarty served as Co-Chair of ACM BALANCES 2023
    Date: November 14, 2023
    Where: Istanbul, Turkey
    MERL Contact: Ankush Chakrabarty
    Research Areas: Control, Data Analytics, Machine Learning, Multi-Physical Modeling, Optimization
    Brief
    • Ankush Chakrabarty, Principal Research Scientist in the Multiphysical Systems team at MERL, served as Co-Chair at the 3rd ACM International Workshop on Big Data and Machine Learning for Smart Buildings and Cities (BALANCES'23). The workshop places spotlights on two different IEA EBC Annexes: the Annex 81 - Data-Driven Smart Buildings and Annex 82 - Energy Flexible Buildings Towards Resilient Low Carbon Energy Systems.
  •