Machine Learning
Data-driven approaches to design intelligent algorithms.
MERL has a long history of research activity in machine learning, including the development of various boosting algorithms and contributing to the theory and practice of highly scalable collaborative filtering. Our recent work has focused on deep learning and reinforcement learning, with application to a wide range of applications including automotive, robotics, factory automation, transportation, as well as building and home systems.
Quick Links
-
Researchers
Toshiaki
Koike-Akino
Ye
Wang
Jonathan
Le Roux
Ankush
Chakrabarty
Anoop
Cherian
Gordon
Wichern
Tim K.
Marks
Philip V.
Orlik
Michael J.
Jones
Stefano
Di Cairano
Kieran
Parsons
Daniel N.
Nikovski
Christopher R.
Laughman
Devesh K.
Jha
Pu
(Perry)
WangDiego
Romeres
Chiori
Hori
Bingnan
Wang
Yebin
Wang
Suhas
Lohit
Hassan
Mansour
Matthew
Brand
Petros T.
Boufounos
Arvind
Raghunathan
Moitreya
Chatterjee
Abraham P.
Vinod
Jing
Liu
Jianlin
Guo
Siddarth
Jain
Kuan-Chuan
Peng
Scott A.
Bortoff
Vedang M.
Deshpande
Hongtao
Qiao
William S.
Yerazunis
Radu
Corcodel
François
Germain
Chungwei
Lin
Pedro
Miraldo
Saviz
Mowlavi
Dehong
Liu
Hongbo
Sun
Wataru
Tsujita
Sameer
Khurana
James
Queeney
Ryo
Aihara
Yanting
Ma
Joshua
Rapp
Anthony
Vetro
Jinyun
Zhang
Jose
Amaya
Purnanand
Elango
Abraham
Goldsmith
Alexander
Schperberg
Avishai
Weiss
Janek
Ebbers
-
Awards
-
AWARD University of Padua and MERL team wins the AI Olympics with RealAIGym competition at IROS24 Date: October 17, 2024
Awarded to: Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli, Diego Romeres
MERL Contact: Diego Romeres
Research Areas: Artificial Intelligence, Dynamical Systems, Machine Learning, RoboticsBrief- The team composed of the control group at the University of Padua and MERL's Optimization and Robotic team ranked 1st out of the 4 finalist teams that arrived to the 2nd AI Olympics with RealAIGym competition at IROS 24, which focused on control of under-actuated robots. The team was composed by Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres. The competition was organized by the German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt and Chalmers University of Technology.
The competition and award ceremony was hosted by IEEE International Conference on Intelligent Robots and Systems (IROS) on October 17, 2024 in Abu Dhabi, UAE. Diego Romeres presented the team's method, based on a model-based reinforcement learning algorithm called MC-PILCO.
- The team composed of the control group at the University of Padua and MERL's Optimization and Robotic team ranked 1st out of the 4 finalist teams that arrived to the 2nd AI Olympics with RealAIGym competition at IROS 24, which focused on control of under-actuated robots. The team was composed by Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres. The competition was organized by the German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt and Chalmers University of Technology.
-
AWARD MERL team wins the Listener Acoustic Personalisation (LAP) 2024 Challenge Date: August 29, 2024
Awarded to: Yoshiki Masuyama, Gordon Wichern, Francois G. Germain, Christopher Ick, and Jonathan Le Roux
MERL Contacts: François Germain; Jonathan Le Roux; Gordon Wichern; Yoshiki Masuyama
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.
The LAP Challenge workshop and award ceremony was hosted by the 32nd European Signal Processing Conference (EUSIPCO 24) on August 29, 2024 in Lyon, France. Yoshiki Masuyama presented the team's method, "Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization", and received the award from Prof. Michele Geronazzo (University of Padova, IT, and Imperial College London, UK), Chair of the Challenge's Organizing Committee.
The LAP challenge aims to explore challenges in the field of personalized spatial audio, with the first edition focusing on the spatial upsampling and interpolation of head-related transfer functions (HRTFs). HRTFs with dense spatial grids are required for immersive audio experiences, but their recording is time-consuming. Although HRTF spatial upsampling has recently shown remarkable progress with approaches involving neural fields, HRTF estimation accuracy remains limited when upsampling from only a few measured directions, e.g., 3 or 5 measurements. The MERL team tackled this problem by proposing a retrieval-augmented neural field (RANF). RANF retrieves a subject whose HRTFs are close to those of the target subject at the measured directions from a library of subjects. The HRTF of the retrieved subject at the target direction is fed into the neural field in addition to the desired sound source direction. The team also developed a neural network architecture that can handle an arbitrary number of retrieved subjects, inspired by a multi-channel processing technique called transform-average-concatenate.
- MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.
-
AWARD Jonathan Le Roux elevated to IEEE Fellow Date: January 1, 2024
Awarded to: Jonathan Le Roux
MERL Contact: Jonathan Le Roux
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
Mitsubishi Electric celebrated Dr. Le Roux's elevation and that of another researcher from the company, Dr. Shumpei Kameyama, with a worldwide news release on February 15.
Dr. Jonathan Le Roux has made fundamental contributions to the field of multi-speaker speech processing, especially to the areas of speech separation and multi-speaker end-to-end automatic speech recognition (ASR). His contributions constituted a major advance in realizing a practically usable solution to the cocktail party problem, enabling machines to replicate humans’ ability to concentrate on a specific sound source, such as a certain speaker within a complex acoustic scene—a long-standing challenge in the speech signal processing community. Additionally, he has made key contributions to the measures used for training and evaluating audio source separation methods, developing several new objective functions to improve the training of deep neural networks for speech enhancement, and analyzing the impact of metrics used to evaluate the signal reconstruction quality. Dr. Le Roux’s technical contributions have been crucial in promoting the widespread adoption of multi-speaker separation and end-to-end ASR technologies across various applications, including smart speakers, teleconferencing systems, hearables, and mobile devices.
IEEE Fellow is the highest grade of membership of the IEEE. It honors members with an outstanding record of technical achievements, contributing importantly to the advancement or application of engineering, science and technology, and bringing significant value to society. Each year, following a rigorous evaluation procedure, the IEEE Fellow Committee recommends a select group of recipients for elevation to IEEE Fellow. Less than 0.1% of voting members are selected annually for this member grade elevation.
- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
See All Awards for Machine Learning -
-
News & Events
-
TALK [MERL Seminar Series 2024] Samuel Clarke presents talk titled Audio for Object and Spatial Awareness Date & Time: Wednesday, October 30, 2024; 1:00 PM
Speaker: Samuel Clarke, Stanford University
MERL Host: Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Robotics, Speech & AudioAbstract- Acoustic perception is invaluable to humans and robots in understanding objects and events in their environments. These sounds are dependent on properties of the source, the environment, and the receiver. Many humans possess remarkable intuition both to infer key properties of each of these three aspects from a sound and to form expectations of how these different aspects would affect the sound they hear. In order to equip robots and AI agents with similar if not stronger capabilities, our research has taken a two-fold path. First, we collect high-fidelity datasets in both controlled and uncontrolled environments which capture real sounds of objects and rooms. Second, we introduce differentiable physics-based models that can estimate acoustic properties of objects and rooms from minimal amounts of real audio data, then can predict new sounds from these objects and rooms under novel, “unseen” conditions.
-
TALK [MERL Seminar Series 2024] Tom Griffiths presents talk titled Tools from cognitive science to understand the behavior of large language models Date & Time: Wednesday, September 18, 2024; 1:00 PM
Speaker: Tom Griffiths, Princeton University
Research Areas: Artificial Intelligence, Data Analytics, Machine Learning, Human-Computer InteractionAbstract- Large language models have been found to have surprising capabilities, even what have been called “sparks of artificial general intelligence.” However, understanding these models involves some significant challenges: their internal structure is extremely complicated, their training data is often opaque, and getting access to the underlying mechanisms is becoming increasingly difficult. As a consequence, researchers often have to resort to studying these systems based on their behavior. This situation is, of course, one that cognitive scientists are very familiar with — human brains are complicated systems trained on opaque data and typically difficult to study mechanistically. In this talk I will summarize some of the tools of cognitive science that are useful for understanding the behavior of large language models. Specifically, I will talk about how thinking about different levels of analysis (and Bayesian inference) can help us understand some behaviors that don’t seem particularly intelligent, how tasks like similarity judgment can be used to probe internal representations, how axiom violations can reveal interesting mechanisms, and how associations can reveal biases in systems that have been trained to be unbiased.
See All News & Events for Machine Learning -
-
Research Highlights
-
PS-NeuS: A Probability-guided Sampler for Neural Implicit Surface Rendering -
Quantum AI Technology -
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models -
Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-Aware Spatio-Temporal Sampling -
Steered Diffusion -
Sustainable AI -
Edge-Assisted Internet of Vehicles for Smart Mobility -
Robust Machine Learning -
mmWave Beam-SNR Fingerprinting (mmBSF) -
Video Anomaly Detection -
Biosignal Processing for Human-Machine Interaction -
MERL Shopping Dataset
-
-
Internships
-
CI0083: Internship - Human-Machine Interface with Biosignal Processing
MERL is excited to announce an internship opening for a talented researcher to join our team. We are looking for an individual to contribute to cutting-edge research in human-machine interfaces (HMI) using multi-modal bio-sensors. This is an exciting opportunity to make a real impact in the field of human-machine interaction and biosignal processing, with the aim of publishing at leading research venues.
Ideal Candidate:
- Experienced PhD student or post-graduate researcher
- Strong background in brain-machine interface (BMI)
- Proficient in deep learning and mixed reality (XR)
- Skilled in robot manipulation, bionics, and bio sensing
- Digital modeling of human and environment
- Hands-on experience in Unity3d, ROS, OpenBCI, and XR headsets
If you are passionate about advancing technology in these areas, we encourage you to apply and be part of our innovative research team!
-
CA0114: Internship - Trajectory planning for drones with controllable sensors
MERL is seeking an outstanding intern to collaborate with the Control for Autonomy team in the development of trajectory generation for mobile robots, e.g., drones, equipped with controllable sensors, for information acquisition tasks. The project objective is to optimize drone trajectories and the control of on board sensors (e.g., field of view, pointing angle, etc.) to maximize the amount of information acquired about specified monitored targets while reducing the mission duration. The ideal candidate is expected to be working towards a PhD with a strong emphasis on trajectory generation and control, optimization-based control and planning algorithms and constrained control. Strong programming skills in at least one among Matlab, Python, Julia, C/C++ are required. Experience with experimental drone platforms such as crazyflie, and related software frameworks, such as ROS, are desired. The expected start date is in the late Spring/Early Summer 2025, for a duration of 3-6 months.
Required Specific Experience
- Currently enrolled in a PhD program in Aerospace, Electrical, Mechanical Engineering, Computer Science, Applied Math or a related field
- 2+ years of research in at least some of: optimization-based trajectory generation, convex and non-convex optimization, sensor modeling, information-aware planning
- Strong programming skills in at least one among Matlab, Python, Julia, or C/C++
- Validation of drone planning and control in simulations. Experience with drone experiments is a plus.
-
CI0054: Internship - Anomaly Detection for Operations Technology Security
MERL is seeking a highly motivated and qualified intern to work on anomaly detection for operational technology security. The ideal candidate would have significant research experience in anomaly detection, machine learning, and cybersecurity for operational technology. A mature understanding of modern machine learning methods, proficiency with Python and PyTorch, and a relevant research publication history are expected. Candidates at or beyond the middle of their Ph.D. program are encouraged to apply. The expected duration is for 3 months with flexible start dates (but ideally in December or early January).
Required Specific Experience
- Proficiency with PyTorch framework.
- Research publications in machine learning and anomaly detection.
See All Internships for Machine Learning -
-
Openings
-
CA0093: Research Scientist - Control for Autonomous Systems
-
EA0042: Research Scientist - Control & Learning
See All Openings at MERL -
-
Recent Publications
- "Decentralized, Safe, Multi-agent Motion Planning for Drones Under Uncertainty via Filtered Reinforcement Learning", IEEE Transactions on Control Systems Technology, DOI: 10.1109/TCST.2024.3433229, Vol. 32, No. 6, pp. 2492-2499, January 2025.BibTeX TR2024-136 PDF
- @article{Vinod2025jan,
- author = {Vinod, Abraham P. and Safaoui, Sleiman and Summers, Tyler and Yoshikawa, Nobuyuki and Di Cairano, Stefano}},
- title = {Decentralized, Safe, Multi-agent Motion Planning for Drones Under Uncertainty via Filtered Reinforcement Learning},
- journal = {IEEE Transactions on Control Systems Technology},
- year = 2025,
- volume = 32,
- number = 6,
- pages = {2492--2499},
- month = jan,
- doi = {10.1109/TCST.2024.3433229},
- url = {https://www.merl.com/publications/TR2024-136}
- }
, - "Slaying the HyDRA: Parameter-Efficient Hyper Networks with Low-Displacement Rank Adaptation", Advances in Neural Information Processing Systems (NeurIPS), December 2024.BibTeX TR2024-157 PDF
- @inproceedings{Chen2024dec,
- author = {Chen, Xiangyu and Wang, Ye and Brand, Matthew and Wang, Pu and Liu, Jing and Koike-Akino, Toshiaki}},
- title = {Slaying the HyDRA: Parameter-Efficient Hyper Networks with Low-Displacement Rank Adaptation},
- booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
- year = 2024,
- month = dec,
- url = {https://www.merl.com/publications/TR2024-157}
- }
, - "SuperLoRA: Parameter-Efficient Unified Adaptation of Large Foundation Models", British Machine Vision Conference (BMVC), November 2024.BibTeX TR2024-156 PDF
- @inproceedings{Chen2024nov,
- author = {Chen, Xiangyu and Liu, Jing and Wang, Ye and Wang, Pu and Brand, Matthew and Wang, Guanghui and Koike-Akino, Toshiaki}},
- title = {SuperLoRA: Parameter-Efficient Unified Adaptation of Large Foundation Models},
- booktitle = {British Machine Vision Conference (BMVC)},
- year = 2024,
- month = nov,
- url = {https://www.merl.com/publications/TR2024-156}
- }
, - "RETR: Multi-View Radar Detection Transformer for Indoor Perception", Advances in Neural Information Processing Systems (NeurIPS), November 2024.BibTeX TR2024-159 PDF
- @inproceedings{Yataka2024nov3,
- author = {Yataka, Ryoma and Cardace, Adriano and Wang, Pu and Boufounos, Petros T. and Takahashi, Ryuhei}},
- title = {RETR: Multi-View Radar Detection Transformer for Indoor Perception},
- booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
- year = 2024,
- month = nov,
- url = {https://www.merl.com/publications/TR2024-159}
- }
, - "Single-pixel imaging of spatio-temporal flows using differentiable latent dynamics", IEEE Transactions on Computational Imaging, October 2024.BibTeX TR2024-151 PDF
- @article{Sholokhov2024oct,
- author = {{Sholokhov, Aleksei and Nabi, Saleh and Rapp, Joshua and Brunton, Steven and Kutz, Nathan and Boufounos, Petros T. and Mansour, Hassan}},
- title = {Single-pixel imaging of spatio-temporal flows using differentiable latent dynamics},
- journal = {IEEE Transactions on Computational Imaging},
- year = 2024,
- month = oct,
- url = {https://www.merl.com/publications/TR2024-151}
- }
, - "AI-assisted Field Plate Design of GaN HEMT Device", Advanced Theory and Simulation, October 2024.BibTeX TR2024-152 PDF
- @article{Xiang2024oct,
- author = {Xiang, Xiaofeng and Palash, Rafid and Yagyu, Eiji and Dunham, Scott and Teo, Koon Hoo and Chowdhury, Nadim}},
- title = {AI-assisted Field Plate Design of GaN HEMT Device},
- journal = {Advanced Theory and Simulation},
- year = 2024,
- month = oct,
- url = {https://www.merl.com/publications/TR2024-152}
- }
, - "Learning control of underactuated double pendulum with Model-Based Reinforcement Learning", Competition: AI Olympics With RealAIGym, October 2024.BibTeX TR2024-142 PDF
- @inproceedings{DallaLibera2024oct,
- author = {Dalla Libera, Alberto and Turcato, Niccolò and Giacomuzzo, Giulio and Carli, Ruggero and Romeres, Diego}},
- title = {Learning control of underactuated double pendulum with Model-Based Reinforcement Learning},
- booktitle = {Competition: AI Olympics With RealAIGym},
- year = 2024,
- month = oct,
- url = {https://www.merl.com/publications/TR2024-142}
- }
, - "Analyzing Inference Privacy Risks Through Gradients In Machine Learning", ACM Conference on Computer and Communications Security (CCS), October 2024.BibTeX TR2024-141 PDF
- @inproceedings{Li2024oct,
- author = {Li, Zhuohang and Lowy, Andrew and Liu, Jing and Koike-Akino, Toshiaki and Parsons, Kieran and Malin, Bradley and Wang, Ye}},
- title = {Analyzing Inference Privacy Risks Through Gradients In Machine Learning},
- booktitle = {ACM Conference on Computer and Communications Security (CCS)},
- year = 2024,
- month = oct,
- url = {https://www.merl.com/publications/TR2024-141}
- }
,
- "Decentralized, Safe, Multi-agent Motion Planning for Drones Under Uncertainty via Filtered Reinforcement Learning", IEEE Transactions on Control Systems Technology, DOI: 10.1109/TCST.2024.3433229, Vol. 32, No. 6, pp. 2492-2499, January 2025.
-
Videos
-
Software & Data Downloads
-
DeepBornFNO -
eeg-subject-transfer -
ComplexVAD Dataset -
Millimeter-wave Multi-View Radar Dataset -
Gear Extensions of Neural Radiance Fields -
Long-Tailed Anomaly Detection (LTAD) Dataset -
Target-Speaker SEParation -
Pixel-Grounded Prototypical Part Networks -
Steered Diffusion -
BAyesian Network for adaptive SAmple Consensus -
Simple Multimodal Algorithmic Reasoning Task Dataset -
Partial Group Convolutional Neural Networks -
SOurce-free Cross-modal KnowledgE Transfer -
Audio-Visual-Language Embodied Navigation in 3D Environments -
Nonparametric Score Estimators -
3D MOrphable STyleGAN -
Instance Segmentation GAN -
Audio Visual Scene-Graph Segmentor -
Generalized One-class Discriminative Subspaces -
Hierarchical Musical Instrument Separation -
Generating Visual Dynamics from Sound and Context -
Adversarially-Contrastive Optimal Transport -
Online Feature Extractor Network -
MotionNet -
FoldingNet++ -
Quasi-Newton Trust Region Policy Optimization -
Landmarks’ Location, Uncertainty, and Visibility Likelihood -
Robust Iterative Data Estimation -
Gradient-based Nikaido-Isoda -
Circular Maze Environment -
Discriminative Subspace Pooling -
Kernel Correlation Network -
Fast Resampling on Point Clouds via Graphs -
FoldingNet -
Deep Category-Aware Semantic Edge Detection -
MERL Shopping Dataset
-