Efficient Machine Learning for Intelligent Machines
In this talk, I will present how I utilize the strategy of system-level recomposition and algorithm-level decomposition for accelerated machine learning.
Abstract:
Machine learning has long powered intelligent machines such as mobile devices and robotics. Yet as generative models bring transformative results in artificial intelligence, they increasingly outstrip available computational resources. My research aims to address the gap between the computational cost of machine learning models and the computation capacity of intelligent machines. Beyond it, my research also harnesses the efficiency to achieve something previously impossible such as AI designs and AI video on edge devices.
In this talk, I will present how I utilize the strategy of system-level recomposition and algorithm-level decomposition for accelerated machine learning. First, from a systems perspective, I will show how I recompose “loose” visual generative systems to be "compact" for real-time streaming applications. Our generative system achieves speedups of up to 59× over leading Hugging Face pipelines. After that, I will illustrate how these efficient approaches enable new applications in edge devices, art designs, graphics, and robotics.
Next, from an algorithmic angle, I will introduce how I decompose the model and data to make them smaller. The decomposition strategy provides a new solution for LLM compression as a hardware-flexible alternative to current quantization and pruning methods. Beyond accelerating LLM, I will demonstrate its efficiency on vision-language models (VLMs) and on vision-language-action systems for robotics. Afterwards, I will introduce how I decompose the data to enable efficient perception for autonomous driving.
I will conclude by outlining my vision for efficient machine learning and the closed-loop relationship between machine learning and intelligent machines.
BIO: Chenfeng Xu is a Ph.D. candidate at UC Berkeley, advised by Kurt Keutzer and Masayoshi Tomizuka. His research focuses on efficient machine learning and intelligent machines. He has accelerated computational AI applications in embodied AI and generative AI on edge devices. His work has been recognized as notable papers and oral presentations at top conferences such as ICLR, CoRL, and ICRA etc. Chenfeng was awarded the Qualcomm Innovation Fellowship, and his research has been supported by Google Generative AI Grant, Meta, NVIDIA, Toyota Research Institute, Qualcomm, and Stellantis etc. His work has been widely adopted in industry, powering platforms such as Meta’s efficient deep learning toolkit for mobile devices, Baidu Apollo’s autonomous driving systems, and LivePeer’s video infrastructure etc.