Abstract: The talk is motivated by two questions at the interplay between algorithm design and machine learning: (1) How can we leverage the predictive power of machine learning in algorithm design? and (2) How can algorithms alleviate the computational demands of modern machine learning?
Towards the first question, I will demonstrate the power of data-driven and learning-augmented algorithm design. I will argue that data should be a central component in the algorithm design process itself. Indeed in many instances, inputs are similar across different algorithm executions. Thus, we can hope to extract information from past inputs or exploit other learned information to improve future performance. Towards this end, I will zoom in on a fruitful template for incorporating learning into algorithm design and highlight a success story in designing space efficient data structures for processing large data streams. I hope to convey that learning-augmented algorithm design should be a tool in every algorithmist's toolkit.
Then I will discuss algorithms for scalable ML computations to address the second question. I will focus on my works in understanding global similarity relationships in large high-dimensional datasets, encoded in a similarity matrix. By exploiting geometric structure of specific similarity functions, such as distance or kernel functions, we can understand the capabilities -- and fundamental limitations -- of computing on similarity matrices. Overall, my main message is that sublinear algorithms design principles are instrumental in designing scalable algorithms for big data.
I will conclude with some exciting directions in pushing the boundaries of learning-augmented algorithms, as well as new algorithmic challenges in scalable computations for faster ML.
Bio: Sandeep is a final year PhD student at MIT, advised by Piotr Indyk. His interests are broadly in fast algorithm design. Recently, he has been working in the intersection of machine learning and classical algorithms by designing provable algorithms in various ML settings, such as efficient algorithms for processing large datasets, as well as using ML to inspire algorithm design.