Machine Learning Thesis Defense

Speaker
BISWAJIT PARIA
Ph.D. Candidate
Machine Learning Department
Carnegie Mellon University

When
June 14, 2022 11:00am - June 14, 2022 11:00am

Where
In Person and Remote - ET

Description
Black-box optimization (BBO) problems occur frequently in many engineering and scientific disciplines, where one has access to zeroth-order evaluations of a function (black-box), that has to be optimized over a specified domain. In many situations, the function is expensive to evaluate and hence the number of evaluations is limited by a budget. A popular class of algorithms known as Bayesian Optimization model the black-box function via surrogates, and proceed by evaluating points that are most likely to lead to the optimum. Multi-objective optimization (MOO) is another topic in optimization where the goal is to simultaneously optimize for multiple objectives defined over a common domain. Typically, these objectives do not achieve their optima for the same inputs. In such scenarios, rather than searching for a single best solution, a set of Pareto optimal solutions is desired. In this thesis, we study a variety of problems related to BBO and MOO with applications to machine learning problems. The first half of this thesis is about BBO for expensive functions. First we propose a simple and flexible approach for multi-objective black-box optimization (MOBO) based on the idea of random scalarizations. We introduce a notion of multi-objective regret and show that our strategy achieves a zero regret as the budget grows. Next we propose a cost-aware Bayesian optimization framework that takes into account the cost of each evaluation. This approach is useful in settings where the evaluation cost varies across the input domain and low cost evaluations can provide a large amount of information about the maximum. Lastly, we study the effectiveness of neural networks for expensive BBO. We show that a simple greedy approach can achieve a performance close to that of Gaussian process Bayesian optimization. Using recently studied connections between Gaussian processes and training dynamics of very wide neural networks, we prove upper bounds on the regret of our proposed algorithm. The second half of this thesis is about applications of MOO to two differentiable MOO problems. Our first application is learning sparse embeddings for fast retrieval using neural networks. The objectives to be optimized here are retrieval accuracy and retrieval speed. We introduce a novel sparsity regularizer, and demonstrate an annealing strategy that yields a better Pareto frontier of the objectives compared to other methods. For our second application, we consider the problem of hierarchical time series forecasting, where multiple related time series are organized as a hierarchy. We propose an approach that accounts for the hierarchical structure, while being scalable to large hierarchies, and show that it leads to an improved accuracy across most hierarchical levels. Thesis Committee: Barnabás Póczos (Co-chair) Jeff Schneider (Co-chair) Zackary Lipton Abhimanyu Das (Google Research) In Person and Zoom Participation. See announcement.