题目：Optimizing Communication in Distributed Data Analytics
主讲人：Ph.D. Qinyi Luo University of Southern California, USA
时间：上午9:00 - 10:00
With the advent of the age of big data, machine learning and graph analytics are two emerging and important applications that extract insights from data. Both of them need to process large volumes of data, e.g., large training data sets in machine learning and graphs with billions of edges. This gives rise to distributed machine learning and graph analytics. In distributed processing, communication between machines is a key factor affecting the performance.
In this talk, I will describe two distributed frameworks we recently developed with optimized communication. First, I will present HOP, a heterogeneity-aware decentralized machine learning framework that eliminates the communication bottleneck in the traditional parameter server and better tolerates system heterogeneity. Based on a key observation about iteration gap, we develop queue-based synchronization mechanisms that efficiently support backup workers, bounded staleness and a novel technique, iteration jump. Second, I will present SympleGraph, a novel distributed graph processing framework that breaks the back box of user-defined functions (UDFs), eliminating the redundant computation and communication. The key insight is to close the semantic gap between the expressed algorithm and the distributed execution.
Qinyi Luo (http://alchem.usc.edu/~qinyi/) is a Ph.D. student at University of Southern California. Her research interests are distributed machine learning and data analytics. She received her B.Eng. degree from the Department of Electronic Engineering, Tsinghua University. She published several papers in the top conferences such as ASPLOS, MICRO and Interspeech. She received a number of honors and awards, including Outstanding Graduate of Beijing, National Scholarship of China, WeTech Qualcomm Global Scholars Award, etc. She was nominated for the Microsoft Research Ada Lovelace Fellowship.