Aaron Harlap

Picture of Aaron Harlap
Contact:
aaron dot harlap at gmail dot com



About me

I'm currently building machine learning systems Determined AI.

Before that I completed my PhD student of at Carnegie Mellon University where I was advised by Greg Ganger and Phil Gibbons.

My work focuses on systems for Machine Learning. In particular I am interested in designing ML systems
that can be efficiently deployed to shared computing environments, such as AWS EC2 and Microsoft Azure.

Before coming to CMU I completed my undergraduate studies at Northeastern University.

[Resume]

Thesis

Improving ML applications in shared computing environments [pdf] [slides]

Publications

PipeDream: Generalized Pipeline Parallelism for DNN Training [link] [pdf]
Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, Phillip B. Gibbons, Matei Zaharia
ACM Symposium on Operating Systems Principles, 2019 (SOSP)

Tributary: spot-dancing for elastic services with latency SLOs [link] [pdf]
Aaron Harlap, Andrew Chung, Alexey Tumanov, Gregory R. Ganger, Phillip B. Gibbons
USENIX Annual Technical Conference, 2018 (Usenix ATC' 18)

PipeDream: Pipeline Parallelism for DNN Training [link] [pdf]
Aaron Harlap, Deepak Narayanan, Amar Phanishayee, Vivek Seshadri, Gregory R. Ganger, Phillip B. Gibbons
SysML, 2018 (SysML'18)

Proteus: agile ML elasticity through tiered reliability in dynamic resource markets [link] [pdf]
Aaron Harlap, Alexey Tumanov, Andrew Chung, Gregory R. Ganger, Phillip B. Gibbons
ACM European Conference on Computer Systems, 2017 (EuroSys'17)

Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds [link] [pdf]
Kevin Hsieh, Aaron Harlap, Nandita Vijaykumar, Dimitris Konomis, Gregory R. Ganger, Phillip B. Gibbons, Onur Mutlu
USENIX Symposium on Networked Systems Design and Implementation, 2017 (NSDI' 17)

Addressing the straggler problem for iterative convergent parallel ML [link] [pdf]
Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, and Eric P. Xing
ACM Symposium on Cloud Computing, 2016 (SoCC' 16)

Talks

Accelerating Deep Learning Development with Distributed Training [slides]
Boston AI Meetup, Boston MA, February 2020

Tributary: spot-dancing for elastic services with latency SLOs [slides]
Aaron Harlap, Andrew Chung, Alexey Tumanov, Gregory R. Ganger, Phillip B. Gibbons
USENIX Annual Technical Conference, 2018 (Usenix ATC' 18), Boston MA, July 2018

PipeDream: Pipeline Parallelism for DNN Training [slides]
Aaron Harlap, Deepak Narayanan, Amar Phanishayee, Vivek Seshadri, Greg Ganger, Phil Gibbons
Parallel Data Lab Retreat, Bedford Springs PA, October 2017

Proteus: agile ML elasticity through tiered reliability in dynamic resource markets [slides]
Aaron Harlap, Alexey Tumanov, Andrew Chung, Gregory R. Ganger, Phillip B. Gibbons
ACM European Conference on Computer Systems, 2017 (EuroSys'17), Belgrade Serbia, April 2017

Agile Elasticity in ML: How to do it and how to take advantage of it [slides]
Aaron Harlap, Alexey Tumanov, Gregory R. Ganger and Phillip B. Gibbons
Parallel Data Lab Retreat, Bedford Springs PA, October 2016

Addressing the straggler problem for iterative convergent parallel ML [slides]
Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, and Eric P. Xing
ACM Symposium on Cloud Computing 2016 (SoCC' 16), Santa Clara CA, October 2016

Addressing the straggler problem for iterative convergent parallel ML [slides]
Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, and Eric P. Xing
Parallel Data Lab Retreat, Bedford Springs PA, October 2015

Teaching

Teaching assistant of Storage Systems (15746/18746), Fall 17. [link]

Teaching assistant of Storage Systems (15746/18746), Fall 16. [link]