Doug Burger, Microsoft

Doug Burger, Microsoft

Session Chair: Babak Falsafi, EPFL

Talk Title: A Return to (Limited) Explicit Dataflow Execution

Abstract: Specialization, accelerators, and machine learning are all the rage. But most of the world’s computing today still uses conventional RISC or CISC CPUs, which expend significant energy to achieve high single-thread performance. Von Neumann ISAs have been so successful because they provide a clean conceptual target to software while running the complete gamut of algorithms reasonably well. We badly need clean new abstractions that utilize fine-grain parallelism and run energy efficiently. Prior work (such as the UT-Austin TRIPS EDGE ISA and others) showed how to form blocks of computation containing limited-scope dataflow graphs, which can be thought of as small structures (DAGs) mapped to silicon. We will describe some post-TRIPS work that addresses the limitations of the original ISA, and how those extensions can provide energy-efficient execution for single threads compared to a conventional out-of-order superscalar design. We will also describe a specific microarchitecture based on these extensions, and show early results. Finally, we will describe how this notion of “structured computation” can be extended to form accelerators dynamically with minor changes to the CPU, or extended to synthesize efficient accelerators that are specialized to a specific workload.

Bio: Doug Burger is a Technical Fellow working in Microsoft's Azure Hardware division. At Microsoft, he and his team drove the Catapult and Brainwave projects, which accelerate Microsoft's cloud and deep learning services, respectively. Before moving to Microsoft in 2008 he spent ten years on the faculty of the University of Texas at Austin. At UT-Austin, he co-founded (with Steve Keckler) the TRIPS project, which invented EDGE architectures and NUCA caches, fabricating them into working silicon. He received his PhD in Computer Sciences from the University of Wisconsin in 1998.

Doug will be joined by two guest speakers during the keynote:

Aaron Smith, Microsoft

Bio: Aaron Smith is a Principal Researcher at Microsoft and a Reader (Associate Professor) in Informatics at the University of Edinburgh. He received his PhD in Computer Science from the University of Texas at Austin for pioneering work in Explicit Data Graph Execution (EDGE) architectures and leads EDGE architecture research at Microsoft. Aaron’s research has been recognized with an ASPLOS best paper award, MICRO top picks papers, and a Communications of the ACM Research Highlight. He served as General Chair of CGO 2015 and Program Chair of CGO 2017 and has over 40 patents pending.

Greg Wright, Qualcomm Research

Bio: Greg Wright is Senior Director, Engineering, and head of Qualcomm Research Raleigh, North Carolina, where he leads the processor research team that investigates new circuit, microarchitecture, and instruction set architecture techniques. Greg joined Qualcomm in 2010, after ten years at Sun Microsystems Laboratories, California. His research interests include high-performance power-efficient processors, instruction set architecture, virtual machines, and compilers. He holds degrees in mathematics from the University of Cambridge, UK, and a PhD in computer science from the University of Manchester, UK.

Kim Hazelwood, Facebook

Kim Hazelwood, Facebook

Session Chair: Murali Annavaram, University of Southern California

Talk Title: Applied Machine Learning at Facebook Scale: Separating Opportunity from Hype

Abstract: The recent proliferation of applications of Deep Learning has had implications for nearly every subfield of computer science. While several of the enabling technologies have been around for decades, a perfect storm of algorithmic advances, high-quality data, and powerful computational platforms has accelerated the field and its practical applications in an seemingly unparalleled manner. In the past 2-3 years, an entirely new field has appeared at the intersection of machine learning and systems and has quickly gained traction, providing evidence of the broad set of challenges and opportunities in this space.

At Facebook, nearly every visible product is powered by machine learning algorithms at its core, from News Feed ranking to language translation to anomaly detection. Scaling these products to billions of global users has uncovered many fascinating scaling challenges at every layer in the systems stack, with bottlenecks surfacing in compute, storage, and network. Meanwhile, at this scale, many research ideas hit their practical limits, and the ultimate solutions often run counter to common assumptions. Being at the forefront of today’s Deep Learning Era enables a unique view into some early signals about which research directions are particularly intriguing, and which are potentially misguided and/or overinvested.

Bio: Kim Hazelwood is a Senior Engineering Manager leading the AI Infrastructure Foundation efforts at Facebook, which focus on the hardware and software platform design and efficiency for Facebook's many applied machine learning-based products and services. Prior to Facebook, Kim held positions including a tenured Associate Professor at the University of Virginia, Software Engineer at Google, and Director of Systems Research at Yahoo Labs. She received a PhD in Computer Science from Harvard University in 2004, and is the recipient of an NSF CAREER Award, the Anita Borg Early Career Award, the MIT Technology Review Top 35 Innovators under 35 Award, and the ACM SIGPLAN 10-Year Test of Time Award. She currently serves on the Board of Directors of CRA, and has authored over 50 conference papers and one book.

Kunle Olukotun, Stanford University

Kunle Olukotun, Stanford University

Session Chair: Timothy M. Pinkston, University of Southern California

Talk Title: Designing Computer Systems for Software 2.0

Abstract: Employing Machine Learning to generate models from data is replacing traditional software development in many applications. This fundamental shift in how we develop software is known as Software 2.0. However, the continued success of Software 2.0 relies on the availability of powerful, efficient and flexible computer systems. This talk will introduce a design paradigm that exploits the characteristics of Software 2.0 to create computer systems that are optimized for both programmability and performance. The key to the design paradigm is a full-stack approach that integrates algorithms, domain-specific languages, advanced compilation technology and new hardware architectures.

Bio: Kunle Olukotun is the Cadence Design Professor of Electrical Engineering and Computer Science at Stanford University. Olukotun is well known as a pioneer in multicore processor design and the leader of the Stanford Hydra chip multipocessor (CMP) research project. Olukotun founded Afara Websystems to develop high-throughput, low-power multicore processors for server systems. The Afara multicore processor, called Niagara, was acquired by Sun Microsystems. Niagara derived processors now power all Oracle SPARC-based servers. Olukotun currently directs the Stanford Pervasive Parallelism Lab (PPL), which seeks to proliferate the use of heterogeneous parallelism in all application areas using Domain Specific Languages (DSLs). Olukotun is a member of the Data Analytics for What’s Next (DAWN) Lab which is developing infrastructure for usable machine learning. Olukotun is an ACM Fellow and IEEE Fellow for contributions to multiprocessors on a chip and multi-threaded processor design and is the recipient of of the 2018 IEEE Harry H. Goode Memorial Award. Olukotun received his Ph.D. in Computer Engineering from The University of Michigan.

Presentation from the keynote is accessible here.