Conference Program

Sunday June 3rd, 2018

6:00pm Reception (Hollywood Ballroom Terrace and Pool Deck, 7th floor)

Monday June 4th, 2018

7:30am Breakfast (Wilshire Grand Ballroom foyer/lobby area)
8:15am Welcoming Remarks (Wilshire Grand Ballroom II)
8:30am Keynote 1: Applied Machine Learning at Facebook Scale: Separating Opportunity from Hype, Kim Hazelwood, Facebook (Wilshire Grand Ballroom II)
9:30am Session 1A: Clouds & Datacenters (Wilshire Grand Ballroom I) Session 1B: Accelerators for Emerging Apps (Wilshire Grand Ballroom III)
10:30am Break (Wilshire Grand Ballroom foyer/lobby area)
11:00am Session 2A: Prefetching (Wilshire Grand Ballroom I) Session 2B: Languages & Models (Wilshire Grand Ballroom III)
12:00pm Lunch & Panel (Wilshire Grand Ballroom II)
2:00pm Session 3A: Virtual Memory (Wilshire Grand Ballroom I) Session 3B: Coherence & Memory Ordering (Wilshire Grand Ballroom III)
3:20pm Break (Wilshire Grand Ballroom foyer/lobby area)
3:50pm Session 4A: Emerging Paradigms (Wilshire Grand Ballroom I) Session 4B: Persistence (Wilshire Grand Ballroom III)
5:00pm Turing Lecture (Wilshire Grand Ballroom II)
6:00pm Turing Lecture Reception (Wilshire Grand Ballroom foyer/lobby area)
7:30pm SIGARCH/TCCA Business Meeting (Wilshire Grand Ballroom III)

Tuesday June 5th, 2018

7:30am Breakfast (Wilshire Grand Ballroom foyer/lobby area)
8:30am Keynote 2: Designing Computer Systems for Software 2.0, Kunle Olukotun, Stanford University (Wilshire Grand Ballroom II)
9:30am Session 5A: Emerging Memory 1 (Wilshire Grand Ballroom I) Session 5B: Storage (Wilshire Grand Ballroom III)
10:30am Break (Wilshire Grand Ballroom foyer/lobby area)
11:00am Session 6A: Emerging Memory 2 (Wilshire Grand Ballroom I) Session 6B: Controllers & Control Systems (Wilshire Grand Ballroom III)
12:00pm Awards Lunch (Wilshire Grand Ballroom II)
2:00pm Session 7A: Mobile Platforms (Wilshire Grand Ballroom I) Session 7B: Security (Wilshire Grand Ballroom III)
3:45pm Excursion

Wednesday June 6th, 2018

7:30am Breakfast (Wilshire Grand Ballroom foyer/lobby area)
8:30am Keynote 3: A Return to (Limited) Explicit Dataflow, Doug Burger, Microsoft (Wilshire Grand Ballroom II)
9:30am Session 8A: Machine Learning Systems 1 (Wilshire Grand Ballroom I) Session 8B: Interconnection Networks (Wilshire Grand Ballroom III)
10:50am Break (Wilshire Grand Ballroom foyer/lobby area)
11:20am Session 9A: Machine Learning Systems 2 (Wilshire Grand Ballroom I) Session 9B: GPUs (Wilshire Grand Ballroom III)
12:40pm Closing Remarks (Wilshire Grand Ballroom II)

Detailed Program

Session 1A: Clouds & Datacenters

Chair: Hadi Esmaeilzadeh

A Configurable Cloud-Scale DNN Processor for Real-Time AI - Jeremy Fowers (Microsoft), Kalin Ovtcharov (Microsoft), Michael Papamichael (Microsoft), Todd Massengill (Microsoft), Ming Liu (Microsoft), Daniel Lo (Microsoft), Shlomi Alkalay (Microsoft), Michael Haselman (Microsoft), Logan Adams (Microsoft), Mahdi Ghandi (Microsoft), Stephen Heil (Microsoft), Prerak Patel (Microsoft), Adam Sapek (Microsoft), Gabriel Weisz (Microsoft), Lisa Woods (Microsoft), Sitaram Lanka (Microsoft), Steven K. Reinhardt (Microsoft), Adrian M. Caulfield (Microsoft), Eric Chung (Microsoft), Doug Burger (Microsoft)
Virtual Melting Temperature: Managing Server Load to Minimize Cooling Overhead with Phase Change Materials - Matt Skach (Michigan), Manish Arora (AMD/UCSD), Dean Tullsen (UCSD), Jason Mars (Michigan), Lingjia Tang (Michigan)
FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud - Sagar Karandikar (Berkeley), Howard Mao (Berkeley), Donggyu Kim (Berkeley), David Biancolin (Berkeley), Alon Amid (Berkeley), Dayeol Lee (Berkeley), Nathan Pemberton (Berkeley), Emmanuel Amaro (Berkeley), Colin Schmidt (Berkeley), Aditya Chopra (Berkeley), Qijing Huang (Berkeley), Kyle Kovacs (Berkeley), Borivoje Nikolic (Berkeley), Randy Katz (Berkeley), Jonathan Bachrach (Berkeley), Krste Asanovic (Berkeley)

Session 1B: Accelerators for Emerging Apps

Chair: Dionisios Pnvematikatos

PROMISE: An End-to-End Design of a Programmable Mixed-Signal Accelerator for Machine Learning Algorithms - Prakalp Srivastava (UIUC), Mingu Kang (IBM), Sujan Kumar Gonugondla (UIUC), Sungmin Lim (UIUC), Jungwook Choi (IBM), Nam Sung Kim (UIUC), Vikram Adve (UIUC), Naresh Shanbhag (UIUC)
Computation Reuse in DNNs by Exploiting Input Similarity - Marc Riera (UPC), Jose Maria Arnau (UPC), Antonio Gonzalez (UPC)
GenAx: A Genome Sequencing Accelerator - Daichi Fuijiki (Michigan), Arun Subramaniyan (Michigan), Tianjun Zhang (Michigan), Yu Zheng (Michigan), Reetuparna Das (Michigan), David Blaauw (Michigan), Satish Narayanasamy (Michigan)

Session 2A: Prefetching

Chair: Carole-Jean Wu

Division of Labor: A More Effective Approach to Prefetching - Sushant Kondguli (Rochester), Michael Huang (Rochester)
Criticality Aware Tiered Cache Hierarchy : A fundamental relook at multi-level cache hierarchies - Anant Nori (Intel), Jayesh Gaur (Intel), Siddharth Rai (IIT Kanpur), Sreenivas Subramoney (Intel), Hong Wang (Intel)
Rethinking Belady's Algorithm to Accommodate Prefetching - Akanksha Jain (UT Austin), Calvin Lin (UT Austin)

Session 2B: Languages & Models

Chair: Adrian Sampson

Constructing a Weak Memory Model - Sizhuo Zhang (MIT), Muralidaran Vijayaraghavan (MIT), Andrew Wright (MIT), Mehdi Alipour (Uppsala), Arvind (MIT)
A Hardware Accelerator for Tracing Garbage Collection - Martin Maas (Berkeley), Krste Asanovic (Berkeley), John Kubiatowicz (Berkeley)
Charm: A Language for Closed-form High-level Architecture Modeling - Weilong Cui (UCSB), Yongshan Ding (Chicago), Deeksha Dangwal (UCSB), Adam Holmes (Chicago), Joseph McMahan (UCSB), Ali JavadiAbhari (Chicago), Georgios Tzimpragos (UCSB), Frederic T. Chong (Chicago), Timothy Sherwood (UCSB)

Session 3A: Virtual Memory

Chair: Yungang Bao

Get Out of the Valley: Power-Efficient Address Mapping for GPUs - Yuxi Liu (Ghent/Peking University), Xia Zhao (Ghent), Magnus Jahre (Norwegian University of Science and Technology), Zhenlin Wang (Michigan Technological University), Xiaolin Wang (Peking University), Yingwei Luo (Peking University), Lieven Eeckhout (Ghent)
Scheduling page table walks for irregular GPU applications - Seunghee Shin (NCSU), Guilherme Cox (Rutgers), Mark Oskin (AMD/Washington), Gabriel H. Loh (AMD), Yan Solihin (NCSU), Abhishek Bhattacharjee (Rutgers), Arkaprava Basu (IISc)
SEESAW: Using Superpages to Improve VIPT Caches - Mayank Parasar (Georgia Tech), Abhishek Bhattacharjee (Rutgers), Tushar Krishna (Georgia Tech)
A Case for Richer Cross-layer Abstractions: Bridging the Semantic Gap to Enhance Memory Optimization - Nandita Vijaykumar (CMU), Abhilasha Jain (CMU), Diptesh Majumdar (CMU), Kevin Hsieh (CMU), Gennady Pekhimenko (Toronto), Eiman Ebrahimi (NVIDIA), Nastaran Hajinazar (Simon Fraser), Phillip B. Gibbons (CMU), Onur Mutlu (ETH)

Session 3B: Coherence & Memory Ordering

Chair: Daniel Sanchez

Non-Speculative Store Coalescing in Total Store Order - Alberto Ros (University of Murcia), Stefanos Kaxiras (Uppsala)
Dynamic Memory Dependence Predication - Zhaoxiang Jin (Michigan Technological University), Soner Onder (Michigan Technological University)
ProtoGen: Automatically Generating Directory Cache Coherence Protocols from Atomic Specifications - Nicolai Oswald (Edinburgh), Vijay Nagarajan (Edinburgh), Daniel Sorin (Duke)
Spandex: A Generalized Interface for Flexible Heterogeneous Coherence - Johnathan Alsop (UIUC), Matthew D. Sinclair (UIUC/AMD/Wisconsin), Sarita V. Adve (UIUC)

Session 4A: Emerging Paradigms

Chair: Abhishek Bhattacharjee

Flexon: A Cost-Effective Digital Neuron for Flexible Spiking Neural Network Simulation - Dayeol Lee (Berkeley), Gwangmu Lee (Seoul National), Dongup Kwon (Seoul National), Sunghwa Lee (Seoul National), Youngsok Kim (Seoul National), Jangwoo Kim (Seoul National)
Space-Time Algebra: A Model for Neocortical Computation - James E. Smith (Wisconsin)
Architecting a Stochastic Computing Unit with Molecular Optical Devices - Xiangyu Zhang (Duke), Ramin Bashizade (Duke), Craig LaBoda (Duke), Chris Dwyer (Parabon Labs), Alvin R. Lebeck (Duke)

Session 4B: Persistence

Chair: Hsien-Hsin Lee

Density Tradeoffs of Non-Volatile Memory as a Replacement for SRAM based Last Level Cache - Ishwar Bhati (Intel), Kunal Korgaonkar (UCSD), Jayesh Gaur (Intel), Huichu Liu (Intel), Sasikanth Manipatruni (Intel), Steven Swanson (UCSD), Tanay Karnik (Intel), Sreenivas Subramoney (Intel), Ian A. Young (Intel), Hong Wang (Intel)
ACCORD: Enabling Associativity for Gigascale DRAM Caches by Coordinating Way-Install and Way-Prediction - Vinson Young (Georgia Tech), Chiachen Chou (Google), Aamer Jaleel (NVIDIA), Moinuddin Qureshi (Georgia Tech)
RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM - Fengbin Tu (Tsinghua), Weiwei Wu (Tsinghua), Shouyi Yin (Tsinghua), Leibo Liu (Tsinghua), Shaojun Wei (Tsinghua)

Session 5A: Emerging Memory 1

Chair: Yoav Etsion

Scaling Datacenter Accelerators With Compute-Reuse Architectures - Adi Fuchs (Princeton), David Wentzlaff (Princeton)
Enabling Scientific Computing on Memristive Accelerators - Ben Feinberg (Rochester), Uday Kumar Reddy Vengalam (Rochester), Nathan Whitehair (Rochester), Shibo Wang (Rochester), Engin Ipek (Rochester)
Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks - Charles Eckert (Michigan), Xiaowei Wang (Michigan), Jingcheng Wang (Michigan), Arun Subramaniyan (Michigan), Ravi Iyer (Intel), Dennis Sylvester (Michigan), David Blaauw (Michigan), Reetuparna Das (Michigan)

Session 5B: Storage

Chair: Christina Delimitrou

FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives - Arash Tavakkol (ETH), Mohammad Sadrosadati (ETH/Sharif), Saugata Ghose (CMU), Jeremie Kim (ETH), Yixin Luo (CMU), Yaohua Wang (ETH/NUDT), Nika Mansouri Ghiasi (ETH), Lois Orosa (ETH/Campinas), Juan Gomez Luna (ETH), Onur Mutlu (ETH/CMU)
GraFBoost: Using accelerated flash storage for external graph analytics - Sang-Woo Jun (MIT), Andy Wright (MIT), Sizhuo Zhang (MIT), Shuotao Xu (MIT), Arvind (MIT)
The Case for Dual, Byte- and Block-Addressable SSD - Duck-Ho Bae (Samsung), Insoon Jo (Samsung), Adel Choi (Samsung), Jooyoung Hwang (Samsung), Sangyeun Cho (Samsung), Daniel DG Lee (Samsung), Jaeheon Jeong (Samsung)

Session 6A: Emerging Memory 2

Chair: Thomas Wenisch

Lazy Persistency: a High-Performing and Write-Efficient Software Persistency Technique - Mohammad Alshboul (NCSU), James Tuck (NCSU), Yan Solihin (NCSU)
DHTM: Durable Hardware Transactional Memory - Arpit Joshi (Edinburgh), Vijay Nagarajan (Edinburgh), Marcelo Cintra (Intel), Stratis Viglas (Google)
Hardware Supported Permission Checks On Persistent Objects For Performance and Programmability - Tiancong Wang (NCSU), Sakthikumaran Sambasivam (NCSU), James Tuck (NCSU)

Session 6B: Controllers & Control Systems

Chair: Reetuparna Das

RoboX: An End-to-End Solution to Accelerate Autonomous Control in Robotics - Jacob Sacks (Georgia Tech), Behnam Khaleghi (UCSD), Divya Mahajan (Georgia Tech), R. Connor Lawson (Georgia Tech), Hadi Esmaeilzadeh (UCSD)
DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture - Dongup Kwon (Seoul National), Jaehyung Ahn (POSTECH), Dongju Chae (POSTECH), Mohammadamin Ajdari (POSTECH), Jaewon Lee (Seoul National), Suheon Bae (Seoul National), Youngsok Kim (Seoul National), Jangwoo Kim (Seoul National)
Multilayer Resource Controllers to Maximize Efficiency - Raghavendra Pradyumna Pothukuchi (UIUC), Sweta Yamini Pothukuchi (UIUC), Petros Voulgaris (UIUC), Josep Torrellas (UIUC)

Session 7A: Mobile Platforms

Chair: Natalie Enright Jerger

Exploring Predictive Replacement Policies for Instruction Cache and Branch Target Buffer - Samira Mirbagher (Texas A&M), Elba Garza (Texas A&M), Sangam Jindal (Texas A&M), Daniel A. Jimenez (Texas A&M)
EVA^2: Exploiting Temporal Redundancy in Live Computer Vision - Mark Buckler (Cornell), Philip Bedoukian (Cornell), Suren Jayasuriya (ASU), Adrian Sampson (Cornell)
Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision - Yuhao Zhu (Rochester), Ananda Samajdar (Georgia Tech), Matt Mattina (ARM), Paul Whatmough (ARM)
Guaranteeing Local Differential Privacy on Ultra-low-power Systems - Wooseok Choi (UIUC), Jose Rodrigo Sanchez Vicarte (UIUC), Matthew Tomei (UIUC), Pavan Kumar Hanumolu (UIUC), Rakesh Kumar (UIUC)
Stitch: Fusible Heterogeneous Accelerators Enmeshed with Many-Core Architecture for Wearables - Cheng Tan (NUS), Manupa Karunaratne (NUS), Tulika Mitra (NUS), Li-Shiuan Peh (NUS)

Session 7B: Security

Chair: Gilles Pokam

Nonblocking Memory Refresh - Kate Nguyen (Virginia Tech), Kehan Lyu (Virginia Tech), Xianze Meng (Virginia Tech), Vilas Sridharan (AMD), Xun Jian (Virginia Tech)
Practical Memory Safety with REST - Kanad Sinha (Columbia), Simha Sethumadhavan (Columbia)
Mitigating Wordline Crosstalk using Adaptive Trees of Counters - Mohammad Seyedzadeh (Pittsburgh), Alex Jones (Pittsburgh), Rami Melhem (Pittsburgh)
Mobilizing the Micro-Ops: Exploiting Context Sensitive Decoding for Security and Energy Efficiency - Mohammadkazem Taram (UCSD), Ashish Venkat (UCSD), Dean Tullsen (UCSD)
Hiding Intermittent Information Leakage with Architectural Support for Blinking - Alric Althoff (UCSD), Joseph McMahan (UCSB), Luis Vega (Washington), Scott Davidson (Washington), Timothy Sherwood (UCSB), Michael Taylor (Washington), Ryan Kastner (UCSD)

Session 8A: Machine Learning Systems 1

Chair: Antonio Gonzalez

GANAX: A Unified SIMD-MIMD Acceleration for Generative Adversarial Network - Amir Yazdanbakhsh (Georgia Tech), Hajar Falahati (IPM), Philip J. Wolfe (Georgia Tech), Kambiz Samadi (Qualcomm), Hadi Esmaeilzadeh (UCSD), Nam Sung Kim (UIUC)
SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks - Vahideh Akhlaghi (UCSD), Amir Yazdanbakhsh (Georgia Tech), Kambiz Samadi (Qualcomm), Hadi Esmaeilzadeh (UCSD), Rajesh K. Gupta (UCSD)
UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition - Kartik Hegde (UIUC), Jiyong Yu (UIUC), Rohit Agrawal (UIUC), Mengjia Yan (UIUC), Michael Pellauer (NVIDIA), Christopher Fletcher (UIUC)
An Energy-Efficient Neural Network Accelerator based on Outlier-Aware Low Precision Computation - Eunhyeok Park (Seoul National), Dongyoung Kim (Seoul National), Sungjoo Yoo (Seoul National)

Session 8B: Interconnection Networks

Chair: Mikko Lipasti

Synchronized Progress in Interconnection Networks (SPIN) : A New Theory for Deadlock Freedom - Aniruddh Ramrakhyani (Georgia Tech), Paul Gratz (Texas A&M), Tushar Krishna (Georgia Tech)
TCEP: Traffic Consolidation for Energy-Proportional High-Radix Networks - Gwangsun Kim (ARM), Hayoung Choi (KAIST), John Kim (KAIST)
Modular Routing Design for Chiplet-based Systems - Jieming Yin (AMD), Zhifeng Lin (AMD/USC), Onur Kayiran (AMD), Matthew Poremba (AMD), Muhammad Shoaib Bin Altaf (AMD), Natalie Enright Jerger (AMD/Toronto), Gabriel H. Loh (AMD)
FastTrack: Leveraging Heterogeneous FPGA Wires to Design Low-cost High-performance Soft NoCs - Nachiket Kapre (Waterloo), Tushar Krishna (Georgia Tech)

Session 9A: Machine Learning Systems 2

Chair: Jangwoo Kim

Prediction based Execution on Deep Neural Networks - Mingcong Song (Florida), Jiechen Zhao , Yang Hu , Jiaqi Zhang , Tao Li (Florida)
Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks - Hardik Sharma (Georgia Tech), Jongse Park (Georgia Tech), Naveen Suda (ARM), Liangzhen Lai (ARM), Benson Chau (Georgia Tech), Joon Kyung Kim (Georgia Tech), Vikas Chandra (ARM), Hadi Esmaeilzadeh (UCSD)
Gist: Efficient Data Encoding for Deep Neural Network Training - Animesh Jain (Michigan), Amar Phanishayee (Microsoft), Jason Mars (Michigan), Lingjia Tang (Michigan), Gennady Pekhimenko (Toronto)
The Dark Side of DNN Pruning - Reza Yazdani (UPC), Marc Riera (UPC), Jose-Maria Arnau (UPC), Antonio Gonzalez (UPC)

Session 9B: GPUs

Chair: Tor Aamodt

HetCore: TFET-CMOS Hetero-Device Architecture for CPUs and GPUs - Bhargava Gopireddy (UIUC), Dimitrios Skarlatos (UIUC), Wenjuan Zhu (UIUC), Josep Torrellas (UIUC)
RegMutex: Inter-Warp GPU Register Time-Sharing - Farzad Khorasani (Georgia Tech), Hodjat Asghari Esfeden (UCR), Amin Farmahini-Farahani (AMD), Nuwan Jayasena (AMD), Vivek Sarkar (Georgia Tech)
The Locality Descriptor: A Holistic Abstraction to Exploit Data Locality in GPUs - Nandita Vijaykumar (CMU), Eiman Ebrahimi (NVIDIA), Kevin Hsieh (CMU), Phillip B. Gibbons (CMU), Onur Mutlu (ETH)
Generic System Calls for GPUs - Jan Vesely (Rutgers), Arkaprava Basu (IISc), Abhishek Bhattacharjee (Rutgers), Gabriel H. Loh (AMD), Mark Oskin (AMD), Steven K. Reinhardt (Microsoft)