AI Lab logo
menu MENU

Computer Engineering Seminar

Proportional Error Tolerance for Efficient and Effective Performance and Reliability

Mattan ErezAssistant ProfessorUniversity of Texas - Austin

Under increasing constraints of bandwidth, power, and energy, resource proportionality can significantly boost overall performance and reduces cost. In this talk I will focus my group's vision and initial work towards achieving error protection with overheads that are proportional to actual application needs. Specifically, I will discuss dynamic and flexible cooperative error protection and variable granularity memory access. Flexible cooperative protection recognizes that different computations and data require different degrees of protection, and hence different amounts of resources for acceptable execution outcomes. For example, our technique of virtualizing memory protection enables such dynamic reliability tradeoffs and relaxes design constraints to significantly improve error-protection efficiency without sacrificing protection level. Variable granularity memory accesses is an example of the benefits of flexible protection and relaxes the trend of ever-increasing access granularities. Adaptive granularity improves utilization of scarce memory bandwidth and operating power resources by multiple factor for applications with fine-grained gather/scatter access patterns. I will also introduce our approach to incorporating reliability tradeoffs within the programming model using our containment domains framework. Containment domains encapsulate application-specific protection strategies and are a key aspect of our research on extreme-scale systems.
Mattan Erez is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Texas at Austin. His research aims to improve the scalability and efficiency of performance-demanding compute platforms at all scales. Mattan's research spans the entire system from low-level microarchitecture to programming models. His most recent work has been on improving resource proportionality with a focus on reliability. Mattan received a B.Sc. in Electrical Engineering and a B.A. in Physics from the Technion, Israel Institute of Technology in 1999. He subsequently received his M.S and Ph.D. in Electrical Engineering from Stanford University in 2002 and 2007 respectively. His experience includes working as a computer architect in the Israeli Processor Architecture Research team, Intel Corporation and a member of the Merrimac streaming supercomputer and Sequoia programming system projects at Stanford.

Sponsored by