FastPath 2014 - Speakers
"Architectural Musings: Rethinking Mobile Computer Systems Architecture & Evaluation"
Abstract: Modern computer systems are complex combinations of hardware and software. These systems have evolved over many years, and their basic structure is derived from a fundamental body of work in computer science and computer architecture that was developed many years ago. This body of work focused on optimizing the utilization of scarce resources, and working around key system bottlenecks in order to extract the most value possible from computer systems. In modern mobile computer systems, the set of resources which are scarce and the set of system bottlenecks has changed quite dramatically from those older systems. This talk explores some of the changes in scarce resources and system bottlenecks, and how those changes might affect both the way that we evaluate such systems as well as the systems' design going forward.
Bio: Christopher Vick is a Principal Engineer and research project lead in Qualcomm Research Silicon Valley. He obtained an MSc in Computer Science from Rice University in 1994, a JD from Columbia University and a BA from Rice University in 1984, and was named an ACM Distinguished Engineer in 2006. His research interests include computer architecture, hardware/software co-design, system level software & virtualization technologies, and runtime optimization & code generation. Prior to Qualcomm, Chris was at Sun Microsystems, Inc., where he was one of the original authors of the HotSpot™ Java Virtual Machine Server Compiler, and led research efforts in Sun Labs on topics ranging from systems software and virtualization for a supercomputer to microprocessor and memory architectures and virtualization. Prior to Sun, Chris worked on compilers, tools, and microprocessor architecture at Texas Instruments, Inc.
"The Role of High Performance Storage in Workload Optimized Systems"
Abstract: It is well known that storage bandwidth has not been able to keep up with the tremendous increase in CPU speed, Memory throughput and CPU IO throughput. That has been changing of late as such technologies as Flash have made dramatic increases in throughput and IOPs and have done so in smaller densities and with less power dissipated then HDDs. This talk will explore the state of the art of high performance storage including direct attach as well as SAN attached and how such storage is a powerful accelerator for many types of workload optimized storage systems like analytics.
Bio: Andy Walls has worked for IBM his entire 32 year career. He was appointed a Distinguished Engineer in 2006 and has been located in San Jose, California since 1987. Andy has worked in storage since that time and is an industry recognized expert in storage systems architecture and NAND Flash storage for the enterprise. Andy led the acquisition of Texas Memory Systems in 2012. He is currently the CTO and Chief Architect for Flash Systems which includes the TMS team. Andy is working on the next generation high performance systems. He is known as an innovator and has over 70 patents to his name. He graduated from UC Santa Barbara with a BSEE in 1981.
"Building Workload Optimized Solutions for Business Analytics"
Abstract: The end of the Dennard scaling has put a stop on free performance gains
in software by merely upgrading hardware, which the industry so long
took for granted. By narrowing the scope of applications and workloads
one hopes to continue the growth in the form of workload optimized
solutions. While there is agreement that the development of workload
optimized systems is a necessity, it is usually not clear on what
technology to bet. GPUs and FPGAs seem to be the most actively discussed
candidates nowadays. Success stories at IBM like the BLU Accelerator for
DB2, however, suggest that special-purpose software for general-purpose
hardware may be alternative that should not be overlooked.
In this talk, I will present a study on hard- and software acceleration for query processing in business analytics. We will discuss predicate evaluation on encoded data, analyze and compare implementations on GPUs and FPGAs with optimized SIMD code for CPUs. I will also talk about how we can take advantage of the high degree of parallelism, latency hiding, and the fast device memory to implement hash joins on GPUs.
Bio: Rene Mueller is a Research Staff Member at IBM Research -- Almaden in San Jose. His research interests are hard-/software co-design and acceleration of data management systems. At IBM, he has worked on the BLU Acceleration for DB2. Rene spends a lot of time figuring out new ways how to use FPGAs and GPUs to run queries faster. He obtained his PhD and MSc in Computer Science from ETH Zurich where he was working on wireless sensor networks and data stream processing using FPGAs. He built a component library and a compositional compiler that translates continuous SQL queries into synthesizable HDL code.