Area of Expertise: Distributed Systems
- Performance Management
- Distributed Databases
- Eventual Consistency
- Gossip Protocols
- RPC IDLs, stub generators, protocols, and runtimes
- Application Servers
- Cloud Native Computing
Languages, Platforms, APIs
Kubernetes, Go, Prometheus, Java, J2EE, SQL, C, Linux/Unix, OpenStack, CORBA, Lisp, XML
Notable Systems Accomplishments
- Key contributor to design and development of prototype of new platform for systems of engagement; prototype inspired major product development investment, in customer trials now as extensions to IBM Operational Decision Manager. I contributed to the overall architecture, with particular focus on how to make the event processing core robust and scalable. I also contributed to the implementation of that core, and a framework for studying its performance, and doing performance studies.
- Led the design and development of peer-to-peer based group communication layer inside WebSphere Application Server (called BBSON), which greatly improved the product's success in scale-out deployments by major customer accounts. I identified non-scalable usage in the clients, and advised the product team on how to fix that. I realized that the remaining usage was still outside the design envelope of the older group communication subsystem that was being used and that we could easily make a serviceable alternate implementation of the relevant higher level services using a semi-structured overlay network subsystem that was already in the product. I led the design and implementation of that alternative implementation.
- Led design and development of performance management technology, delivered in WebSphere Application Server. This product can function as a sophisticated distributed system running multiple application modules and servicing multiple classes of traffic with SLOs defined by administrators. The performance is managed by four feedback loops that turn three classes of control knobs (placement, load balancing, and throttling) and update performance model parameters, to cohesively optimize a global objective. I led the overall design and implemented the part that formulates the central optimization problem and converts its result into the forms that are needed downstream.
- Key contributor to design and development of the Bayou distributed database, which pioneered the concept of eventual consistency and was an early use of gossip. I was a core contributor to the overall design and implemented the gossip protocol and the main server code.
- Key contributor to ILU, early open-source software for multi-language RPC and one-way messaging. I was one of two leads on the overall design and implementation.
- Co-chair of industrial track of Middleware 2012.
- Program committee member of several conferences, including WWW, SOCC, Middleware, IC2E, LADIS, ACM TRIOS.
IBM Research, 2000-Present
Kubernetes-Based Control Planes, 2014-Present
The Kubernetes control plane exemplifies a good way to build a distributed system whose priority is to eventually do the right thing (as opposed to doing something quickly). The Kubernetes API machinery is generic and can be used to build such distributed systems. I have been working on using the Kubernetes API machinery to build control planes for traditional IaaS systems. See https://github.com/MikeSpreitzer/kube-examples/tree/add-kos/staging/kos for an example I hope to contribute soon.
Placement, OpenStack, 2013
(OpenStack is an industry consortium producing open-source software that implements a mostly-infrastructure level cloud)
- Led redesign and implementation of IBM Research virtual resource placement technology for robustness and scalability.
- Leading IBM contributions to OpenStack architecture regarding joint placement decision making and software configuration.
Storage and Computation, 2010-2012
- Prototyped upcoming product for systems of engagement.
- Prototyped platform that unifies Google's Pregel and MapReduce.
Health and Performance Management, 2003-2009
- Led progression from early limited research idea to comprehensive functionality in product.
- Designed and developed a service offering of IBM Research's Gryphon pub/sub software.
Xerox PARC, 1989-2000 – selected examples
- Key contributor to the widely cited Bayou project.
- Key contributor to the open sourced ILU software for RPC & one-way messaging.
- Developed message delivery system using ambient devices as part of ubiquitous computing project under Mark Weiser.
- Developed cryptography offload server.
- Developed scheme source for data types as part of inlining project in SchemeXerox.
- Ph.D. in Computer Science, Stanford, 1989
- BS in Engineering& Applied Science, CalTech, 1980
- Corporate awards for work on performance management technology in WebSphere.
- Corporate award for group communication technology work in WebSphere.
- 2015 ACM SIGOPS Hall of Fame award for 1995 SOSP paper on Bayou.
- 2016 IEEE SIGMETRICS Test of Time award for 2005 paper on modeling internet services.
Managing update conflicts in Bayou, a weakly connected replicated storage system
DB Terry, MM Theimer, K Petersen, AJ Demers, MJ Spreitzer, CH Hauser
ACM SIGOPS Operating Systems Review 29 (5), 172-182, 1995
Flexible update propagation for weakly consistent replication
K Petersen, MJ Spreitzer, DB Terry, MM Theimer, AJ Demers
ACM SIGOPS Operating Systems Review 31 (5), 288-301, 1997
An analytical model for multi-tier internet services and its applications
B Urgaonkar, G Pacifici, P Shenoy, M Spreitzer, A Tantawi
ACM SIGMETRICS Performance Evaluation Review 33 (1), 291-302, 2005
For a more complete listing see my Google Scholar Profile at http://scholar.google.com/citations?user=SsbOWx4AAAAJ