PC Server Activities - overview
This project is focusing on improvements in the memory and I/O architecture of PC Servers. Memory Expansion Technology is an example of a now major project that had its roots in this project area. Focus areas for this project are now cluster architectures for PC Servers, large multiprocessor and NUMA multiprocessor architecture, and large partitionable systems, combining aspects of clustering and large multiprocessor design. Much of the current work also attempts to leverage Linux to exploit advanced hardware architecture features. The Linux effort is highlighted at IBM Developerworks.
1. System Design and Performance: Intel Clusters
The goal of this project is to continue to develop and exploit the rack-mounted clusters. The current focus is on the design of rack optimized dense servers with unique cost, performance and manageability advantages over our competition. MXT technology is being leveraged in the design of dense 1U server packages optimized for this market but with a 40 percent to 70 percent cost performance advantage. We are also leveraging this advantage through Linux-based web appliances and near appliance configurations.
2. Scalable Server Architecture
Work in this area consists of three projects. The goal of the first project is to explore High Throughput coherence controllers and efficient hardware support for user level message passing in scalable commercial servers. The results of this research have been used in defining the next generation NUMAQ servers, and is being done jointly with our product lab. The architectural concepts developed under this project have also been used to define the Canopus multiprocessor server design in Poughkeepsie. The second project involves the exploration of high speed interconnection networks for scalable shared memory servers. Research in this area has resulted in the design of the Federation switch that is targeted for use in server offerings from NUMAQ and from the IBM Personal Systems division. The third project in this area was responsible for developing the MemorIES on-line cache emulation tools for evaluating cache memory systems under realistic application environments. This tool is now being extended to connect to a NUMAQ system.
3. Advanced Memory Systems
The goal of this project is to exploit in-memory databases using hard (fault-tolerant and non-volatile) main memories, and to develop the supporting hardware/software architectures. Raw processor performance is increasing at 60 percent per year while memory and disk subsystems latencies are improving at barely 7 percent per year. This means that memory and disk subsystem performance is increasingly becoming the throughput bottlenecks for large systems. Our goal is to improve overall commercial system performance by (1) replacing part of the disk system with fast, hard memory and by (2) architecting memory subsystems for optimum performance in servers using large amounts of memory. In these environments, direct sharing of memory resources across multiple OS images is desirable. It is also critical that the caching hierarchy support the appearance of low latency high bandwidth memory.
Related IBM Products
Netfinity SP Switch.
Publications on Shared Memory Multiprocessors
"High Throughput Coherence Control and Hardware Messaging in Everest," A. K. Nanda, A-T. Nguyen, M. Michael, D. Joseph to appear in the IBM Journal of Research and Development, June, 2001.
"MemorIES: A Programmable, Real Time, Hardware Emulation Tool for Multiprocessor Server Design," A.K. Nanda, K. Mak and K Sugavanam, R. Sahoo, V. Soundararajan and T. Basil Smith, Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS-IX, Nov. 2000.
"Using Switch Directories to Speed up Cache-to-Cache Transfers in CC-NUMA Multiprocessors" R. Iyer, L. N. Bhuyan and A.K. Nanda, Proceedings of the 12th International Parallel and Distributed Processing Symposium (IPDPS2000), May 2000.
"High Throughput Coherence Controllers" A. K. Nanda, A-T. Nguyen, M. Michael, D. Joseph, Proceedings of the 6th International Symposium on High Performance Computer Architecture, HPCA-6, Jan. 2000.
"Coherence Controller Architectures for Scalable Shared Memory Multiprocessors," M. Michael, A.K. Nanda and B.H. Lim, IEEE Transactions on Computers, Feb. 1999.
"Design and Performance of Directory Caches for Scalable Shared Memory Multiprocessors" M. Michael and A.K. Nanda, Proceedings of the 5th International Symposium on High Performance Computer Architecture, HPCA-5, Jan. 1999.
"Measurement, Analysis and Performance Improvement of Apache Web Server," Y. Hu, Q. Yang and A.K. Nanda, Proceedings of the 18th IPCCC, Feb. 1999.
The Design of COMPASS : An Execu-tion Driven Simulator for Commercial Applications Running on Shared Memory Multiprocessors A.K. Nanda, Y. Hu, M. Ohara, M. Giampapa, C. Benveniste and M. Michael, Proceedings of International Parallel Processing Symposium, April 1998.
Coherence Controller Architectures for SMP-Based CC-NUMA Multiprocessors M. Michael, A.K. Nanda, B-H. Lim and M. Scott, Proceedings of the 24th International Symposium on Computer Architecture, June 1997.
Publications on Interconnection Networks
"Adaptive Routing on the new Switch Chip for IBM SP Systems," B. Abali, C. B. Stunkel, J. Herring, M. Banikazemi, D. K. Panda, C. Aykanat, and Y. Aydogan, to appear in Journal of Parallel and Distributed Computing.
"Implementing Multidestination Worms in Switch-based Parallel Systems: Architectural Alternatives and their Impact," R. Sivaram, C. B. Stunkel, and D. K. Panda, IEEE Transactions on Parallel and Distributed Systems, vol. 11, no. 8, pp. 794-812, Aug., 2000
"Efficient Broadcast and Multicast on Multistage Interconnection Networks using Multiport Encoding", R. Sivaram, D. K. Panda, and C. B. Stunkel, IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 10, pp. 1004-1028, Oct. 1998.
"IBM RS/6000 SP Interconnection Network Topologies for Large Systems," H. Sethu, C. B. Stunkel, and R. F. Stucke, in Proc. International Conference on Parallele Processing (Minneapolis, MN), pp. 620-627, Aug. 1998.
"HIPIQS: A High-performance Switch Architecture using Input Queuing," R. Sivaram, C. B. Stunkel, and D. K. Panda, Proc. 12th International Parallel Processing Symposium (Orlando, FL), pp. 134-143, March-April 1998.
"Implementing Multidestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and their Impact," C. B. Stunkel, R. Sivaram, and D. K. Panda, in Proc. 24th Annual International Symposium on Computer Architecture (Denver, CO), pp. 50-61, June, 1997.
Technical Reports
MemorIES: A Real Time, On-Line Emulation Tool for Evaluating Large Caches and SMP Cache Protocols, A. K. Nanda, K. Mak, and K. Sugavanam, IBM T. J. Watson Research Center Technical Report, July, 1999.
Memory Reference Characteristics of TPCD/DB2 on Shared Memory Multiprocessors, M. Ohara and A. K. Nanda, IBM T. J. Watson Research Center Technical Report, Nov., 1997.