Yasushi Negishi (根岸 康)  Yasushi Negishi (根岸 康) photo         

contact information

Advisory Researcher
IBM Research - Tokyo
  +81dash3dash3808dash5295

links


profile


 

Profile

Yasushi Negishi is a Research Staff Member at IBM Research - Tokyo. He belongs to the Deep Computing & Analytics group in Systems & Software. He joined IBM Research - Tokyo in 1989 after obtaining his M.S. degree in information science from the Tokyo Institute of Technology. He has more than 25 years of research experience. In 1989-1990, he researched system software, such as two-level threading systems, on the world’s second earliest symmetric multiprocessing machine called TOP-1. In 1990-1996, he improved the performance of the NFS server by about 15% by avoiding target data copy with functions of Ethernet adapters. In 1995-1996, he developed a communication protocol and system for video-on-demand software based on UDP/IP that achieved several times better communication stability than TCP/IP. In 1995-1999, he developed a communication protocol and system for PDAs (personal digital assistants) based on a synchronization mechanism. His protocol and system were used for a product providing the Lotus Notes database on PDAs. In 2000-2004, he worked on the design and development of a high performance processor for gaming, named CELL, in collaboration with Sony and Toshiba technicians. From 2008, he optimized HPC applications, such as FFT and CFD, on Blue Gene and other POWER processor machines. He also developed programming tools for optimization. Much of his work has been presented at major refereed conferences and in journals including INFOCOM, SC, Euro-Par, IPDPS, and TPDS. He is an ACM Senior member, an IPSJ Seniar Member, and a member of IEEE. He has obtained more than thirty patents in relation to his work.

Journals

1. "A Systematic Approach towards Automated Performance Analysis and Tuning," Guojing Cong, I-Hsin Chung, Huifang Wen, David Klepacki, Hiroki Murata, Yasushi Negishi, Takao Moriyama, Transactions on Parallel and Distributed Systems, IEEE, 2012.

2. "Automating Optimization Process of HPC Applications," (in Japanese) Hiroki Murata, Yasushi Negishi, and Takao Moriyama, Gojing Cong, I-Hsin Chung, Huifang Wen, and David Klepacki, IPSJ Journal (ACS) Vol. 34, 2011.

3. "Low Power, Massively Parallel, Energy Efficient Supercomputers," The Blue Gene Team, Green Computing: Large-Scale Energy Efficiency, the Green 500 Organization, 2010

4. "Overview of the IBM Blue Gene/P project," IBM Blue Gene team, IBM Journal of Research and Development, Vol. 52 No 1/2, January/March, 2008.

5. "CPU Resource Reservation System for CPU Using Simultaneous Multi Thread ," (in Japanese) Hiroshi Inoue, Takao Moriyama, Yasushi Negishi, Moriyoshi Ohara, IPSJ Journal (ACS), Vol. 45, No. SIG03, Mar. 2004

6. "Performance Analysis and Improvement of Network Attached Storage Built from Open Software," (in Japanese) Kazuya Tago, Yasushi Negishi, Kenichi Okuyama, Hiroki Murata, Takuya Matsunaga, IPSJ Journal, Vol. 44, No. 2, Feb. 2003

7. "Tuplink: A Meta-middleware System for Micro-clients," Yasushi Negishi, Kiyokuni Kawachiya, Hiroki Murata, Kazuya Tago, IPSJ Journal, Vol. 41, No. 10, Oct. 2000.

8. "A Proposal for an Operating System Designed for Cluster Servers," Kazuya Tago, Yasushi Negishi, Mikako Hoshiba, Journal of Information Processing, 15 May 1992.

International Conferences

1. "High Resolution Medical Image Segmentation Using Data-Swapping Method," Haruki Imai, Samuel Matzek, Tung D. Le, Yasushi Negishi, Kiyokuni Kawachiya, International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2019)

2. "Automatic GPU memory management for large neural models in TensorFlow," Tung D. Le, Haruki Imai, Yasushi Negishi, Kiyokuni Kawachiya, Proceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management (ISMM 2019)

3. "Profiling based out-of-core hybrid method for large neural networks," Yuki Ito, Haruki Imai, Tung Le Duc, Yasushi Negishi, Kiyokuni Kawachiya, Ryo Matsumiya, Toshio Endo, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming (PPoPP’19) (Poster)

4. "Large Model Support for Deep Learning in Caffe and Chainer," Minsik Cho, Tung D. Le, Ulrich A Finkler, Haruki Imai, Yasushi Negishi, Taro Sekiyama, Saritha Vinod, Vladimir Zolotov, Kiyokuni Kawachiya, David S. Kung, Hillery C. Hunter, SysML conference, 2018

5. "TFLMS: Large Model Support in TensorFlow by Graph Rewriting," Tung D. Le, Haruki Imai, Yasushi Negishi, Kiyokuni Kawachiya, https://arxiv.org/abs/1807.02037, arXiv, 2018

6. "Involving CPUs into Multi-GPU deep learning," Tung D. Le, Taro Sekiyama, Yasushi Negishi, Haruki Imai, Kiyokuni Kawachiya
2018 ACM/SPEC International Conference on Performance Engineering , pp. 56--67

7. "Accelerating Multi-GPU Deep Learning by Collecting and Accumulating Gradients on CPUs," Tung D. Le, Taro Sekiyama, Yasushi Negishi, Haruki Imai, Kiyokuni Kawachiya, SIG Technical Reports 2017-HPC-159(8), pp. 1--8

8. "An Event-Processing System Alerting Analytically to Networked Vehicles," Haruki Imai, Kumiko Maeda, Tatsuhiro Chiba, Yasushi Negishi, Akira Koseki, Tohru Aihara, Hideaki Komatsu, IEEE Conference on Intelligent Transportation Systems, October, 2013.

9. "Smarter Mobility Integrated System: A Real Time Processing Framework for Sensor Data Aggregation, Analysis and Query," Tatsuhiro Chiba, Haruki Imai, Kumiko Maeda, Akira Koseki, Yasushi Negishi, Hideaki Komatsu, TS World Congress, ITS, October, 2013.

10. "A Static Analysis Tool using a three-step approach for Data Races in HPC Programs," Yasushi Negishi, Hiroki Murata, Guojing Cong, Hui-Fang Wen, I-Hsin Chung, PADTAD'2012, July 17, 2012.

11. "Tool-assisted Optimization of Shared-memory Accesses in UPC Applications," Guojing Cong, Hui-Fang Wen, Hiroki Murata, Yasushi Negishi, HPCC-ICESS 2012, June 25-27, 2012.

12. "An Efficient Framework for Multi-dimensional Tuning of High Performance Computing Applications," Guojing Cong, Huifang Wen, I-hsin Chung, David Klepacki, Hiroki Murata, Yasushi Negishi, International Parallel and Distributed Processing Symposium, May 21-25, 2012.

13. "Tool-assisted performance measurement and tuning of PGAS applications," Guojing Cong, I-Hsin Chung, Huifang Wen, Hiroki Murata, Yasushi Negishi, PGAS2011, Oct. 10-14, 2011.

14. "Overlapping Methods of All-to-All Communication and FFT Algorithms for Torus-Connected Massively Parallel Supercomputers," Jun Doi, Yasushi Negishi, Supercomputing (SC10), Nov. 13-19, 2010

15. "Application tuning through bottleneck-driven refactoring," Guojing Cong, I-Hsin Chung, Huifang Wen, David Klepacki, Hiroki Murata, Yasushi Negishi, Takao Moriyama, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), April 19-23, 2010

16. "A Holistic Approach towards Automated Performance Analysis and Tuning," Guojing Cong, I-Hsin Chung, Hui-Fang Wen, David J. Klepacki, Hiroki Murata, Yasushi Negishi, Takao Moriyama, Euro-Par 2009 Parallel Processing, 15th International Euro-Par Conference, Aug. 25-28, 2009

17. "A Proposal of Operation History Management System for Source-to-Source Optimization of HPC Programs" Y. Negishi, H. Murata, T. Moriyama, PADTAD'2009, July 19-20, 2009

18. "Automatic Parameter Search for Scientific Programs Using Performance Prediction," H. Imai, Y. Negishi, The Second international Workshop on Automatic Performance Tuning, Sep. 2007.

19. "Efficient error correction code configurations for quasi-nonvolatile data retention of DRAMs," Yasunao Katayama, Sumio Morioka, Yasushi Negishi, IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'2000), Oct. 25-27, 2000.

20. "Tuplink: A Communication System for PDAs and Micro-Devices," Y. Negishi, H. Murata, K. Okuyama, K. Kamata, K. Tago, 17th ACM Symposium on Operating Systems Principles (Poster Session), 1999.

21. "A Portable Communication System for Video-on-Demand Applications using the Existing Infrastructure," Yasushi Negishi, Kiyokuni Kawachiya, Kazuya Tago, IEEE INFOCOMM '96, March 24-28, 1996.

Domestic Conferences and Others

1. "NVLink2.0のCPU・GPU間コヒーレントメモリアクセス機能(ATS)の調査及び深層学習への適用に関する考察," (in Japanese), Yasushi Negishi, Tung Le Doc, Haruki Imai, Jun Doi, Kiyokuni Kawachiya, 第169回PHC研究発表会, SIG HPC of IPSJ, May 3, 2019.

2. "GPUメモリ管理の実行時最適化による大規模深層学習の高速化," (in Japanese), Yuki Itoh, Haruki Imai, Tung Le Duc, Yasushi Negishi, Kiyokuni Kawachiya, Ryo Matumiya, Nobuo Endo, 第165回PHC研究発表会, SIG HPC of IPSJ, July 23, 2018.

3. "CUDA Unified Memoryのディープラーニングへの適用についての考察," (in Japanese), Yasushi Negishi, Haruki Imai, Jun Doi, Kiyokuni Kawachiya, 第164回PHC研究発表会, SIG HPC of IPSJ, April 30, 2018.

4. "大規模ニューラルネットワークモデルのOut-of-Core学習の性能評価," (in Japanese), Haruki Imai, Tung Le Duc, Yasushi Negishi, Taro Sekiyama, Kiyokuni Kawachiya, 第162回PHC研究発表会, SIG HPC of IPSJ, December 18-19, 2017.

5. "Unified Memory を用いた大規模ディープラーニングモデルの性能に関する考察," (in Japanese), Yasushi Negishi, Haruki Imai, Jun Doi, Kiyokuni Kawachiya, 日本ソフトウェア科学会第34回大会, Sepember 19-21, 2017.

6. ”Linux on z SystemsにおけるMongoDBの性能評価," (in Japaanese), Yasushi Negishi, Moriyoshi Ohara, Kiyokuni Kawachiya, 日本ソフトウェア科学会第32回大会, September 9-11, 2015.

7. ”HPCS Toolkitのソースコード書換えによるファイルI/O性能最適化," (in Japaanese), Yasushi Negishi, Hiroki Murata,Cong Guojing,Chung I-Hsin,Wen Hui-Fnag, SIG HPC of IPSJ, May 26-27, 2014.

8. "自動ソースコード変換によるPGASプログラムの最適化," (in Japanese) Y. Negishi, Hiroki Murata, Takao Moriyama, Guojing Cong, I-Hshin Chung, Hui-Fang Wen, David Klepacki, 2011年ハイパフォーマンスコンピューティングと計算科学シンポジウム, SIG HPCS of IPSJ, January 2011.

9. "携帯端末向けの通信機構とそのアプリケーション," (in Japanese) Y. Negishi, Hiroki Murata, Ken'ichi Okuyama, Kunio Kamata, Kiyokuni Kawachiya, Kazuya Tago, The JSSST SIGOOC 1998 Workshop on Systems for Programming and Applications, March 25-27, 1998

10. "ファイルシステムインタフェースを利用したデータアクセスシステムの試作," (in Japanese) Yoshiaki Mima, Kazuya Kosaka, Y. Negishi, The 52nd Annual meeting of IPSJ, March 7, 1996

11. "OS/2 Warpのリアルタイム機能の拡張," (in Japanese) Y. Negishi, Masahiro Tachizawa, Ryoji Honda, Shigeki Ishikawa, The 52nd Annual meeting of IPSJ, March 8, 1996

12. "VOD向けのネットワークAPIの提案," (in Japanese) Kiyokuni Kawachiya, Y. Negishi, Kazuya Tago, The 52nd Annual meeting of IPSJ, March 8, 1996

13. "The Relationship between a Communication Interface and its Overhead," Y. Negishi, The 48th Annual meeting of IPSJ, March 23, 1994

14. "LAN通信機構の問題点とその改善," (in Japanese) K. Tago, Y. Negishi, Computer System Symposium, October 21, 1993

15. "高速通信のためのアーキテクチャサポート," (in Japanese) K. Tago, Y. Negishi, SIG Architecture of IPSJ, October 21, 1993

16. "高速な通信媒体のための通信機構," (in Japanese) Y. Negishi, K. Tago, M. Hoshiba, Computer System Symposium, 27 Oct 1992.

17. Paper Introduction of "Scheduling and IPC Mechanisms for Continuous Media," Y. Negishi, Journal of Information Processing Society of Japan VOL.33 NO.6, 15 Jun 1992.

18. "An Operating System Design for Client-Server Configuration," K. Tago, Y. Negishi, M. Hoshiba, Computer System Symposium, March 3, 1991

19. "User-Level Scheduler and Preemption Mechanism in SSCORE, an Operating System Kernel for Parallel Processing," T. Moriyama, S. Uzuhara, T. Matsumoto, Y. Negishi, The 7th Annual convention of JSSST, 6th October, 1990.

20. "A Thread implementation strategy for Tightly-Coupled Multiprocessor Systems." Y. Negishi, The 41st Annual convention of IPSJ, 6th September, 1990.

21. "Classification of Parallel Programs Based on Grain Size Information and its Application to Multiprocessor Resource Management Scheme," T. Matsumoto, S. Uzuhara, T. Moriyama, Y. Negishi, The 7th Annual convention of JSSST. 6th October, 1990.

22. "A Multiprocessor Resource Management Scheme which Considers Program Grain Size," T. Moriyama, S. Uzuhara, T. Matsumoto, Y. Negishi, Summer Workshop on Parallel Processing '90, IPSJ, 18th July, 1990.

Blogs

1. "Deep Learning on OpenPOWER: Building Chainer on OpenPOWER Linux Systems," Yasushi Negishi, IBM Developer works ( https://www.ibm.com/developerworks/community/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/Deep_Learning_on_OpenPOWER_Building_Chainer_on_OpenPOWER_Linux_Systems )
2. "Deep Learning on OpenPOWER: Install IBM-optimized Chainer v4 Easily with pip Command on OpenPOWER Linux Systems," Yasushi Negishi, Linux on Power developer Potal ( https://developer.ibm.com/linuxonpower/2018/07/31/deep-learning-openpower-install-ibm-optimized-chainer-v4-easily-pip-command-operpower-linux-systems/ )

Lectures

1. "R&D Activities of Global companies," Tokyo Institute of Technology, 2013-2019.
http://www.ocw.titech.ac.jp/index.php?module=General&action=T0300&GakubuCD=7&KamokuCD=110600&KougiCD=201707420&Nendo=2017&LeftTab=graduate&vid=03&lang=EN

2. ”Software Architecture for Enterprise Systems," HoseiUniversity, 2015-2019.
https://syllabus.hosei.ac.jp/web/preview.php?no_id=1623000&nendo=2016&gakubu_id=%E6%83%85%E5%A0%B1%E7%A7%91%E5%AD%A6%E7%A0%94%E7%A9%B6%E7%A7%91&gakubueng=EP&radd=500

Guest Editor

1. 情報処理2018年11月号 特集「ディープラーニング活用事例と使いこなしの勘所」

Patents

1. Patent Name: ESTIMATING PERFORMANCE OF GPU APPLICATION FOR DIFFERENT GPU-LINK PERFORMANCE RATIO
Date: 21 Oct 2019
Number: 10453167
Issuing country: United States

2. Patent Name: REAL-TIME RESOURCE USAGE REDUCTION IN ARTIFICIAL NEURAL NETWORKS
Date: 22 Apr 2019
Number: 10268951
Issuing country: United States

3. Patent Name: HIGH-PERFORMANCE WIRELESS CROSSBAR SWITCH
Date: 05 Mar 2019
Number: 308604
Issuing country: India

4. Patent Name: A NETWORK SCHEDULING METHOD OF ALL-TO-ALL COMMUNICATION FOR PIPELINED PARALLEL FAST FOURIER TRANSFORM PROCESSING
Date: 20 Feb 2019
Number: 112010003810
Issuing country: Germany

5. Patent Name: SURFACE-BASED OBJECT IDENTIFICATION
Date: 31 Dec 2018
Number: 10169874
Issuing country: United States

6. Patent Name: DATA DE-DUPLICATION SYSTEM USING GENOME FORMATS CONVERSION
Date: 24 Dec 2018
Number: 10162934
Issuing country: United States

7.Patent Name: HIGH-PERFORMANCE WIRELESS CROSSBAR SWITCH
Date: 12 Mar 2018
Number: 2780258
Issuing country: Canada

8. Patent Name: APPLYING MULTIPLE REWRITING WITHOUT COLLISION FOR SEMI-AUTOMATIC PROGRAM REWRITING SYSTEM
Date: 09 Oct 2017
Number: 9785422
Issuing country: United States

9. Patent Name: PPH - A NETWORK SCHEDULING METHOD OF ALL-TO-ALL COMMUNICATION FOR PIPELINED PARALLEL FAST FOURIER TRANSFORM PROCESSING
Date: 14 Sep 2016
Number: 2487684
Issuing country: United Kingdom

10. Patent Name: SCHEDULING COMPUTATION PROCESSES INCLUDING ALL-TO-ALL COMMUNICATIONS (A2A) FOR PIPELINED PARALLEL PROCESSING AMONG PLURALITY OF PROCESSOR NODES CONSTITUTING NETWORK OF N-DIMENSIONAL SPACE
Date: 02 Feb 2016
Number: 9251118
Issuing country: United States

11. Patent Name: ASYNCHRONOUS CHECKPOINT ACQUISITION AND RECOVERY FROM THE CHECKPOINT IN PARALLEL COMPUTER CALCULATION IN ITERATION METHOD
Date: 12 June 2015
Number: 5759203
Issuing country:Japan

12. Patent Name: HIGH-PERFORMANCE WIRELESS CROSSBAR SWITCH
Date: 29 Apr 2015
Number: ZL201080053262.4
Issuing country: China

13. Patent Name: A NETWORK SCHEDULING METHOD OF ALL-TO-ALL COMMUNICATION FOR PIPELINED PARALLEL FAST FOURIER TRANSFORM PROCESSING
Date: 15 Apr 2015
Number: ZL201080050810.8
Issuing country: China

14. Patent Name: HIGH-PERFORMANCE WIRELESS CROSSBAR SWITCH
Date: 11 Apr 2015
Number: I481215
Issuing country: Taiwan

15. Patent Name: HIGH-PERFORMANCE WIRELESS CROSSBAR SWITCH
Date: 18 Dec 2014
Number: 112010004607
Issuing country: Germany

16. Patent Name: METHODOLOGY FOR FAST DETECTION OF FALSE SHARING IN THREADED SCIENTIFIC CODES
Date: 24 Nov 2014
Number: 8898648
Issuing country: United States

17. Patent Name: A NETWORK SCHEDULING METHOD OF ALL-TO-ALL COMMUNICATION FOR PIPELINED PARALLEL FAST FOURIER TRANSFORM PROCESSING
Date: 05 Dec 2013
Number: 5425993
Issuing country: Japan

18. Patent Name: HIGH-PERFORMANCE WIRELESS CROSSBAR SWITCH
Date: 15 Nov 2013
Number: 1332095
Issuing country: Korea

19. Patent Name: FAST - HIGH-PERFORMANCE WIRELESS CROSSBAR SWITCH
Date: 10 Sep 2013
Number: 2488502
Issuing country: United Kingdom

20. Patent Name: A NETWORK SCHEDULING METHOD OF ALL-TO-ALL COMMUNICATION FOR PIPELINED PARALLEL FAST FOURIER TRANSFORM PROCESSING
Date: 13 Dec 2012
Number: 5153945
Issuing country: Japan

21. Patent Name: HIGH-PERFORMANCE WIRELESS CROSSBAR SWITCH
Date: 20 July 2012
Number: 5044046
Issuing country: Japan

22. Patent Name: STORAGE PERFORMANCE IMPROVEMENT SYSTEM USING FRAME ADDRESS AND DELETE COMMAND
Date: 13 Feb 2009
Number: 4257834
Issuing country: Japan

23. Patent Name: NETWORK SYSTEM, SERVER, DATA PROCESSING METHOD AND PROGRAM
Date: 15 July 2008
Number: 7401129
Issuing country: United States

24. Patent Name: A NAS SERVER FRONT END SYSTEM USING NVRAM
Date: 06 June 2008
Number: 4131514
Issuing country: Japan

25. Patent Name: CPU RESOURCE RESERVATION SYSTEM FOR CPU USING SIMULTANEOUS MULTI THREADING
Date: 25 May 2007
Number: 3962370
Issuing country: Japan

26. Patent Name: %INC3%A REPLICA MANAGEMENT SYSTEM
Date: 23 July 2004
Number: 3578385
Issuing country: Japan

27. Patent Name: %INC3%A REPLICA MANAGEMENT SYSTEM
Date: 16 July 2003
Number: ZL99120897.8
Issuing country: China

28. Patent Name: %INC3% COMPUTER, DATA SHARING SYSTEM, AND METHOD FOR MAINTAINING REPLICA CONSTITENCY
Date: 27 May 2003
Number: 6571278
Issuing country: United States

29. Patent Name: %INC3%A REPLICA MANAGEMENT SYSTEM
Date: 05 Mar 2003
Number: NI-165248
Issuing country: Taiwan

30. Patent Name: A REPLICA MANAGEMENT SYSTEM
Date: 02 Mar 2003
Number: 131710
Issuing country: Israel

31. Patent Name: %INC3%A REPLICA MANAGEMENT SYSTEM
Date: 24 Oct 2002
Number: 0359960
Issuing country: Korea

32. Patent Name: A HIGH SPEED DATA TRANSFER MECHANISM
Date: 20 June 1997
Number: 2664838
Issuing country: Japan

33. Patent Name: APPARATUS AND METHOD FOR PACKET COMMUNICATIONS
Date: 25 July 1995
Number: 5436892
Issuing country: United States