Gerald Tesauro  Gerald Tesauro photo       

contact information

Principal Research Staff Member
Thomas J. Watson Research Center, Yorktown Heights, NY USA
  +1dash914dash945dash3000

links

Professional Associations

Professional Associations:  Association for the Advancement of Artificial Intelligence (AAAI)  |  IEEE Member


2017

Optimal Sequential Drilling for Hydrocarbon Field Development Planning
Torrado, Ruben Rodriguez, Jesus Rios, and Gerald Tesauro
Proceedings of AAAI, pp. 4734-4739, 2017
Abstract

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation
Iulian Vlad Serban, Tim Klinger, Gerald Tesauro, Kartik Talamadupula, Bowen Zhou, Yoshua Bengio and Aaron Courville
Proceedings of AAAI, pp. 3288-3294, 2017
Abstract


2016

Selecting Near-Optimal Learners via Incremental Data Allocation
Ashish Sabharwal, Horst Samulowitz, and Gerald Tesauro
Proceedings of AAAI-2016
Abstract


2015

Budgeted Prediction with Expert Advice
Amin, Kareem, Satyen Kale, Gerald Tesauro, and Deepak S. Turaga
Proceedings of AAAI-2015, pp. 2490--2496
Abstract


2013

Analysis of watson's strategies for playing Jeopardy!
G. Tesauro, D. C. Gondek, J. Lenchner, J. Fan and J. M. Prager
Journal of Artificial Intelligence Research47, 205-251, AI Access Foundation, 2013


2012

Simulation, learning, and optimization techniques in Watson's game strategies
G. Tesauro, DC Gondek, J. Lenchner, J. Fan, JM Prager
IBM Journal of Research and Development 56(3.4), 16--1, IBM, 2012

Applying a framework for healthcare incentives simulation
J. P. Bigus, C.-H. Chen-Ritzo, K. Hermiz, G. Tesauro, R. Sorrentino
WSC'12 Proceedings of the Winter Simulation Conference, pp. 80:1-80:12, 2012

Bayesian inference in monte-carlo tree search
Gerald Tesauro, VT Rajan, Richard Segal
arXiv preprint arXiv:1203.3519, 2012


2010

Bayesian Inference in Monte-Carlo Tree Search
Gerald Tesauro, V. T. Rajan, and Richard Segal
Proceedings of UAI, AUAI Press, 2010
Abstract


2009

Monte-Carlo simulation balancing
David Silver, Gerald Tesauro
Proceedings of the 26th Annual International Conference on Machine Learning, pp. 945--952, ACM, 2009
Abstract


2008

Autonomic multi-agent management of power and performance in data centers
R. Das, J.O. Kephart, C. Lefurgy, G. Tesauro, D.W. Levine, H. Chan
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems: industrial track, pp. 107--114, International Foundation for Autonomous Agents and Multiagent Systems, 2008


2007

Estimating end-to-end performance by collaborative prediction with active sampling
Irina Rish, Gerald Tesauro
Integrated Network Management, 2007. IM'07. 10th IFIP/IEEE International Symposium on, pp. 294--303

Metric learning for kernel regression
K Q Weinberger, G Tesauro
Eleventh international conference on artificial intelligence and statistics, pp. 608--615, 2007

On the use of hybrid reinforcement learning for autonomic resource allocation
G Tesauro, N K Jong, R Das, M N Bennani
Cluster Computing 10(3), 287--299, Springer, 2007

Coordinating multiple autonomic managers to achieve specified power-performance tradeoffs
Jeffrey O. Kephart, Hoi Chan, Rajarshi Das, David W. Levine, Gerald Tesauro, Freeman L. Rawson III, Charles Lefurgy
Fourth International Conference on Autonomic Computing, pp. 24-33, IEEE Computer Society, 2007

Reinforcement learning in autonomic computing: A manifesto and case studies
G Tesauro
IEEE Internet Computing, 22--30, Published by the IEEE Computer Society, 2007

Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning
Gerald J Tesauro, Rajarshi Das, Hoi Y Chan, Jeffrey O. Kephart, David W. Levine, Freeman L Rawson III, Charles Lefurgy
Advances in Neural Information Processing Systems 20, MIT Press, 2007


2006

Online performance management using hybrid reinforcement learning
G Tesauro, R Das, N K Jong
First Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SysML’06), Citeseer, 2006

Improvement of systems management policies using hybrid reinforcement learning
G Tesauro, N Jong, R Das, M Bennani
Machine Learning: ECML 2006, 783--791, Springer

A hybrid reinforcement learning approach to autonomic resource allocation
G Tesauro, N K Jong, R Das, M N Bennani
2006 IEEE International Conference on Autonomic Computing, pp. 65--73


2005

Online resource allocation using decompositional reinforcement learning
G Tesauro, others
PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, pp. 886, 2005

New approaches to optimization and utility elicitation in autonomic computing
Relu Patrascu, Craig Boutilier, Rajarshi Das, Jeffrey O. Kephart, Gerald Tesauro, and William E. Walsh
Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-05), pp. 140-145, 2005

Utility-function-driven resource allocation in autonomic systems
Gerald Tesauro, Rajarshi Das, William E Walsh, Jeffrey O Kephart
Proceedings of the Second International Conference on Autonomic Computing, pp. 342-343, IEEE Computer Society, 2005


2004

Extending Q-learning to general adaptive multi-agent systems
G Tesauro
Advances in neural information processing systems16, Citeseer, 2004

A multi-agent systems approach to autonomic computing
Gerald Tesauro, David M Chess, William E Walsh, Rajarshi Das, Alla Segal, Ian Whalley, Jeffrey O Kephart, and Steve R White
Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 1, pp. 464--471, IEEE Computer Society, 2004
Abstract

Utility functions in autonomic systems
William E Walsh, Gerald Tesauro, Jeffrey O Kephart and Rajarshi Das
Proceedings of the First International Conference on Autonomic Computing, pp. 70--77, IEEE Computer Society, 2004


2003

A strategic decision model for multi-attribute bilateral negotiation with alternating
Cuihong Li, Gerald Tesauro
Proceedings of the 4th ACM conference on Electronic commerce, pp. 208--209, ACM, 2003

Cooperative negotiation in autonomic systems using incremental utility elicitation
Craig Boutilier, Rajarshi Das, Jeffrey O Kephart, Gerald Tesauro, William E Walsh
Nineteenth Conference on Uncertainty in Artificial Intelligence, pp. 89--97, Morgan Kaufmann, 2003


2002


Analyzing complex strategic interactions in multi-agent systems
William E Walsh, Rajarshi Das, Gerald Tesauro, Jeffrey O Kephart
Proceedings of the Fourth Workshop on Game-Theoretic and Decision-Theoretic Agents, pp. 109--118, 2002

Strategic sequential bidding in auctions using dynamic programming
Gerald Tesauro, Jonathan L Bredin
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2, pp. 591--598, ACM, 2002
Abstract

Pricing in agent economies using multi-agent Q-learning
Gerald Tesauro, Jeffrey O Kephart
Autonomous Agents and Multi-Agent Systems 5(3), 289--304, Springer, 2002

Programming backgammon using self-teaching neural nets
G Tesauro
Artificial Intelligence 134(1-2), 181--199, Elsevier, 2002


2001

Comparison training of chess evaluation functions
Gerald Tesauro
Machines that learn to play games, 2001 - portal.acm.org, Nova Science Publishers, Inc.
Abstract


Agent-human interactions in the continuous double auction
Rajarshi Das, James E Hanson, Jeffrey O Kephart, Gerald Tesauro
Proceedings of the 17th International Joint Conference on Artificial Intelligence, pp. 1169--1176, Morgan Kaufmann Publishers Inc., 2001

High-performance bidding agents for the continuous double auction
Gerald Tesauro, Rajarshi Das
Proceedings of the 3rd ACM conference on Electronic Commerce, pp. 206--209, ACM, 2001
Abstract


2000

Foresight-based pricing algorithms in agent economies
Gerald J Tesauro, Jeffrey O Kephart
Decision Support Systems 28(1-2), 49--60, Elsevier, 2000

Pseudo-convergent Q-learning by competitive pricebots
Jeffrey O Kephart, Gerald J Tesauro
Proceedings of the 17th International Conference on Machine Learning, pp. 463--470, Morgan Kaufmann Publishers Inc., 2000

Multi-agent Q-learning and regression trees for automated pricing decisions
M Sridharan, G Tesauro
Seventeenth International Conference on Machine Learning (ICML), pp. 447--448, 2000



1999

Strategic pricebot dynamics
A.R. Greenwald, J.O. Kephart, G.J. Tesauro
Proceedings of the 1st ACM conference on Electronic commerce, pp. 58--67, ACM, 1999


1998


Foresight-based pricing algorithms in an economy of software agents
G.J. Tesauro, J.O. Kephart
Proceedings of the first international conference on Information and computation economies, pp. 37--44, ACM, 1998


1997

On-line policy improvement using Monte-Carlo search
G Tesauro, G R Galperin
Advances in Neural Information Processing Systems, 1068--1074, MORGAN KAUFMANN PUBLISHERS, 1997


1996

Neural networks for computer virus recognition
G.J. Tesauro, J.O. Kephart, G.B. Sorkin
IEEE expert 11(4), 5--6, IEEE, 1996


1995

Biologically inspired defenses against computer viruses
Jeffrey O Kephart, Gregory B Sorkin, William C Arnold, David M Chess, Gerald J Tesauro, Steve R White, TJ Watson
IJCAI (1), pp. 985--996, 1995

Advances in neural information processing systems 7
G Tesauro, D S Touretzky
1995 - books.google.com, The MIT Press

Temporal difference learning and TD-Gammon
G Tesauro
Communications of the ACM 38(3), 58--68, 1995


1994

Advances in neural information processing systems 6
J D Cowan, G Tesauro, J Alspector
1994 - Morgan Kaufmann, Morgan Kaufmann



1992

Temporal difference learning of backgammon strategy
Gerald Tesauro
Proceedings of the ninth international workshop on Machine learning, pp. 451--457, Morgan Kaufmann Publishers Inc., 1992

How tight are the Vapnik-Chervonenkis bounds?
D Cohn, G Tesauro
Neural Computation 4(2), 249--269, MIT Press, 1992

Practical issues in temporal difference learning
G Tesauro
Machine learning 8(3), 257--277, Springer, 1992


1991

Can neural networks do better than the Vapnik-Chervonenkis bounds
G Tesauro, D Cohn
Advances in Neural Information Processing3, 911--917, 1991

Visualizing processes in neural networks
J Wejchert, G Tesauro
IBM Journal of Research and Development 35(1.2), 244--253, IBM, 1991


1990

Advances in neural information processing systems 2
Subutai Ahmad, Gerald Tesauro, Yu He
Advances in neural information ..., 1990 - portal.acm.org, Morgan Kaufmann Publishers Inc.


Can neural networks do better than the Vapnik-Chervonenkis bounds?
David Cohn, Gerald Tesauro
Proceedings of the 1990 conference on Advances in neural information processing systems 3, pp. 911--917, Morgan Kaufmann Publishers Inc.



1989

Neurogammon Wins Computer Olympiad
Gerald Tesauro
Neural Computation 1(3), 321-323, MIT Press, 1989
Abstract

Olfactory processing and associative memory: cellular and modeling studies
A Gelperin, DW Tank, G Tesauro
Neural Models of Plasticity, 133--159, Academic Press, New York, 1989

Connectionist learning of expert preferences by comparison training
Gerald Tesauro
Advances in Neural Information Processing Systems 1, pp. 99--106, Morgan Kaufmann Publishers Inc., 1989
Abstract

The Hebb rule for synaptic plasticity: Implementations and applications
TJ Sejnowski, GJ Tesauro
Neural Models of Plasticity, 94--103, Academic New York, 1989

A parallel network that learns to play backgammon
G Tesauro, T J Sejnowski
Artificial Intelligence 39(3), 357--390, Elsevier, 1989

Asymptotic convergence of backpropagation
G Tesauro, Y He, S Ahmad
Neural Computation 1(3), 382--391, MIT Press, 1989

Neurogammon: A neural network backgammon learning program
G Tesauro
Heuristic Programming in Artificial Intelligence: The First Computer Olympiad, Chichester, England 1(3), 321--323, MIT Press, 1989


1988

Scaling and Generalization in Neural Networks: A Case Study
S. Ahmad and G. Tesauro
Advances in Neural Information Processing Systems, pp. 160--168, Morgan Kaufmann, 1988
Abstract

A plausible neural circuit for classical conditioning without synaptic plasticity
G Tesauro
Proceedings of the National Academy of Sciences 85(8), 2830, National Acad Sciences, 1988

Scaling relationships in back-propagation learning
G Tesauro, B Janssens
Complex Systems 2(1), 39--44, Complex Systems Publications, Inc., 1988


1987

A 'Neural' Network that Learns to Play Backgammon
G. Tesauro and T. J. Sejnowski
Neural Information Processing Systems, pp. 794--803, American Institute of Physics, 1987
Abstract


1986

Simple neural models of classical conditioning
G Tesauro
Biological Cybernetics 55(2), 187--200, Springer, 1986


Year Unknown

Connectionist learning of expert backgammon evaluations
G Tesauro
Machine Learning, 200--206

Scaling and Generalization in Neural Networks: A Case Study
G Tesauro, S Ahmad
Neural Information Processing System, pp. 160--169