# Dan He

## contact information

Computational Genomics

Thomas J. Watson Research Center, Yorktown Heights, NY USA

+19149452315

Thomas J. Watson Research Center, Yorktown Heights, NY USA

+19149452315

## links

### Professional Associations

**Professional Associations:**ACM | IEEE Member | International Society for Computational Biology

**2016**

Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction

He, Dan and Kuhn, David and Parida, Laxmi

He, Dan and Kuhn, David and Parida, Laxmi

*Bioinformatics**32*(*12*), i37--i43, Oxford Univ Press, 2016
Mint: Mutual information based transductive feature selection for genetic trait prediction

He, Dan and Rish, Irina and Haws, David and Parida, Laxmi

He, Dan and Rish, Irina and Haws, David and Parida, Laxmi

*IEEE/ACM Transactions on Computational Biology and Bioinformatics**13*(*3*), 578--583, IEEE, 2016**2015**

Does encoding matter? A novel view on the quantitative genetic trait prediction problem

He, Dan and Parida, Laxmi

He, Dan and Parida, Laxmi

*Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on*,*pp. 123--126*
SAME: a sampling-based multi-locus epistasis algorithm for quantitative genetic trait prediction

He, Dan and Parida, Laxmi

He, Dan and Parida, Laxmi

*Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics*,*pp. 286--295*, 2015
Performance evaluation of different encoding strategies for quantitative genetic trait prediction

Ogundijo, Oyetunji E and He, Dan and Parida, Laxmi

Ogundijo, Oyetunji E and He, Dan and Parida, Laxmi

*Computational Advances in Bio and Medical Sciences (ICCABS), 2015 IEEE 5th International Conference on*,*pp. 1--6*
Mined: An efficient mutual information based epistasis detection method to improve quantitative genetic trait prediction

He, Dan and Wang, Zhanyong and Parada, Laxmi

He, Dan and Wang, Zhanyong and Parada, Laxmi

*International Symposium on Bioinformatics Research and Applications*,*pp. 108--124*, 2015
Variable-Selection Emerges on Top in Empirical Comparison of Whole-Genome Complex-Trait Prediction Methods

Haws, David C and Rish, Irina and Teyssedre, Simon and He, Dan and Lozano, Aurelie C and Kambadur, Prabhanjan and Karaman, Zivan and Parida, Laxmi

Haws, David C and Rish, Irina and Teyssedre, Simon and He, Dan and Lozano, Aurelie C and Kambadur, Prabhanjan and Karaman, Zivan and Parida, Laxmi

*PloS one**10*(*10*), e0138903, Public Library of Science, 2015
Data-driven encoding for quantitative genetic trait prediction

He, Dan and Wang, Zhanyong and Parida, Laxmi

He, Dan and Wang, Zhanyong and Parida, Laxmi

*BMC bioinformatics**16*(*Suppl 1*), S10, BioMed Central Ltd, 2015**2014**

IPED2: Inheritance path based pedigree reconstruction algorithm for complicated pedigrees

He, Dan and Wang, Zhanyong and Parida, Laxmi and Eskin, Eleazar

He, Dan and Wang, Zhanyong and Parida, Laxmi and Eskin, Eleazar

*Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics*,*pp. 202--210*, 2014**2013**

IPEDX: An exact algorithm for pedigree reconstruction using genotype data

He, Dan and Eskin, Eleazar

He, Dan and Eskin, Eleazar

*Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on*,*pp. 517--520*
Optimized retrieval algorithms for personalized content aggregation

He, Dan and Parker, Douglass S

He, Dan and Parker, Douglass S

*Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on*,*pp. 270--277*
Leveraging multi-SNP reads from sequencing data for haplotype inference

Yang, Wen-Yun and Hormozdiari, Farhad and Wang, Zhanyong and He, Dan and Pasaniuc, Bogdan and Eskin, Eleazar

Yang, Wen-Yun and Hormozdiari, Farhad and Wang, Zhanyong and He, Dan and Pasaniuc, Bogdan and Eskin, Eleazar

*Bioinformatics*, btt386, Oxford Univ Press, 2013
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data

Yang, Wen-Yun and Hormozdiari, Farhad and Wang, Zhanyong and He, Dan and Pasaniuc, Bogdan and Eskin, Eleazar

Yang, Wen-Yun and Hormozdiari, Farhad and Wang, Zhanyong and He, Dan and Pasaniuc, Bogdan and Eskin, Eleazar

*Bioinformatics**29*(*18*), 2245--2252, Oxford Univ Press, 2013
IBD-Groupon: an efficient method for detecting group-wise identity-by-descent regions simultaneously in multiple individuals based on pairwise IBD relationships

He, Dan

He, Dan

*Bioinformatics**29*(*13*), i162--i170, Oxford Univ Press, 2013
IPED: inheritance path-based pedigree reconstruction algorithm using genotype data

He, Dan and Wang, Zhanyong and Han, Buhm and Parida, Laxmi and Eskin, Eleazar

He, Dan and Wang, Zhanyong and Han, Buhm and Parida, Laxmi and Eskin, Eleazar

*Journal of Computational Biology**20*(*10*), 780--791, Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, NY 10801 USA, 2013**2012**

Hap-seqX: Expedite Algorithm for Haplotype Phasing with Imputation using Sequence Data

D. He, E. Eskin

D. He, E. Eskin

*Gene*, Elsevier, 2012
Modeling semantic influence for biomedicai research topics using MeSH hierarchy

D. He

D. He

*Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on*,*pp. 1--6*
Hap-seq: An Optimal Algorithm for Haplotype Phasing with Imputation Using Sequencing Data

D. He, B. Han, E. Eskin

D. He, B. Han, E. Eskin

*Research in Computational Molecular Biology*,*pp. 64--78*, 2012**2011**

Mining research topic-related influence between academia and industry

D. He

D. He

*Machine Learning and Knowledge Discovery in Databases*, 17--31, Springer, 2011
CLAP: Collaborative pattern mining for distributed information systems

X. Zhu, B. Li, X. Wu, D. He, C. Zhang

X. Zhu, B. Li, X. Wu, D. He, C. Zhang

*Decision Support Systems*, Elsevier, 2011
How Does Research Evolve? Pattern Mining for Research Meme Cycles

D. He, X. Zhu, D.S. Parker

D. He, X. Zhu, D.S. Parker

*Data Mining (ICDM), 2011 IEEE 11th International Conference on*,*pp. 1068--1073*
Genotyping common and rare variation using overlapping pool sequencing

D. He, N. Zaitlen, B. Pasaniuc, E. Eskin, E. Halperin

D. He, N. Zaitlen, B. Pasaniuc, E. Eskin, E. Halperin

*BMC Bioinformatics**12*(*Suppl 6*), S2, BioMed Central Ltd, 2011
MINING APPROXIMATE REPEATING PATTERNS FROM SEQUENCE DATA WITH GAP CONSTRAINTS

D. He, X. Zhu, X. Wu

D. He, X. Zhu, X. Wu

*Computational Intelligence**27*(*3*), 336--362, Wiley Online Library, 2011
Using HLA binding prediction algorithms for epitope mapping in HIV vaccine clinical trials

D. He, P. Kunwar, E. Eskin, H. Horton, P. Gilbert, T. Hertz

D. He, P. Kunwar, E. Eskin, H. Horton, P. Gilbert, T. Hertz

*Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine*,*pp. 594--601*, 2011
Efficient algorithms for tandem copy number variation reconstruction in repeat-rich regions

D. He, F. Hormozdiari, N. Furlotte, E. Eskin

D. He, F. Hormozdiari, N. Furlotte, E. Eskin

*Bioinformatics**27*(*11*), 1513--1520, Oxford Univ Press, 2011
Learning the funding momentum of research projects

D. He, D. Parker

D. He, D. Parker

*Advances in Knowledge Discovery and Data Mining*, 532--543, Springer, 2011
An optimal weighted aggregated association test for identification of rare variants involved in common diseases

J.H. Sul, B. Han, D. He, E. Eskin

J.H. Sul, B. Han, D. He, E. Eskin

*Genetics**188*(*1*), 181--188, Genetics Soc America, 2011
Topical semantics of twitter links

M.J. Welch, U. Schonfeld, D. He, J. Cho

M.J. Welch, U. Schonfeld, D. He, J. Cho

*Proceedings of the fourth ACM international conference on Web search and data mining*,*pp. 327--336*, 2011**2010**

Rule synthesizing from multiple related databases

D. He, X. Wu, X. Zhu

D. He, X. Wu, X. Zhu

*Advances in Knowledge Discovery and Data Mining*, 201--213, Springer, 2010
Effective algorithms for fusion gene detection

D. He, E. Eskin

D. He, E. Eskin

*Algorithms in Bioinformatics*, 312--324, Springer, 2010
Detection and reconstruction of tandemly organized de novo copy number variations

D. He, N. Furlotte, E. Eskin

D. He, N. Furlotte, E. Eskin

*BMC bioinformatics**11*(*Suppl 11*), S12, BioMed Central Ltd, 2010
Optimal algorithms for haplotype assembly from whole-genome sequence data

D. He, A. Choi, K. Pipatsrisawat, A. Darwiche, E. Eskin

D. He, A. Choi, K. Pipatsrisawat, A. Darwiche, E. Eskin

*Bioinformatics**26*(*12*), i183--i190, Oxford Univ Press, 2010
Topic dynamics: an alternative model of bursts in streams of topics

D. He, D.S. Parker

D. He, D.S. Parker

*Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining*,*pp. 443--452*, 2010**2009**

Approximate repeating pattern mining with gap requirements

D. He, X. Zhu, X. Wu

D. He, X. Zhu, X. Wu

*Tools with Artificial Intelligence, 2009. ICTAI'09. 21st International Conference on*,*pp. 17--24*
Error detection and uncertainty modeling for imprecise data

D. He, X. Zhu, X. Wu

D. He, X. Zhu, X. Wu

*Tools with Artificial Intelligence, 2009. ICTAI'09. 21st International Conference on*,*pp. 792--795***2008**

Cleansing noisy data streams

X. Zhu, P. Zhang, X. Wu, D. He, C. Zhang, Y. Shi

X. Zhu, P. Zhang, X. Wu, D. He, C. Zhang, Y. Shi

*Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on*,*pp. 1139--1144***2007**

Iterative Refinement of Repeat Sequence Specification Using Constrained Pattern Matching

D. He, A.N. Arslan, Y. He, X. Wu

D. He, A.N. Arslan, Y. He, X. Wu

*Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on*,*pp. 1199--1203*
A novel greedy algorithm for the minimum common string partition problem

D. He

D. He

*Bioinformatics Research and Applications*, 441--452, Springer, 2007
SAIL-APPROX: An efficient on-line algorithm for approximate pattern matching with wildcards and length constraints

D. He, X. Wu, X. Zhu

D. He, X. Wu, X. Zhu

*Bioinformatics and Biomedicine, 2007. BIBM 2007. IEEE International Conference on*,*pp. 151--158***2006**

Ontology-Based FeatureWeighting for Biomedical Literature Classification

D. He, X. Wu

D. He, X. Wu

*Information Reuse and Integration, 2006 IEEE International Conference on*,*pp. 280--285*
Using suffix tree to discover complex repetitive patterns in DNA sequences

D. He

D. He

*Engineering in Medicine and Biology Society, 2006. EMBS'06. 28th Annual International Conference of the IEEE*,*pp. 3474--3477*
A fast algorithm for the Constrained Multiple Sequence Alignment problem

D. He, A.N. Arslan, A.C.H. Ling

D. He, A.N. Arslan, A.C.H. Ling

*Acta Cybernetica**17*(*4*), 701--717, Acta Cybernetica, 2006**2005**

A parallel algorithm for the Constrained Multiple Sequence Alignment problem

D. He, A.N. Arslan

D. He, A.N. Arslan

*Bioinformatics and Bioengineering, 2005. BIBE 2005. Fifth IEEE Symposium on*,*pp. 258--262*
A space-efficient algorithm for the constrained pairwise sequence alignment problem

D. He, A.N. Arslan

D. He, A.N. Arslan

*Genome Informatics**16*(*2*), 237--246, 2005**Year Unknown**

Space-efficient Algorithms for the Constrained Multiple Sequence Alignment Problem

D. He, A.N. Arslan

D He..., 0

D. He, A.N. Arslan

D He..., 0

FastPCMSA: An Improved Parallel Algorithm for the Constrained Multiple Sequence Alignment Problem

D. He, A.N. Arslan

Citeseer, Citeseer, 0

D. He, A.N. Arslan

Citeseer, Citeseer, 0

A* Algorithms for the Constrained Multiple Sequence Alignment Problem

D. He, A.N. Arslan

Citeseer, Citeseer, 0

D. He, A.N. Arslan

Citeseer, Citeseer, 0