I am broadly interested in the mechanisms of molecular evolution and in how they can be better inferred from large, densely-sampled phylogenomic and population-genomic datasets.  My research aims to exploit computational and comparative genomics both to characterize the mechanistic basis for how genes and non-coding elements evolve new functions, and to reconstruct how gene and genome functions have changed throughout the vertebrates.

To enable such work, new methods of statistical phylogenetic inference that are efficient enough to rapidly, simultaneously analyze hundreds to thousands of genomes are urgently needed.  My work has therefore largely focused on developing such a computational framework, so that the potential utility of the large phylogenomic datasets of the future can be realized (e.g., see Genome10k).  This has included:

  1. Improving the realism, sensitivity, and power of comparative genomics approaches (de Koning et al., 2011; de Koning, 2007; also see Current Projects);
  2. Enabling rapid, high-throughput inferences from very large comparative datasets using novel data augmentation strategies and Markov Chain Monte Carlo (reducing the time to analyze large data sets from months to minutes; de Koning et al., submitted; de Koning et al., 2010; de Koning et al., 2009; Krishnan et al., 2004; also see Current Projects); and
  3. Overcoming difficulties associated with making use of large genomic datasets, including: the impact of model violations (Castoe / de Koning et al., 2009; de Koning and Stewart, submitted), model over-parameterization (also see Current Projects), and experimental design challenges (de Koning, 2007; also see Current Projects).
Some of my current projects are described in more detail here.

Other Recent Work

I’ve also been actively collaborating on developing novel ways to exploit massively-parallel, next-generation DNA sequencing technologies (Pollock, de Koning et al., 2011; Castoe et al., 2012 and Castoe et al. 2009), and have been working on a variety of genomics projects.  These include the sequencing and analysis of the first snake genome (the burmese python; Castoe, de Koning et al., 2011) and other snakes, detailed analysis of the transposable element and repetitive sequence landscape in humans (de Koning et al., 2011) and snakes (Castoe et al., 2011), analysis of evolutionary variation in mutation asymmetries in reptile mitochondrial genomes (Castoe et al., 2009), and the initial analysis of three crocodilian genomes (St. John et al., 2012).  See the Snake Genomics Consortium website and the International Crocodilian Genomes Working Group site for more about our work on reptilian genomes.

My full list of publications can be found here.