Novel algorithms and hardware designs for ultra-fast next-gen sequence analysis



Our proposed research aims to accelerate next generation sequence analysis 1000-fold or more by combining our knowledge in genomic sequence analysis, algorithms development, and computer architecture/engineering. Our plan to address the problems of processing unprecedented amounts of sequence data has three major components. First, we will develop and improve sophisticated software algorithms and tools to handle large amounts of sequence reads generated by all major NGS platforms without sacrificing sensitivity while correcting for the sequencing biases associated by each of the NGS platforms. Our algorithms will also be able to map reads in the duplicated regions of the genome and report the underlying sequence variation, an important feature especially to characterize segmental duplications and structural variation that no other read mapping tool can currently achieve. Second, we will boost the performance and efficiency of our algorithms (100 to 1000-fold) by accelerating the required inherently-parallel computations of the sequence search problem on massively-parallel hardware engines available today, graphics processing units (GPUs). Finally, we will design specialized hardware architectures to enhance the speed of sequence analysis beyond orders of magnitude while reducing energy consumed by it by 100-fold or more.