Novel algorithms and hardware designs for ultra-fast next-gen
          sequence analysis
        
        
          
            - United States of America National Institutes of Health (R01
              HG006004), 2011-2015 
 
- PI: Onur Mutlu
- Co-PI: Can Alkan
- Subaward amount (4 years): $462,847
 
- The goal of this project is to develop specialized hardware
              architectures to accelerate mapping reads generated with the high
              throughput sequencing platforms.
People
        
          
            - Principal Investigators: Assistant Prof. Can Alkan (Bilkent U.)
              and Assistant Prof. Onur Mutlu (Carnegie Mellon U.)
- Students
              - CMU: Hongyi Xin, Donghyuk Lee, Samihan Yedkar, Damla Şenol
                Çalı
- UCLA: Farhad Hormozdiari
- Bilkent: Mustafa Korkmaz, Azita Nouri, Mohammed Alser, Tuğba
                Doğan
 
Abstract
        
        Our proposed research aims to accelerate next generation sequence
        analysis 1000-fold or more by combining our knowledge in genomic
        sequence analysis, algorithms development, and computer
        architecture/engineering. Our plan to address the problems of processing
        unprecedented amounts of sequence data has three major components.
        First, we will develop and improve sophisticated software algorithms and
        tools to handle large amounts of sequence reads generated by all major
        NGS platforms without sacrificing sensitivity while correcting for the
        sequencing biases associated by each of the NGS platforms. Our
        algorithms will also be able to map reads in the duplicated regions of
        the genome and report the underlying sequence variation, an important
        feature especially to characterize segmental duplications and structural
        variation that no other read mapping tool can currently achieve. Second,
        we will boost the performance and efficiency of our algorithms (100 to
        1000-fold) by accelerating the required inherently-parallel computations
        of the sequence search problem on massively-parallel hardware engines
        available today, graphics processing units (GPUs). Finally, we will
        design specialized hardware architectures to enhance the speed of
        sequence analysis beyond orders of magnitude while reducing energy
        consumed by it by 100-fold or more.
        
        
Dissemination
        
          - SCALCE: boosting Sequence
                  Compression Algorithms using Locally Consistent Encoding.
              Faraz Hach, Ibrahim
                  Numanagić, Can
                Alkan, S. Cenk Sahinalp. Bioinformatics,
              Dec
                  1;28(23):3051-57, 2012.
- Accelerating
                read mapping with FastHASH. Hongyi Xin, Donghyuk
            Lee, Farhad Hormozdiari, Samihan Yedkar, Onur Mutlu, Can
                Alkan. BMC Genomics, 14(Suppl
            1):S13, 2013.
- mrsFAST-Ultra:
                a compact, SNP-aware mapper for high performance sequencing
                applications. Faraz Hach*, Iman Sarrafi*, Farhad
            Hormozdiari, Can Alkan, Evan
            E. Eichler, S. Cenk Sahinalp. Nucl Acids Res, Jul;42(Web
            Server issue):W494-500, 2014.
- Fast
                and accurate mapping of Complete Genomics reads. Donghyuk
            Lee*, Farhad Hormozdiari*, Hongyi Xin, Faraz Hach, Onur Mutlu, Can Alkan. Methods,
            [epub October 22], doi :10.1016/j.ymeth.2014.10.012, 2014.
- Shifted
                Hamming Distance: a fast and accurate SIMD-friendly filter to
                accelerate alignment verification in read mapping. Hongyi
            Xin, John Greth, John Emmons, Gennady Pekhimenko, Carl Kingsford, Can Alkan* and Onur
            Mutlu*. Bioinformatics, [published online, Jan 10], 2015.
- Optimal
                Seed Solver: Optimizing Seed Selection in Read Mapping. Hongyi
            Xin, Sunny Nahar, Richard Zhu, John Emmons, Gennady Pekhimenko, Carl
            Kingsford, Can
              Alkan*, Onur Mutlu*. Bioinformatics,
            Jun 1;32(11):1632-42, 2016.
- MAGNET:
                understanding and improving the accuracy of genome pre-alignment
                filtering. Mohammed
              Alser, Onur Mutlu*, Can
              Alkan*. IPSI Transactions on Internet Research, 13(2),
            2017.
-  GateKeeper:
                  a new hardware architecture for accelerating pre-alignment in
                  DNA short read mapping. Mohammed
                Alser, Hasan Hassan, Hongyi
            Xin, Oguz Ergin, Onur Mutlu*, Can
                Alkan*. Bioinformatics, Nov 1;
            33(21):3335-63, 2017.
-  GRIM-Filter:
                  fast seed location filtering in DNA read mapping using
                  processing-in-memory technologies. Jeremie
            S. Kim, Damla Senol Cali, Hongyi Xin, Donghyuk Lee, Saugata Ghose, Mohammed Alser, Hasan
            Hassan, Oguz Ergin, Can
                  Alkan*, Onur Mutlu*. BMC Genomics,
            19 (Suppl 2):89, 2018.
- Nanopore
                sequencing technology and tools for genome assembly:
                computational analysis of the current state, bottlenecks and
                future directions. Damla
            Senol Cali, Jeremie S. Kim, Saugata Ghose, Can
                Alkan*, Onur Mutlu*. Briefings in
              Bioinformatics, [epub Apr 2; doi: 10.1093/bib/bby017], 2018.