This list of sequence alignment software is a compilation of software tools and web portals used in pairwise sequence alignment and multiple sequence alignment. See structural alignment software for structural alignment of proteins.
- Clutch Alignment Tool For 1994 Mach 1
- Alignment Tool For Clutch Replacement
- Alignment Tool For Cabinet Pulls
Multiple Sequence Alignment (MSA) is generally the alignment of three or more biological sequences (protein or nucleic acid) of similar length. From the output, homology can be inferred and the evolutionary relationships between the sequences studied. By contrast, Pairwise Sequence Alignment tools are used to identify regions of similarity that may indicate functional, structural and/or. MEGA is an integrated tool for conducting automatic and manual alignment, inferring phylogenetic trees, mining web-based databases, estimating rates of molecular evolution, inferring ancestral sequences, and testing evolutionary hypotheses. Accurate multiple sequence alignment is central to bioinformatics and molecular evolutionary analyses. Although sophisticated sequence alignment programs are available, manual adjustments are often required to improve alignment quality. Unfortunately, few programs offer a simple and intuitive way to.
Database search only[edit]
Name | Description | Sequence type* | Link | Authors | Year |
---|---|---|---|---|---|
BLAST | Local search with fast k-tuple heuristic (Basic Local Alignment Search Tool) | Both | NCBIEMBL-EBIDDBJDDBJ (psi-blast)GenomeNetPIR (protein only) | Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ[1] | 1990 |
CS-BLAST | Sequence-context specific BLAST, more sensitive than BLAST, FASTA, and SSEARCH. Position-specific iterative version CSI-BLAST more sensitive than PSI-BLAST | Protein | CS-BLAST serverdownload[permanent dead link] | Angermueller C, Biegert A, Soeding J[2] | 2013 |
CUDASW++ | GPU accelerated Smith Waterman algorithm for multiple shared-host GPUs | Protein | Liu Y, Maskell DL and Schmidt B | 2009/2010 | |
DIAMOND | BLASTX and BLASTP aligner based on double indexing | Protein | Buchfink B, Xie, C and Huson DH[3] | 2015 | |
FASTA | Local search with fast k-tuple heuristic, faster but less sensitive than BLAST | Both | EMBL-EBIDDBJGenomeNetPIR (protein only) | ||
GGSEARCH, GLSEARCH | Global:Global (GG), Global:Local (GL) alignment with statistics | Protein | FASTA server | ||
Genoogle | Genoogle uses indexing and parallel processing techniques for searching DNA and Proteins sequences. It is developed in Java and open source. | Both | Albrecht F | 2015 | |
HMMER | Local and global search with profile Hidden Markov models, more sensitive than PSI-BLAST | Both | download | Durbin R, Eddy SR, Krogh A, Mitchison G[4] | 1998 |
HH-suite | Pairwise comparison of profile Hidden Markov models; very sensitive | Protein | Söding J[5][6] | 2005/2012 | |
IDF | Inverse Document Frequency | Both | download | ||
Infernal | Profile SCFG search | RNA | download | Eddy S | |
KLAST | High-performance general purpose sequence similarity search tool | Both | 2009/2014 | ||
LAMBDA | High performance local aligner compatible to BLAST, but much faster; supports SAM/BAM | Protein | Hannes Hauswedell, Jochen Singer, Knut Reinert[7] | 2014 | |
MMseqs2 | Software suite to search and cluster huge sequence sets. Similar sensitivity to BLAST and PSI-BLAST but orders of magnitude faster | Protein | homepage | Steinegger M, Mirdita M, Galiez C, Söding J[8] | 2017 |
USEARCH | Ultra-fast sequence analysis tool | Both | homepage | Edgar, RC (2010) Search and clustering orders of magnitude faster than BLAST, Bioinformatics 26(19), 2460-2461. doi: 10.1093/bioinformatics/btq461 publication | 2010 |
OSWALD | OpenCL Smith-Waterman on Altera's FPGA for Large Protein Databases | Protein | homepage | Rucci E, García C, Botella G, De Giusti A, Naiouf M, Prieto-Matías M[9] | 2016 |
parasail | Fast Smith-Waterman search using SIMD parallelization | Both | homepage | Daily J | 2015 |
PSI-BLAST | Position-specific iterative BLAST, local search with position-specific scoring matrices, much more sensitive than BLAST | Protein | NCBI PSI-BLAST | Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ[10] | 1997 |
PSI-Search | Combining the Smith-Waterman search algorithm with the PSI-BLAST profile construction strategy to find distantly related protein sequences, and preventing homologous over-extension errors. | Protein | EMBL-EBI PSI-Search | Li W, McWilliam H, Goujon M, Cowley A, Lopez R, Pearson WR[11] | 2012 |
ScalaBLAST | Highly parallel Scalable BLAST | Both | ScalaBLAST | Oehmen et al.[12] | 2011 |
Sequilab | Linking and profiling sequence alignment data from NCBI-BLAST results with major sequence analysis servers/services | Nucleotide, peptide | server | 2010 | |
SAM | Local and global search with profile Hidden Markov models, more sensitive than PSI-BLAST | Both | SAM | Karplus K, Krogh A[13] | 1999 |
SSEARCH | Smith-Waterman search, slower but more sensitive than FASTA | Both | |||
SWAPHI | First parallelized algorithm employing the emerging Intel Xeon Phis to accelerate Smith-Waterman protein database search | Protein | homepage | Liu Y and Schmidt B | 2014 |
SWAPHI-LS | First parallel Smith-Waterman algorithm exploiting Intel Xeon Phi clusters to accelerate the alignment of long DNA sequences | DNA | homepage | Liu Y, Tran TT, Lauenroth F, Schmidt B | 2014 |
SWIMM | Smith-Waterman implementation for Intel Multicore and Manycore architectures | Protein | homepage | Rucci E, García C, Botella G, De Giusti A, Naiouf M and Prieto-Matías M[14] | 2015 |
SWIPE | Fast Smith-Waterman search using SIMD parallelization | Both | homepage | Rognes T | 2011 |
*Sequence type: protein or nucleotide
Pairwise alignment[edit]
Clutch Alignment Tool For 1994 Mach 1
Name | Description | Sequence type* | Alignment type** | Link | Author | Year |
---|---|---|---|---|---|---|
ACANA | Fast heuristic anchor based pairwise alignment | Both | Both | download | Huang, Umbach, Li | 2005 |
AlignMe | Alignments for membrane protein sequences | Protein | Both | download, server | M. Stamm, K. Khafizov, R. Staritzbichler, L.R. Forrest | 2013 |
ALLALIGN | For DNA, RNA and proteins with summed length n, generates all local alignments in O(n) time using approximate suffix tree matching or mapped density dynamic alignment | Both | Local | allalign | E. Wachtel | 2017 |
Bioconductor Biostrings::pairwiseAlignment | Dynamic programming | Both | Both + Ends-free | site | P. Aboyoun | 2008 |
BioPerl dpAlign | Dynamic programming | Both | Both + Ends-free | site | Y. M. Chan | 2003 |
BLASTZ, LASTZ | Seeded pattern-matching | Nucleotide | Local | download, download | Schwartz et al.[15][16] | 2004,2009 |
CUDAlign | DNA sequence alignment of unrestricted size in single or multiple GPUs | Nucleotide | Local, SemiGlobal, Global | download | E. Sandes[17][18][19] | 2011-2015 |
DNADot | Web-based dot-plot tool | Nucleotide | Global | server | R. Bowen | 1998 |
DNASTAR Lasergene Molecular Biology Suite | Software to align DNA, RNA, protein, or DNA + protein sequences via pairwise and multiple sequence alignment algorithms including MUSCLE, Mauve, MAFFT, Clustal Omega, Jotun Hein, Wilbur-Lipman, Martinez Needleman-Wunsch, Lipman-Pearson and Dotplot analysis. | Both | Both | DNASTAR site | DNASTAR | 1993-2016 |
DOTLET | Java-based dot-plot tool | Both | Global | applet | M. Pagni and T. Junier | 1998 |
FEAST | Posterior based local extension with descriptive evolution model | Nucleotide | Local | site | A. K. Hudek and D. G. Brown | 2010 |
Genome Compiler Genome Compiler | Align chromatogram files (.ab1, .scf) against a template sequence, locate errors, and correct them instantly. Learn more | Nucleotide | Local | Free online & download | Genome Compiler Corporation | 2014 |
G-PAS | GPU-based dynamic programming with backtracking | Both | Local, SemiGlobal, Global | site+download | W. Frohmberg, M. Kierzynka et al. | 2011 |
GapMis | Does pairwise sequence alignment with one gap | Both | SemiGlobal | site | K. Frousios, T. Flouri, C. S. Iliopoulos, K. Park, S. P. Pissis, G. Tischler | 2012 |
GGSEARCH, GLSEARCH | Global:Global (GG), Global:Local (GL) alignment with statistics | Protein | Global in query | FASTA server | W. Pearson | 2007 |
JAligner | Javaopen-source implementation of Smith-Waterman | Both | Local | JWS | A. Moustafa | 2005 |
K*Sync | Protein sequence to structure alignment that includes secondary structure, structural conservation, structure-derived sequence profiles, and consensus alignment scores | Protein | Both | Robetta server | D. Chivian & D. Baker[20] | 2003 |
LALIGN | Multiple, non-overlapping, local similarity (same algorithm as SIM) | Both | Local non-overlapping | W. Pearson | 1991 (algorithm) | |
NW-align | Standard Needleman-Wunsch dynamic programming algorithm | Protein | Global | server and download | Y Zhang | 2012 |
mAlign | modelling alignment; models the information content of the sequences | Nucleotide | Both | doccode[permanent dead link] | D. Powell, L. Allison and T. I. Dix | 2004 |
matcher | Waterman-Eggert local alignment (based on LALIGN) | Both | Local | Pasteur | I. Longden (modified from W. Pearson) | 1999 |
MCALIGN2 | explicit models of indel evolution | DNA | Global | server | J. Wang et al. | 2006 |
MUMmer | suffix tree based | Nucleotide | Global | download | S. Kurtz et al. | 2004 |
needle | Needleman-Wunsch dynamic programming | Both | SemiGlobal | A. Bleasby | 1999 | |
Ngila | logarithmic and affine gap costs and explicit models of indel evolution | Both | Global | download | R. Cartwright | 2007 |
NW | Needleman-Wunsch dynamic programming | Both | Global | download | A.C.R. Martin | 1990-2015 |
parasail | C/C++/Python/Java SIMD dynamic programming library for SSE, AVX2 | Both | Global, Ends-free, Local | site | J. Daily | 2015 |
Path | Smith-Waterman on protein back-translationgraph (detects frameshifts at protein level) | Protein | Local | M. Gîrdea et al.[21] | 2009 | |
PatternHunter | Seeded pattern-matching | Nucleotide | Local | download | B. Ma et al.[22][23] | 2002–2004 |
ProbA (also propA) | Stochastic partition function sampling via dynamic programming | Both | Global | download | U. Mückstein | 2002 |
PyMOL | 'align' command aligns sequence & applies it to structure | Protein | Global (by selection) | site | W. L. DeLano | 2007 |
REPuter | suffix tree based | Nucleotide | Local | download | S. Kurtz et al. | 2001 |
SABERTOOTH | Alignment using predicted Connectivity Profiles | Protein | Global | download on request[permanent dead link] | F. Teichert, J. Minning, U. Bastolla, and M. Porto | 2009 |
Satsuma | Parallel whole-genome synteny alignments | DNA | Local | download | M.G. Grabherr et al. | 2010 |
SEQALN | Various dynamic programming | Both | Local or global | server | M.S. Waterman and P. Hardy | 1996 |
SIM, GAP, NAP, LAP | Local similarity with varying gap treatments | Both | Local or global | server | X. Huang and W. Miller | 1990-6 |
SIM | Local similarity | Both | Local | servers | X. Huang and W. Miller | 1991 |
SPA: Super pairwise alignment | Fast pairwise global alignment | Nucleotide | Global | available upon request | Shen, Yang, Yao, Hwang | 2002 |
SSEARCH | Local (Smith-Waterman) alignment with statistics | Protein | Local | W. Pearson | 1981 (Algorithm) | |
Sequences Studio | Java applet demonstrating various algorithms from[24] | Generic sequence | Local and global | A.Meskauskas | 1997 (reference book) | |
SWIFT suit | Fast Local Alignment Searching | DNA | Local | site | K. Rasmussen,[25] W. Gerlach | 2005,2008 |
stretcher | Memory-optimized Needleman-Wunsch dynamic programming | Both | Global | Pasteur | I. Longden (modified from G. Myers and W. Miller) | 1999 |
tranalign | Aligns nucleic acid sequences given a protein alignment | Nucleotide | NA | Pasteur | G. Williams (modified from B. Pearson) | 2002 |
UGENE | Opensource Smith-Waterman for SSE/CUDA, Suffix array based repeats finder & dotplot | Both | Both | UGENE site | UniPro | 2010 |
water | Smith-Waterman dynamic programming | Both | Local | A. Bleasby | 1999 | |
wordmatch | k-tuple pairwise match | Both | NA | Pasteur | I. Longden | 1998 |
YASS | Seeded pattern-matching | Nucleotide | Local | L. Noe and G. Kucherov[26] | 2004 |
*Sequence type: protein or nucleotide **Alignment type: local or global
Multiple sequence alignment[edit]
Name | Description | Sequence type* | Alignment type** | Link | Author | Year | License |
---|---|---|---|---|---|---|---|
ABA | A-Bruijn alignment | Protein | Global | download | B.Raphael et al. | 2004 | Proprietary, freeware for education, research, nonprofit |
ALE | manual alignment ; some software assistance | Nucleotides | Local | download | J. Blandy and K. Fogel | 1994 (latest version 2007) | Free, GPL2 |
ALLALIGN | For DNA, RNA and proteins with summed length n, generates all local alignments in O(n) time using approximate suffix tree matching or mapped density dynamic alignment | Both | Local | allalign | E. Wachtel | 2017 | Free |
AMAP | Sequence annealing | Both | Global | server | A. Schwartz and L. Pachter | 2006 | |
anon. | fast, optimal alignment of three sequences using linear gap costs | Nucleotides | Global | papersoftware[permanent dead link] | D. Powell, L. Allison and T. I. Dix | 2000 | |
BAli-Phy | Tree+multi-alignment; probabilistic-Bayesian; joint estimation | Both + Codons | Global | WWW+download | BD Redelings and MA Suchard | 2005 (latest version 2018) | Free, GPL |
Base-By-Base | Java-based multiple sequence alignment editor with integrated analysis tools | Both | Local or global | download | R. Brodie et al. | 2004 | Proprietary, freeware, must register |
CHAOS, DIALIGN | Iterative alignment | Both | Local (preferred) | server | M. Brudno and B. Morgenstern | 2003 | |
ClustalW | Progressive alignment | Both | Local or global | Thompson et al. | 1994 | Free, LGPL | |
CodonCode Aligner | Multi-alignment; ClustalW & Phrap support | Nucleotides | Local or global | download | P. Richterich et al. | 2003 (latest version 2009) | |
Compass | COmparison of Multiple Protein sequence Alignments with assessment of Statistical Significance | Protein | Global | download and server | R.I. Sadreyev, et al. | 2009 | |
DECIPHER | Progressive-iterative alignment | Both | Global | download | Erik S. Wright | 2014 | Free, GPL |
DIALIGN-TX and DIALIGN-T | Segment-based method | Both | Local (preferred) or Global | download and server | A.R.Subramanian | 2005 (latest version 2008) | |
DNA Alignment | Segment-based method for intraspecific alignments | Both | Local (preferred) or Global | server | A.Roehl | 2005 (latest version 2008) | |
DNA Baser Sequence Assembler | Multi-alignment; Full automatic sequence alignment; Automatic ambiguity correction; Internal base caller; Command line seq alignment | Nucleotides | Local or global | www.DnaBaser.com | Heracle BioSoft SRL | 2006 (latest version 2018) | Commercial (some modules are freeware) |
DNADynamo | linked DNA to Protein multiple alignment with MUSCLE, Clustal and Smith-Waterman | Both | Local or global | download | DNADynamo | 2004 (newest version 2017) | |
DNASTAR Lasergene Molecular Biology Suite | Software to align DNA, RNA, protein, or DNA + protein sequences via pairwise and multiple sequence alignment algorithms including MUSCLE, Mauve, MAFFT, Clustal Omega, Jotun Hein, Wilbur-Lipman, Martinez Needleman-Wunsch, Lipman-Pearson and Dotplot analysis. | Both | Local or global | DNASTAR site | DNASTAR | 1993-2016 | |
EDNA | Energy Based Multiple Sequence Alignment for DNA Binding Sites | Nucleotides | Local or global | sourceforge.net/projects/msa-edna/ | Salama, RA. et al. | 2013 | |
FAMSA | Progressive alignment for extremely large protein families (hundreds of thousands of members) | Protein | Global | download | Deorowicz et al. | 2016 | |
FSA | Sequence annealing | Both | Global | download and server | R. K. Bradley et al. | 2008 | |
Geneious | Progressive-Iterative alignment; ClustalW plugin | Both | Local or global | download | A.J. Drummond et al. | 2005 (latest version 2017) | |
Kalign | Progressive alignment | Both | Global | T. Lassmann | 2005 | ||
MAFFT | Progressive-iterative alignment | Both | Local or global | K. Katoh et al. | 2005 | Free, BSD | |
MARNA | Multi-alignment of RNAs | RNA | Local | S. Siebert et al. | 2005 | ||
MAVID | Progressive alignment | Both | Global | server | N. Bray and L. Pachter | 2004 | |
MSA | Dynamic programming | Both | Local or global | download | D.J. Lipman et al. | 1989 (modified 1995) | |
MSAProbs | Dynamic programming | Protein | Global | download | Y. Liu, B. Schmidt, D. Maskell | 2010 | |
MULTALIN | Dynamic programming-clustering | Both | Local or global | F. Corpet | 1988 | ||
Multi-LAGAN | Progressive dynamic programming alignment | Both | Global | server | M. Brudno et al. | 2003 | |
MUSCLE | Progressive-iterative alignment | Both | Local or global | server | R. Edgar | 2004 | |
Opal | Progressive-iterative alignment | Both | Local or global | download | T. Wheeler and J. Kececioglu | 2007 (latest stable 2013, latest beta 2016) | |
Pecan | Probabilistic-consistency | DNA | Global | download | B. Paten et al. | 2008 | |
Phylo | A human computing framework for comparative genomics to solve multiple alignment | Nucleotides | Local or global | site | McGill Bioinformatics | 2010 | |
PMFastR | Progressive structure aware alignment | RNA | Global | site | D. DeBlasio, J Braund, S Zhang | 2009 | |
Praline | Progressive-iterative-consistency-homology-extended alignment with preprofiling and secondary structure prediction | Protein | Global | server | J. Heringa | 1999 (latest version 2009) | |
PicXAA | Nonprogressive, maximum expected accuracy alignment | Both | Global | download and server | S.M.E. Sahraeian and B.J. Yoon | 2010 | |
POA | Partial order/hidden Markov model | Protein | Local or global | download | C. Lee | 2002 | |
Probalign | Probabilistic/consistency with partition function probabilities | Protein | Global | server | Roshan and Livesay | 2006 | Free, public domain |
ProbCons | Probabilistic/consistency | Protein | Local or global | server | C. Do et al. | 2005 | Free, public domain |
PROMALS3D | Progressive alignment/hidden Markov model/Secondary structure/3D structure | Protein | Global | server | J. Pei et al. | 2008 | |
PRRN/PRRP | Iterative alignment (especially refinement) | Protein | Local or global | Y. Totoki (based on O. Gotoh) | 1991 and later | ||
PSAlign | Alignment preserving non-heuristic | Both | Local or global | download | S.H. Sze, Y. Lu, Q. Yang. | 2006 | |
RevTrans | Combines DNA and Protein alignment, by back translating the protein alignment to DNA. | DNA/Protein (special) | Local or global | server | Wernersson and Pedersen | 2003 (newest version 2005) | |
SAGA | Sequence alignment by genetic algorithm | Protein | Local or global | download | C. Notredame et al. | 1996 (new version 1998) | |
SAM | Hidden Markov model | Protein | Local or global | server | A. Krogh et al. | 1994 (most recent version 2002) | |
Se-Al | Manual alignment | Both | Local | download | A. Rambaut | 2002 | |
StatAlign | Bayesian co-estimation of alignment and phylogeny (MCMC) | Both | Global | download | A. Novak et al. | 2008 | |
Stemloc | Multiple alignment and secondary structure prediction | RNA | Local or global | download | I. Holmes | 2005 | Free, GPL 3 (parte de DART) |
T-Coffee | More sensitive progressive alignment | Both | Local or global | C. Notredame et al. | 2000 (newest version 2008) | Free, GPL 2 | |
UGENE | Supports multiple alignment with MUSCLE, KAlign, Clustal and MAFFT plugins | Both | Local or global | download | UGENE team | 2010 (newest version 2012) | Free, GPL 2 |
VectorFriends | VectorFriends Aligner, MUSCLE plugin, and ClustalW plugin | Both | Local or global | download | BioFriends team | 2013 | Proprietary, freeware for academic use |
GLProbs | Adaptive pair-Hidden Markov Model based approach | Protein | Global | download | Y. Ye et al. | 2013 |
*Sequence type: protein or nucleotide. **Alignment type: local or global
Genomics analysis[edit]
Name | Description | Sequence type* | Link |
---|---|---|---|
ACT (Artemis Comparison Tool) | Synteny and comparative genomics | Nucleotide | server |
AVID | Pairwise global alignment with whole genomes | Nucleotide | server |
BLAT | Alignment of cDNA sequences to a genome. | Nucleotide | [27] |
DECIPHER | Alignment of rearranged genomes using 6 frame translation | Nucleotide | download |
FLAK | Fuzzy whole genome alignment and analysis | Nucleotide | server |
GMAP | Alignment of cDNA sequences to a genome. Identifies splice site junctions with high accuracy. | Nucleotide | http://research-pub.gene.com/gmap |
Splign | Alignment of cDNA sequences to a genome. Identifies splice site junctions with high accuracy. Able to recognize and separate gene duplications. | Nucleotide | https://www.ncbi.nlm.nih.gov/sutils/splign |
Mauve | Multiple alignment of rearranged genomes | Nucleotide | download |
MGA | Multiple Genome Aligner | Nucleotide | download |
Mulan | Local multiple alignments of genome-length sequences | Nucleotide | server |
Multiz | Multiple alignment of genomes | Nucleotide | download |
PLAST-ncRNA | Search for ncRNAs in genomes by partition function local alignment | Nucleotide | server |
Sequerome | Profiling sequence alignment data with major servers/services | Nucleotide, peptide | server |
Sequilab | Profiling sequence alignment data from NCBI-BLAST results with major servers-services | Nucleotide, peptide | server |
Shuffle-LAGAN | Pairwise glocal alignment of completed genome regions | Nucleotide | server |
SIBsim4, Sim4 | A program designed to align an expressed DNA sequence with a genomic sequence, allowing for introns | Nucleotide | download |
SLAM | Gene finding, alignment, annotation (human-mouse homology identification) | Nucleotide | server |
*Sequence type: protein or nucleotide
Alignment Tool For Clutch Replacement
Alignment Tool For Cabinet Pulls
Motif finding[edit]
Name | Description | Sequence type* | Link |
---|---|---|---|
PMS | Motif search and discovery | Both | |
FMM | Motif search and discovery (can get also positive & negative sequences as input for enriched motif search) | Nucleotide | server |
BLOCKS | Ungapped motif identification from BLOCKS database | Both | server |
eMOTIF | Extraction and identification of shorter motifs | Both | servers |
Gibbs motif sampler | Stochastic motif extraction by statistical likelihood | Both | |
HMMTOP | Prediction of transmembrane helices and topology of proteins | Protein | homepage & download |
I-sites | Local structure motif library | Protein | server |
JCoils | Prediction of Coiled coil and Leucine Zipper | Protein | homepage & download |
MEME/MAST | Motif discovery and search | Both | server |
CUDA-MEME | GPU accelerated MEME (v4.4.0) algorithm for GPU clusters | Both | homepage |
MERCI | Discriminative motif discovery and search | Both | homepage & download |
PHI-Blast | Motif search and alignment tool | Both | Pasteur |
Phyloscan | Motif search tool | Nucleotide | server |
PRATT | Pattern generation for use with ScanProsite | Protein | server |
ScanProsite | Motif database search tool | Protein | server |
TEIRESIAS | Motif extraction and database search | Both | server |
BASALT | Multiple motif and regular expression search | Both | homepage |
*Sequence type: protein or nucleotide
Benchmarking[edit]
Name | Link | Authors |
---|---|---|
PFAM 30.0 (2016) | ||
SMART (2015) | website | Letunic, Copley, Schmidt, Ciccarelli, Doerks, Schultz, Ponting, Bork |
BAliBASE 3 (2015) | website | Thompson, Plewniak, Poch |
Oxbench (2011) | download | Raghava, Searle, Audley, Barber, Barton |
Benchmark collection (2009) | website | Edgar |
HOMSTRAD (2005) | website | Mizuguchi |
PREFAB 4.0 (2005) | website | Edgar |
SABmark (2004) | download | Van Walle, Lasters, Wyns |
Alignment viewers, editors[edit]
Please see List of alignment visualization software.
Short-read sequence alignment[edit]
Name | Description | paired-end option | Use FASTQ quality | Gapped | Multi-threaded | License | Link | Reference | Year |
---|---|---|---|---|---|---|---|---|---|
Arioc | Computes Smith-Waterman gapped alignments and mapping qualities on one or more GPUs. Supports BS-seq alignments. Processes 100,000 to 500,000 reads per second (varies with data, hardware, and configured sensitivity). | Yes | No | Yes | Yes | Free, BSD | github | [28] | 2015 |
BarraCUDA | A GPGPU accelerated Burrows-Wheeler transform (FM-index) short read alignment program based on BWA, supports alignment of indels with gap openings and extensions. | Yes | No | Yes | Yes, POSIX Threads and CUDA | Free, GPL | link | ||
BBMap | Uses a short kmers to rapidly index genome; no size or scaffold count limit. Higher sensitivity and specificity than Burrows-Wheeler aligners, with similar or greater speed. Performs affine-transform-optimized global alignment, which is slower but more accurate than Smith-Waterman. Handles Illumina, 454, PacBio, Sanger, and Ion Torrent data. Splice-aware; capable of processing long indels and RNA-seq. Pure Java; runs on any platform. Used by the Joint Genome Institute. | Yes | Yes | Yes | Yes | Free, BSD | link | 2010 | |
BFAST | Explicit time and accuracy tradeoff with a prior accuracy estimation, supported by indexing the reference sequences. Optimally compresses indexes. Can handle billions of short reads. Can handle insertions, deletions, SNPs, and color errors (can map ABI SOLiD color space reads). Performs a full Smith Waterman alignment. | Yes, POSIX Threads | Free, GPL | link[permanent dead link] | [29] | 2009 | |||
BigBWA | Runs the Burrows-Wheeler Aligner-BWA on a Hadoop cluster. It supports the algorithms BWA-MEM, BWA-ALN, and BWA-SW, working with paired and single reads. It implies an important reduction in the computational time when running in a Hadoop cluster, adding scalability and fault-tolerance. | Yes | Low quality bases trimming | Yes | Yes | Free, GPL 3 | link | [30] | 2015 |
BLASTN | BLAST's nucleotide alignment program, slow and not accurate for short reads, and uses a sequence database (EST, Sanger sequence) rather than a reference genome. | link | |||||||
BLAT | Made by Jim Kent. Can handle one mismatch in initial alignment step. | Yes, client-server | Proprietary, freeware for academic and noncommercial use | link | [31] | 2002 | |||
Bowtie | Uses a Burrows-Wheeler transform to create a permanent, reusable index of the genome; 1.3 GB memory footprint for human genome. Aligns more than 25 million Illumina reads in 1 CPU hour. Supports Maq-like and SOAP-like alignment policies | Yes | Yes | No | Yes, POSIX Threads | Free, Artistic | link | [32] | 2009 |
BWA | Uses a Burrows-Wheeler transform to create an index of the genome. It's a bit slower than Bowtie but allows indels in alignment. | Yes | Low quality bases trimming | Yes | Yes | Free, GPL | link | [33] | 2009 |
BWA-PSSM | A probabilistic short read aligner based on the use of position specific scoring matrices (PSSM). The aligner is adaptable in the sense that it can take into account the quality scores of the reads and models of data specific biases, such as those observed in Ancient DNA, PAR-CLIP data or genomes with biased nucleotide compositions.[34] | Yes | Yes | Yes | Yes | Free, GPL | link | [34] | 2014 |
CASHX | Quantify and manage large quantities of short-read sequence data. CASHX pipeline contains a set of tools that can be used together, or separately as modules. This algorithm is very accurate for perfect hits to a reference genome. | No | Proprietary, freeware for academic and noncommercial use | link | |||||
Cloudburst | Short-read mapping using Hadoop MapReduce | Yes, HadoopMapReduce | Free, Artistic | link | |||||
CUDA-EC | Short-read alignment error correction using GPUs. | Yes, GPU enabled | link | ||||||
CUSHAW | A CUDA compatible short read aligner to large genomes based on Burrows-Wheeler transform | Yes | Yes | No | Yes (GPU enabled) | Free, GPL | link | [35] | 2012 |
CUSHAW2 | Gapped short-read and long-read alignment based on maximal exact match seeds. This aligner supports both base-space (e.g. from Illumina, 454, Ion Torrent and PacBio sequencers) and ABI SOLiD color-space read alignments. | Yes | No | Yes | Yes | Free, GPL | link | 2014 | |
CUSHAW2-GPU | GPU-accelerated CUSHAW2 short-read aligner. | Yes | No | Yes | Yes | Free, GPL | link | ||
CUSHAW3 | Sensitive and accurate base-space and color-space short-read alignment with hybrid seeding | Yes | No | Yes | Yes | Free, GPL | link | [36] | 2012 |
drFAST | Read mapping alignment software that implements cache obliviousness to minimize main/cache memory transfers like mrFAST and mrsFAST, however designed for the SOLiD sequencing platform (color space reads). It also returns all possible map locations for improved structural variation discovery. | Yes | Yes, for structural variation | Yes | No | Free, BSD | link | ||
ELAND | Implemented by Illumina. Includes ungapped alignment with a finite read length. | ||||||||
ERNE | Extended Randomized Numerical alignEr for accurate alignment of NGS reads. It can map bisulfite-treated reads. | Yes | Low quality bases trimming | Yes | Multithreading and MPI-enabled | Free, GPL 3 | link | ||
GASSST | Finds global alignments of short DNA sequences against large DNA banks | Multithreading | CeCILL version 2 License. | link | [37] | 2011 | |||
GEM | High-quality alignment engine (exhaustive mapping with substitutions and indels). More accurate and several times faster than BWA or Bowtie 1/2. Many standalone biological applications (mapper, split mapper, mappability, and other) provided. | Yes | Yes | Yes | Yes | Dual, freeware for noncommercial use; GEM source is currently unavailable | link | [38] | 2012 |
Genalice MAP | Ultra fast and comprehensive NGS read aligner with high precision and small storage footprint. | Yes | Low quality bases trimming | Yes | Yes | Proprietary, commercial | link | ||
Geneious Assembler | Fast, accurate overlap assembler with the ability to handle any combination of sequencing technology, read length, any pairing orientations, with any spacer size for the pairing, with or without a reference genome. | Yes | Proprietary, commercial | link | |||||
GensearchNGS | Complete framework with user-friendly GUI to analyse NGS data. It integrates a proprietary high quality alignment algorithm and plug-in ability to integrate various public aligner into a framework allowing to import short reads, align them, detect variants, and generate reports. It is made for resequencing projects, namely in a diagnostic setting. | Yes | No | Yes | Yes | Proprietary, commercial | link | ||
GMAP and GSNAP | Robust, fast short-read alignment. GMAP: longer reads, with multiple indels and splices (see entry above under Genomics analysis); GSNAP: shorter reads, with one indel or up to two splices per read. Useful for digital gene expression, SNP and indel genotyping. Developed by Thomas Wu at Genentech. Used by the National Center for Genome Resources (NCGR) in Alpheus. | Yes | Yes | Yes | Yes | Proprietary, freeware for academic and noncommercial use | link | ||
GNUMAP | Accurately performs gapped alignment of sequence data obtained from next-generation sequencing machines (specifically of Solexa-Illumina) back to a genome of any size. Includes adaptor trimming, SNP calling and Bisulfite sequence analysis. | Yes, also supports Illumina *_int.txt and *_prb.txt files with all 4 quality scores for each base | Multithreading and MPI-enabled | link | [39] | 2009 | |||
HIVE-hexagon | Uses a hash table and bloom matrix to create and filter potential positions on the genome. For higher efficiency uses cross-similarity between short reads and avoids realigning non unique redundant sequences. It is faster than Bowtie and BWA and allows indels and divergent sensitive alignments on viruses, bacteria, and more conservative eukaryotic alignments. | Yes | Yes | Yes | Yes | Proprietary, freeware for academic and noncommercial users registered to HIVE deployment instance | link | [40] | 2014 |
IMOS | Improved Meta-aligner and Minimap2 On Spark. A long read distributed aligner on Apache Spark platform with linear scalability w.r.t. single node execution. | Yes | Yes | Yes | Free | github | |||
Isaac | Fully uses all the computing power available on one server node; thus, it scales well over a broad range of hardware architectures, and alignment performance improves with hardware abilities | Yes | Yes | Yes | Yes | Free, GPL | github | ||
LAST | Uses adaptative seeds and copes more efficiently with repeat-rich sequences (e.g. genomes). For example: it can align reads to genomes without repeat-masking, without becoming overwhelmed by repetitive hits. | Yes | Yes | Yes | No | Free, GPL | link | [41] | 2011 |
MAQ | Ungapped alignment that takes into account quality scores for each base. | Free, GPL | link | ||||||
mrFAST, mrsFAST | Gapped (mrFAST) and ungapped (mrsFAST) alignment software that implements cache obliviousness to minimize main/cache memory transfers. They are designed for the Illumina sequencing platform and they can return all possible map locations for improved structural variation discovery. | Yes | Yes, for structural variation | Yes | No | Free, BSD | |||
MOM | MOM or maximum oligonucleotide mapping is a query matching tool that captures a maximal length match within the short read. | Yes | link | ||||||
MOSAIK | Fast gapped aligner and reference-guided assembler. Aligns reads using a banded Smith-Waterman algorithm seeded by results from a k-mer hashing scheme. Supports reads ranging in size from very short to very long. | Yes | link | ||||||
MPscan | Fast aligner based on a filtration strategy (no indexing, use q-grams and Backward Nondeterministic DAWG Matching) | link | [42] | 2009 | |||||
Novoalign & NovoalignCS | Gapped alignment of single end and paired end Illumina GA I & II, ABI Colour space & ION Torrent reads. High sensitivity and specificity, using base qualities at all steps in the alignment. Includes adapter trimming, base quality calibration, Bi-Seq alignment, and options for reporting multiple alignments per read. Use of ambiguous IUPAC codes in reference for common SNPs can improve SNP recall and remove allelic bias. | Yes | Yes | Yes | Multi-threading and MPI versions available with paid license | Proprietary, freeware single threaded version for academic and noncommercial use | Novocraft | ||
NextGENe | Developed for use by biologists performing analysis of next generation sequencing data from Roche Genome Sequencer FLX, Illumina GA/HiSeq, Life Technologies Applied BioSystems’ SOLiD System, PacBio and Ion Torrent platforms. | Yes | Yes | Yes | Yes | Proprietary, commercial | Softgenetics | ||
NextGenMap | Flexible and fast read mapping program (twice as fast as BWA), achieves a mapping sensitivity comparable to Stampy. Internally uses a memory efficient index structure (hash table) to store positions of all 13-mers present in the reference genome. Mapping regions where pairwise alignments are required are dynamically determined for each read. Uses fast SIMD instructions (SSE) to accelerate alignment calculations on CPU. If available, alignments are computed on GPU (using OpenCL/CUDA) further reducing runtime 20-50%. | Yes | No | Yes | Yes, POSIX Threads, OpenCL/CUDA, SSE | Free | Official GitHub Page | [43] | 2013 |
Omixon Variant Toolkit | Includes highly sensitive and highly accurate tools for detecting SNPs and indels. It offers a solution to map NGS short reads with a moderate distance (up to 30% sequence divergence) from reference genomes. It poses no restrictions on the size of the reference, which, combined with its high sensitivity, makes the Variant Toolkit well-suited for targeted sequencing projects and diagnostics. | Yes | Yes | Yes | Yes | Proprietary, commercial | www.omixon.com | ||
PALMapper | Efficiently computes both spliced and unspliced alignments at high accuracy. Relying on a machine learning strategy combined with a fast mapping based on a banded Smith-Waterman-like algorithm, it aligns around 7 million reads per hour on one CPU. It refines the originally proposed QPALMA approach. | Yes | Free, GPL | link | |||||
Partek Flow | For use by biologists and bioinformaticians. It supports ungapped, gapped and splice-junction alignment from single and paired-end reads from Illumina, Life technologies Solid TM, Roche 454 and Ion Torrent raw data (with or without quality information). It integrates powerful quality control on FASTQ/Qual level and on aligned data. Additional functionality include trimming and filtering of raw reads, SNP and InDel detection, mRNA and microRNA quantification and fusion gene detection. | Yes | Yes | Yes | Multiprocessor-core, client-server installation possible | Proprietary, commercial, free trial version | [1] | ||
PASS | Indexes the genome, then extends seeds using pre-computed alignments of words. Works with base space, color space (SOLID), and can align genomic and spliced RNA-seq reads. | Yes | Yes | Yes | Yes | Proprietary, freeware for academic and noncommercial use | PASS_HOME | ||
PerM | Indexes the genome with periodic seeds to quickly find alignments with full sensitivity up to four mismatches. It can map Illumina and SOLiD reads. Unlike most mapping programs, speed increases for longer read lengths. | Yes | Free, GPL | link | [44] | ||||
PRIMEX | Indexes the genome with a k-mer lookup table with full sensitivity up to an adjustable number of mismatches. It is best for mapping 15-60 bp sequences to a genome. | No | No | Yes | No, multiple processes per search | link | [2] | 2003 | |
QPalma | Can use quality scores, intron lengths, and computation splice site predictions to perform and performs an unbiased alignment. Can be trained to the specifics of a RNA-seq experiment and genome. Useful for splice site/intron discovery and for gene model building. (See PALMapper for a faster version). | Yes, client-server | Free, GPL 2 | link | |||||
RazerS | No read length limit. Hamming or edit distance mapping with configurable error rates. Configurable and predictable sensitivity (runtime/sensitivity tradeoff). Supports paired-end read mapping. | Free, LGPL | link | ||||||
REAL, cREAL | REAL is an efficient, accurate, and sensitive tool for aligning short reads obtained from next-generation sequencing. The programme can handle an enormous amount of single-end reads generated by the next-generation Illumina/Solexa Genome Analyzer. cREAL is a simple extension of REAL for aligning short reads obtained from next-generation sequencing to a genome with circular structure. | Yes | Yes | Free, GPL | link | ||||
RMAP | Can map reads with or without error probability information (quality scores) and supports paired-end reads or bisulfite-treated read mapping. There are no limitations on read length or number of mismatches. | Yes | Yes | Yes | Free, GPL 3 | link | |||
rNA | A randomized Numerical Aligner for Accurate alignment of NGS reads | Yes | Low quality bases trimming | Yes | Multithreading and MPI-enabled | Free, GPL 3 | link | ||
RTG Investigator | Extremely fast, tolerant to high indel and substitution counts. Includes full read alignment. Product includes comprehensive pipelines for variant detection and metagenomic analysis with any combination of Illumina, Complete Genomics and Roche 454 data. | Yes | Yes, for variant calling | Yes | Yes | Proprietary, freeware for individual investigator use | link | ||
Segemehl | Can handle insertions, deletions, mismatches; uses enhanced suffix arrays | Yes | No | Yes | Yes | Proprietary, freeware for noncommercial use | link | [45] | 2009 |
SeqMap | Up to 5 mixed substitutions and insertions-deletions; various tuning options and input-output formats | Proprietary, freeware for academic and noncommercial use | link | ||||||
Shrec | Short read error correction with a suffix tree data structure | Yes, Java | link | ||||||
SHRiMP | Indexes the reference genome as of version 2. Uses masks to generate possible keys. Can map ABI SOLiD color space reads. | Yes | Yes | Yes | Yes, OpenMP | Free, [[BSD licenses||Free, BSD]] derivative | link | [46][47] | 2009-2011 |
SLIDER | Slider is an application for the Illumina Sequence Analyzer output that uses the 'probability' files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences. | Yes | Yes | No | No | link | [48][49] | 2009-2010 | |
SOAP, SOAP2, SOAP3, SOAP3-dp | SOAP: robust with a small (1-3) number of gaps and mismatches. Speed improvement over BLAT, uses a 12 letter hash table. SOAP2: using bidirectional BWT to build the index of reference, and it is much faster than the first version. SOAP3: GPU-accelerated version that could find all 4-mismatch alignments in tens of seconds per one million reads. SOAP3-dp, also GPU accelerated, supports arbitrary number of mismatches and gaps according to affine gap penalty scores. | Yes | No | Yes, SOAP3-dp | Yes, POSIX Threads; SOAP3, SOAP3-dp need GPU with CUDA support | Free, GPL | link | [50][51] | |
SOCS | For ABI SOLiD technologies. Significant increase in time to map reads with mismatches (or color errors). Uses an iterative version of the Rabin-Karp string search algorithm. | Yes | Free, GPL | link | |||||
SparkBWA | Integrates the Burrows-Wheeler Aligner—BWA on an Apache Spark framework running atop Hadoop. Version 0.2 of October 2016, supports the algorithms BWA-MEM, BWA-backtrack, and BWA-ALN. All of them work with single-reads and paired-end reads. | Yes | Low quality bases trimming | Yes | Yes | Free, GPL 3 | link | [52] | 2016 |
SSAHA, SSAHA2 | Fast for a small number of variants | Proprietary, freeware for academic and noncommercial use | link | ||||||
Stampy | For Illumina reads. High specificity, and sensitive for reads with indels, structural variants, or many SNPs. Slow, but speed increased dramatically by using BWA for first alignment pass. | Yes | Yes | Yes | No | Proprietary, freeware for academic and noncommercial use | link | [53] | 2010 |
SToRM | For Illumina or ABI SOLiD reads, with SAM native output. Highly sensitive for reads with many errors, indels (full from 0 to 15, extended support otherwise). Uses spaced seeds (single hit) and a very fast SSE-SSE2-AVX2-AVX-512 banded alignment filter. For fixed-length reads only, authors recommend SHRiMP2 otherwise. | No | Yes | Yes | Yes, OpenMP | Free | link | [54] | 2010 |
Subread, Subjunc | Superfast and accurate read aligners. Subread can be used to map both gDNA-seq and RNA-seq reads. Subjunc detects exon-exon junctions and maps RNA-seq reads. They employ a novel mapping paradigm named seed-and-vote. | Yes | Yes | Yes | Yes | Free, GPL 3 | |||
Taipan | De-novo assembler for Illumina reads | Proprietary, freeware for academic and noncommercial use | link | ||||||
UGENE | Visual interface both for Bowtie and BWA, and an embedded aligner | Yes | Yes | Yes | Yes | Free, GPL | link | ||
VelociMapper | FPGA-accelerated reference sequence alignment mapping tool from TimeLogic. Faster than Burrows-Wheeler transform-based algorithms like BWA and Bowtie. Supports up to 7 mismatches and/or indels with no performance penalty. Produces sensitive Smith-Waterman gapped alignments. | Yes | Yes | Yes | Yes | Proprietary, commercial | TimeLogic | ||
XpressAlign | FPGA based sliding window short read aligner which exploits the embarrassingly parallel property of short read alignment. Performance scales linearly with number of transistors on a chip (i.e. performance guaranteed to double with each iteration of Moore's Law without modification to algorithm). Low power consumption is useful for datacentre equipment. Predictable runtime. Better price/performance than software sliding window aligners on current hardware, but not better than software BWT-based aligners currently. Can manage large numbers (>2) of mismatches. Will find all hit positions for all seeds. Single-FPGA experimental version, needs work to develop it into a multi-FPGA production version. | Proprietary, freeware for academic and noncommercial use | link | ||||||
ZOOM | 100% sensitivity for a reads between 15-240 bp with practical mismatches. Very fast. Support insertions and deletions. Works with Illumina & SOLiD instruments, not 454. | Yes (GUI), no (CLI) | Proprietary, commercial | link | [55] |
See also[edit]
References[edit]
- ^Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ; Gish; Miller; Myers; Lipman (October 1990). 'Basic local alignment search tool'. Journal of Molecular Biology. 215 (3): 403–10. doi:10.1016/S0022-2836(05)80360-2. PMID2231712.CS1 maint: multiple names: authors list (link)
- ^Angermüller, C.; Biegert, A.; Söding, J. (Dec 2012). 'Discriminative modelling of context-specific amino acid substitution probabilities'. Bioinformatics. 28 (24): 3240–7. doi:10.1093/bioinformatics/bts622. PMID23080114.
- ^Buchfink, Xie and Huson (2015). 'Fast and sensitive protein alignment using DIAMOND'. Nature Methods. 12 (1): 59–60. doi:10.1038/nmeth.3176. PMID25402007.
- ^Durbin, Richard; Eddy, Sean R.; Krogh, Anders; Mitchison, Graeme, eds. (1998). Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge, UK: Cambridge University Press. ISBN978-0-521-62971-3.[page needed]
- ^Söding J (April 2005). 'Protein homology detection by HMM-HMM comparison'. Bioinformatics. 21 (7): 951–60. doi:10.1093/bioinformatics/bti125. PMID15531603.
- ^Remmert, Michael; Biegert, Andreas; Hauser, Andreas; Söding, Johannes (2011-12-25). 'HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment'. Nature Methods. 9 (2): 173–175. doi:10.1038/nmeth.1818. hdl:11858/00-001M-0000-0015-8D56-A. ISSN1548-7105. PMID22198341.
- ^Hauswedell H, Singer J, Reinert K (2014-09-01). 'Lambda: the local aligner for massive biological data'. Bioinformatics. 30 (17): 349–355. doi:10.1093/bioinformatics/btu439. PMC4147892. PMID25161219.
- ^Steinegger, Martin; Soeding, Johannes (2017-10-16). 'MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets'. Nature Biotechnology. 35 (11): 1026–1028. doi:10.1038/nbt.3988. hdl:11858/00-001M-0000-002E-1967-3. PMID29035372.
- ^Rucci, Enzo; Garcia, Carlos; Botella, Guillermo; Giusti, Armando E. De; Naiouf, Marcelo; Prieto-Matias, Manuel (2016-06-30). 'OSWALD: OpenCL Smith–Waterman on Altera's FPGA for Large Protein Databases'. International Journal of High Performance Computing Applications. 32 (3): 337–350. doi:10.1177/1094342016654215. ISSN1094-3420.
- ^Altschul SF, Madden TL, Schäffer AA, et al. (September 1997). 'Gapped BLAST and PSI-BLAST: a new generation of protein database search programs'. Nucleic Acids Research. 25 (17): 3389–402. doi:10.1093/nar/25.17.3389. PMC146917. PMID9254694.
- ^Li W, McWilliam H, Goujon M, et al. (June 2012). 'PSI-Search: iterative HOE-reduced profile SSEARCH searching'. Bioinformatics. 28 (12): 1650–1651. doi:10.1093/bioinformatics/bts240. PMC3371869. PMID22539666.
- ^Oehmen, C.; Nieplocha, J. (August 2006). 'ScalaBLAST: A scalable implementation of BLAST for high-performancemw-data:TemplateStyles:r886058088'>
- ^Hughey, R.; Karplus, K.; Krogh, A. (2003). SAM: sequence alignment and modeling software system. Technical report UCSC-CRL-99-11 (Report). University of California, Santa Cruz, CA.
- ^Rucci, Enzo; García, Carlos; Botella, Guillermo; De Giusti, Armando; Naiouf, Marcelo; Prieto-Matías, Manuel (2015-12-25). 'An energy-aware performance analysis of SWIMM: Smith–Waterman implementation on Intel's Multicore and Manycore architectures'. Concurrency and Computation: Practice and Experience. 27 (18): 5517–5537. doi:10.1002/cpe.3598. ISSN1532-0634.
- ^Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W; Kent; Smit; Zhang; Baertsch; Hardison; Haussler; Miller (2003). 'Human-mouse alignments with BLASTZ'. Genome Research. 13 (1): 103–107. doi:10.1101/gr.809403. PMC430961. PMID12529312.CS1 maint: multiple names: authors list (link)
- ^Harris R S (2007). Improved pairwise alignment of genomic DNA (Thesis).
- ^Sandes, Edans F. de O.; de Melo, Alba Cristina M.A. (May 2013). 'Retrieving Smith-Waterman Alignments with Optimizations for Megabase Biological Sequences Using GPU'. IEEE Transactions on Parallel and Distributed Systems. 24 (5): 1009–1021. doi:10.1109/TPDS.2012.194.
- ^Sandes, Edans F. de O.; Miranda, G.; De Melo, A.C.M.A.; Martorell, X.; Ayguade, E. (May 2014). CUDAlign 3.0: Parallel Biological Sequence Comparison in Large GPU Clusters. Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on. p. 160. doi:10.1109/CCGrid.2014.18.
- ^Sandes, Edans F. de O.; Miranda, G.; De Melo, A.C.M.A.; Martorell, X.; Ayguade, E. (August 2014). Fine-grain Parallel Megabase Sequence Comparison with Multiple Heterogeneous GPUs. Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. pp. 383–384. doi:10.1145/2555243.2555280.
- ^Chivian, D; Baker, D (2006). 'Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection'. Nucleic Acids Research. 34 (17): e112. doi:10.1093/nar/gkl480. PMC1635247. PMID16971460.
- ^Girdea, M; Noe, L; Kucherov, G (January 2010). 'Back-translation for discovering distant protein homologies in the presence of frameshift mutations'. Algorithms for Molecular Biology. 5 (6): 6. doi:10.1186/1748-7188-5-6. PMC2821327. PMID20047662.
- ^Ma, B.; Tromp, J.; Li, M. (2002). 'PatternHunter: faster and more sensitive homology search'. Bioinformatics. 18 (3): 440–445. doi:10.1093/bioinformatics/18.3.440. PMID11934743.
- ^Li, M.; Ma, B.; Kisman, D.; Tromp, J. (2004). 'Patternhunter II: highly sensitive and fast homology search'. Journal of Bioinformatics and Computational Biology. 2 (3): 417–439. CiteSeerX10.1.1.1.2393. doi:10.1142/S0219720004000661. PMID15359419.
- ^Gusfield, Dan (1997). Algorithms on strings, trees and sequences. Cambridge university press. ISBN978-0-521-58519-4.
- ^Rasmussen K, Stoye J, Myers EW; Stoye; Myers (2006). 'Efficient q-Gram Filters for Finding All epsilon-Matches over a Given Length'. Journal of Computational Biology. 13 (2): 296–308. CiteSeerX10.1.1.465.2084. doi:10.1089/cmb.2006.13.296. PMID16597241.CS1 maint: multiple names: authors list (link)
- ^Noe L, Kucherov G; Kucherov (2005). 'YASS: enhancing the sensitivity of DNA similarity search'. Nucleic Acids Research. 33 (suppl_2): W540–W543. doi:10.1093/nar/gki478. PMC1160238. PMID15980530.
- ^'Index of /admin/exe'.
- ^Wilton, Richard; Budavari, Tamas; Langmead, Ben; Wheelan, Sarah J.; Salzberg, Steven L.; Szalay, Alexander S. (2015). 'Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space'. PeerJ. 3: e808. doi:10.7717/peerj.808. PMC4358639. PMID25780763.
- ^Homer, Nils; Merriman, Barry; Nelson, Stanley F. (2009). 'BFAST: An Alignment Tool for Large Scale Genome Resequencing'. PLOS ONE. 4 (11): e7767. doi:10.1371/journal.pone.0007767. PMC2770639. PMID19907642.
- ^Abuín, J.M.; Pichel, J.C.; Pena, T.F.; Amigo, J. (2015). 'BigBWA: approaching the Burrows–Wheeler aligner to Big Data technologies'. Bioinformatics. 31 (24): 4003–5. doi:10.1093/bioinformatics/btv506. PMID26323715.
- ^Kent, W. J. (2002). 'BLAT---The BLAST-Like Alignment Tool'. Genome Research. 12 (4): 656–664. doi:10.1101/gr.229202. ISSN1088-9051. PMC187518. PMID11932250.
- ^Langmead, Ben; Trapnell, Cole; Pop, Mihai; Salzberg, Steven L (2009). 'Ultrafast and memory-efficient alignment of short DNA sequences to the human genome'. Genome Biology. 10 (3): R25. doi:10.1186/gb-2009-10-3-r25. ISSN1465-6906. PMC2690996. PMID19261174.
- ^Li, H.; Durbin, R. (2009). 'Fast and accurate short read alignment with Burrows-Wheeler transform'. Bioinformatics. 25 (14): 1754–1760. doi:10.1093/bioinformatics/btp324. ISSN1367-4803. PMC2705234. PMID19451168.
- ^ abKerpedjiev, Peter; Frellsen, Jes; Lindgreen, Stinus; Krogh, Anders (2014). 'Adaptable probabilistic mapping of short reads using position specific scoring matrices'. BMC Bioinformatics. 15 (1): 100. doi:10.1186/1471-2105-15-100. ISSN1471-2105. PMC4021105. PMID24717095.
- ^Liu, Y.; Schmidt, B.; Maskell, D. L. (2012). 'CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform'. Bioinformatics. 28 (14): 1830–1837. doi:10.1093/bioinformatics/bts276. ISSN1367-4803. PMID22576173.
- ^Liu, Y.; Schmidt, B. (2012). 'Long read alignment based on maximal exact match seeds'. Bioinformatics. 28 (18): i318–i324. doi:10.1093/bioinformatics/bts414. ISSN1367-4803. PMC3436841. PMID22962447.
- ^Rizk, Guillaume; Lavenier, Dominique (2010). 'GASSST: global alignment short sequence search tool'. Bioinformatics. 26 (20): 2534–2540. doi:10.1093/bioinformatics/btq485. PMC2951093. PMID20739310.
- ^Marco-Sola, Santiago; Sammeth, Michael; Guigó, Roderic; Ribeca, Paolo (2012). 'The GEM mapper: fast, accurate and versatile alignment by filtration'. Nature Methods. 9 (12): 1185–1188. doi:10.1038/nmeth.2221. ISSN1548-7091. PMID23103880.
- ^Clement, N. L.; Snell, Q.; Clement, M. J.; Hollenhorst, P. C.; Purwar, J.; Graves, B. J.; Cairns, B. R.; Johnson, W. E. (2009). 'The GNUMAP algorithm: unbiased probabilistic mapping of oligonucleotides from next-generation sequencing'. Bioinformatics. 26 (1): 38–45. doi:10.1093/bioinformatics/btp614. ISSN1367-4803. PMC6276904. PMID19861355.
- ^Santana-Quintero, Luis; Dingerdissen, Hayley; Thierry-Mieg, Jean; Mazumder, Raja; Simonyan, Vahan (2014). 'HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis'. PLOS ONE. 9 (6): 1754–1760. doi:10.1371/journal.pone.0099033. PMC4053384. PMID24918764.
- ^Kielbasa, S.M.; Wan, R.; Sato, K.; Horton, P.; Frith, M.C. (2011). 'Adaptive seeds tame genomic sequence comparison'. Genome Research. 21 (3): 487–493. doi:10.1101/gr.113985.110. PMC3044862. PMID21209072.
- ^Rivals, Eric; Salmela, Leena; Kiiskinen, Petteri; Kalsi, Petri; Tarhio, Jorma (2009). mpscan: Fast Localisation of Multiple Reads in Genomes. Algorithms in Bioinformatics. Lecture Notes in Computer Science. 5724. pp. 246–260. CiteSeerX10.1.1.156.928. doi:10.1007/978-3-642-04241-6_21. ISBN978-3-642-04240-9.
- ^Sedlazeck, Fritz J.; Rescheneder, Philipp; von Haeseler, Arndt (2013). 'NextGenMap: fast and accurate read mapping in highly polymorphic genomes'. Bioinformatics. 29 (21): 2790–2791. doi:10.1093/bioinformatics/btt468. PMID23975764.
- ^Chen, Yangho; Souaiaia, Tade; Chen, Ting (2009). 'PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds'. Bioinformatics. 25 (19): 2514–2521. doi:10.1093/bioinformatics/btp486. PMC2752623. PMID19675096.
- ^Searls, David B.; Hoffmann, Steve; Otto, Christian; Kurtz, Stefan; Sharma, Cynthia M.; Khaitovich, Philipp; Vogel, Jörg; Stadler, Peter F.; Hackermüller, Jörg (2009). 'Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures'. PLoS Computational Biology. 5 (9): e1000502. doi:10.1371/journal.pcbi.1000502. ISSN1553-7358. PMC2730575. PMID19750212.
- ^Rumble, Stephen M.; Lacroute, Phil; Dalca, Adrian V.; Fiume, Marc; Sidow, Arend; Brudno, Michael (2009). 'SHRiMP: Accurate Mapping of Short Color-space Reads'. PLOS Computational Biology. 5 (5): e1000386. doi:10.1371/journal.pcbi.1000386. PMC2678294. PMID19461883.
- ^David, Matei; Dzamba, Misko; Lister, Dan; Ilie, Lucian; Brudno, Michael (2011). 'SHRiMP2: Sensitive yet Practical Short Read Mapping'. Bioinformatics. 27 (7): 1011–1012. doi:10.1093/bioinformatics/btr046. PMID21278192.
- ^Malhis, Nawar; Butterfield, Yaron S. N.; Ester, Martin; Jones, Steven J. M. (2009). 'Slider – Maximum use of probability information for alignment of short sequence reads and SNP detection'. Bioinformatics. 1 (1): 6–13. doi:10.1093/bioinformatics/btn565. PMC2638935. PMID18974170.
- ^Malhis, Nawar; Jones, Steven J. M. (2010). 'High Quality SNP Calling Using Illumina Data at Shallow Coverage'. Bioinformatics. 26 (8): 1029–1035. doi:10.1093/bioinformatics/btq092. PMID20190250.
- ^Li, R.; Li, Y.; Kristiansen, K.; Wang, J. (2008). 'SOAP: short oligonucleotide alignment program'. Bioinformatics. 24 (5): 713–714. doi:10.1093/bioinformatics/btn025. ISSN1367-4803. PMID18227114.
- ^Li, R.; Yu, C.; Li, Y.; Lam, T.-W.; Yiu, S.-M.; Kristiansen, K.; Wang, J. (2009). 'SOAP2: an improved ultrafast tool for short read alignment'. Bioinformatics. 25 (15): 1966–1967. doi:10.1093/bioinformatics/btp336. ISSN1367-4803. PMID19497933.
- ^Abuín, José M.; Pichel, Juan C.; Pena, Tomás F.; Amigo, Jorge (2016-05-16). 'SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data'. PLOS ONE. 11 (5): e0155461. doi:10.1371/journal.pone.0155461. ISSN1932-6203. PMC4868289. PMID27182962.
- ^Lunter, G.; Goodson, M. (2010). 'Stampy: A statistical algorithm for sensitive and fast mapping of Illumina sequence reads'. Genome Research. 21 (6): 936–939. doi:10.1101/gr.111120.110. ISSN1088-9051. PMC3106326. PMID20980556.
- ^Noe, L.; Girdea, M.; Kucherov, G. (2010). 'Designing efficient spaced seeds for SOLiD read mapping'. Advances in Bioinformatics. 2010: 708501. doi:10.1155/2010/708501. PMC2945724. PMID20936175.
- ^Lin, H.; Zhang, Z.; Zhang, M.Q.; Ma, B.; Li, M. (2008). 'ZOOM! Zillions of oligos mapped'. Bioinformatics. 24 (21): 2431–2437. doi:10.1093/bioinformatics/btn416. PMC2732274. PMID18684737.
Good design is within everyone’s reach. Even though you may never have attended an art class, you can still create layouts that are compelling and easy to read. If you can memorize four easy principles, you’ve got what it takes to create an interesting and pleasing poster, brochure, party invite, business card, or any other composition. Last time we talked about proximity and the importance of using space to group related items together within a layout. In this second part of the Design Basics series, we’ll look at another design rule: alignment.
Alignment—Line it up
Alignment gives readers a hard edge for their eyes to follow when scanning or reading a piece. This edge forms an invisible line that connects items on a page. Robin Williams (author of the Non-Designers Design Book, Peachpit Press) wisely notes that the stronger the alignment, the stronger, cleaner, and more dramatic your layout will be.
The basic alignments are left, center, and right; but which one do you use when? For large blocks of text, use left alignment because it’s the easiest to read (think newspapers, books, and magazines). Right alignment is more difficult to read so use it on smaller chunks of text. Centered alignment conveys a feeling of formality and elegance, so reserve it for graduation announcements and wedding invitations.
Another alignment pitfall to avoid is wrapping text around an irregularly shaped object—very few designers can pull this off. Because the text edges become jagged and erratic, the piece becomes difficult to read, as shown below.
Here’s how to find alignment tools in some popular programs:
Microsoft Word: In the Formatting toolbar (View --> Toolbars > Formatting), Formatting palette (View -> Formatting Palette), and in the Paragraph dialog box (choose Format -> Paragraph).
TextEdit: In the toolbar at the top of an open document.
Tex-Edit Plus: In the Tools palette (Tools -> Show Tools) and in the Format menu (choose -> Justification).
Apple Pages and Keynote: In the Text Inspector (choose View -> Show Inspector and click the big T).
However, if you’re aligning text (or other objects) in a program that support layers—such as Photoshop, Illustrator or InDesign—there’s a bit more you need to know.
Aligning text on a single layer
To change the alignment on one Text layer in Photoshop (), Illustrator (), and InDesign (), press T to grab the Type tool and make sure you’re on the layer you want to change. Double-click the layer thumbnail to select all the text, or click and drag to select a portion of text (you can have different alignments on a single text layer, provided the lines of text are separated by a return). Once the text is selected, click an alignment button in the Options bar (the same buttons live in the Paragraph panel).
Aligning text or objects on multiple layers
In Photoshop and InDesign, activate the Move tool and Shift- or Command-click to the right of each layer’s thumbnail (near the name) to select the layers you want to change. In the Options bar, you’ll see a slew of alignment tools that appear only when the Move tool is active and more than one layer is selected. Click the appropriate button and the selected layers will pop into place. In Illustrator, choose Window -> Align to summon the same set of tools.
As you can see from the above examples, proper alignment makes a huge difference in your layout. See you here next time for design secret number three: Repetition.
Lesa Snider, founder of GraphicReporter.com, is the chief evangelist of iStockphoto.com, author of Photoshop CS4: The Missing Manual (Pogue Press/O’Reilly, 2009), and several video training titles from both KelbyTraining.com and Lynda.com.