Bugfix: dotplot left border misplaced (for some versions of python/PIL).
1 Detailed papers on LAST
2 =======================
4 LAST has many ingredients, some of which are described in these
5 papers. If you find an ingredient useful, please cite the
6 corresponding paper. Citation is important because it provides
7 feedback on which research work was useful, and helps to justify the
10 1. `Adaptive seeds tame genomic sequence comparison`__. Kiełbasa SM,
11 Wan R, Sato K, Horton P, Frith MC. Genome Res. 2011 21(3):487-93.
13 __ http://genome.cshlp.org/content/21/3/487.long
15 This describes the main algorithms used by LAST.
17 2. `Incorporating sequence quality data into alignment improves DNA
18 read mapping`__. Frith MC, Wan R, Horton P. Nucleic Acids
21 __ http://nar.oxfordjournals.org/content/38/7/e100.long
23 How LAST uses sequence quality data.
25 3. `Parameters for Accurate Genome Alignment`__. Frith MC, Hamada M,
26 Horton P. BMC Bioinformatics. 2010 11:80.
28 __ http://www.biomedcentral.com/1471-2105/11/80
30 Choice of score parameters, ambiguity of alignment columns, and
31 gamma-centroid alignment.
33 4. `A new repeat-masking method enables specific detection of
34 homologous sequences`__. Frith MC. Nucleic Acids Res. 2011
37 __ http://nar.oxfordjournals.org/content/39/4/e23.long
39 This describes the tantan algorithm for finding simple /
40 low-complexity / tandem repeats, which reliably prevents
41 non-homologous alignments, unlike other repeat finders.
43 5. `Gentle masking of low-complexity sequences improves homology
44 search`__. Frith MC. PLoS One. 2011 6(12):e28819.
46 __ http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0028819
48 This describes what LAST does with repeats after they have been
51 6. `Probabilistic alignments with quality scores: an application to
52 short-read mapping toward accurate SNP/indel detection`__. Hamada
53 M, Wijaya E, Frith MC, Asai K. Bioinformatics. 2011
56 __ http://bioinformatics.oxfordjournals.org/content/27/22/3085.long
58 Describes probabilistic alignment using sequence quality data, and
61 7. `A mostly traditional approach improves alignment of
62 bisulfite-converted DNA`__. Frith MC, Mori R, Asai K. Nucleic
63 Acids Res. 2012 40(13):e100.
65 __ http://nar.oxfordjournals.org/content/40/13/e100.long
67 This describes alignment of bisulfite-converted DNA, and an update
68 for use of fastq quality data that allows for non-uniform base
71 8. `An approximate Bayesian approach for mapping paired-end DNA reads
72 to a reference genome`__. Shrestha AM, Frith MC.
73 Bioinformatics. 2013 29(8):965-72.
75 __ http://bioinformatics.oxfordjournals.org/content/29/8/965.long
77 This describes the algorithm used by last-pair-probs.
79 9. `Improved search heuristics find 20,000 new alignments between
80 human and mouse genomes`__. Frith MC, Noé L. Nucleic Acids
83 __ http://nar.oxfordjournals.org/content/42/7/e59.long
85 This describes sensitive DNA seeding (MAM8 and MAM4).
87 10. `Frameshift alignment: statistics and post-genomic
88 applications`__. Sheetlin SL, Park Y, Frith MC, Spouge JL.
89 Bioinformatics. 2014 30(24):3575-82.
91 __ http://bioinformatics.oxfordjournals.org/content/30/24/3575.long
93 Describes DNA-versus-protein alignment allowing for frameshifts.
95 11. `Split-alignment of genomes finds orthologies more accurately`__.
96 Frith MC, Kawaguchi R. Genome Biology. 2015 16:106.
98 __ http://www.genomebiology.com/content/16/1/106
100 Describes the split alignment algorithm, and its application to
101 whole genome alignment.
103 12. `Training alignment parameters for arbitrary sequencers with
104 LAST-TRAIN`__. Hamada M, Ono Y, Asai K Frith MC.
107 __ https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btw742
109 Describes last-train.
114 LAST of course owes its ideas to much previous research. Here are
115 listed only implementations that are directly used in LAST.
117 * `The Gumbel pre-factor k for gapped local alignment can be estimated
118 from simulations of global alignment`__. Sheetlin S, Park Y, Spouge
119 JL. Nucleic Acids Res. 2005 33(15):4987-94.
121 __ http://nar.oxfordjournals.org/content/33/15/4987.long
123 Describes how E-values are calculated.
125 * `New finite-size correction for local alignment score
126 distributions`__. Park Y, Sheetlin S, Ma N, Madden TL, Spouge JL.
127 BMC Res Notes. 2012 5:286.
129 __ http://www.biomedcentral.com/1756-0500/5/286
131 Describes a correction that makes the E-values more accurate for
134 * `GNU Parallel - The Command-Line Power Tool`__. Tange O. ;login:
135 The USENIX Magazine. 2011:42-47.
137 __ https://www.usenix.org/publications/login/february-2011-volume-36-number-1/gnu-parallel-command-line-power-tool
139 It seems traditional not to cite this kind of ingredient, which is
140 unfortunate because the same reasons for citation apply.