Detailed papers on LAST

LAST has many ingredients, some of which are described in these papers. If you find an ingredient useful, please cite the corresponding paper. Citation is important because it provides feedback on which research work was useful, and helps to justify the research to society.

  1. Adaptive seeds tame genomic sequence comparison. Kiełbasa SM, Wan R, Sato K, Horton P, Frith MC. Genome Res. 2011 21(3):487-93.

    This describes the main algorithms used by LAST.

  2. Incorporating sequence quality data into alignment improves DNA read mapping. Frith MC, Wan R, Horton P. Nucleic Acids Res. 2010 38(7):e100.

    How LAST uses sequence quality data.

  3. Parameters for Accurate Genome Alignment. Frith MC, Hamada M, Horton P. BMC Bioinformatics. 2010 11:80.

    Choice of score parameters, ambiguity of alignment columns, and gamma-centroid alignment.

  4. A new repeat-masking method enables specific detection of homologous sequences. Frith MC. Nucleic Acids Res. 2011 39(4):e23.

    This describes the tantan algorithm for finding simple / low-complexity / tandem repeats, which reliably prevents non-homologous alignments, unlike other repeat finders.

  5. Gentle masking of low-complexity sequences improves homology search. Frith MC. PLoS One. 2011 6(12):e28819.

    This describes what LAST does with repeats after they have been found.

  6. Probabilistic alignments with quality scores: an application to short-read mapping toward accurate SNP/indel detection. Hamada M, Wijaya E, Frith MC, Asai K. Bioinformatics. 2011 27(22):3085-92.

    Describes probabilistic alignment using sequence quality data, and LAMA alignment.

  7. A mostly traditional approach improves alignment of bisulfite-converted DNA. Frith MC, Mori R, Asai K. Nucleic Acids Res. 2012 40(13):e100.

    This describes alignment of bisulfite-converted DNA, and an update for use of fastq quality data that allows for non-uniform base frequencies.

  8. An approximate Bayesian approach for mapping paired-end DNA reads to a reference genome. Shrestha AM, Frith MC. Bioinformatics. 2013 29(8):965-72.

    This describes the algorithm used by last-pair-probs.

  9. Improved search heuristics find 20,000 new alignments between human and mouse genomes. Frith MC, Noé L. Nucleic Acids Res. 2014 42(7):e59.

    This describes sensitive DNA seeding (MAM8 and MAM4).

  10. Frameshift alignment: statistics and post-genomic applications. Sheetlin SL, Park Y, Frith MC, Spouge JL. Bioinformatics. 2014 30(24):3575-82.

    Describes DNA-versus-protein alignment allowing for frameshifts.

  11. Split-alignment of genomes finds orthologies more accurately. Frith MC, Kawaguchi R. Genome Biology. 2015 16:106.

    Describes the split alignment algorithm, and its application to whole genome alignment.

  12. Training alignment parameters for arbitrary sequencers with LAST-TRAIN. Hamada M, Ono Y, Asai K Frith MC. Bioinformatics. 2017 33(6):926-928.

    Describes last-train.

  13. A Simplified Description of Child Tables for Sequence Similarity Search. Frith MC, Shrestha A. IEEE/ACM Trans Comput Biol Bioinform. 2018.

    Describes how LAST uses child tables.

External methods

LAST of course owes its ideas to much previous research. Here are listed only implementations that are directly used in LAST.