Description of scripts that accompany LAST ========================================== last-dotplot.py --------------- This script makes a dotplot, a.k.a. Oxford Grid, of alignments in LAST tabular format. It requires the Python Imaging Library to be installed. To get a usage message:: last-dotplot.py --help To make a png-format dotplot of alignments in a file called "al":: last-dotplot.py al al.png To get a nicer font, try something like:: last-dotplot.py -f /usr/share/fonts/truetype/freefont/FreeSans.ttf al al.png If the fonts are located somewhere different on your computer, change this as appropriate. To turn off the text and margins completely:: last-dotplot.py -s0 al al.png To limit the plot to 500x500 pixels:: last-dotplot.py -x500 -y500 al al.png If there are too many chromosomes, the dotplot will be very cluttered, or the script might give up with an error message. So you may want to remove alignments involving fragmentary chromosomes first. For example, you could use "grep -v" to remove alignments involving chromosomes with names like "chr1_random":: grep -v 'random' al > plotme last-dotplot.py plotme plotme.png maf-convert.py -------------- This script can convert MAF-format alignments to tabular format. This is needed to feed MAF alignments to last-dotplot.py. Usage:: maf-convert.py tab my-alignments.maf > my-alignments.tab It can also convert MAF to AXT format, should you wish to do that:: maf-convert.py axt my-alignments.maf > my-alignments.axt last-reduce-alignments.sh ------------------------- This script removes "uninteresting" alignments from LAST genome comparisons in MAF format. Roughly speaking, it removes paralog alignments and keeps ortholog alignments. More precisely, if region A in genome 1 aligns with region B in genome 2, but if A also aligns more strongly with a different region X and B aligns more strongly with a different region Y, then the alignment of A with B is removed. This procedure is conservative: it is unlikely to remove one-to-one orthologs, but it may keep some paralogs, e.g. if the ortholog is (wholly or partially) deleted in one genome. The usage is simple:: last-reduce-alignments.sh my-alignments.maf > reduced-alignments.maf There is also an option to remove alignments more aggressively: if A aligns more strongly with X *or* B aligns more strongly with Y, then the alignment of A with B is removed. This is still unlikely to remove one-to-one orthologs, but it may cause some regions that are alignable to something to be aligned to nothing. This option is selected with "-d":: last-reduce-alignments.sh -d my-alignments.maf > reduced-alignments.maf maf-join.py ----------- This script joins two or more sets of pairwise (or multiple) alignments into multiple alignments:: maf-join.py aln1.maf aln2.maf aln3.maf > joined.maf The top genome in each input file should be the same, and the script simply joins alignment columns that are at the same position in the top genome. IMPORTANT LIMITATION: alignment columns with gaps in the top sequence get joined arbitrarily, and probably wrongly. Please disregard such columns in downstream analyses. Each input file must have been sorted using maf-sort.sh (but the output of last-reduce-alignments.sh is already in the right order). For an example of using LAST and maf-join.py, see multiMito.sh in the examples directory. maf-swap.py ----------- This script changes the order of the sequences in MAF-format alignments. You can use option "-n" to move the "n"th sequence to the top (it defaults to 2):: maf-swap.py -n3 my-alignments.maf > my-swapped.maf maf-sort.sh ----------- This sorts MAF-format alignments by sequence name, then start position, then end position, of the top sequence. last-remove-dominated.py ------------------------ This script is used by last-reduce-alignments.sh. It reads sorted alignments, and removes alignments of A in genome 1 with B in genome 2 if A also aligns more strongly to a different region X. maf2html.py ----------- This script converts MAF-format alignments to a human-friendly HTML format:: maf2html.py multiMito.maf > multiMito.html You can change the number of letters per line using the "-l" option:: maf2html.py -l50 multiMito.maf > multiMito.html Each alignment column gets coloured according to its probability, given by MAF lines starting with 'p'. (To get MAF lines starting with 'p', run lastal with option -j4 or -j5.) If an alignment has multiple 'p' lines (e.g. after using maf-join.py), then the column-wise products are used. last-map-probs.py ----------------- This script calculates "mapping probabilities" for alignments in LAST tabular format. If one query sequence participates in more than one alignment, the script ASSUMES that ONE of these alignments is the true mapping. The script estimates a probability of each alignment being the true mapping (based on their scores). It writes the alignments plus an extra column with the probabilities:: last-map-probs.py my-alignments.tab > my-alignments-with-probs This script does not indicate whether alignments are significantly similar: for that, you need lastex. Furthermore, if you feed this script weak alignments that might exist just by chance, then its assumption is unlikely to be valid. Limitations ----------- 1) last-reduce-alignments.sh and last-remove-dominated.py do not work with centroid alignments. 2) The scripts that read MAF format work with the simple subset of MAF produced by lastal, but they don't necessarily work with more complex MAF data from elsewhere.