4 This script makes a dotplot, a.k.a. Oxford Grid, of pair-wise sequence
5 alignments in MAF or LAST tabular format. It requires the `Python
6 Imaging Library <https://pillow.readthedocs.io/>`_ to be installed.
7 It can be used like this::
9 last-dotplot my-alignments my-plot.png
11 The output can be in any format supported by the Imaging Library::
13 last-dotplot alns alns.gif
15 To get a nicer font, try something like::
17 last-dotplot -f /usr/share/fonts/liberation/LiberationSans-Regular.ttf alns alns.png
21 last-dotplot -f /Library/Fonts/Arial.ttf alns alns.png
23 If the fonts are located somewhere different on your computer, change
29 If there are too many sequences, the dotplot will be very cluttered,
30 or the script might give up with an error message. You can exclude
31 sequences with names like "chrUn_random522" like this::
33 last-dotplot -1 'chr[!U]*' -2 'chr[!U]*' alns alns.png
35 Option "-1" selects sequences from the 1st (horizontal) genome, and
36 "-2" selects sequences from the 2nd (vertical) genome. 'chr[!U]*' is
37 a *pattern* that specifies names starting with "chr", followed by any
38 character except U, followed by anything.
40 ========== =============================
42 ---------- -----------------------------
43 ``*`` zero or more of any character
44 ``?`` any single character
45 ``[abc]`` any character in abc
46 ``[!abc]`` any character not in abc
47 ========== =============================
49 If a sequence name has a dot (e.g. "hg19.chr7"), the pattern is
50 compared to both the whole name and the part after the dot.
52 You can specify more than one pattern, e.g. this gets sequences with
53 names starting in "chr" followed by one or two characters::
55 last-dotplot -1 'chr?' -1 'chr??' alns alns.png
57 You can also specify a sequence range; for example this gets the first
60 last-dotplot -1 chr9:0-1000 alns alns.png
66 Show a help message, with default option values, and exit.
68 Show progress messages & data about the plot.
69 -1 PATTERN, --seq1=PATTERN
70 Which sequences to show from the 1st (horizontal) genome.
71 -2 PATTERN, --seq2=PATTERN
72 Which sequences to show from the 2nd (vertical) genome.
73 -x WIDTH, --width=WIDTH
74 Maximum width in pixels.
75 -y HEIGHT, --height=HEIGHT
76 Maximum height in pixels.
77 -c COLOR, --forwardcolor=COLOR
78 Color for forward alignments.
79 -r COLOR, --reversecolor=COLOR
80 Color for reverse alignments.
82 Put the 1st genome's sequences left-to-right in order of: their
83 appearance in the input (0), their names (1), their lengths (2).
85 Put the 2nd genome's sequences top-to-bottom in order of: their
86 appearance in the input (0), their names (1), their lengths (2).
88 Trim unaligned sequence flanks from the 1st (horizontal) genome.
90 Trim unaligned sequence flanks from the 2nd (vertical) genome.
92 Number of pixels between sequences.
94 Color for pixels between sequences.
99 -f FILE, --fontfile=FILE
100 TrueType or OpenType font file.
101 -s SIZE, --fontsize=SIZE
102 TrueType or OpenType font size.
104 Text rotation for the 1st genome: h(orizontal) or v(ertical).
106 Text rotation for the 2nd genome: h(orizontal) or v(ertical).
108 Show sequence lengths for the 1st (horizontal) genome.
110 Show sequence lengths for the 2nd (vertical) genome.
115 These options read annotations of sequence segments, and draw them as
116 colored horizontal or vertical stripes. This looks good only if the
117 annotations are reasonably sparse: e.g. you can't sensibly view 20000
118 gene annotations in one small dotplot.
122 <https://genome.ucsc.edu/FAQ/FAQformat.html#format1>`_
123 annotations for the 1st genome. They are drawn as stripes, with
124 coordinates given by the first three BED fields. The color is
125 specified by the RGB field if present, else pale red if the
126 strand is "+", pale blue if "-", or pale purple.
128 Read BED-format annotations for the 2nd genome.
130 Read repeat annotations for the 1st genome, in RepeatMasker .out
131 or rmsk.txt format. The color is pale purple for "low
132 complexity" and "simple repeats", else pale red for "+" strand
133 and pale blue for "-" strand.
135 Read repeat annotations for the 2nd genome.
141 Read gene annotations for the 1st genome in `genePred format
142 <https://genome.ucsc.edu/FAQ/FAQformat.html#format9>`_.
144 Read gene annotations for the 2nd genome.
148 Color for protein-coding regions.
150 Unsequenced gap options
151 ~~~~~~~~~~~~~~~~~~~~~~~
153 Note: these "gaps" are *not* alignment gaps (indels): they are regions
157 Read unsequenced gaps in the 1st genome from an agp or gap file.
159 Read unsequenced gaps in the 2nd genome from an agp or gap file.
160 --bridged-color=COLOR
161 Color for bridged gaps.
162 --unbridged-color=COLOR
163 Color for unbridged gaps.
165 An unsequenced gap will be shown only if it covers at least one whole
171 Colors can be specified in `various ways described here
172 <http://effbot.org/imagingbook/imagecolor.htm>`_.