doc/last-dotplot.txt
author Martin C. Frith
Tue Apr 04 11:51:15 2017 +0900 (2017-04-04)
changeset 845 16060c00b129
parent 840 85a72978fb7d
child 846 1f46ab956351
permissions -rw-r--r--
Added last-dotplot options to show BED features.
     1 last-dotplot
     2 ============
     3 
     4 This script makes a dotplot, a.k.a. Oxford Grid, of pair-wise sequence
     5 alignments in MAF or LAST tabular format.  It requires the Python
     6 Imaging Library to be installed.  It can be used like this::
     7 
     8   last-dotplot my-alignments my-plot.png
     9 
    10 The output can be in any format supported by the Imaging Library::
    11 
    12   last-dotplot alns alns.gif
    13 
    14 To get a nicer font, try something like::
    15 
    16   last-dotplot -f /usr/share/fonts/truetype/freefont/FreeSans.ttf alns alns.png
    17 
    18 If the fonts are located somewhere different on your computer, change
    19 this as appropriate.
    20 
    21 Choosing sequences
    22 ------------------
    23 
    24 If there are too many sequences, the dotplot will be very cluttered,
    25 or the script might give up with an error message.  You can exclude
    26 sequences with names like "chrUn_random522" like this::
    27 
    28   last-dotplot -1 'chr[!U]*' -2 'chr[!U]*' alns alns.png
    29 
    30 Option "-1" selects sequences from the 1st genome, and "-2" selects
    31 sequences from the 2nd genome.  'chr[!U]*' is a *pattern* that
    32 specifies names starting with "chr", followed by any character except
    33 U, followed by anything.
    34 
    35 ==========  =============================
    36 Pattern     Meaning
    37 ----------  -----------------------------
    38 ``*``       zero or more of any character
    39 ``?``       any single character
    40 ``[abc]``   any character in abc
    41 ``[!abc]``  any character not in abc
    42 ==========  =============================
    43 
    44 If a sequence name has a dot (e.g. "hg19.chr7"), the pattern is
    45 compared to both the whole name and the part after the dot.
    46 
    47 You can specify more than one pattern, e.g. this gets sequences with
    48 names starting in "chr" followed by one or two characters::
    49 
    50   last-dotplot -1 'chr?' -1 'chr??' alns alns.png
    51 
    52 You can also specify a sequence range; for example this gets the first
    53 1000 bases of chr9::
    54 
    55   last-dotplot -1 chr9:0-1000 alns alns.png
    56 
    57 Options
    58 -------
    59 
    60   -h, --help
    61       Show a help message, with default option values, and exit.
    62   -1 PATTERN, --seq1=PATTERN
    63       Which sequences to show from the 1st genome.
    64   -2 PATTERN, --seq2=PATTERN
    65       Which sequences to show from the 2nd genome.
    66   -x WIDTH, --width=WIDTH
    67       Maximum width in pixels.
    68   -y HEIGHT, --height=HEIGHT
    69       Maximum height in pixels.
    70   -f FILE, --fontfile=FILE
    71       TrueType or OpenType font file.
    72   -s SIZE, --fontsize=SIZE
    73       TrueType or OpenType font size.
    74   -c COLOR, --forwardcolor=COLOR
    75       Color for forward alignments.
    76   -r COLOR, --reversecolor=COLOR
    77       Color for reverse alignments.
    78   --trim1
    79       Trim unaligned sequence flanks from the 1st genome.
    80   --trim2
    81       Trim unaligned sequence flanks from the 2nd genome.
    82   --bed1=FILE
    83       Read `BED-format
    84       <https://genome.ucsc.edu/FAQ/FAQformat.html#format1>`_
    85       annotations for the 1st genome.  They are drawn as rectangles,
    86       with coordinates given by the first three BED fields.  The color
    87       is specified by the RGB field if present, else pale red if the
    88       strand is "+", pale blue if "-", or pale purple.
    89   --bed2=FILE
    90       Read BED-format annotations for the 2nd genome.
    91 
    92 Unsequenced gap options
    93 ~~~~~~~~~~~~~~~~~~~~~~~
    94 
    95 Note: these "gaps" are *not* alignment gaps (indels): they are regions
    96 of unknown sequence.
    97 
    98   --gap1=FILE
    99       Read unsequenced gaps in the 1st genome from an agp or gap file.
   100   --gap2=FILE
   101       Read unsequenced gaps in the 2nd genome from an agp or gap file.
   102   --bridged-color=COLOR
   103       Color for bridged gaps.
   104   --unbridged-color=COLOR
   105       Color for unbridged gaps.
   106 
   107 An unsequenced gap will be shown only if it covers at least one whole
   108 pixel.