
                                  diffseq 



Function

   Find differences between nearly identical sequences

Description

   diffseq takes two overlapping, nearly identical sequences and reports
   the differences between them, together with any features that overlap
   with these regions. GFF files of the differences in each sequence are
   also produced.

   diffseq finds the region of overlap of the input sequences and then
   reports differences within this region, like a local alignment.

   The start and end positions of the overlap are reported.

   diffseq should be of value when looking for SNPs, differences between
   strains of an organism and anything else that requires the differences
   between sequences to be highlighted.

   The sequences can be very long. The program does a match of all
   sequence words of size 10 (by default). It then reduces this to the
   minimum set of overlapping matches by sorting the matches in order of
   size (largest size first) and then for each such match it removes any
   smaller matches that overlap. The result is a set of the longest
   ungapped alignments between the two sequences that do not overlap with
   each other. The mismatched regions between these matches are reported.

   It should be possible to find differences between sequences that are
   Mega-bases long.

Usage

   Here is a sample session with diffseq


% diffseq tembl:ap000504 tembl:af129756 
Find differences between nearly identical sequences
Word size [10]: 
Output report [ap000504.diffseq]: 
Output features [AP000504.diffgff]: 
Second output features [AF129756.diffgff]: 

   Go to the input files for this example
   Go to the output files for this example

Command line arguments

   Standard (Mandatory) qualifiers:
  [-asequence]         sequence   Sequence USA
  [-bsequence]         sequence   Sequence USA
   -wordsize           integer    The similar regions between the two
                                  sequences are found by creating a hash table
                                  of 'wordsize'd subsequences. 10 is a
                                  reasonable default. Making this value larger
                                  (20?) may speed up the program slightly,
                                  but will mean that any two differences
                                  within 'wordsize' of each other will be
                                  grouped as a single region of difference.
                                  This value may be made smaller (4?) to
                                  improve the resolution of nearby
                                  differences, but the program will go much
                                  slower.
  [-outfile]           report     Output report file name
  [-aoutfeat]          featout    File for output of first sequence's features
  [-boutfeat]          featout    File for output of second sequence's
                                  features

   Additional (Optional) qualifiers:
   -globaldifferences  boolean    Normally this program will find regions of
                                  identity that are the length of the
                                  specified word-size or greater and will then
                                  report the regions of difference between
                                  these matching regions. This works well and
                                  is what most people want if they are working
                                  with long overlapping nucleic acid
                                  sequences. You are usually not interested in
                                  the non-overlapping ends of these
                                  sequences. If you have protein sequences or
                                  short RNA sequences however, you will be
                                  interested in differences at the very ends .
                                  It this option is set to be true then the
                                  differences at the ends will also be
                                  reported.

   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers:

   "-asequence" associated qualifiers
   -sbegin1            integer    Start of the sequence to be used
   -send1              integer    End of the sequence to be used
   -sreverse1          boolean    Reverse (if DNA)
   -sask1              boolean    Ask for begin/end/reverse
   -snucleotide1       boolean    Sequence is nucleotide
   -sprotein1          boolean    Sequence is protein
   -slower1            boolean    Make lower case
   -supper1            boolean    Make upper case
   -sformat1           string     Input sequence format
   -sdbname1           string     Database name
   -sid1               string     Entryname
   -ufo1               string     UFO features
   -fformat1           string     Features format
   -fopenfile1         string     Features file name

   "-bsequence" associated qualifiers
   -sbegin2            integer    Start of the sequence to be used
   -send2              integer    End of the sequence to be used
   -sreverse2          boolean    Reverse (if DNA)
   -sask2              boolean    Ask for begin/end/reverse
   -snucleotide2       boolean    Sequence is nucleotide
   -sprotein2          boolean    Sequence is protein
   -slower2            boolean    Make lower case
   -supper2            boolean    Make upper case
   -sformat2           string     Input sequence format
   -sdbname2           string     Database name
   -sid2               string     Entryname
   -ufo2               string     UFO features
   -fformat2           string     Features format
   -fopenfile2         string     Features file name

   "-outfile" associated qualifiers
   -rformat3           string     Report format
   -rname3             string     Base file name
   -rextension3        string     File name extension
   -rdirectory3        string     Output directory
   -raccshow3          boolean    Show accession number in the report
   -rdesshow3          boolean    Show description in the report
   -rscoreshow3        boolean    Show the score in the report
   -rusashow3          boolean    Show the full USA in the report

   "-aoutfeat" associated qualifiers
   -offormat4          string     Output feature format
   -ofopenfile4        string     Features file name
   -ofextension4       string     File name extension
   -ofdirectory4       string     Output directory
   -ofname4            string     Base file name
   -ofsingle4          boolean    Separate file for each entry

   "-boutfeat" associated qualifiers
   -offormat5          string     Output feature format
   -ofopenfile5        string     Features file name
   -ofextension5       string     File name extension
   -ofdirectory5       string     Output directory
   -ofname5            string     Base file name
   -ofsingle5          boolean    Separate file for each entry

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write standard output
   -filter             boolean    Read standard input, write standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report deaths


   Standard (Mandatory) qualifiers Allowed values Default
   [-asequence]
   (Parameter 1) Sequence USA Readable sequence Required
   [-bsequence]
   (Parameter 2) Sequence USA Readable sequence Required
   -wordsize The similar regions between the two sequences are found by
   creating a hash table of 'wordsize'd subsequences. 10 is a reasonable
   default. Making this value larger (20?) may speed up the program
   slightly, but will mean that any two differences within 'wordsize' of
   each other will be grouped as a single region of difference. This
   value may be made smaller (4?) to improve the resolution of nearby
   differences, but the program will go much slower. Integer 2 or more 10
   [-outfile]
   (Parameter 3) Output report file name Report output file
   [-aoutfeat]
   (Parameter 4) File for output of first sequence's features Writeable
   feature table $(asequence.name).diffgff
   [-boutfeat]
   (Parameter 5) File for output of second sequence's features Writeable
   feature table $(bsequence.name).diffgff
   Additional (Optional) qualifiers Allowed values Default
   -globaldifferences Normally this program will find regions of identity
   that are the length of the specified word-size or greater and will
   then report the regions of difference between these matching regions.
   This works well and is what most people want if they are working with
   long overlapping nucleic acid sequences. You are usually not
   interested in the non-overlapping ends of these sequences. If you have
   protein sequences or short RNA sequences however, you will be
   interested in differences at the very ends . It this option is set to
   be true then the differences at the ends will also be reported.
   Boolean value Yes/No No
   Advanced (Unprompted) qualifiers Allowed values Default
   (none)

Input file format

   This program reads in two nucleic acid sequence USAs or two protein
   sequence USAs.

  Input files for usage example

   'tembl:ap000504' is a sequence entry in the example nucleic acid
   database 'tembl'

  Database entry: tembl:ap000504

ID   AP000504   standard; DNA; HUM; 100000 BP.
XX
AC   AP000504; BA000025;
XX
SV   AP000504.1
XX
DT   28-SEP-1999 (Rel. 61, Created)
DT   22-AUG-2001 (Rel. 68, Last updated, Version 3)
XX
DE   Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section
DE   3/20.
XX
KW   .
XX
OS   Homo sapiens (human)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia
;
OC   Eutheria; Primates; Catarrhini; Hominidae; Homo.
XX
RN   [1]
RP   1-100000
RA   Hirakawa M., Yamaguchi H., Imai K., Shimada J.;
RT   ;
RL   Submitted (21-SEP-1999) to the EMBL/GenBank/DDBJ databases.
RL   Mika Hirakawa, Japan Science and Technology Corporation (JST), Advanced
RL   Databases Department; 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-0081, Japan
RL   (E-mail:mika@tokyo.jst.go.jp, URL:http://www-alis.tokyo.jst.go.jp/,
RL   Tel:81-3-5214-8491, Fax:81-3-5214-8470)
XX
RN   [2]
RA   Shiina S., Tamiya G., Oka A., Inoko H.;
RT   "Homo sapiens 2,229,817bp genomic DNA of 6p21.3 HLA class I region";
RL   Unpublished.
XX
DR   SWISS-PROT; O00299; CLI1_HUMAN.
DR   SWISS-PROT; O43196; MSH5_HUMAN.
DR   SWISS-PROT; O95445; APOM_HUMAN.
DR   SWISS-PROT; O95865; DDH2_HUMAN.
DR   SWISS-PROT; O95867; NG24_HUMAN.
DR   SWISS-PROT; P13862; KC2B_HUMAN.
XX
CC   This sequence is conducted by Tokai University as a JST sequencing
CC   Team.
CC   Principal Investigator: Hidetoshi Inoko Ph.D
CC   Phone:+81-463-93-1121, Fax:+81-463-94-8884,
CC   The sequence is submitted by Human Genome Sequencing in ALIS
CC   project of JST
CC   Japan Science and Technology Corporation (JST)
CC   5-3, Yonbancyo, Chiyoda-ku, Tokyo, 102-0081 Japan
CC   For further infomation about this sequences, please visit our
CC   sequence archive Web site (http://www-alis.tokyo.jst.go.jp/HGS/top.


  [Part of this file has been deleted for brevity]

     gggtggatca tgaggtcaag agatcgagac tatcctggct aacatgatga aaccccgtct     9708
0
     ctactaaaaa tacaaaaaat tagctgggca tggtggcggg cacctgtagt cccagctact     9714
0
     cgggaggctg agtcaggaga atggtgtgaa cccaggagac ggagcttgca gtgagctgag     9720
0
     gtcgcaccac tgcactccag cctgggtgat agagcgagac tctgtctcaa aaaaaaaaaa     9726
0
     aaaaaaaaaa aaaacaaaaa ttagccgggt gtggtggcag gcaacttaat cccagctact     9732
0
     tgggaggcag aggcaggaga atcgtttgaa cctgggaggc ggaggttgaa gagaatagaa     9738
0
     gctctgctgg tccagagaag gattgggcca gggctctggg agaccaggga gaaagagggc     9744
0
     acatgtggtc cctgttgact gtgagggtgg gaatctgagg aaggctttgg ctcattgccc     9750
0
     cttgggtttg tccacagcca tccttcccct gcggagtatg tcgaggtgct ccaggagcta     9756
0
     cagcggctgg agagtcgcct ccagcccttc ttgcagcgct actacgaggt tctgggtgct     9762
0
     gctgccacca cggactacaa taacaatgtg agccctttga tggccctgcc ctttctcctc     9768
0
     agccccagta ctcccaaaac agaacaggct gaaatacaga taactctttc cctccctgga     9774
0
     aaaacattgc aacagggcca ggtgcagtgg ctcacgcctg taatcccagc actttgggag     9780
0
     gccaaggtgg gcggatcatc tgagatcggg agtttgagac cagcctggcc aacatggtgc     9786
0
     aaccccatct ctactgaaaa tataaacatt agctggatgt agtggtgcac acctgtaatc     9792
0
     ccagctactc aggaggctga ggcaggagaa tcgctagaac tcgggaggag ggggttgcag     9798
0
     tgagccgaga ttgcactact gcactctagc ctgggtgaca gagcgagact gtctcaaaaa     9804
0
     acaaaacaaa acaaaaaaac acacattgca acaaaacaat ttctctctaa acctgtaagt     9810
0
     gattttgtcc tcccttacag agaaggtgat aatctttgct gtaagcactg tcctcgtatc     9816
0
     gtaccccttg tgcccctgaa tgaatttaga aaatgtaaag tacaggagat cagtatatga     9822
0
     tgacttactg attcatagta gtgttttaat aggatgttcc ttatgtgaat aagatataat     9828
0
     ttatttgcaa agatttggtc tacatgtaaa cttccaagga tataactgaa agttttggag     9834
0
     gacatggtat tctcagtagg cattattgct tttattagtg agatggactc cagcttgata     9840
0
     ttttctgcct ttttgtgttt ggctggttgt gcgcagcacg agggccggga ggaggatcag     9846
0
     cggttgatca acttggtagg ggagagcctg cgactgctgg gcaacacctt tgttgcactg     9852
0
     tctgacctgc gctgcaatct ggcctgcacg cccccacgac acctgcatgt ggtccggcct     9858
0
     atgtctcact acaccacccc catggtgctc cagcaggcag ccattcccat acaggtgggt     9864
0
     tagggggagt ctggcctgag ggagagtgag gggtgttgat agagtgaccc agggtagcta     9870
0
     ctgggcctga aggaggttag gaaaggagga gactggaaac atggtgatga aggctggaga     9876
0
     tactttagag gtttatcatg aggttttctt ggttaggctc ttgtattttt ctcacatctg     9882
0
     cctgtccatc tgtctttttc agatcaatgt gggaaccact gtgaccatga caggaaatgg     9888
0
     gactcggccc cccccaactc ccaatgcaga ggcacctccc cctggtcctg ggcaggcctc     9894
0
     atccgtggct ccgtcttcta ccaatgtcga gtcctcagct gagggggctc ccccgccagg     9900
0
     tccagctccc ccgccagcca ccagccaccc gagggtcatc cggatttccc accagagtgt     9906
0
     ggaacccgtg gtcatgatgc acatgaacat tcaaggtgag aatagttgct ggcgagaaga     9912
0
     gcaggatcag catgatgagg gaggttcatg ctgaggtgtg agggaacagg gtggggaagg     9918
0
     gagaggcaca tgctggtggt ggtagcctgg ggaccagagc agaagcttaa gtagacagat     9924
0
     gtggggggtg tgggggttgg tttgtctttg gaggtgtgtt tgtgtggtga agggagtacc     9930
0
     tctccctgtt tagatggagg gaaaggcagg ctttctgatt gggggattat gggcctgaag     9936
0
     tatgcctgat ctcagaagga tatagttagg ccttggccct acctacctca gggccactgt     9942
0
     ctctgtctcc ctgcccagat tctggcacac agcctggtgg tgttccgagt gctcccactg     9948
0
     gccccctggg accccctggt catggccaaa ccctgggtaa gagtgagggc atcagggcag     9954
0
     gctgagctct gggtagagaa agggaagggc tgagtgggtg ggttgaaggg gtccaggttc     9960
0
     aaggttacat cagacccgcc ccccaggctc caccctcatc cagctgccct ccctgccccc     9966
0
     tgagttcatg cacgccgtcg cccaccagat cactcatcag gccatggtgg cagctgttgc     9972
0
     ctccgcggcc gcaggtaatg acctggaagg ggaggcttgg gaggtagggc acagtccatg     9978
0
     gtggcagctg gctggcaagg gcctggccct cagccctctt cggtctgtct cttctgccac     9984
0
     ccacaggaca gcaggtgcca ggcttcccaa cagctccaac ccgggtggtg attgcccggc     9990
0
     ccactcctcc acaggctcgg ccttcccatc ctggagggcc cccagtctct gggacactgg     9996
0
     tgagcaaggg tcggggagtt ctagtgcgta acagtctagg                          10000
0
//

  Database entry: tembl:af129756

ID   AF129756   standard; DNA; HUM; 184666 BP.
XX
AC   AF129756;
XX
SV   AF129756.1
XX
DT   12-MAR-1999 (Rel. 59, Created)
DT   29-OCT-1999 (Rel. 61, Last updated, Version 2)
XX
DE   Homo sapiens MSH55 gene, partial cds; and CLIC1, DDAH, G6b, G6c, G5b, G6d,
DE   G6e, G6f, BAT5, G5b, CSK2B, BAT4, G4, Apo M, BAT3, BAT2, AIF-1, 1C7, LST-1
,
DE   LTB, TNF, and LTA genes, complete cds.
XX
KW   .
XX
OS   Homo sapiens (human)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia
;
OC   Eutheria; Primates; Catarrhini; Hominidae; Homo.
XX
RN   [1]
RP   1-184666
RA   Rowen L., Madan A., Qin S., Shaffer T., James R., Ratcliffe A., Abbasi N.,
RA   Dickhoff R., Loretz C., Madan A., Dors M., Young J., Lasky S., Hood L.;
RT   "Sequence of the human major histocompatibility complex class III region";
RL   Unpublished.
XX
RN   [2]
RP   1-184666
RA   Rowen L.;
RT   ;
RL   Submitted (22-FEB-1999) to the EMBL/GenBank/DDBJ databases.
RL   Department of Molecular Biotechnology, Box 357730 University of Washington
,
RL   Seattle, WA 98195, USA
XX
RN   [3]
RP   1-184666
RA   Rowen L.;
RT   ;
RL   Submitted (28-OCT-1999) to the EMBL/GenBank/DDBJ databases.
RL   Multimegabase Sequencing Center, University of Washington, PO Box 357730,
RL   Seattle, WA 98195, USA
XX
DR   EPD; EP11158; HS_TNFA.
DR   EPD; EP11159; HS_TNFB.
DR   SPTREMBL; O00452; O00452.
DR   SPTREMBL; O14931; O14931.
DR   SPTREMBL; O95866; O95866.
DR   SPTREMBL; O95868; O95868.
DR   SPTREMBL; O95869; O95869.
DR   SPTREMBL; O95870; O95870.


  [Part of this file has been deleted for brevity]

     aaaccagttt accaccactc ctaacactaa acttaaatct gactctaaat gtaagtccaa    18174
0
     tctgagccac aagcctaaag ttgaacttta tcctgcttta tgaattattc atccattcct    18180
0
     ccatttagtg agtatctgcg tgcctaacac atgctgggca ttgtcctaag gcaggaggga    18186
0
     catggaggca aagggatcag agaaggtacc agcacctgtg gagcttgtat tccagtgagg    18192
0
     ccagacggaa aagaaagaaa ctgaagaaga aattggtact atgagaaaat aagacaggct    18198
0
     gatgttgtaa gagtggcagg gagctacttt taaatacagt agtcagcaaa atcctctttg    18204
0
     agtgtttggg tggcactgga gctgagaccc aaatgacaaa aaatagtgac caggtaaaag    18210
0
     tttgggagca aagcatttca ggtaaaggga gcagctactg caaaggctgg aaggcggaac    18216
0
     caagctgggg gtgttgacga caaacagaag gccagtgtgg ctggagcaga gagagagact    18222
0
     gggaggcggg tgggagatga ggtcagagag gagggcaggg gccaggtcat gcagggccat    18228
0
     gcaagaaggg taaagcctct agatttcatc cagccacagg aagcctttaa aggtcgtcag    18234
0
     agtgtgtggt gcgtgcgtgt gtgtgtgtgt gtgtgtgtgt gttgcagggg agagaggggg    18240
0
     agggagagag agagagagag agagaagagg gaggtgagca gaggtgattg gatttttttt    18246
0
     tcttttgaca tggtgtcttg ctctgtggcc taggctggag tgcagtggca ccatcatagc    18252
0
     ccactgcaac ctcaaaacca tgggctcaag tcatccttcc acctcagctt cccaagtatc    18258
0
     taggactaca ggtgtgtgcc actgtgcctg gctaatttta aaaaatattt taaaattttt    18264
0
     gttgagacag ggtctatgct gctcaggctg gtctcgaact cctggtttca agtgatctgc    18270
0
     ccatcttggc ctcccaaagt ttttttttgt tagtttgaga ggcggtttcg ctcgttgccc    18276
0
     aggctggagt gcaatgactg atctcatctc actgcaacct ctgcctcctg ggttcaagcg    18282
0
     attctcctgc ttcagcctcc caagtagctg ggattacagg tgcatgccac cattcccggc    18288
0
     taattttttg tatttagtag agatggggtt tcaccatgtt agtcaggctg atctcaaact    18294
0
     cctgacctca ggtgatccgc ctgcctcagc ctcccaaagt tttgggatta caggtgtgag    18300
0
     ccaccatgct gggccagcct cccaaagttt tgggattaca ggcatgagtc accacactgg    18306
0
     ccctggattt tttttctttc ttttttttgg agacggagtc tcactctgtt gcccaggctg    18312
0
     gagtgcaatg gcgtaatctc agctcactgc aacctctgct gcccgggttc aaacgattct    18318
0
     cctgtcttag cctcctgagt agctgggatt ataggtgcat gccaccatgc ctggctaatt    18324
0
     tttgtacttt tagtagagaa agtacaccat cttggccagg ctggtctcga actcctgacc    18330
0
     tcaggtgatc cacttgcgtc ggcctcccaa agtgctggga ttacaggcgt gagacaccgc    18336
0
     acccagcctt tttttttttt tttcttttaa gacagaatcg ctctgtcacc caggctggag    18342
0
     tgcagtggca caatctcggc tcactgcaac ctctgcctcc caggtttaag caatccacct    18348
0
     atgtcagtct cccaagtagc tgggattata ggtgcatgtc accatgcctg gctaattttt    18354
0
     gtacttttag tatagaaagt acaccatgtt ggccaggctg gtcttgaact cctgacctca    18360
0
     agtgatccgc ctgcctcagc ctcccgaagt gctggaatta cagacatgtg ccactgcacc    18366
0
     cggcctggtt ttttttttct aagagatgga gtctcacttt tctgcccagg ttggagtgca    18372
0
     atggcaccat catagctcac tgcagccttc aactcttggc ctcaggcaat ccttgcacct    18378
0
     tagcctcgca gtgttgggat tacaggcatg agccactgag ccttgcctgg actttttttt    18384
0
     ttttttgaga tggcgtctcg ctctgttgcc caggttggag tgctacggca tgatcttggc    18390
0
     tcactgcaac ttccacctcc caggttcaag cgattctctt gcctcggccc cccgagtagc    18396
0
     tgggattaca ggcatgcgcc accgtgcctg gctaattttg gtatttttag tagagatagg    18402
0
     gtttcatcat gttgggcagg ctggtcttga actcctgacc tcgtgatcca cccacctcgg    18408
0
     cctcccaaag tgctgggatt ataggcatag ccaacgcgcc cagcctggac ttgtttttaa    18414
0
     aagatcactg tggctcctgt gtttaggctg gctggtagga gacaggtggc agtggcattg    18420
0
     atggtgaaga gaaaatagtg gcagccatgg agatggagag aagtagacaa gtttgggata    18426
0
     tattatacat tccaggggta gaaacaacag gactagatga tggattgatg ggtgggagat    18432
0
     gtagatactg ggagagaagc aggattctga tggatggaaa aactaaaaaa ttctattttg    18438
0
     ggtgtggtaa gtctaagtct attagacatg caagtagaga tgtcactggg cagatacaca    18444
0
     tctggatttc aggggcaagg tccaagctag agaaagaaac ctgggcatgg tcagcatgag    18450
0
     gatggtgttt aaagccatgg aacttatctt gtgcatccct ataagacccc tttgaggcac    18456
0
     ttgtttcccc tcacaatgga tgcagtgcat cttccattct gaattccaga ggcaacaacc    18462
0
     tcctgctcct agaagctaaa ctctccagac ttagtcttct gaattc                   18466
6
//

Output file format

   The output is a standard EMBOSS report file.

   The results can be output in one of several styles by using the
   command-line qualifier -rformat xxx, where 'xxx' is replaced by the
   name of the required format. The available format names are: embl,
   genbank, gff, pir, swiss, trace, listfile, dbmotif, diffseq, excel,
   feattable, motif, regions, seqtable, simple, srs, table, tagseq

   See: http://emboss.sf.net/docs/themes/ReportFormats.html for further
   information on report formats.

   By default diffseq writes a 'diffseq' report file.

  Output files for usage example

  File: ap000504.diffseq

########################################
# Program: diffseq
# Rundate: Fri Jul 15 2005 12:00:00
# Report_format: diffseq
# Report_file: ap000504.diffseq
# Additional_files: 2
# 1: AP000504.diffgff (Feature file for first sequence)
# 2: AF129756.diffgff (Feature file for second sequence)
########################################

#=======================================
#
# Sequence: AP000504     from: 1   to: 100000
# HitCount: 119
#
# Compare: AF129756     from: 1   to: 184666
#
# AP000504 overlap starts at 1
# AF129756 overlap starts at 6036
#
# (AP000504) start end length sequence
# (AF129756) start end length sequence
#
#
#=======================================


AP000504 847-847 Length: 1
Sequence: a
Sequence: t
AF129756 6882-6882 Length: 1

AP000504 1795-1795 Length: 1
Sequence: g
Sequence: a
AF129756 7830-7830 Length: 1

AP000504 2273-2273 Length: 1
Sequence: t
Sequence:
Feature: repeat_region 7920-8351 rpt_family='MSTB'
AF129756 8307 Length: 0

AP000504 2466-2466 Length: 1
Sequence: g
Sequence: a
Feature: repeat_region 8391-8686 rpt_family='AluSg'
AF129756 8500-8500 Length: 1

AP000504 2655-2658 Length: 4


  [Part of this file has been deleted for brevity]

Sequence: t
Sequence: c
AF129756 99280-99280 Length: 1

AP000504 93696-93696 Length: 1
Sequence: t
Sequence: g
AF129756 99726-99726 Length: 1

AP000504 93860-93860 Length: 1
Sequence: t
Sequence: g
AF129756 99890-99890 Length: 1

AP000504 95451-95451 Length: 1
Sequence: c
Sequence: t
AF129756 101481-101481 Length: 1

AP000504 96650-96650 Length: 1
Sequence: c
Sequence: t
AF129756 102680-102680 Length: 1

AP000504 97273-97274 Length: 2
Sequence: aa
Sequence:
Feature: repeat_region 103299-103402 rpt_family='AluSq'
AF129756 103302 Length: 0

AP000504 97716-97716 Length: 1
Sequence: a
Sequence: g
AF129756 103744-103744 Length: 1

AP000504 97827-97827 Length: 1
Sequence: c
Sequence: t
Feature: repeat_region 103784-104083 rpt_family='AluSx'
AF129756 103855-103855 Length: 1

#---------------------------------------
#
# Overlap_end: 100000 in AP000504
# Overlap_end: 106028 in AF129756
#
# SNP_count: 86
# Transitions: 58
# Transversions: 28
#
#---------------------------------------

  File: AF129756.diffgff

##gff-version 2.0
##date 2005-07-15
##Type DNA AF129756
AF129756        diffseq conflict        6882    6882    1.000   +       .
Sequence "AF129756.1" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        7830    7830    1.000   +       .
Sequence "AF129756.2" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        8500    8500    1.000   +       .
Sequence "AF129756.3" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        10945   10962   1.000   +       .
Sequence "AF129756.4" ; note "Insertion of 18 bases in AF129756" ; replace ""
AF129756        diffseq conflict        10999   11001   1.000   +       .
Sequence "AF129756.5" ; note "AP000504" ; replace "aaa"
AF129756        diffseq conflict        12915   12915   1.000   +       .
Sequence "AF129756.6" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        15139   15139   1.000   +       .
Sequence "AF129756.7" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        17192   17192   1.000   +       .
Sequence "AF129756.8" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        19761   19761   1.000   +       .
Sequence "AF129756.9" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        20291   20291   1.000   +       .
Sequence "AF129756.10" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        20462   20462   1.000   +       .
Sequence "AF129756.11" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        25686   25686   1.000   +       .
Sequence "AF129756.12" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        26192   26192   1.000   +       .
Sequence "AF129756.13" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        27227   27227   1.000   +       .
Sequence "AF129756.14" ; note "Insertion of 1 bases in AF129756" ; replace ""
AF129756        diffseq conflict        27837   27837   1.000   +       .
Sequence "AF129756.15" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        29328   29328   1.000   +       .
Sequence "AF129756.16" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        29458   29458   1.000   +       .
Sequence "AF129756.17" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        29629   29629   1.000   +       .
Sequence "AF129756.18" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        29646   29646   1.000   +       .
Sequence "AF129756.19" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        30838   30838   1.000   +       .
Sequence "AF129756.20" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        31349   31349   1.000   +       .
Sequence "AF129756.21" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        31901   31901   1.000   +       .
Sequence "AF129756.22" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        36682   36682   1.000   +       .
Sequence "AF129756.23" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        38225   38226   1.000   +       .
Sequence "AF129756.24" ; note "Insertion of 2 bases in AF129756" ; replace ""
AF129756        diffseq conflict        38379   38379   1.000   +       .
Sequence "AF129756.25" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        38537   38537   1.000   +       .
Sequence "AF129756.26" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        39114   39114   1.000   +       .
Sequence "AF129756.27" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        39816   39816   1.000   +       .
Sequence "AF129756.28" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        40807   40807   1.000   +       .
Sequence "AF129756.29" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        40977   40977   1.000   +       .
Sequence "AF129756.30" ; note "Insertion of 1 bases in AF129756" ; replace ""
AF129756        diffseq conflict        41204   41204   1.000   +       .
Sequence "AF129756.31" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        42548   42548   1.000   +       .
Sequence "AF129756.32" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        45315   45315   1.000   +       .
Sequence "AF129756.33" ; note "Insertion of 1 bases in AF129756" ; replace ""
AF129756        diffseq conflict        48382   48382   1.000   +       .
Sequence "AF129756.34" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        50635   50635   1.000   +       .
Sequence "AF129756.35" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        50809   50809   1.000   +       .
Sequence "AF129756.36" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        51286   51286   1.000   +       .
Sequence "AF129756.37" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        51645   51645   1.000   +       .
Sequence "AF129756.38" ; note "Insertion of 1 bases in AF129756" ; replace ""
AF129756        diffseq conflict        52388   52388   1.000   +       .
Sequence "AF129756.39" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        52646   52646   1.000   +       .
Sequence "AF129756.40" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        53596   53596   1.000   +       .
Sequence "AF129756.41" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        53621   53621   1.000   +       .
Sequence "AF129756.42" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        54883   54883   1.000   +       .
Sequence "AF129756.43" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        55377   55377   1.000   +       .
Sequence "AF129756.44" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        55571   55571   1.000   +       .
Sequence "AF129756.45" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        55611   55611   1.000   +       .
Sequence "AF129756.46" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        55655   55661   1.000   +       .
Sequence "AF129756.47" ; note "Insertion of 7 bases in AF129756" ; replace ""


  [Part of this file has been deleted for brevity]

AF129756        diffseq conflict        66604   66604   1.000   +       .
Sequence "AF129756.55" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        69445   69445   1.000   +       .
Sequence "AF129756.56" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        70182   70183   1.000   +       .
Sequence "AF129756.57" ; note "AP000504" ; replace "ta"
AF129756        diffseq conflict        70195   70195   1.000   +       .
Sequence "AF129756.58" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        71102   71102   1.000   +       .
Sequence "AF129756.59" ; note "Insertion of 1 bases in AF129756" ; replace ""
AF129756        diffseq conflict        73566   73566   1.000   +       .
Sequence "AF129756.60" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        73758   73758   1.000   +       .
Sequence "AF129756.61" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        74597   74597   1.000   +       .
Sequence "AF129756.62" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        76175   76176   1.000   +       .
Sequence "AF129756.63" ; note "Insertion of 2 bases in AF129756" ; replace ""
AF129756        diffseq conflict        76463   76463   1.000   +       .
Sequence "AF129756.64" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        76710   76710   1.000   +       .
Sequence "AF129756.65" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        77331   77331   1.000   +       .
Sequence "AF129756.66" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        77597   77597   1.000   +       .
Sequence "AF129756.67" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        78092   78092   1.000   +       .
Sequence "AF129756.68" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        79671   79671   1.000   +       .
Sequence "AF129756.69" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        80042   80042   1.000   +       .
Sequence "AF129756.70" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        80115   80115   1.000   +       .
Sequence "AF129756.71" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        81882   81882   1.000   +       .
Sequence "AF129756.72" ; note "AP000504" ; replace "tttggaat"
AF129756        diffseq conflict        82132   82132   1.000   +       .
Sequence "AF129756.73" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        83649   83649   1.000   +       .
Sequence "AF129756.74" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        84290   84290   1.000   +       .
Sequence "AF129756.75" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        86465   86465   1.000   +       .
Sequence "AF129756.76" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        86842   86844   1.000   +       .
Sequence "AF129756.77" ; note "Insertion of 3 bases in AF129756" ; replace ""
AF129756        diffseq conflict        87014   87014   1.000   +       .
Sequence "AF129756.78" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        87102   87102   1.000   +       .
Sequence "AF129756.79" ; note "Insertion of 1 bases in AF129756" ; replace ""
AF129756        diffseq conflict        87605   87605   1.000   +       .
Sequence "AF129756.80" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        87893   87893   1.000   +       .
Sequence "AF129756.81" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        88359   88359   1.000   +       .
Sequence "AF129756.82" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        88635   88635   1.000   +       .
Sequence "AF129756.83" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        88750   88750   1.000   +       .
Sequence "AF129756.84" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        88822   88826   1.000   +       .
Sequence "AF129756.85" ; note "Insertion of 5 bases in AF129756" ; replace ""
AF129756        diffseq conflict        89118   89118   1.000   +       .
Sequence "AF129756.86" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        89738   89738   1.000   +       .
Sequence "AF129756.87" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        91271   91271   1.000   +       .
Sequence "AF129756.88" ; note "Insertion of 1 bases in AF129756" ; replace ""
AF129756        diffseq conflict        92311   92311   1.000   +       .
Sequence "AF129756.89" ; note "SNP in AP000504" ; replace "g"
AF129756        diffseq conflict        92345   92345   1.000   +       .
Sequence "AF129756.90" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        93979   93979   1.000   +       .
Sequence "AF129756.91" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        94959   94959   1.000   +       .
Sequence "AF129756.92" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        95246   95246   1.000   +       .
Sequence "AF129756.93" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        95809   95810   1.000   +       .
Sequence "AF129756.94" ; note "AP000504" ; replace "aat"
AF129756        diffseq conflict        96756   96756   1.000   +       .
Sequence "AF129756.95" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        97713   97713   1.000   +       .
Sequence "AF129756.96" ; note "AP000504" ; replace "tgtgtgtgtgtgtgtgt"
AF129756        diffseq conflict        97827   97827   1.000   +       .
Sequence "AF129756.97" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        98195   98195   1.000   +       .
Sequence "AF129756.98" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        99280   99280   1.000   +       .
Sequence "AF129756.99" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        99726   99726   1.000   +       .
Sequence "AF129756.100" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        99890   99890   1.000   +       .
Sequence "AF129756.101" ; note "SNP in AP000504" ; replace "t"
AF129756        diffseq conflict        101481  101481  1.000   +       .
Sequence "AF129756.102" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        102680  102680  1.000   +       .
Sequence "AF129756.103" ; note "SNP in AP000504" ; replace "c"
AF129756        diffseq conflict        103744  103744  1.000   +       .
Sequence "AF129756.104" ; note "SNP in AP000504" ; replace "a"
AF129756        diffseq conflict        103855  103855  1.000   +       .
Sequence "AF129756.105" ; note "SNP in AP000504" ; replace "c"

  File: AP000504.diffgff

##gff-version 2.0
##date 2005-07-15
##Type DNA AP000504
AP000504        diffseq conflict        847     847     1.000   +       .
Sequence "AP000504.1" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        1795    1795    1.000   +       .
Sequence "AP000504.2" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        2273    2273    1.000   +       .
Sequence "AP000504.3" ; note "Insertion of 1 bases in AP000504" ; replace ""
AP000504        diffseq conflict        2466    2466    1.000   +       .
Sequence "AP000504.4" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        2655    2658    1.000   +       .
Sequence "AP000504.5" ; note "Insertion of 4 bases in AP000504" ; replace ""
AP000504        diffseq conflict        4951    4953    1.000   +       .
Sequence "AP000504.6" ; note "AF129756" ; replace "tat"
AP000504        diffseq conflict        6600    6600    1.000   +       .
Sequence "AP000504.7" ; note "Insertion of 1 bases in AP000504" ; replace ""
AP000504        diffseq conflict        6868    6868    1.000   +       .
Sequence "AP000504.8" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        8218    8221    1.000   +       .
Sequence "AP000504.9" ; note "Insertion of 4 bases in AP000504" ; replace ""
AP000504        diffseq conflict        9096    9096    1.000   +       .
Sequence "AP000504.10" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        11149   11149   1.000   +       .
Sequence "AP000504.11" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        13718   13718   1.000   +       .
Sequence "AP000504.12" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        14248   14248   1.000   +       .
Sequence "AP000504.13" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        14419   14419   1.000   +       .
Sequence "AP000504.14" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        19643   19643   1.000   +       .
Sequence "AP000504.15" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        20149   20149   1.000   +       .
Sequence "AP000504.16" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        21316   21319   1.000   +       .
Sequence "AP000504.17" ; note "Insertion of 4 bases in AP000504" ; replace ""
AP000504        diffseq conflict        21797   21797   1.000   +       .
Sequence "AP000504.18" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        23288   23288   1.000   +       .
Sequence "AP000504.19" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        23418   23418   1.000   +       .
Sequence "AP000504.20" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        23589   23589   1.000   +       .
Sequence "AP000504.21" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        23606   23606   1.000   +       .
Sequence "AP000504.22" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        24798   24798   1.000   +       .
Sequence "AP000504.23" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        25309   25309   1.000   +       .
Sequence "AP000504.24" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        25861   25861   1.000   +       .
Sequence "AP000504.25" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        28039   28040   1.000   +       .
Sequence "AP000504.26" ; note "Insertion of 2 bases in AP000504" ; replace ""
AP000504        diffseq conflict        30644   30644   1.000   +       .
Sequence "AP000504.27" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        32339   32339   1.000   +       .
Sequence "AP000504.28" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        32497   32497   1.000   +       .
Sequence "AP000504.29" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        33074   33074   1.000   +       .
Sequence "AP000504.30" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        33776   33776   1.000   +       .
Sequence "AP000504.31" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        34767   34767   1.000   +       .
Sequence "AP000504.32" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        35163   35163   1.000   +       .
Sequence "AP000504.33" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        36507   36507   1.000   +       .
Sequence "AP000504.34" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        37760   37762   1.000   +       .
Sequence "AP000504.35" ; note "Insertion of 3 bases in AP000504" ; replace ""
AP000504        diffseq conflict        38680   38683   1.000   +       .
Sequence "AP000504.36" ; note "Insertion of 4 bases in AP000504" ; replace ""
AP000504        diffseq conflict        42347   42347   1.000   +       .
Sequence "AP000504.37" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        42637   42638   1.000   +       .
Sequence "AP000504.38" ; note "Insertion of 2 bases in AP000504" ; replace ""
AP000504        diffseq conflict        44602   44602   1.000   +       .
Sequence "AP000504.39" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        44776   44776   1.000   +       .
Sequence "AP000504.40" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        45253   45253   1.000   +       .
Sequence "AP000504.41" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        46354   46354   1.000   +       .
Sequence "AP000504.42" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        46612   46612   1.000   +       .
Sequence "AP000504.43" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        47562   47562   1.000   +       .
Sequence "AP000504.44" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        47587   47587   1.000   +       .
Sequence "AP000504.45" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        48849   48849   1.000   +       .
Sequence "AP000504.46" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        49343   49343   1.000   +       .
Sequence "AP000504.47" ; note "SNP in AF129756" ; replace "a"


  [Part of this file has been deleted for brevity]

AP000504        diffseq conflict        58685   58685   1.000   +       .
Sequence "AP000504.55" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        60558   60558   1.000   +       .
Sequence "AP000504.56" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        61209   61209   1.000   +       .
Sequence "AP000504.57" ; note "Insertion of 1 bases in AP000504" ; replace ""
AP000504        diffseq conflict        62958   62959   1.000   +       .
Sequence "AP000504.58" ; note "Insertion of 2 bases in AP000504" ; replace ""
AP000504        diffseq conflict        63402   63402   1.000   +       .
Sequence "AP000504.59" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        64139   64140   1.000   +       .
Sequence "AP000504.60" ; note "AF129756" ; replace "at"
AP000504        diffseq conflict        64152   64152   1.000   +       .
Sequence "AP000504.61" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        65317   65317   1.000   +       .
Sequence "AP000504.62" ; note "Insertion of 1 bases in AP000504" ; replace ""
AP000504        diffseq conflict        67523   67523   1.000   +       .
Sequence "AP000504.63" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        67715   67715   1.000   +       .
Sequence "AP000504.64" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        68554   68554   1.000   +       .
Sequence "AP000504.65" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        69285   69285   1.000   +       .
Sequence "AP000504.66" ; note "Insertion of 1 bases in AP000504" ; replace ""
AP000504        diffseq conflict        70419   70419   1.000   +       .
Sequence "AP000504.67" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        70666   70666   1.000   +       .
Sequence "AP000504.68" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        71287   71287   1.000   +       .
Sequence "AP000504.69" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        71553   71553   1.000   +       .
Sequence "AP000504.70" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        72048   72048   1.000   +       .
Sequence "AP000504.71" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        73627   73627   1.000   +       .
Sequence "AP000504.72" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        73998   73998   1.000   +       .
Sequence "AP000504.73" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        74071   74071   1.000   +       .
Sequence "AP000504.74" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        75838   75845   1.000   +       .
Sequence "AP000504.75" ; note "AF129756" ; replace "g"
AP000504        diffseq conflict        76095   76095   1.000   +       .
Sequence "AP000504.76" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        77612   77612   1.000   +       .
Sequence "AP000504.77" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        78253   78253   1.000   +       .
Sequence "AP000504.78" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        80428   80428   1.000   +       .
Sequence "AP000504.79" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        80974   80974   1.000   +       .
Sequence "AP000504.80" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        81564   81564   1.000   +       .
Sequence "AP000504.81" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        81852   81852   1.000   +       .
Sequence "AP000504.82" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        82318   82318   1.000   +       .
Sequence "AP000504.83" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        82594   82594   1.000   +       .
Sequence "AP000504.84" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        82709   82709   1.000   +       .
Sequence "AP000504.85" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        83072   83072   1.000   +       .
Sequence "AP000504.86" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        83692   83692   1.000   +       .
Sequence "AP000504.87" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        86264   86264   1.000   +       .
Sequence "AP000504.88" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        86298   86298   1.000   +       .
Sequence "AP000504.89" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        87932   87932   1.000   +       .
Sequence "AP000504.90" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        88912   88912   1.000   +       .
Sequence "AP000504.91" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        89199   89199   1.000   +       .
Sequence "AP000504.92" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        89762   89764   1.000   +       .
Sequence "AP000504.93" ; note "AF129756" ; replace "ca"
AP000504        diffseq conflict        90710   90710   1.000   +       .
Sequence "AP000504.94" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        91667   91683   1.000   +       .
Sequence "AP000504.95" ; note "AF129756" ; replace "g"
AP000504        diffseq conflict        91797   91797   1.000   +       .
Sequence "AP000504.96" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        92165   92165   1.000   +       .
Sequence "AP000504.97" ; note "SNP in AF129756" ; replace "a"
AP000504        diffseq conflict        93250   93250   1.000   +       .
Sequence "AP000504.98" ; note "SNP in AF129756" ; replace "c"
AP000504        diffseq conflict        93696   93696   1.000   +       .
Sequence "AP000504.99" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        93860   93860   1.000   +       .
Sequence "AP000504.100" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        95451   95451   1.000   +       .
Sequence "AP000504.101" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        96650   96650   1.000   +       .
Sequence "AP000504.102" ; note "SNP in AF129756" ; replace "t"
AP000504        diffseq conflict        97273   97274   1.000   +       .
Sequence "AP000504.103" ; note "Insertion of 2 bases in AP000504" ; replace ""
AP000504        diffseq conflict        97716   97716   1.000   +       .
Sequence "AP000504.104" ; note "SNP in AF129756" ; replace "g"
AP000504        diffseq conflict        97827   97827   1.000   +       .
Sequence "AP000504.105" ; note "SNP in AF129756" ; replace "t"

   The first line is the title giving the names of the sequences used.

   The next two non-blank lines state the positions in each sequence
   where the detected overlap between them starts.

   There then follows a set of reports of the mismatches between the
   sequences.
   Each report consists of 4 or more lines.
     * The first line has the name of the first sequence followed by the
       start and end positions of the mismatched region in that sequence,
       followed by the length of the mismatched region. If the mismatched
       region is of zero length in this sequence, then only the position
       of the last matching base before the mismatch is given.
     * If a feature of the first sequence overlaps with this mismatch
       region, then one or more lines starting with 'Feature:' comes next
       with the type, position and tag field of the feature.
     * Next is a line starting "Sequence:" giving the sequence of the
       mismatch in the first sequence.

   This is followed by the equivalent information for the second
   sequence, but in the reverse order, namely 'Sequence:' line,
   'Feature:' lines and line giving the position of the mismatch in the
   second sequence.

   At the end of the report are two non-blank lines giving the positions
   in each sequence where the detected overlap between them ends.

   The last three lines of the report gives the counts of SNPs (defined
   as a change of one nucleotide to one other nucleotide, no deletions or
   insertions are counted, no multi-base changes are counted).

   If the input sequences are nucleic acid, The counts of transitions
   (Pyrimide to Pyrimidine or Purine to Purine) and transversions
   (Pyrimidine to Purine) are also given.

   It should be noted that not all features are reported.

   The 'source' feature found in all EMBL/Genbank feature table entries
   is not reported as this covers all of the sequence and so overlaps
   with any difference found in that sequence and so is uninformative and
   irritating. It has therefore been removed from the output report.

   The translation information of CDS features is often extremely long
   and does not add useful information to the report. It has therefore
   been removed from the output report.

Data files

   None

Notes

   It should be noted that not all features are reported.

   The 'source' feature found in all EMBL/Genbank feature table entries
   is not reported as this covers all of the sequence and so overlaps
   with any difference found in that sequence and so is uninformative and
   irritating. It has therefore been removed from the output report.

   The translation information of CDS features is often extremely long
   and does not add useful information to the report. It has therefore
   been removed from the output report.

   If you run out of memory, use a larger word size.

   Using a larger word size increases the length between mismatches that
   will be reported as one event. Thus a word size of 50 will report two
   single-base differences that are with 50 bases of each other as one
   mismatch.

References

   None.

Warnings

   None.

Diagnostic Error Messages

   None.

Exit status

   It always exits with status 0.

Known bugs

   None.

See also

   Program name Description

Author(s)

   Gary Williams (gwilliam  rfcgr.mrc.ac.uk)
   MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust
   Genome Campus, Hinxton, Cambridge, CB10 1SB, UK

History

   Written 15th Aug 2000 - Gary Williams.

   18th Aug 2000 - Added writing out GFF files of the mismatched regions

Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.

Comments

   None
