
                          SEQSEARCH documentation
                                      
   

CONTENTS

   1.0 SUMMARY 
   2.0 INPUTS & OUTPUTS 
   3.0 INPUT FILE FORMAT 
   4.0 OUTPUT FILE FORMAT 
   5.0 DATA FILES 
   6.0 USAGE 
   7.0 KNOWN BUGS & WARNINGS 
   8.0 NOTES 
   9.0 DESCRIPTION 
   10.0 ALGORITHM 
   11.0 RELATED APPLICATIONS 
   12.0 DIAGNOSTIC ERROR MESSAGES 
   13.0 AUTHORS 
   14.0 REFERENCES 

1.0 SUMMARY

   Generate PSI-BLAST hits (DHF file) from a DAF file

2.0 INPUTS & OUTPUTS

   SEQSEARCH reads a directory of i. single protein sequences or ii. set
   of protein sequences (aligned or unaligned) and generates a DHF file
   ('domain hits file') of sequence relatives (hits) for each file in the
   input directory. The hits are sequence relatives to the input
   sequences and are found by using PSIBLAST. Only unique hits are
   generated; only one of a set of identical hits returned by PSIBLAST is
   retained.
   Typically, aligned sequences within a DAF file ('domain alignment
   file') are input and the DHF file output is annotated with domain
   classification data.
   PSIBLAST must be installed on the system that is running SEQSEARCH
   (see 'Notes' below). The base name of an input file is used as the
   base name for the corresponding output file. The paths and extensions
   for the sequence files (input) and domain hits files (output) are
   specified by the user. The name of the BLAST-indexed database to
   search are also user-specified. A log file is also written.

3.0 INPUT FILE FORMAT

   The format of the domain alignment file is described in DOMAINALIGN
   documentation.
   If other sequences or sequence sets (aligned or unaligned) are used as
   input, all of the common file formats are supported.

  Input files for usage example

  File: swsmall

> Q9WVI4
DDVTMLFSDIVGFTAICAQCTPMQVISMLNELYTRFDHQCGFLDIYKVETIGDAYCVASG
LHRKSLCHAKPIALMALKMMELSEEVLTPDGRPIQMRIGIHSGSVLAGVVGVRMPRYCLF
GNNVTLASKFESGSHPRRINISPTTYQLL
> Q9ERL9
VTMLFSDIVGFTAICSQCSPLQVITMLNALYTRFDQQCGELDVYKVETIGDAYCVAGGLH
RESDTHAVQIALMALKMMELSNEVMSPHGEPIKMRIGLHSGSVFAGVVGVKMPRYCLFGN
NVTLANKFESCSVPRKINVSPTTYRLLKDCPG
> Q9DGG6
EQVSILFADIVGFTKMSANKSAHALVGLLNDLFGRFDRLCEDTKCEKISTLGDCYYCVAG
CPEPRADHAYCCIEMGLGMIKAIEQFCQEKKEMVNMRVGVHTGTVLCGILGMRRFKFDVW
SNDVNLANLMEQLGVAGKVHISEATAKYLDDRYEMEDGKVTERVGQSAVADQLKGLKTYL
I
> Q99396
KELADPVTLIFTDIESSTAQWATQPELMPDAVATHHSMVRSLIENYDCYEVKTVGDSFMI
ACKSPFAAVQLAQELQLRFLRLDWGTTVFDEFYREFEERHAEEGDGKYKPPTARLDPEVY
RQLWNGLRVRVGIHTGLCDIRYDEVTKGYDYYGQTANTAARTESVGNGGQVLMTCETYHS
LSTAERSQFDVTPLGGVPLRGVSEPVEVYQLN
> Q99280
NDSAPKEPTGPVTLIFTDIESSTALWAAHPDLMPDAVATHHRLIRSLITRYECYEVKTVG
DSFMIASKSPFAAVQLAQELQLRFLRLDWETNALDESYREFEEQRAEGECEYTPPTAHMD
PEVYSRLWNGLRVRVGIHTGLCDIRYDEVTKGYDYYGRTSNMAARTESVANGGQVLMTHA
AYMSLSGEDRNQLDVTTLGATVLRGVPEPVRMYQLN
> Q99279
NNNRAPKEPTDPVTLIFTDIESSTALWAAHPDLMPDAVAAHHRMVRSLIGRYKCYEVKTV
GDSFMIASKSPFAAVQLAQELQLCFLHHDWGTNALDDSYREFEEQRAEGECEYTPPTAHM
DPEVYSRLWNGLRVRVGIHTGLCDIIRHDEVTKGYDYYGRTPNMAARTESVANGGQVLMT
HAAYMSLSAEDRKQIDVTALGDVALRGVSDPVKMYQLN
> Q91WF3
VCVLFASVPDFKEFYSESNINHEGLECLRLLNEIIADFDELLSKPKFSGVEKIKTIGSTY
MAATGLNATSGQDTQQDSERSCSHLGTMVEFAVALGSKLGVINKHSFNNFRLRVGLNHGP
VVAGVIGAQKPQYDIWGNTVNVASRMESTGVLGKIQVTEETARAL
> Q91WF3
FHSLYVKRHQGVSVLYADIVGFTRLASECSPKELVLMLNELFGKFDQIAKEHECMRIKIL
GDCYYCVSGLPLSLPDHAINCVRMGLDMCRAIRKLRVATGVDINMRVGVHSGSVLCGVIG
LQKWQYDVWSHDVTLANHMEAGGVPGRVHITGATLALL
> Q8VHH7
NNFMLRIGMNKGGVLAGVIGARKPHYDIWGNTVNVASRMESTGVMGNIQVVEET
> Q8VHH7
FNTMYMYRHENVSILFADIVGFTQLSSACSAQELVKLLNELFARFDKLAAKYHQLRIKIL
GDCYYCICGLPDYREDHAVCSILMGLAMVEAISYVREKTKTGVDMRVGVHTGTVLGGVLG
QKRWQYDVWSTDVTVANKMEAGGIPGRVHISQSTMDCLKGEFDVEPGDGGSRCDYLDEKG
IETYLI
> Q8NFM4
VCVLFASVPDFKEFYSESNINHEGLECLRLLNEIIADFDELLSKPKFSGVEKIKTIGSTY
MAATGLNATSGQDAQQDAERSCSHLGTMVEFAVALGSKLDVINKHSFNNFRLRVGLNHGP
VVAGVIGAQKPQYDIWGNTVNVASRMESTGVLGKIQVTEET
> Q8NFM4
FHSLYVKRHQGVSVLYADIVGFTRLASECSPKELVLMLNELFGKFDQIAKEHECMRIKIL
GDCYYCVSGLPLSLPDHAINCVRMGLDMCRAIRKLRAATGVDINMRVGVHSGSVLCGVIG


  [Part of this file has been deleted for brevity]

> Q83IL8
VEAIKRGTVIDHIPAQIGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIKIENTFLSE
EQVDQLALYAPQATVNRIDNYEVVGKSRPSLP
> Q7P144
VEALKQGTVIDHIPAGEGVKILRLFKLTETGERVTVGLNLVSRHMGSKDLIKVENVALTE
EQANELALFAPKATVNVIDNFEVVKKHKLTLP
> Q7MZ14
VEAIRCGTVIDHIPAQVGFKLLSLFKLTETDQRITIGLNLPSNRLGKKDLIKIENTFLTE
QQANQLAMYAPNATVNCIENYEVVKKLPINLP
> Q7MX57
VAAIRNGIVIDHIPPTKLFKVATLLQLDDLDKRITIGNNLRSRSHGSKGVIKIEDKTFEE
EELNRIALIAPNVRLNIIRDYEVVEKRQVEVP
> Q7MHF0
VEAIKNGTVIDHIPAQVGIKVLKLFDMHNSSQRVTIGLNLPSSALGNKDLLKIENVFINE
EQASKLALYAPHATVNQIEDYQVVKKLALELP
> Q58801
VKKITNGTVIDHIDAGKALMVFKVLNVPKETSVMIAINVPSKKKGKKDILKIEGIELKKE
DVDKISLISPDVTINIIRNGKVVEKLKPQIP
> P96175
VEAICNGYVIDHIPSGQGVKILRLFSLTDTKQRVTVGFNLPSHDGTTKDLIKVENTEITK
SQANQLALLAPNATVNIIENFKVTDKHSLALP
> P96111
GIKPIENGTVIDHIAKGKTPEEIYSTILKIRKILRLYDVDSADGIFRSSDGSFKGYISLP
DRYLSKKEIKKLSAISPNTTVNIIKNSTVVEKYRIKLP
> P77919
VSAIKEGTVIDHIPAGKGLKVIEILKLGKLTNGGAVLLAMNVPSKKLGRKDIVKVEGRFL
SEEEVNKIALVAPNATVNIIRDYKVVEKFKVEVP
> P74766
VSKIKNGTVIDHIPAGRAFAVLNVLGIKGHEGFRIALVINVDSKKMGKKDIVKIEDKEIS
DTEANLITLIAPTATINIVREYEVVKKTKLEVP
> P57451
VEAIKSGSVIDHIPEYIGFKLLSLFRFTETEKRITIGLNLPSKKLGRKDIIKIENTFLSD
EQINQLAIYAPHATVNYINEYNLVRKVFPTLP
> P19936
VEAIKCGTVIDHIPAQIGFKLLTLFKLTATDQRITIGLNLPSNELGRKDLIKIENTFLTE
QQANQLAMYAPKATVNRIDNYEVVRKLTLSLP
> P08421
VEAIKCGTVIDHIPAQVGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIKIENTFLTE
EQVNQLALYAPQATVNRIDNYDVVGKSRPSLP
> P00478
VEAIKRGTVIDHIPAQIGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIKIENTFLSE
DQVDQLALYAPQATVNRIDNYEVVGKSRPSLP
> O58452
VSAIKEGTVIDHIPAGKGLKVIEILGLSKLSNGGSVLLAMNVPSKKLGRKDIVKVEGKFL
SEEEVNKIALVAPTATVNIIRNYKVVEKFKVEVP
> O30129
VSKIKEGTVIDHINAGKALLVLKILKIQPGTDLTVSMAMNVPSSKMGKKDIVKVEGMFIR
DEELNKIALISPNATINLIRDYEIERKFKVSPP
> O26938
VKPIKNGTVIDHITANRSLNVLNILGLPDGRSKVTVAMNMDSSQLGSKDIVKIENRELKP
SEVDQIALIAPRATINIVRDYKIVEKAKVRL

4.0 OUTPUT FILE FORMAT

   SEQSEARCH generates a domain hits file in FASTA-like format (Figure
   1).
   Figure 1 DHF file (FASTA-like format)
   The file (Figure 1) contains two lines per hit. The first contains a
   description of the hit in 16 text tokens delimited by '^'. The tokens
   are as follows (a '.' is given where a token does not have a value).
   The first 4 tokens are specific to the sequence of the hit:
     * (i) Accession number of the hit.
     * (ii) Database code from Uniprot.
     * (iii - iv) Start and end positions of the hit relative to the full
       length sequence in the uniprot database (files of these type may
       also be generated by using SEQWORDS in which case a '.' will be
       given for these records - see SEQWORDS documentation ).

   The next 9 tokens are specific to the domain (or domain family or
   other node) for which the hit was generated:
     * (v) Type of domain; currently either 'SCOP' or 'CATH' is given.
     * (vi) SCOP or CATH domain identifier. This is a 7-character code
       that uniquely identifies the domain in SCOP or CATH.
     * (vii) SCOP or CATH node unique identifier. For example, if the
       domain alignment file was for a SCOP family, the SCOP Sunid for
       the family would be given. This number uniquely identifies the
       node (i.e. family in this case) in the raw SCOP or CATH parsable
       files.
     * (viii) Domain class. Textual description of the 'Class' (SCOP and
       CATH domains).
     * (ix) Domain architecture. Textual description of the
       'Architecture' (CATH only).
     * (x) Domain topology. Textual description of the 'Topology' (CATH
       only).
     * (xi) Domain fold. Textual description of the 'Fold' (SCOP domains
       only).
     * (xii) Domain superfamily. Textual description of the 'Superfamily'
       (SCOP and CATH domains).
     * (xiii) Domain family. Textual description of the 'Fold' (SCOP
       only).

   The next 4 tokens are specific to the hit itself:
     * (xiv) Model type. The type of model that was used to generate the
       hit. May have a value of "PSIBLAST" (from PSIBLAST), "HMMER"
       (hidden Markov model from the HMMER package), "SAM" (hidden Markov
       model from the SAM package), SPARSE (sparse protein signature),
       HENIKOFF (Henikoff profile) or GRIBSKOV (Gribskov profile). A
       value of "PSIBLAST" is written by SEQSEARCH, a value of "KEYWORD"
       is written by SEQWORDS.
     * (xv) SC - Score of hit. A floating point value that is the score
       from psiblast (or other search algorithm).
     * (xvi) P-value of hit from search algorithm.
     * (xvii) E-value of hit from search algorithm. The second line
       contains the protein sequence.

  Output files for usage example

  File: 54894.dhf

> Q9YBD5^.^1^95^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^55.30^0.000e+
00^2.000e-11
VRKIRSGVVIDHIPPGRAFTMLKALGLLPPRGYRWRIAVVINAESSKLGRKDILKIEGYKPRQRDLEVLGIIAPGATFN
VIEDYKVVEKVKLKLP
> Q9UX07^.^1^93^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^59.20^0.000e+
00^1.000e-12
VSKIRNGTVIDHIPAGRALAVLRILGIRGSEGYRVALVMNVESKKIGRKDIVKIEDRVIDEKEASLITLIAPSATINII
RDYVVTEKRHLEVP
> Q9KP65^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^119.00^0.000e
+00^9.000e-31
VEAIKNGTVIDHIPAKVGIKVLKLFDMHNSAQRVTIGLNLPSSALGSKDLLKIENVFISEAQANKLALYAPHATVNQIE
NYEVVKKLALQLP
> Q9K1K9^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^91.90^0.000e+
00^2.000e-22
VEAIEKGTVIDHIPAGRGLTILRQFKLLHYGNAVTVGFNLPSKTQGSKDIIKIKGVCLDDKAADRLALFAPEAVVNTID
NFKVVQKRHLNLP
> Q9JWY6^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^90.40^0.000e+
00^5.000e-22
VEAIEKGTVIDHIPAGRGLTILRQFKLLHYGNAVTVGFNLPSKTQGSKDIIKIKGVCLDDKAADRLALFAPEAVVNTID
HFKVVQKRHLNLP
> Q9HKM3^.^1^93^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^71.90^0.000e+
00^2.000e-16
ISKIRDGTVIDHVPSGKGIRVIGVLGVHEDVNYTVSLAIHVPSNKMGFKDVIKIENRFLDRNELDMISLIAPNATISII
KNYEISEKFQVELP
> Q9HHN3^.^1^93^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^70.40^0.000e+
00^4.000e-16
VSKIQAGTVIDHIPAGQALQVLQILGTNGASDDQITVGMNVTSERHHRKDIVKIEGRELSQDEVDVLSLIAPDATINIV
RDYEVDEKRRVDRP
> Q97FS4^.^1^90^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^42.60^0.000e+
00^1.000e-07
INSIKNGIVIDHIKAGHGIKIYNYLKLGEAEFPTALIMNAISKKNKAKDIIKIENVMDLDLAVLGFLDPNITVNIIEDE
KIRQKIQLKLP
> Q97B28^.^1^93^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^71.90^0.000e+
00^2.000e-16
ISKIKDGTVIDHIPSGKALRVLSILGIRDDVDYTVSVGMHVPSSKMEYKDVIKIENRSLDKNELDMISLTAPNATISII
KNYEISEKFKVELP
> Q970X3^.^1^91^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^71.10^0.000e+
00^3.000e-16
VSKIKNGTVIDHIPAGRALAVLRILKIAEGYRIALVMNVESKKMGKKDIVKIENKEVDEKEANLITLIAPTATINIIRD
YEVVEKKKLKIP
> Q8ZTG2^.^1^93^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^58.00^0.000e+
00^3.000e-12
VSKIENGTVIDHIPAGRALTVLRILGISGKEGLRVALVMNVESKKLGKKDIVKIEGRELTPEEVNIISAVAPTATINII
RNFAVVKKFKVTPP
> Q8ZB38^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^143.00^0.000e
+00^3.000e-38
VEAIKCGTVIDHIPAQIGFKLLSLFKLTATDQRITIGLNLPSKRSGRKDLIKIENTFLTEQQANQLAMYAPDATVNRID
NYEVVKKLTLSLP
> Q8Z130^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^167.00^0.000e
+00^0.000e+00
VEAIKCGTVIDHIPAQVGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIKIENTFLTDEQVNQLALYAPQATVNRID
NYDVVGKSRPSLP
> Q8U374^.^1^94^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^82.70^0.000e+
00^1.000e-19
VSAIKEGTVIDHIPAGKGLKVIQILGLGELKNGGAVLLAMNVPSKKLGRKDIVKVEGKFLSEEEVNKIALVAPTATVNI
IREYKVVEKFKVEIP
> Q8TVB1^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^65.40^0.000e+
00^2.000e-14
VKRIEMGTVLDHLPPGTAPQIMRILDIDPTETTLLVAINVESSKMGRKDILKIEGKILSEEEANKVALVAPNATVNIVR
DYSVAEKFQVKPP
> Q8THL3^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^66.50^0.000e+
00^7.000e-15
IQAIENGTVIDHITAGQALNVLRILRISSAFRATVSFVMNAPGARGKKDVVKIEGKELSVEELNRIALISPKATINIIR
DFEVVQKNKVVLP
> Q8PXK6^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^60.70^0.000e+
00^4.000e-13
VQAIESGTVIDHIKSGQALNVLRILGISSAFRATISFVMNAPGAGGKKDVVKIEGKELSVEELNRIALISPKATINIIR
DFVVVQKNNVVLP
> Q8K9H8^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^135.00^0.000e
+00^1.000e-35
VEAIKSGSVIDHIPAHIGFKLLSLFRFTETEKRITIGLNLPSQKLDKKDIIKIENTFLSDDQINQLAIYAPCATVNYIE
KYNLVGKIFPSLP
> Q8DCF7^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^116.00^0.000e
+00^5.000e-30
VEAIKNGTVIDHIPAQVGIKVLKLFDMHNSSQRVTIGLNLPSSALGNKDLLKIENVFINEEQASKLALYAPHATVNQIE
DYQVVKKLALELP
> Q8D1W6^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^113.00^0.000e
+00^5.000e-29
VEAIFGGTVIDHIPAQVGLKLLSLFKWLHTKERITMGLNLPSNQQKKKDLIKLENVLLNEDQANQLSIYAPLATVNQIK
NYIVIKKQKLKLP
> Q8A9S4^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^57.30^0.000e+
00^4.000e-12
VAALKNGTVIDHIPSEKLFTVVQLLGVEQMKCNITIGFNLDSKKLGKKGIIKIADKFFCDEEINRISVVAPYVKLNIIR
DYEVVEKKEVRMP
> Q891I9^.^1^91^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^46.10^0.000e+
00^1.000e-08
ITSIKDGIVIDHIKSGYGIKIFNYLNLKNVEYSVALIMNVFSSKLGKKDIIKIANKEIDIDFTVLGLIDPTITINIIED
EKIKEKLNLELP
> Q87LF7^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^121.00^0.000e
+00^2.000e-31
VEAIKNGTVIDHIPAQIGIKVLKLFDMHNSSQRVTIGLNLPSSALGHKDLLKIENVFINEEQASKLALYAPHATVNQIE
NYEVVKKLALELP
> Q83IL8^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^174.00^0.000e
+00^0.000e+00
VEAIKRGTVIDHIPAQIGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIKIENTFLSEEQVDQLALYAPQATVNRID
NYEVVGKSRPSLP
> Q7P144^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^117.00^0.000e
+00^3.000e-30
VEALKQGTVIDHIPAGEGVKILRLFKLTETGERVTVGLNLVSRHMGSKDLIKVENVALTEEQANELALFAPKATVNVID
NFEVVKKHKLTLP
> Q7MZ14^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^139.00^0.000e
+00^6.000e-37
VEAIRCGTVIDHIPAQVGFKLLSLFKLTETDQRITIGLNLPSNRLGKKDLIKIENTFLTEQQANQLAMYAPNATVNCIE
NYEVVKKLPINLP
> Q7MX57^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^72.70^0.000e+
00^1.000e-16
VAAIRNGIVIDHIPPTKLFKVATLLQLDDLDKRITIGNNLRSRSHGSKGVIKIEDKTFEEEELNRIALIAPNVRLNIIR
DYEVVEKRQVEVP
> Q7MHF0^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^116.00^0.000e
+00^5.000e-30
VEAIKNGTVIDHIPAQVGIKVLKLFDMHNSSQRVTIGLNLPSSALGNKDLLKIENVFINEEQASKLALYAPHATVNQIE
DYQVVKKLALELP
> Q58801^.^1^91^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^53.40^0.000e+
00^6.000e-11
VKKITNGTVIDHIDAGKALMVFKVLNVPKETSVMIAINVPSKKKGKKDILKIEGIELKKEDVDKISLISPDVTINIIRN
GKVVEKLKPQIP
> P96175^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^98.10^0.000e+
00^2.000e-24
VEAICNGYVIDHIPSGQGVKILRLFSLTDTKQRVTVGFNLPSHDGTTKDLIKVENTEITKSQANQLALLAPNATVNIIE
NFKVTDKHSLALP
> P96111^.^1^98^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^42.20^0.000e+
00^1.000e-07
GIKPIENGTVIDHIAKGKTPEEIYSTILKIRKILRLYDVDSADGIFRSSDGSFKGYISLPDRYLSKKEIKKLSAISPNT
TVNIIKNSTVVEKYRIKLP
> P77919^.^1^94^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^83.80^0.000e+
00^4.000e-20
VSAIKEGTVIDHIPAGKGLKVIEILKLGKLTNGGAVLLAMNVPSKKLGRKDIVKVEGRFLSEEEVNKIALVAPNATVNI
IRDYKVVEKFKVEVP
> P74766^.^1^93^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^67.30^0.000e+
00^4.000e-15
VSKIKNGTVIDHIPAGRAFAVLNVLGIKGHEGFRIALVINVDSKKMGKKDIVKIEDKEISDTEANLITLIAPTATINIV
REYEVVKKTKLEVP
> P57451^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^131.00^0.000e
+00^2.000e-34
VEAIKSGSVIDHIPEYIGFKLLSLFRFTETEKRITIGLNLPSKKLGRKDIIKIENTFLSDEQINQLAIYAPHATVNYIN
EYNLVRKVFPTLP
> P19936^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^147.00^0.000e
+00^0.000e+00
VEAIKCGTVIDHIPAQIGFKLLTLFKLTATDQRITIGLNLPSNELGRKDLIKIENTFLTEQQANQLAMYAPKATVNRID
NYEVVRKLTLSLP
> P08421^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^168.00^0.000e
+00^0.000e+00
VEAIKCGTVIDHIPAQVGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIKIENTFLTEEQVNQLALYAPQATVNRID
NYDVVGKSRPSLP
> P00478^.^1^92^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^175.00^0.000e
+00^0.000e+00
VEAIKRGTVIDHIPAQIGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIKIENTFLSEDQVDQLALYAPQATVNRID
NYEVVGKSRPSLP
> O58452^.^1^94^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^85.40^0.000e+
00^2.000e-20
VSAIKEGTVIDHIPAGKGLKVIEILGLSKLSNGGSVLLAMNVPSKKLGRKDIVKVEGKFLSEEEVNKIALVAPTATVNI
IRNYKVVEKFKVEVP
> O30129^.^1^93^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^70.40^0.000e+
00^5.000e-16
VSKIKEGTVIDHINAGKALLVLKILKIQPGTDLTVSMAMNVPSSKMGKKDIVKVEGMFIRDEELNKIALISPNATINLI
RDYEIERKFKVSPP
> O26938^.^1^91^SCOP^.^54894^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Aspartate carbamoyltransferase, Regulatory-chain, N-terminal domain^Aspartate c
arbamoyltransferase, Regulatory-chain, N-terminal domain^PSIBLAST^73.80^0.000e+
00^4.000e-17
VKPIKNGTVIDHITANRSLNVLNILGLPDGRSKVTVAMNMDSSQLGSKDIVKIENRELKPSEVDQIALIAPRATINIVR
DYKIVEKAKVRL

  File: 55074.dhf

> Q9WVI4^.^1^149^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^84.40^0.000e+00^1.000e-19
DDVTMLFSDIVGFTAICAQCTPMQVISMLNELYTRFDHQCGFLDIYKVETIGDAYCVASGLHRKSLCHAKPIALMALKM
MELSEEVLTPDGRPIQMRIGIHSGSVLAGVVGVRMPRYCLFGNNVTLASKFESGSHPRRINISPTTYQLL
> Q9ERL9^.^1^152^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^72.10^0.000e+00^5.000e-16
VTMLFSDIVGFTAICSQCSPLQVITMLNALYTRFDQQCGELDVYKVETIGDAYCVAGGLHRESDTHAVQIALMALKMME
LSNEVMSPHGEPIKMRIGLHSGSVFAGVVGVKMPRYCLFGNNVTLANKFESCSVPRKINVSPTTYRLLKDCPG
> Q9DGG6^.^1^181^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^142.00^0.000e+00^4.000e-37
EQVSILFADIVGFTKMSANKSAHALVGLLNDLFGRFDRLCEDTKCEKISTLGDCYYCVAGCPEPRADHAYCCIEMGLGM
IKAIEQFCQEKKEMVNMRVGVHTGTVLCGILGMRRFKFDVWSNDVNLANLMEQLGVAGKVHISEATAKYLDDRYEMEDG
KVTERVGQSAVADQLKGLKTYLI
> Q99396^.^1^212^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^192.00^0.000e+00^0.000e+00
KELADPVTLIFTDIESSTAQWATQPELMPDAVATHHSMVRSLIENYDCYEVKTVGDSFMIACKSPFAAVQLAQELQLRF
LRLDWGTTVFDEFYREFEERHAEEGDGKYKPPTARLDPEVYRQLWNGLRVRVGIHTGLCDIRYDEVTKGYDYYGQTANT
AARTESVGNGGQVLMTCETYHSLSTAERSQFDVTPLGGVPLRGVSEPVEVYQLN
> Q99280^.^1^216^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^218.00^0.000e+00^0.000e+00
NDSAPKEPTGPVTLIFTDIESSTALWAAHPDLMPDAVATHHRLIRSLITRYECYEVKTVGDSFMIASKSPFAAVQLAQE
LQLRFLRLDWETNALDESYREFEEQRAEGECEYTPPTAHMDPEVYSRLWNGLRVRVGIHTGLCDIRYDEVTKGYDYYGR
TSNMAARTESVANGGQVLMTHAAYMSLSGEDRNQLDVTTLGATVLRGVPEPVRMYQLN
> Q99279^.^1^218^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^258.00^0.000e+00^0.000e+00
NNNRAPKEPTDPVTLIFTDIESSTALWAAHPDLMPDAVAAHHRMVRSLIGRYKCYEVKTVGDSFMIASKSPFAAVQLAQ
ELQLCFLHHDWGTNALDDSYREFEEQRAEGECEYTPPTAHMDPEVYSRLWNGLRVRVGIHTGLCDIIRHDEVTKGYDYY
GRTPNMAARTESVANGGQVLMTHAAYMSLSAEDRKQIDVTALGDVALRGVSDPVKMYQLN
> Q91WF3^.^1^165^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^54.30^0.000e+00^1.000e-10
VCVLFASVPDFKEFYSESNINHEGLECLRLLNEIIADFDELLSKPKFSGVEKIKTIGSTYMAATGLNATSGQDTQQDSE
RSCSHLGTMVEFAVALGSKLGVINKHSFNNFRLRVGLNHGPVVAGVIGAQKPQYDIWGNTVNVASRMESTGVLGKIQVT
EETARAL
> Q91WF3^.^1^158^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^160.00^0.000e+00^0.000e+00
FHSLYVKRHQGVSVLYADIVGFTRLASECSPKELVLMLNELFGKFDQIAKEHECMRIKILGDCYYCVSGLPLSLPDHAI
NCVRMGLDMCRAIRKLRVATGVDINMRVGVHSGSVLCGVIGLQKWQYDVWSHDVTLANHMEAGGVPGRVHITGATLALL
> Q8VHH7^.^1^186^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^178.00^0.000e+00^0.000e+00
FNTMYMYRHENVSILFADIVGFTQLSSACSAQELVKLLNELFARFDKLAAKYHQLRIKILGDCYYCICGLPDYREDHAV
CSILMGLAMVEAISYVREKTKTGVDMRVGVHTGTVLGGVLGQKRWQYDVWSTDVTVANKMEAGGIPGRVHISQSTMDCL
KGEFDVEPGDGGSRCDYLDEKGIETYLI
> Q8NFM4^.^1^161^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^54.70^0.000e+00^9.000e-11
VCVLFASVPDFKEFYSESNINHEGLECLRLLNEIIADFDELLSKPKFSGVEKIKTIGSTYMAATGLNATSGQDAQQDAE
RSCSHLGTMVEFAVALGSKLDVINKHSFNNFRLRVGLNHGPVVAGVIGAQKPQYDIWGNTVNVASRMESTGVLGKIQVT
EET
> Q8NFM4^.^1^158^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^160.00^0.000e+00^0.000e+00
FHSLYVKRHQGVSVLYADIVGFTRLASECSPKELVLMLNELFGKFDQIAKEHECMRIKILGDCYYCVSGLPLSLPDHAI
NCVRMGLDMCRAIRKLRAATGVDINMRVGVHSGSVLCGVIGLQKWQYDVWSHDVTLANHMEAGGVPGRVHITGATLALL
> Q29450^.^1^186^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^179.00^0.000e+00^0.000e+00
FHNLYVKRHQNVSILYADIVGFTRLASDCSPKELVVVLNELFGKFDQIAKANECMRIKILGDCYYCVSGLPVSLPNHAR
NCVKMGLDMCEAIKQVREATGVDISMRVGIHSGNVLCGVIGLRKWQYDVWSHDVSLANRMEAAGVPGRVHITEATLKHL
DKAYEVEDGHGQQRDPYLKEMNIRTYLV
> Q29450^.^1^58^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase ca
talytic domain^PSIBLAST^56.30^0.000e+00^3.000e-11
NSFRLRVGINHGPVIAGVIGARKPQYDIWGNTVNVASRMESTGELGKIQVTEETCTIL
> Q27675^.^1^217^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^167.00^0.000e+00^0.000e+00
NNDAAPKDGDEPVTLLFTDIESSTALWAALPQLMSDAIAAHHRVIRQLVKKYGCYEVKTIGDSFMIACRSAHSAVSLAC
EIQTKLLKHDWGTEALDRAYREFELARVDTLDDYEPPTARLSEEEYAALWCGLRVRVGIHTGLTDIRYDEVTKGYDYYG
DTSNMAARTEAVANGGQVVATEAAWWALSNDERAGIAHTAMGPQGLRGVPFAVEMFQLN
> Q26896^.^1^216^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^195.00^0.000e+00^0.000e+00
NDSAPKEFTDPVTLIFTDIESSTALWAAHPGMMADAVATHHRLIRSLIALYGAYEVKTVGDSFMIACRSAFAAVELARD
LQLTLVHHDWGTVAIDESYRKFEEERAVEDSDYAPPTARLDSAVYCKLWNGLRVRAGIHTGLCDIAHDEVTKGYDYYGR
TPNLAARTESAANGGQVLVTGATYYSLSVAERARLDATPIGPVPLRGVPEPVEMYQLN
> Q26721^.^1^206^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^223.00^0.000e+00^0.000e+00
PVTLIFTDIESSTALWAAHPEVMPDAVATHHRLIRTLISKYECYEVKTVGDSFMIASKSPFAAVQLAQELQLCFLHHDW
GTNAIDESYQQFEQQRAEDDSDYTPPTARLDPKVYSRLWNGLRVRVGIHTGLCDIRRDEVTKGYDYYGRTSNMAARTES
VANGGQVLMTHAAYMSLSAEERQQIDVTALGDVPLRGVPKPVEMYRLN
> Q25263^.^1^217^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^166.00^0.000e+00^0.000e+00
NNDAAPKDGDEPVTLLFTDIESSTALWAALPQLMSDAIAAHHRVIRQLVKKYGCYEVKTIGDSFMIACRSAHSAVSLAC
EIQTKLLKHDWGTEALDRAYREFELARVDTLDDYEPPTARLSEEEYAALWCGLRVRVGIHTGLTDIRYDEVTRGYDYYG
DTSNMAARTEAVANGGQVVATEAAWWALSNDERAGIAHTAMGPQGLRGVPFAVEMFQLN
> Q09435^.^1^161^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^79.80^0.000e+00^2.000e-18
DSVTVFFSDVVKFTILASKCSPFQTVNLLNDLYSNFDTIIEQHGVYKVESIGDGYLCVSGLPTRNGYAHIKQIVDMSLK
FMEYCKSFNIPHLPRENVELRIGVNSGPCVAGVVGLSMPRYCLFGDTVNTASRMESNGKPSLIHLTNDAHSLLTTHYPN
QYE
> Q08828^.^1^186^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^217.00^0.000e+00^0.000e+00
FHKIYIQRHDNVSILFADIVGFTGLASQCTAQELVKLLNELFGKFDELATENHCRRIKILGDCYYCVSGLTQPKTDHAH
CCVEMGLDMIDTITSVAEATEVDLNMRVGLHTGRVLCGVLGLRKWQYDVWSNDVTLANVMEAAGLPGKVHITKTTLACL
NGDYEVEPGYGHERNSFLKTHNIETFFI
> Q08828^.^1^51^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase ca
talytic domain^PSIBLAST^49.70^0.000e+00^3.000e-09
NDFVLRVGINVGPVVAGVIGARRPQYDIWGNTVNVASRMDSTGVQGRIQVT
> Q08462^.^1^186^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^180.00^0.000e+00^0.000e+00
FHNLYVKRHTNVSILYADIVGFTRLASDCSPGELVHMLNELFGKFDQIAKENECMRIKILGDCYYCVSGLPISLPNHAK
NCVKMGLDMCEAIKKVRDATGVDINMRVGVHSGNVLCGVIGLQKWQYDVWSHDVTLANHMEAGGVPGRVHISSVTLEHL
NGAYKVEEGDGDIRDPYLKQHLVKTYFV
> Q08462^.^1^167^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^49.70^0.000e+00^3.000e-09
DCVCVMFASIPDFKEFYTESDVNKEGLECLRLLNEIIADFDDLLSKPKFSGVEKIKTIGSTYMAATGLSAVPSQEHSQE
PERQYMHIGTMVEFAFALVGKLDAINKHSFNDFKLRVGINHGPVIAGVIGAQKPQYDIWGNTVNVASRMDSTGVLDKIQ
VTEETSLVL
> Q07553^.^1^152^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^82.50^0.000e+00^4.000e-19
DCVTILFSDIVGFTELCTTSTPFEVVEMLNDWYTCCDSIISNYDVYKVETIGDAYMVVSGLPLQNGSRHAGEIASLALH
LLETVGNLKIRHKPTETVQLRIGVHSGPCAAGVVGQKMPRYCLFGDTVNTASRMESTGDSMRIHISEATYQLL
> Q07093^.^1^158^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^65.50^0.000e+00^5.000e-14
VTILFSDIVGFTSICSRATPFMVISMLEGLYKDFDEFCDFFDVYKVETIGDAYCVASGLHRASIYDAHRCLDGLKMIDA
CSKHITHDGEQIKMRIGLHTGTVLAGVVGRKMPRYCLFGHSVTIANKFESGSEALKINVSPTTKDWLTKHEGFEFELQP
> Q04400^.^1^189^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^299.00^0.000e+00^0.000e+00
MMFHKIYIQKHDNVSILFADIEGFTSLASQCTAQELVMTLNELFARFDKLAAENHCLRIKILGDCYYCVSGLPEARADH
AHCCVEMGMDMIEAISSVREVTGVNVNMRVGIHSGRVHCGVLGLRKWQFDVWSNDVTLANHMEAGGKAGRIHITKATLN
YLNGDYEVEPGCGGERNAYLKEHSIETFLIL


  [Part of this file has been deleted for brevity]

PTGNVAIVFTDIKNSTFLWELFPDAMRAAIKTHNDIMRRQLRIYGGYEVKTEGDAFMVAFPTPTSALVWCLSVQLKLLE
AEWPEEITSIQDGCLITDNSGTKVYLGLSVRMGVHWGCPVPEIDLVTQRMDYLGPVVNKAARVSGVADGGQITLS
> P22717^.^1^147^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^63.20^0.000e+00^3.000e-13
TILFSDVVTFTNICAACEPIQIVNMLNSMYSKFDRLTSVHDVYKVETIGDAYMVVGGVPVPVESHAQRVANFALGMRIS
AKEVMNPVTGEPIQIRVGIHTGPVLAGVVGDKMPRYCLFGDTVNTASRMESHGLPSKVHLSPTAHRAL
> P21932^.^1^186^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^178.00^0.000e+00^0.000e+00
FNTMYMYRHENVSILFADIVGFTQLSSACSAQELVKLLNELFARFDKLAAKYHQLRIKILGDCYYCICGLPDYREDHAV
CSILMGLAMVEAISYVREKTKTGVDMRVGVHTGTVLGGVLGQKRWQYDVWSTDVTVANKMEAGGIPGRVHISQSTMDCL
KGEFDVEPGDGGSRCDYLDEKGIETYLI
> P20595^.^1^165^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^75.90^0.000e+00^4.000e-17
HKRPVPAKRYDNVTILFSGIVGFNAFCSKHASGEGAMKIVNLLNDLYTRFDTLTDSRKNPFVYKVETVGDKYMTVSGLP
EPCIHHARSICHLALDMMEIAGQVQVDGESVQITIGIHTGEVVTGVIGQRMPRYCLFGNTVNLTSRTETTGEKGKINVS
EYTYRCL
> P20594^.^1^168^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^84.40^0.000e+00^1.000e-19
VQAEAFDSVTIYFSDIVGFTALSAESTPMQVVTLLNDLYTCFDAIIDNFDVYKVETIGDAYMVVSGLPGRNGQRHAPEI
ARMALALLDAVSSFRIRHRPHDQLRLRIGVHTGPVCAGVVGLKMPRYCLFGDTVNTASRMESNGQALKIHVSSTTKDAL
DELGCFQLEL
> P19754^.^1^186^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^217.00^0.000e+00^0.000e+00
FHKIYIQRHDNVSILFADIVGFTGLASQCTAQELVKLLNELFGKFDELATENHCRRIKILGDCYYCVSGLTQPKTDHAH
CCVEMGLDMIDTITSVAEATEVDLNMRVGLHTGRVLCGVLGLRKWQYDVWSNDVTLANVMEAAGLPGKVHITKTTLACL
NGDYEVEPGHGHERNSFLKTHNIETFFI
> P19754^.^1^51^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase ca
talytic domain^PSIBLAST^49.70^0.000e+00^3.000e-09
NDFVLRVGINVGPVVAGVIGARRPQYDIWGNTVNVASRMDSTGVQGRIQVT
> P19687^.^1^161^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^77.10^0.000e+00^1.000e-17
AVQAKRFGNVTMLFSDIVGFTAICSQCSPLQVITMLNALYTRFDRQCGELDVYKVETIGDAYCVAGGLHKESDTHAVQI
ALMALKMMELSHEVVSPHGEPIKMRIGLHSGSVFAGVVGVKMPRYCLFGNNVTLANKFESCSVPRKINVSPTTYRLLKD
CPG
> P19686^.^1^160^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^72.50^0.000e+00^4.000e-16
VQAKKFNEVTMLFSDIVGFTAICSQCSPLQVITMLNALYTRFDQQCGELDVYKVETIGDAYCVAGGLHRESDTHAVQIA
LMALKMMELSNEVMSPHGEPIKMRIGLHSGSVFAGVVGVKMPRYCLFGNNVTLANKFESCSVPRKINVSPTTYRLLKDC
PG
> P18910^.^1^168^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^84.40^0.000e+00^1.000e-19
VQAEAFDSVTIYFSDIVGFTALSAESTPMQVVTLLNDLYTCFDAVIDNFDVYKVETIGDAYMVVSGLPVRNGQLHAREV
ARMALALLDAVRSFRIRHRPQEQLRLRIGIHTGPVCAGVVGLKMPRYCLFGDTVNTASRMESNGEALKIHLSSETKAVL
EEFDGFELEL
> P18293^.^1^168^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^85.90^0.000e+00^4.000e-20
VQAEAFDSVTIYFSDIVGFTALSAESTPMQVVTLLNDLYTCFDAVIDNFDVYKVETIGDAYMVVSGLPVRNGQLHAREV
ARMALALLDAVRSFRIRHRPQEQLRLRIGIHTGPVCAGVVGLKMPRYCLFGDTVNTASRMESNGEALRIHLSSETKAVL
EEFDGFELEL
> P16068^.^1^165^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^75.90^0.000e+00^4.000e-17
HKRPVPAKRYDNVTILFSGIVGFNAFCSKHASGEGAMKIVNLLNDLYTRFDTLTDSRKNPFVYKVETVGDKYMTVSGLP
EPCIHHARSICHLALDMMEIAGQVQVDGESVQITIGIHTGEVVTGVIGQRMPRYCLFGNTVNLTSRTETTGEKGKINVS
EYTYRCL
> P16067^.^1^168^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^84.40^0.000e+00^1.000e-19
VQAEAFDSVTIYFSDIVGFTALSAESTPMQVVTLLNDLYTCFDAIIDNFDVYKVETIGDAYMVVSGLPGRNGQRHAPEI
ARMALALLDAVSSFRIRHRPHDQLRLRIGVHTGPVCAGVVGLKMPRYCLFGDTVNTASRMESNGQALKIHVSSTTKDAL
DELGCFQLEL
> P16066^.^1^168^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^84.80^0.000e+00^7.000e-20
VQAEAFDSVTIYFSDIVGFTALSAESTPMQVVTLLNDLYTCFDAVIDNFDVYKVETIGDAYMVVSGLPVRNGRLHACEV
ARMALALLDAVRSFRIRHRPQEQLRLRIGIHTGPVCAGVVGLKMPRYCLFGDTVNTASRMESNGEALKIHLSSETKAVL
EEFGGFELEL
> P16065^.^1^143^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^82.10^0.000e+00^5.000e-19
VSIFFSDIVGFTALSAASTPIQVVNLLNDLYTLFDAIISNYDVYKVETIGDAYMLVSGLPLRNGDRHAGQIASTAHHLL
ESVKGFIVPHKPEVFLKLRIGIHSGSCVAGVVGLTMPRYCLFGDTVNTASRMESNGLALRIHVS
> O95622^.^1^189^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^301.00^0.000e+00^0.000e+00
MMFHKIYIQKHDNVSILFADIEGFTSLASQCTAQELVMTLNELFARFDKLAAENHCLRIKILGDCYYCVSGLPEARADH
AHCCVEMGMDMIEAISLVREVTGVNVNMRVGIHSGRVHCGVLGLRKWQFDVWSNDVTLANHMEAGGKAGRIHITKATLN
YLNGDYEVEPGCGGERNAYLKEHSIETFLIL
> O95622^.^1^159^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^55.10^0.000e+00^7.000e-11
VAVMFASIANFSEFYVELEANNEGVECLRLLNEIIADFDEIISEDRFRQLEKIKTIGSTYMAASGLNDSTYDKVGKTHI
KALADFAMKLMDQMKYINEHSFNNFQMKIGLNIGPVVAGVIGARKPQYDIWGNTVNVASRMDSTGVPDRIQVTTDMYQV
L
> O75343^.^1^147^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^68.20^0.000e+00^8.000e-15
TILFSDVVTFTNICTACEPIQIVNVLNSMYSKFDRLTSVHAVYKVETIGDAYMVVGGVPVPIGNHAQRVANFALGMRIS
AKEVTNPVTGEPIQLRVGIHTGPVLADVVGDKMPRYCLFGDTVNTASRMESHGLPNKVHLSPTAYRAL
> O60503^.^1^179^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^143.00^0.000e+00^2.000e-37
VSILFADIVGFTKMSANKSAHALVGLLNDLFGRFDRLCEETKCEKISTLGDCYYCVAGCPEPRADHAYCCIEMGLGMIK
AIEQFCQEKKEMVNMRVGVHTGTVLCGILGMRRFKFDVWSNDVNLANLMEQLGVAGKVHISEATAKYLDDRYEMEDGKV
IERLGQSVVADQLKGLKTYLI
> O60266^.^1^186^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^179.00^0.000e+00^0.000e+00
FNTMYMYRHENVSILFADIVGFTQLSSACSAQELVKLLNELFARFDKLAAKYHQLRIKILGDCYYCICGLPDYREDHAV
CSILMGLAMVEAISYVREKTKTGVDMRVGVHTGTVLGGVLGQKRWQYDVWSTDVTVANKMEAGGIPGRVHISQSTMDCL
KGEFDVEPGDGGSRCDYLEEKGIETYLI
> O60266^.^1^54^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like^
Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase ca
talytic domain^PSIBLAST^46.30^0.000e+00^3.000e-08
NNFMLRIGMNKGGVLAGVIGARKPHYDIWGNTVNVASRMESTGVMGNIQVVEET
> O43306^.^1^189^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^287.00^0.000e+00^0.000e+00
MMFHKIYIQKHDNVSILFADIEGFTSLASQCTAQELVMTLNELFARFDKLAAENHCLRIKILGDCYYCVSGLPEARADH
AHCCVEMGVDMIEAISLVREVTGVNVNMRVGIHSGRVHCGVLGLRKWQFDVWSNDVTLANHMEAGGRAGRIHITRATLQ
YLNGDYEVEPGRGGERNAYLKEQHIETFLIL
> O43306^.^1^159^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^55.90^0.000e+00^3.000e-11
VAVMFASIANFSEFYVELEANNEGVECLRLLNEIIADFDEIISEERFRQLEKIKTIGSTYMAASGLNASTYDQVGRSHI
TALADYAMRLMEQMKHINEHSFNNFQMKIGLNMGPVVAGVIGARKPQYDIWGNTVNVSSRMDSTGVPDRIQVTTDLYQV
L
> O30820^.^1^149^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^80.20^0.000e+00^2.000e-18
DEASVLFADIVGFTERASSTAPADLVRFLDRLYSAFDELVDQHGLEKIKVSGDSYMVVSGVPRPRPDHTQALADFALDM
TNVAAQLKDPRGNPVPLRVGLATGPVVAGVVGSRRFFYDVWGDAVNVASRMESTDSVGQIQVPDEVYERL
> O19179^.^1^161^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^82.50^0.000e+00^4.000e-19
VTLYFSDIVGFTTISAMSEPIEVVDLLNDLYTLFDAIIGSHDVYKVETIGDAYMVASGLPQRNGQRHAAEIANMALDIL
SAVGSFRMRHMPEVPVRIRIGLHSGPCVAGVVGLTMPRYCLFGDTVNTASRMESTGLPYRIHVNMSTVRILHALDEGFQ
TEV
> O02740^.^1^162^SCOP^.^55074^Alpha and beta proteins (a+b)^.^.^Ferredoxin-like
^Adenylyl and guanylyl cyclase catalytic domain^Adenylyl and guanylyl cyclase c
atalytic domain^PSIBLAST^85.20^0.000e+00^7.000e-20
DLVTLYFSDIVGFTTISAMSEPIEVVDLLNDLYTLFDAIIGSHDVYKVETIGDAYMVASGLPKRNGMRHAAEIANMSLD
ILSSVGTFKMRHMPEVPVRIRIGLHSGPVVAGVVGLTMPRYCLFGDTVNTASRMESTGLPYRIHVSHSTVTILRTLGEG
YEVE

  File: seqsearch.log

//
/ebi/services/idata/pmr/hgmp/test/qa/domainalign-keep/daf/55074.daf
//
/ebi/services/idata/pmr/hgmp/test/qa/domainalign-keep/daf/54894.daf

5.0 DATA FILES

   SEQSEARCH does not requires any data files.

6.0 USAGE

  6.1 COMMAND LINE ARGUMENTS

   Standard (Mandatory) qualifiers:
   -mode               menu       This option specifies the mode of SEQSEARCH
                                  operation. SEQSEARCH takes as input a
                                  directory of either i. single sequences, ii.
                                  set of sequences (unaligned or aligned, but
                                  typically aligned sequences within a domain
                                  alignment file)). The user has to specify
                                  which.
  [-inseqspath]        dirlist    This option specifies the location of
                                  sequences, e.g. DAF files (domain alignment
                                  files) (input). SEQSEARCH takes as input a
                                  database of either i. single sequences, ii.
                                  sets of unaligned sequences or iii. sets of
                                  aligned sequences, e.g. a domain alignment
                                  file. A 'domain alignment file' contains a
                                  sequence alignment of domains belonging to
                                  the same SCOP or CATH family. The file is in
                                  clustal format annotated with domain family
                                  classification information. The files
                                  generated by using SCOPALIGN will contain a
                                  structure-based sequence alignment of
                                  domains of known structure only. Such
                                  alignments can be extended with sequence
                                  relatives (of unknown structure) by using
                                  SEQALIGN.
  [-database]          string     Name of BLAST-indexed database to search.
   -niter              integer    This option specifies the number of PSIBLAST
                                  iterations. This option specifies the
                                  number of PSIBLAST iterations that are
                                  performed in a search.
   -evalue             float      This option specifies the threshold E-value
                                  for inclusion in family. This option
                                  specifies the threshold E-value for a
                                  PSIBLAST hit to be retained.
   -maxhits            integer    This option specifies the maximum number of
                                  hits. This option specifies the maximum
                                  number of PSIBLAST hit that are retained. It
                                  should normally be set high so that nothing
                                  is discarded.
  [-dhfoutdir]         outdir     This option specifies the location of DHF
                                  files (domain hits files) (output). A
                                  'domain hits file' contains database hits
                                  (sequences) with domain classification
                                  information, in FASTA format. The hits are
                                  relatives to a SCOP or CATH family and are
                                  found from a search of a sequence database.
                                  Files containing hits retrieved by PSIBLAST
                                  are generated by using SEQSEARCH.
   -logfile            outfile    This option specifies the name of log file
                                  for the build. The log file contains
                                  messages about any errors arising while
                                  SEQSEARCH ran.

   Additional (Optional) qualifiers: (none)
   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers:

   "-logfile" associated qualifiers
   -odirectory         string     Output directory

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write standard output
   -filter             boolean    Read standard input, write standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report deaths


   Standard (Mandatory) qualifiers Allowed values Default
   -mode This option specifies the mode of SEQSEARCH operation. SEQSEARCH
   takes as input a directory of either i. single sequences, ii. set of
   sequences (unaligned or aligned, but typically aligned sequences
   within a domain alignment file)). The user has to specify which.
   1 (Single sequences)
   2 (Multiple sequences (e.g. sequence set or alignment))
   1
   [-inseqspath]
   (Parameter 1) This option specifies the location of sequences, e.g.
   DAF files (domain alignment files) (input). SEQSEARCH takes as input a
   database of either i. single sequences, ii. sets of unaligned
   sequences or iii. sets of aligned sequences, e.g. a domain alignment
   file. A 'domain alignment file' contains a sequence alignment of
   domains belonging to the same SCOP or CATH family. The file is in
   clustal format annotated with domain family classification
   information. The files generated by using SCOPALIGN will contain a
   structure-based sequence alignment of domains of known structure only.
   Such alignments can be extended with sequence relatives (of unknown
   structure) by using SEQALIGN. Directory with files ./
   [-database]
   (Parameter 2) Name of BLAST-indexed database to search. Any string is
   accepted swissprot
   -niter This option specifies the number of PSIBLAST iterations. This
   option specifies the number of PSIBLAST iterations that are performed
   in a search. Any integer value 1
   -evalue This option specifies the threshold E-value for inclusion in
   family. This option specifies the threshold E-value for a PSIBLAST hit
   to be retained. Any numeric value 0.001
   -maxhits This option specifies the maximum number of hits. This option
   specifies the maximum number of PSIBLAST hit that are retained. It
   should normally be set high so that nothing is discarded. Any integer
   value 1000
   [-dhfoutdir]
   (Parameter 3) This option specifies the location of DHF files (domain
   hits files) (output). A 'domain hits file' contains database hits
   (sequences) with domain classification information, in FASTA format.
   The hits are relatives to a SCOP or CATH family and are found from a
   search of a sequence database. Files containing hits retrieved by
   PSIBLAST are generated by using SEQSEARCH. Output directory ./
   -logfile This option specifies the name of log file for the build. The
   log file contains messages about any errors arising while SEQSEARCH
   ran. Output file seqsearch.log
   Additional (Optional) qualifiers Allowed values Default
   (none)
   Advanced (Unprompted) qualifiers Allowed values Default
   (none)

  6.2 EXAMPLE SESSION

   An example of interactive use of SEQSEARCH is shown below. Here is a
   sample session with seqsearch


% seqsearch 
Generate PSI-BLAST hits (DHF file) from a DAF file.
Input mode
         1 : Single sequences
         2 : Multiple sequences (e.g. sequence set or alignment)
Select mode of operation. [1]: 2
Location of sequences, e.g. DAF files (domain alignment files) (input). [./]: .
./domainalign-keep/daf
Name of BLAST-indexed database to search. [swissprot]: swsmall
Number of PSIBLAST iterations. [1]: 
Threshold E-value for inclusion in family. [0.001]: 0.0001
Maximum number of hits. [1000]: 100
Location of DHF files (domain hits files) (output). [./]: 
Name of log file for the build. [seqsearch.log]: 
[blastpgp] WARNING: posFindAlignmentDimensions: Attempting to recover data from
 multiple alignment file

[blastpgp] WARNING: posProcessAlignment: Alignment recovered successfully

[blastpgp] WARNING: posFindAlignmentDimensions: Attempting to recover data from
 multiple alignment file

[blastpgp] WARNING: posProcessAlignment: Alignment recovered successfully


PROCESSING /ebi/services/idata/pmr/hgmp/test/qa/domainalign-keep/daf/55074.daf
blastpgp -i ./seqsearch-1234567890.1234.seqin -B ./seqsearch-1234567890.1234.se
qsin -j 1 -e 0.000100 -b 100 -v 100 -d ../../data/structure/swsmall > ./seqsear
ch-1234567890.1234.psiout

PROCESSING /ebi/services/idata/pmr/hgmp/test/qa/domainalign-keep/daf/54894.daf
blastpgp -i ./seqsearch-1234567890.1234.seqin -B ./seqsearch-1234567890.1234.se
qsin -j 1 -e 0.000100 -b 100 -v 100 -d ../../data/structure/swsmall > ./seqsear
ch-1234567890.1234.psiout


   Go to the input files for this example
   Go to the output files for this example

7.0 KNOWN BUGS & WARNINGS

   None.

8.0 NOTES

   1. Use of psiblast
   psiblast must be installed on the system that is running SEQSEARCH.
   When running SEQSEARCH at the HGMP it is essential that the command
   'use blast_v2' (which runs the script /packages/menu/USE/blast_v2) is
   given before it is run.
   SEQSEARCH requires a blast-indexed database to be present, i.e. both
   the sequence and index file must be present on the system. The name of
   the database to search specified in the acd file is that which is
   given as the -d parameter to blastpgp (e.g. blastpgp -d swissprot).

  8.1 GLOSSARY OF FILE TYPES

   FILE TYPE FORMAT DESCRIPTION CREATED BY SEE ALSO
   Domain hits file DHF format (FASTA-like). Database hits (sequences)
   with domain classification information. The hits are relatives to a
   SCOP or CATH family (or other node in the structural hierarchies) and
   are found from a search of a discriminating element (e.g. a protein
   signature, hidden Markov model, simple frequency matrix, Gribskov
   profile or Hennikoff profile) against a sequence database. SEQSEARCH
   (hits retrieved by PSIBLAST). SIGSCAN (hits retrieved by sparse
   protein signature). LIBSCAN (hits retrieved by various types of HMM
   and profile). N.A.
   Domain alignment file DAF format (CLUSTAL-like format with domain
   classification information). Contains a sequence alignment of domains
   belonging to the same SCOP or CATH family. The file is annotated with
   domain family classification information. DOMAINALIGN (structure-based
   sequence alignment of domains of known structure). DOMAINALIGN
   alignments can be extended with sequence relatives (of unknown
   structure) to the family in question by using SEQALIGN.

   None

9.0 DESCRIPTION

   By using homology search tools such as blast it is possible to find
   relatives to a group of related proteins (family, superfamily etc),
   given one or more sequences belonging to the group of interest. For
   example, when using psiblast it is possible to use a sequence
   alignment as the seed with which to search a sequence database.
   Performing such searches for large datasets such as all families
   within SCOP or CATH potentially requires a lot of time for preparation
   of datasets, running jobs and so on, in addition to the compute time
   required for the searches themselves. SEQSEARCH automatically performs
   a psiblast search of a sequence database for each file in a directory
   of sequences or sets of sequences. These sequences are used for the
   searches. Typically, the directory contains DAF files (domain
   alignment files) and the alignments are for a certain node (e.g.
   family, superfamily etc) from SCOP or CATH.

10.0 ALGORITHM

   None.

11.0 RELATED APPLICATIONS

See also

   Program name                       Description
   contactcount Count specific versus non-specific contacts
   contacts     Generate intra-chain CON files from CCF files
   domainalign  Generate alignments (DAF file) for nodes in a DCF file
   domainrep    Reorder DCF file to identify representative structures
   domainreso   Remove low resolution domains from a DCF file
   interface    Generate inter-chain CON files from CCF files
   libgen       Generate discriminating elements from alignments
   matgen3d     Generate a 3D-1D scoring matrix from CCF files
   psiphi       Phi and psi torsion angles from protein coordinates
   rocon        Generates a hits file from comparing two DHF files
   rocplot      Performs ROC analysis on hits files
   scorecmapdir Contact scores for cleaned protein chain contact files
   seqalign     Extend alignments (DAF file) with sequences (DHF file)
   seqfraggle   Removes fragment sequences from DHF files
   seqsort      Remove ambiguous classified sequences from DHF files
   seqwords     Generates DHF files from keyword search of UniProt
   siggen       Generates a sparse protein signature from an alignment
   siggenlig    Generate ligand-binding signatures from a CON file
   sigscan      Generate hits (DHF file) from a signature search
   sigscanlig   Search ligand-signature library & write hits (LHF file)

12.0 DIAGNOSTIC ERROR MESSAGES

   The following 3 types of message might appear in the log file:
   WARN Could not open for reading my.file
   WARN No PSIBLAST hits therefore no output file my.file
   WARN Could not open for writing my.file

13.0 AUTHORS

   Ranjeeva Ranasinghe (rranasin@rfcgr.mrc.ac.uk)
   Jon Ison (jison@rfcgr.mrc.ac.uk)
   MRC Rosalind Franklin Centre for Genomics Research, Wellcome Trust
   Genome Campus, Hinxton, Cambridge, CB10 1SB, UK

14.0 REFERENCES

   Please cite the authors and EMBOSS.
   Rice P, Longden I and Bleasby A (2000) "EMBOSS - The European
   Molecular Biology Open Software Suite" Trends in Genetics, 15:276-278.
   
   See also http://emboss.sourceforge.net/

  14.1 Other useful references

   Altschul et al, Nuc. Acids. Res. 25:3389-3402, 1997
