
                               rebaseextract 



Function

   Extract data from REBASE

Description

   The Restriction Enzyme database (REBASE) is a collection of
   information about restriction enzymes and related proteins. It
   contains published and unpublished references, recognition and
   cleavage sites, isoschizomers, commercial availability, methylation
   sensitivity, crystal and sequence data. DNA methyltransferases, homing
   endonucleases, nicking enzymes, specificity subunits and control
   proteins are also included. Most recently, putative DNA
   methyltransferases and restriction enzymes, as predicted from analysis
   of genomic sequences, are also listed.

   The home page of REBASE is: http://rebase.neb.com/

   This program derives recognition site and cleavage information from
   the "withrefm" file of an REBASE distribution. It creates three files
   in the EMBOSS data subdirectory REBASE. A pattern file, a reference
   file and a supplier file.

   It will also (by default) produce an 'embossre.equ' file. This can be
   turned off by setting the -equivalences option to be false. This
   option calculates an 'embossre.equ' file using restriction enzyme
   prototypes in the "withrefm" file. The 'embossre.equ' file is a file
   of preferred isoschizomers. You may edit it to contain your available
   restriction enzymes.

   The EMBOSS programs that find restriction cutting sites use the data
   files produced by this program and will not work without them.

   Running this program may be the job of your system manager.

Usage

   Here is a sample session with rebaseextract


% rebaseextract 
Extract data from REBASE
Full pathname of WITHREFM file: withrefm
Full pathname of PROTO file: proto

   Go to the input files for this example
   Go to the output files for this example

Command line arguments

   Standard (Mandatory) qualifiers:
  [-infile]            infile     Full pathname of WITHREFM file
  [-protofile]         infile     Full pathname of PROTO file

   Additional (Optional) qualifiers:
   -[no]equivalences   boolean    This option calculates an embossre.equ file
                                  using restriction enzyme prototypes in the
                                  withrefm file.

   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers: (none)
   General qualifiers:
   -auto                boolean    Turn off prompts
   -stdout              boolean    Write standard output
   -filter              boolean    Read standard input, write standard output
   -options             boolean    Prompt for standard and additional values
   -debug               boolean    Write debug output to program.dbg
   -verbose             boolean    Report some/full command line options
   -help                boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning             boolean    Report warnings
   -error               boolean    Report errors
   -fatal               boolean    Report fatal errors
   -die                 boolean    Report deaths


   Standard (Mandatory) qualifiers Allowed values Default
   [-infile]
   (Parameter 1) Full pathname of WITHREFM file Input file Required
   [-protofile]
   (Parameter 2) Full pathname of PROTO file Input file Required
   Additional (Optional) qualifiers Allowed values Default
   -[no]equivalences This option calculates an embossre.equ file using
   restriction enzyme prototypes in the withrefm file. Boolean value
   Yes/No Yes
   Advanced (Unprompted) qualifiers Allowed values Default
   (none)

Input file format

   The input file must be the "withrefm" file of a REBASE distribution.

   For example, the withrefm file for REBASE version 005 is at:
   ftp://ftp.neb.com/pub/rebase/withrefm.005

  Input files for usage example

  File: withrefm


REBASE version 106                                              withrefm.106

    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    REBASE, The Restriction Enzyme Database   http://rebase.neb.com
    Copyright (c)  Dr. Richard J. Roberts, 2001.   All rights reserved.
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Rich Roberts                                                    May 31 2001


<ENZYME NAME>   Restriction enzyme name.
<ISOSCHIZOMERS> Other enzymes with this specificity.
<RECOGNITION SEQUENCE>
                These are written from 5' to 3', only one strand being given.
                If the point of cleavage has been determined, the precise site
                is marked with ^.  For enzymes such as HgaI, MboII etc., which
                cleave away from their recognition sequence the cleavage sites
                are indicated in parentheses.

                For example HgaI GACGC (5/10) indicates cleavage as follows:
                                5' GACGCNNNNN^      3'
                                3' CTGCGNNNNNNNNNN^ 5'

                In all cases the recognition sequences are oriented so that
                the cleavage sites lie on their 3' side.

                REBASE Recognition sequences representations use the standard
                abbreviations (Eur. J. Biochem. 150: 1-5, 1985) to represent
                ambiguity.
                                R = G or A
                                Y = C or T
                                M = A or C
                                K = G or T
                                S = G or C
                                W = A or T
                                B = not A (C or G or T)
                                D = not C (A or G or T)
                                H = not G (A or C or T)
                                V = not T (A or C or G)
                                N = A or C or G or T



                ENZYMES WITH UNUSUAL CLEAVAGE PROPERTIES:

                Enzymes that cut on both sides of their recognition sequences,
                such as BcgI, Bsp24I, CjeI and CjePI, have 4 cleavage sites
                each instead of 2.



  [Part of this file has been deleted for brevity]

<6>S.A. Thompson
<7>N
<8>Morgan, R.D., Unpublished observations.
Morgan, R.D., Xu, Q., US Patent Office, 2001.
Xu, Q., Morgan, R., Blaser, M., Unpublished observations.

<1>HspAI
<2>HhaI,AspLEI,BcaI,BspLAI,BstHHI,CcoP95I,CfoI,Csp1470I,FnuDIII,Hin6I,Hin7I,Hin
GUI,HinP1I,HinS1I,HinS2I,Hpy99III,HpyF10I,HsoI,MnnIV,NgoEII,SciNI
<3>G^CGC
<4>
<5>Haemophilus species A
<6>S.K. Degtyarev
<7>I
<8>Rechkunova, N.I., Prikhod'ko, E.A., Shevchenko, A.V., Degtyarev, S.K., Unpub
lished observations.

<1>KpnI
<2>Acc65I,AhaB8I,Asp718I,BspJ106I,Eco149I,Esp19I,KpnK14I,MvsI,MvsAI,MvsBI,MvsCI
,MvsDI,MvsEI,NmiI,Sau10I,SthI,SthAI,SthBI,SthCI,SthDI,SthEI,SthFI,SthGI,SthHI,S
thJI,SthKI,SthLI,SthMI,SthNI,Uba76I,Uba85I,Uba86I,Uba87I,Uba1201I
<3>GGTAC^C
<4>4(6)
<5>Klebsiella pneumoniae OK8
<6>ATCC 49790
<7>ABCDEFGHIJKLMNOQRSTU
<8>Kiss, A., Finta, C., Venetianer, P., (1991) Nucleic Acids Res., vol. 19, pp.
 3460.
Smith, D.I., Blattner, F.R., Davies, J., (1976) Nucleic Acids Res., vol. 3, pp.
 343-353.
Tomassini, J., Roychoudhury, R., Wu, R., Roberts, R.J., (1978) Nucleic Acids Re
s., vol. 5, pp. 4055-4064.

<1>NotI
<2>CciNI,CspBI,MchAI
<3>GC^GGCCGC
<4>?(4)
<5>Nocardia otitidis-caviarum
<6>ATCC 14630
<7>ABCDEFGHJKLMNOQRSTU
<8>Borsetti, R., Wise, D., Qiang, B.-Q., Schildkraut, I., Unpublished observati
ons.
Morgan, R.D., Unpublished observations.
Morgan, R.D., Benner, J.S., Claus, T.E., US Patent Office, 1994.
Qiang, B.-Q., Schildkraut, I., (1987) Methods Enzymol., vol. 155, pp. 15-21.

<1>TaqI
<2>CviSIII,EsaBC3I,HpyV,Hpy26II,HpyF14III,HpyF16I,HpyF23I,HpyF24I,HpyF26III,Hpy
F30I,HpyF35I,HpyF40II,HpyF42IV,HpyF45I,HpyF49I,HpyF52I,HpyF59III,HpyF62II,HpyF6
4I,HpyF65II,HpyF66IV,HpyF71I,HpyF73II,HpyJP26II,PpaAII,Taq20I,Tbr51I,TfiA3I,Tfi
Tok4A2I,TfiTok6A1I,TflI,Tsc4aI,Tsp32I,Tsp32II,Tsp358I,Tsp505I,Tsp510I,TspAK13D2
1I,TspAK16D24I,TspNI,TspVi4AI,TspVil3I,Tth24I,TthHB8I,TthRQI
<3>T^CGA
<4>4(6)
<5>Thermus aquaticus YTI
<6>J.I. Harris
<7>ABCDEFGIJLMNOQRSTU
<8>Anton, B.P., Brooks, J.E., Unpublished observations.
Fomenkov, A., Xiao, J.-P., Dila, D., Raleigh, E., Xu, S.-Y., (1994) Nucleic Aci
ds Res., vol. 22, pp. 2399-2403.
McClelland, M., (1981) Nucleic Acids Res., vol. 9, pp. 6795-6804.
Sato, S., Hutchison, C.A. III, Harris, J.I., (1977) Proc. Natl. Acad. Sci. U. S
. A., vol. 74, pp. 542-546.
Zebala, J.A., (1993) Diss. Abstr., vol. 54, pp. 1394-1398.

  File: proto


REBASE version 305                                              proto.305

    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    REBASE, The Restriction Enzyme Database   http://rebase.neb.com
    Copyright (c)  Dr. Richard J. Roberts, 2003.   All rights reserved.
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Rich Roberts                                                    Apr 30 2003




            TYPE II ENZYMES
            ---------------

BseYI                          CCCAGC (-5/-1)
BsiYI                          CCNNNNN^NNGG
BsrI                           ACTGG (1/-1)
HaeIII                         GG^CC
HpaII                          C^CGG
Ksp632I                        CTCTTC (1/4)
MaeII                          A^CGT



            TYPE I ENZYMES
            ---------------

EcoAI                          GAGNNNNNNNGTCA
EcoBI                          TGANNNNNNNNTGCT
EcoDI                          TTANNNNNNNGTCY
EcoDR2                         TCANNNNNNGTCG
EcoDR3                         TCANNNNNNNATCG
EcoDXXI                        TCANNNNNNNRTTC
EcoEI                          GAGNNNNNNNATGC
EcoKI                          AACNNNNNNGTGC



            TYPE III ENZYMES
            ---------------

EcoPI                          AGACC
EcoP15I                        CAGCAG (25/27)
HinfIII                        CGAAT
StyLTI                         CAGAG

Output file format

  Output files for usage example

  File: embossre.equ

Bsc4I BsiYI
Bse1I BsrI
BshI HaeIII
BsiSI HpaII
Bsu6I Ksp632I
HpyCH4IV MaeII

  Directory: REBASE

   This directory contains output files.

   The output files are held in the REBASE subdirectory of the EMBOSS
   data directory. There are three:
     * embossre.enz Enzyme pattern file
     * embossre.ref Enzyme references
     * embossre.sup Enzyme suppliers

   rebaseextract will also (by default) produce an 'embossre.equ' file in
   the EMBOSS data directory. This can be turned off by setting the
   -equivalences option to be false. This option calculates an
   'embossre.equ' file using restriction enzyme prototypes in the
   "withrefm" file. The 'embossre.equ' file is a file of preferred
   isoschizomers. You may edit it to contain your available restriction
   enzymes.

Data files

   The "withrefm" file of an REBASE distribution is the input file for
   this program.

Notes

   The home page of REBASE is: http://rebase.neb.com/

   Running this program may be the job of your system manager.

   The ready-made files produced by this program may already be available
   at the REBASE web site: http://rebase.neb.com/rebase/rebase.files.html
   or http://rebase.neb.com/rebase/rebase.f37.html

References

    1. Nucleic Acids Research 27: 312-313 (1999).

Warnings

   The program will warn you if the input file is incorrectly formatted.

Diagnostic Error Messages

Exit status

   It exits with status 0 unless an error is reported.

Known bugs

See also

    Program name                        Description
   aaindexextract Extract data from AAINDEX
   cutgextract    Extract data from CUTG
   printsextract  Extract data from PRINTS
   prosextract    Build the PROSITE motif database for use by patmatmotifs
   tfextract      Extract data from TRANSFAC

Author(s)

   Alan Bleasby (ajb  ebi.ac.uk)
   European Bioinformatics Institute, Wellcome Trust Genome Campus,
   Hinxton, Cambridge CB10 1SD, UK

History

   Completed 12th April 1999

Target users

   This program is intended to be used by administrators responsible for
   software and database installation and maintenance.

Comments

   None
