
                                  geecee 



Function

   Calculates fractional GC content of nucleic acid sequences

Description

   This calculates the fraction of G+C bases of the input nucleic acid
   sequence(s).

   It reads in nucleic acid sequences, sums the number of 'G' and 'C'
   bases and writes out the result as the fraction (in the interval 0.0
   to 1.0) of the length of the whole sequence.

Usage

   Here is a sample session with geecee


% geecee tembl:hhtetra 
Calculates fractional GC content of nucleic acid sequences
Output file [hhtetra.geecee]: 

   Go to the input files for this example
   Go to the output files for this example

Command line arguments

   Standard (Mandatory) qualifiers:
  [-sequence]          seqall     Sequence database USA
  [-outfile]           outfile    Output file name

   Additional (Optional) qualifiers: (none)
   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers:

   "-sequence" associated qualifiers
   -sbegin1             integer    Start of each sequence to be used
   -send1               integer    End of each sequence to be used
   -sreverse1           boolean    Reverse (if DNA)
   -sask1               boolean    Ask for begin/end/reverse
   -snucleotide1        boolean    Sequence is nucleotide
   -sprotein1           boolean    Sequence is protein
   -slower1             boolean    Make lower case
   -supper1             boolean    Make upper case
   -sformat1            string     Input sequence format
   -sdbname1            string     Database name
   -sid1                string     Entryname
   -ufo1                string     UFO features
   -fformat1            string     Features format
   -fopenfile1          string     Features file name

   "-outfile" associated qualifiers
   -odirectory2         string     Output directory

   General qualifiers:
   -auto                boolean    Turn off prompts
   -stdout              boolean    Write standard output
   -filter              boolean    Read standard input, write standard output
   -options             boolean    Prompt for standard and additional values
   -debug               boolean    Write debug output to program.dbg
   -verbose             boolean    Report some/full command line options
   -help                boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning             boolean    Report warnings
   -error               boolean    Report errors
   -fatal               boolean    Report fatal errors
   -die                 boolean    Report deaths


   Standard (Mandatory) qualifiers Allowed values Default
   [-sequence]
   (Parameter 1) Sequence database USA Readable sequence(s) Required
   [-outfile]
   (Parameter 2) Output file name Output file <sequence>.geecee
   Additional (Optional) qualifiers Allowed values Default
   (none)
   Advanced (Unprompted) qualifiers Allowed values Default
   (none)

Input file format

   geecee reads any nucleic acid sequence USAs.

  Input files for usage example

   'tembl:hhtetra' is a sequence entry in the example nucleic acid
   database 'tembl'

  Database entry: tembl:hhtetra

ID   HHTETRA    standard; DNA; VRL; 1272 BP.
XX
AC   L46634; L46689;
XX
SV   L46634.1
XX
DT   06-NOV-1995 (Rel. 45, Created)
DT   04-MAR-2000 (Rel. 63, Last updated, Version 3)
XX
DE   Human herpesvirus 7 (clone ED132'1.2) telomeric repeat region.
XX
KW   telomeric repeat.
XX
OS   Human herpesvirus 7
OC   Viruses; dsDNA viruses, no RNA stage; Herpesviridae; Betaherpesvirinae.
XX
RN   [1]
RP   1-1272
RX   MEDLINE; 96079055.
RA   Secchiero P., Nicholas J., Deng H., Xiaopeng T., van Loon N., Ruvolo V.R.,
RA   Berneman Z.N., Reitz M.S. Jr., Dewhurst S.;
RT   "Identification of human telomeric repeat motifs at the genome termini of
RT   human herpesvirus 7: structural analysis and heterogeneity";
RL   J. Virol. 69(12):8041-8045(1995).
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..1272
FT                   /db_xref="taxon:10372"
FT                   /organism="Human herpesvirus 7"
FT                   /strain="JI"
FT                   /clone="ED132'1.2"
FT   repeat_region   207..928
FT                   /note="long and complex repeat region composed of various
FT                   direct repeats, including TAACCC (TRS), degenerate copies
FT                   of TRS motifs and a 14-bp repeat, TAGGGCTGCGGCCC"
FT   misc_signal     938..998
FT                   /note="pac2 motif"
FT   misc_feature    1009
FT                   /note="right genome terminus (...ACA)"
XX
SQ   Sequence 1272 BP; 346 A; 455 C; 222 G; 249 T; 0 other;
     aagcttaaac tgaggtcaca cacgacttta attacggcaa cgcaacagct gtaagctgca        6
0
     ggaaagatac gatcgtaagc aaatgtagtc ctacaatcaa gcgaggttgt agacgttacc       12
0
     tacaatgaac tacacctcta agcataacct gtcgggcaca gtgagacacg cagccgtaaa       18
0
     ttcaaaactc aacccaaacc gaagtctaag tctcacccta atcgtaacag taaccctaca       24
0
     actctaatcc tagtccgtaa ccgtaacccc aatcctagcc cttagcccta accctagccc       30
0
     taaccctagc tctaacctta gctctaactc tgaccctagg cctaacccta agcctaaccc       36
0
     taaccgtagc tctaagttta accctaaccc taaccctaac catgaccctg accctaaccc       42
0
     tagggctgcg gccctaaccc tagccctaac cctaacccta atcctaatcc tagccctaac       48
0
     cctagggctg cggccctaac cctagcccta accctaaccc taaccctagg gctgcggccc       54
0
     taaccctaac cctagggctg cggcccgaac cctaacccta accctaaccc taaccctagg       60
0
     gctgcggccc taaccctaac cctagggctg cggccctaac cctaacccta gggctgcggc       66
0
     ccgaacccta accctaaccc taaccctagg gctgcggccc taaccctaac cctagggctg       72
0
     cggccctaac cctaacccta actctagggc tgcggcccta accctaaccc taaccctaac       78
0
     cctagggctg cggcccgaac cctagcccta accctaaccc tgaccctgac cctaacccta       84
0
     accctaaccc taaccctaac cctaacccta accctaaccc taaccctaac cctaacccta       90
0
     accctaaccc taaccctaac cctaaccccg cccccactgg cagccaatgt cttgtaatgc       96
0
     cttcaaggca ctttttctgc gagccgcgcg cagcactcag tgaaaaacaa gtttgtgcac      102
0
     gagaaagacg ctgccaaacc gcagctgcag catgaaggct gagtgcacaa ttttggcttt      108
0
     agtcccataa aggcgcggct tcccgtagag tagaaaaccg cagcgcggcg cacagagcga      114
0
     aggcagcggc tttcagactg tttgccaagc gcagtctgca tcttaccaat gatgatcgca      120
0
     agcaagaaaa atgttctttc ttagcatatg cgtggttaat cctgttgtgg tcatcactaa      126
0
     gttttcaagc tt                                                          127
2
//

Output file format

  Output files for usage example

  File: hhtetra.geecee

#Sequence   GC content
HHTETRA       0.53

   The first non-blank line is the title line. Subsequent lines consist
   of two columns of data.
     * The first column is the name of the sequence.
     * The second column is the percentage G+C content of the sequence.

Data files

   None.

Notes

   None.

References

   None.

Warnings

   None.

Diagnostic Error Messages

   None.

Exit status

   0 on successful completion.

Known bugs

   None.

See also

   Program name         Description
   cpgplot      Plot CpG rich areas
   cpgreport    Reports all CpG rich regions
   newcpgreport Report CpG rich areas
   newcpgseek   Reports CpG rich regions

Author(s)

   Richard Bruskiewich (r.bruskiewich@cgiar.org)
   while he was at:
   Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge,
   CB10 1SA, UK.

History

   Completed 18th June 1999.

Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.

Comments

   None /BODY>
