Automated splice site analyses compiles mutation data from wide variety
of sources that result in genetic disorders.
The procedure of executing Automated splice site analysis is as follows:
- Obtain the mutation table
- Splice site analysis analyses uses a script called
run1.splice (modification of run.splice).
- run1.splice uses two input files input
- Once parameters in the files input and eein are set,
if it is first time, give the command as run1.splice accession number,
otherwise no arguments are required.
- The run1.splice calls the perl program named ee.pl, which generates exoncoor.txt file which
contains exon boundaries.
- Then run2.splice is
called, which generates the file containing natural site and
information content of each natural site. A pop.html file containing
the information content of natural sites is generated.
- Then new.pl is called
separately of donor and acceptor, to create instruction files
separately for donor and acceptor, named as donor.inst and
program is run individually for donor and
acceptor, followed by scan. The parameters of scanp are set in such a
way so that data file is obtained.
- Then program called real.pl
is called to identify
the base changes in the data file and record the changes into three
files as increased.html, decreased.html and equal.html which contain
base sites, where there is increase, decrease and no change in the
information content value respectively.
- Sort.pl is called, which
essentially sorts out all
the bases where the information content varies and sorts them out by
mutation taking place. All the sites where the information content
change is present are grouped under the respective mutation.
It contains three lines.
First line correponds to the location of the mutation datafile.
Second line corresponds to the location where exon coordinates are
Third specifies about the cds position. If it starts with a '0' then,
the cds start should not be added to get the nucelotide position, if
'1', then the cds start should be added to obtain the correct location
of the nucleotide refering to.
example: input file
The first line specifies the accession number
The second line specifies the mrna.txt file to be used
The third line specifies the chromosome number
The fourth line, specifies the location where the output files have to
be stored. It serves as a purpose for name of gene too.
The fifth line, specifies the link to the mutation table
The sixth line, specifies the link to genbank entry
The lines following specify the references.
database of human type I and type III collagen mutations
input : eein
output : exoncoor.txt
It reads the accession number mentioned in the eein.
The codon start position ( cds) is identified from the mrna text
The exon boundaries are calcucated in the mrna text file and the output
is kept in the file exoncoor.txt.
input: input, eein, exoncoor.txt, instp
output: inst, mut_list
Its reads the accession number from the eein file.
It reads the mutation data file.
Each mutation represented in the data file is converted to delila
This program is individually executed for acceptor and donor with
differenet Ribl matrix, so that different inst values are generated for
acceptor and donor.
The range of the window used in the delila instruction will be equal to
twice of the length of Ribl matrix used.
mut_list contains the natural splice sites relative to the mutation
If the mutation takes place in the middle of the exon, then
natural splice site is not located and termed as NA ( non applicable).
output: pop.html, natural.txt
other programs used: natural.pl, popup.pl, delila, scan
This is a bash script used to generate pop.html, a file which contains
information content of the natural sites. The exoncoor.txt ( output
from ee.pl) is used as input. File natural.txt contains the locations
of the natural sites. After the extraction of natural sites,
instructions are formed, delila is executed. The scanp value is reduced
to a very low value and the book obtained from delila is scanned.
The pop.html containing the information content is generated using
Delila is run separately for donor and acceptor. More information about
delila can be had from the link.
input: data , eein
output: increased.html, decreased.html, equal.html, gene.html,
increased.txt, decreased.txt, equal.txt
The chromosome, gene are read from the eein file. The output files are
stored in the directory named by the gene name.
The data file is scanned and all the changes in the information content
values are categorized according to the type of change as increased,
decreased and nochange. The changes are tabulated in the html files i.e
increased, decreased, and nochange.
All the changes observed, when scanned either under donor or acceptor
are tabulated under separate tables "total" under donor and acceptor.
input: increased.txt, decreased.txt, equal.txt
All the changes both under donor and acceptor in information content
observed are tabulated according to the mutation made.
input: natural.txt, data
It generates a html file containing information content of natural
It generates the list of natural sites found in the accession number
output: menu.html, donor.html, acceptor.html
It generates a list of menus for donor and acceptor.