|
|
Statistical approaches to automated gene identification without teacher.
Gorban A. N., Zinovyev A. Yu., Popova T. G.
Institut des Hautes Etudes Scientifiques Preprint. IHES M/01/34.
Online-version
http://www.ihes.fr/PREPRINTS/M01/Resu/resu-M01-34.html
Download
Abstract
Overview of statistical methods of gene identification is made. Particular attention is given to the methods which need not a training set of already known genes. After analysis several statistical approaches are proposed for computational exon identification in whole genomes. For several genomes an optimal window length for averaging GC-content function and calculating codon frequencies has been found. Self-training procedure based on clustering in multidimensional codon frequencies space is proposed.
|