Statistical approaches to automated gene identification without teacher.
Gorban A. N., Zinovyev A. Yu., Popova T. G.
Institut des Hautes Etudes Scientifiques Preprint. IHES M/01/34.
Русское название: Статистический подход к автоматической идентификации генов без учителя
Overview of statistical methods of gene identification is made. Particular attention is given to the methods which need not a training set of already known genes. After analysis several statistical approaches are proposed for computational exon identification in whole genomes. For several genomes an optimal window length for averaging GC-content function and calculating codon frequencies has been found. Self-training procedure based on clustering in multidimensional codon frequencies space is proposed.