Welcome to the Bruns Laboratory web page. We have identified a number of
novel human ancient sequence genes with significant homology in the D.melanogaster and C.elegans genomes that as yet do not recognize known proteins or
motifs in current databases. The identifiers for these genes, the homology
relations of the corresponding human and model system proteins plus an abstract
describing the work are presented here. Click on the underlined hyperlinks for
detailed information on the novel human ancient sequence genes.
Click here for the
spreadsheet data for the chr1-chr9 genes
Click here for the
spreadsheet data for the chr10-chr20 genes
Human Genes of
Ancient Origin.
G.A. Bruns1,2,
R.E. Eisenman1. 1) Genetics Division, Children's Hosp, Boston, MA; 2) Dept of
Pediatrics, Harvard Medical School, Boston, MA.
Various categories of human genes
of unknown function may be of interest to identify. Examples include those with
predominant expression in one organ system; those with expression at very early
stages of development or fleeting expression in a developmental process; or
those with significant expression only in primates. We have focused on finding
novel human genes that have extensive homology in D.melanogaster and C.elegans
but do not recognize known proteins, protein families or motifs in current
databases. These encode new families of proteins of ancient origin. As
extensive cross-phylum AA sequence conservation is frequently a hallmark of
proteins involved in fundamental cellular processes, function analysis of some
of the proteins may elucidate priorly unknown pathways. Mutation at loci
encoding ancient sequences also often underlie human inherited disorders.
More than 40 genes of this class were recognized by database
search. Human unique Unigene sequences were BLAST analyzed to find those with
significant homology (E values, e-005 to e-067) in the D.melanogaster EST database. Any initial match to a
homolog of known function or a defined motif in the Unigene or nr db was
excluded. For each candidate the human protein was sought from the nr database
which often yielded the D.melanogaster relative. The fly homolog was also obtainable from FlyBase. For
all but 4 genes, a C.elegans homolog was retrieved from the nr or
Wormpep databases. In some cases Wormpep or FlyBase gave identification of a
gene not available from the nr annotation. Such loci were excluded. This
indicates the importance of using multiple databases when seeking novel genes.
All candidate genes were analyzed with pfam. The human proteins were aligned to
the fly and worm homologs using the NCBI BLAST 2 Sequences algorithm. The E
values ranged from e-05 to e-135 with the majority at
least e-30 . EST representation for the human genes was retrieved
from the Unigene db. The gene identifiers, homology relationships of the
proteins plus RNA data will be presented and posted at
http://tnt.tch.harvard.edu/bruns/.
Amer. J. Hum. Genet. 71, 397 (2002)
For questions on these loci, contact Dr. G.
Bruns at bruns@enders.tch.harvard.edu
Click here for the
spreadsheet data for the chr1-chr9 genes
Click here for the
spreadsheet data for the chr10-chr20 genes