Welcome to the Bruns Laboratory web page. We have identified a number of novel human ancient sequence genes with significant homology in the D.melanogaster and C.elegans genomes that as yet do not recognize known proteins or motifs in current databases. The identifiers for these genes, the homology relations of the corresponding human and model system proteins plus an abstract describing the work are presented here. Click on the underlined hyperlinks for detailed information on the novel human ancient sequence genes.

 

Click here for the spreadsheet data for the chr1-chr9 genes
Click here for the spreadsheet data for the chr10-chr20 genes

 

 

Human Genes of Ancient Origin.

G.A. Bruns1,2, R.E. Eisenman1. 1) Genetics Division, Children's Hosp, Boston, MA; 2) Dept of Pediatrics, Harvard Medical School, Boston, MA.

   Various categories of human genes of unknown function may be of interest to identify. Examples include those with predominant expression in one organ system; those with expression at very early stages of development or fleeting expression in a developmental process; or those with significant expression only in primates. We have focused on finding novel human genes that have extensive homology in D.melanogaster and C.elegans but do not recognize known proteins, protein families or motifs in current databases. These encode new families of proteins of ancient origin. As extensive cross-phylum AA sequence conservation is frequently a hallmark of proteins involved in fundamental cellular processes, function analysis of some of the proteins may elucidate priorly unknown pathways. Mutation at loci encoding ancient sequences also often underlie human inherited disorders.
    More than 40 genes of this class were recognized by database search. Human unique Unigene sequences were BLAST analyzed to find those with significant homology (E values, e-005 to e-067) in the D.melanogaster EST database. Any initial match to a homolog of known function or a defined motif in the Unigene or nr db was excluded. For each candidate the human protein was sought from the nr database which often yielded the D.melanogaster relative. The fly homolog was also obtainable from FlyBase. For all but 4 genes, a C.elegans homolog was retrieved from the nr or Wormpep databases. In some cases Wormpep or FlyBase gave identification of a gene not available from the nr annotation. Such loci were excluded. This indicates the importance of using multiple databases when seeking novel genes. All candidate genes were analyzed with pfam. The human proteins were aligned to the fly and worm homologs using the NCBI BLAST 2 Sequences algorithm. The E values ranged from e-05 to e-135 with the majority at least e-30 . EST representation for the human genes was retrieved from the Unigene db. The gene identifiers, homology relationships of the proteins plus RNA data will be presented and posted at http://tnt.tch.harvard.edu/bruns/.

Amer. J. Hum. Genet. 71, 397 (2002)

 

For questions on these loci, contact Dr. G. Bruns at bruns@enders.tch.harvard.edu

 

Click here for the spreadsheet data for the chr1-chr9 genes
Click here for the spreadsheet data for the chr10-chr20 genes