ACS Publications Division - Journals/Magazines
About MDD - Subscription Info
November 2001, Vol. 4
No. 11, pp 26–38, 40.
To MDD Home Page
Focus: High Throughput / Robotics
Feature Article

Ecce homology:
A primer on comparative genomics


The weed, the fly, the “worm”, the mouse—their genomes yield pieces of the puzzle that is human flesh.

opening artNow that the human genome has been sequenced (for the most part, regardless of errors in individual base pairs or calculations of numbers of genes), questions concerning what to do with it abound. And the answer, for the most part, in isolation, is absolutely nothing. Without a massive cloning and medical experimentation program that would put the twisted imagination of a Mengele to shame, we cannot practically get the information on the function of most human genes from the human genome itself.

To be sure, those genetic mistakes that nature has perpetrated on unfortunate individuals can be tracked from womb to tomb, as it were. And comparative microarrays can show us what genes turn on and off from among a panoply of gene candidates during testing of a variety of clinically acceptable treatments in humans. Gene sequencing also can predict protein structures, some of which can be constructed and their functions tested for a variety of activities in vitro. But we cannot (or certainly should not) blithely manipulate human genes (at least in humans) just to see what they do—how they go wrong, how they can be moved about, or how they affect everything from embryonic development to the senescence of old age.

Happily, evolution provides an answer. A gene is a gene is a gene. And whether they be in mouse or fly, “worm” (actually the nematode Caenorhabditis elegans) or weed, the kinds and activities of an amazingly wide variety of genes are essentially the same—conserved and co-opted over geological time to meet the needs of millions of species, including Homo sapiens.

Comparative genomics is the ultimate key to functional genomics. We know what’s happening in the human because the same thing happens in Arabidopsis or Drosophila, in the nematode C. elegans, or in whatever model species proves most homologous for a particular trait of interest, be it metabolic pathway, structural component, or disease. This was recognized from the start of the Human Genome Project in the late 1980s, when monies were set aside to fund research into several other model species genomes, including those of the fruit fly, Arabidopsis, and the laboratory mouse.

Brother worm and sister fly
Saint Francis of Assisi purportedly acknowledged the family status of all creatures by referring to them as siblings to himself. Never were truer words spoken, at least in assigning relationships between human beings and the rest of life on earth. Profound gene homologies have been detected between humans and a wide variety of model species being used in genome projects around the world.

The percentage of human genes homologous to those in these model species has been determined through complex database comparisons that only modern computing could make possible. Because of the ability of database algorithms such as BLAST ( to do rapid and complex sorting and sequence comparisons across a multitude of databases, including comparing proteins to nucleic acids and vice versa, homologue assignment of similar genes from other species can be made simply and easily. With the rise of the Internet came the ability to extend one researcher’s laboratory results and link themwith those of thousands of others across the world, creating a virtual genomics community.

Since 1996, BLAST databases, which incorporate genome information from a wide variety of species, have been so successful and comprehensive as to provide a greater than 50% likelihood of finding a homologous match in the system for any unknown gene presented for examination by a researcher.

One of the more striking examples of pervasive functional gene homology across the evolutionary spectrum has been the discovery, initially in Drosophila, of the homeotic selector genes. The same homeotic genes that control body segmentation in the lowly fruit fly appear to have been tapped to control the body plan in vertebrates in general, including human beings.

These homologies were first discovered by using the fly gene as a probe for screening the genomes of other species. The widespread homology at the gene level has led researchers to look for similar homeotic genes in Arabidopsis, where they have been found to play a role in the structural development of flowers. The products of the homeotic genes are a variety of unique transcription factors that act as modulators of the other genes necessary for a variety of developmental processes. Similarly, and to many, surprisingly, the seemingly alien, multifaceted eye of Drosophila has proved to have striking genetic homologies with the human eye. And Arabidopsis genes that prevent apoptosis have been found to be similar to those in humans.

In other examples, as reported at the Interactive Fly site at Purdue (, in a single Drosophila embryonic brain, cDNA library homologous assignments can easily be made to known genes in the Xenopus frog, the slime mold, C. elegans, the mouse, the garden snail, the rat, the chicken, yeast, and the human—as much proof of the functional utility of comparative genomics in a single example as one could hope for. Similar assignments are being used to try to determine the functions of genes discovered via the Human Genome Project. Homology is everywhere, if we know how to use it.

Particularly useful in the quest for homology are the burgeoning advances in DNA microarray technology for rapid screening and comparisons. Incyte Genomics, Inc. (Palo Alto), for example, markets a line of arrays with a broad selection of gene clones, including those of humans, rats, Arabidopsis, and mice. A wide variety of companies, such as Research Genetics (Huntsville, AL) and Mergen Ltd. (San Leandro, CA), also offer human, mouse, and rat genes on arrays. Many of these systems are used to study diseases ranging from cancer to diabetes to infections—researchers search for pattern homologies that may provide physiological models, and, ultimately, druggable targets.

A summary of some of the percentage gene homologies between common model organisms is available at share a 47, 63, 38, 15, and 20% homology with the fruit fly, the mouse, C. elegans, baker’s yeast, and Arabidopsis, respectively.

Diseases and destinies
Within these percentages, a surprising number of genes involved in the development of human diseases show homologies to other species, from bacteria to yeast, from fruit flies to mice. For example, a human gene involved in cystic fibrosis has a yeast homologue in a gene that codes for metal resistance; a Type 1 neurofibromatosis-related gene in the yeast codes for a regulatory protein; and a human gene associated with amyotrophic lateral sclerosis is homologous to a yeast superoxide dismutase gene. Highly striking is a yeast homologue (SGS1) of a gene associated with Werner’s syndrome (a disease resulting in premature aging). Researchers at MIT demonstrated that a mutant SGS1 yeast shows a shorter life span and has an early onset of sterility, which is a senescence indicator (

In the past two years, a series of Drosophila studies have found homologies and models in the fruit fly genome for various human diseases. When Ethan Bier and his colleagues at the University of California, San Diego, screened 929 human disease genes against the completed Drosophila genome, 548 (representing 714 different diseases) showed a high degree of similarity in their predicted amino acid sequences ( 06_01/Homophila_database.cfm).

Because the vast majority of human genes have a mouse counterpart, significant progress in both identifying and studying human diseases through the creation or discovery of mouse mutants is also under way. A chromosome-by-chromosome gene homology map is available that compares the human and mouse genomes ( A typical example of success in this area is the discovery that a mutation in the mouse mdx gene produces a muscle disease analogous to the Duchenne’s muscular dystrophy mutation in humans. “The two genes produce proteins that function in very similar ways and that are clearly required for normal muscle development and function in the corresponding species” (

Knowing by a knockout
Sequence prediction for the proteins produced by discovered genes and subsequent protein modeling can sometimes provide information on function, especially if the homologous protein is well known and understood. But for unknown proteins, or for genes with multiple effects, the best method for determining what a gene does in an organism requires tampering with its operation in a test animal—just to see what happens. One means to do this is through the use of genetic “knockout” organisms—those in which a particular gene has been deliberately disabled.

Although yeast has proved a valuable knockout organism for a wide variety of eukaryotic studies, especially on basic cellular growth and metabolism, mice have become the preferred model organism for the examination of human diseases, because of their more obvious complexities and similarities with human physiology. Some of the most significant knockout lines to date include those with one or more of the known oncogenes affected, including those for skin cancer, breast cancer, and various other tumorous conditions.

Many companies are banking on the potential benefits of knockout mice for drug discovery. For example, Lexicon Genetics, Inc. (The Woodlands, TX), uses patented gene-targeting technologies to alter specific DNA in mouse embryonic stem cells, which are then cloned to create their knockout mice. Generating more than 1500 new knockouts per week, the company has a living database of over 170,000 knockout mice in which randomly chosen gene sequences have been altered.Given the large homology between human and mouse genes, the hope is that a significant number of these knockouts will eventually turn out to be in clinically interesting, disease-related sites.

But knockouts are not a panacea. For example, mice that lack a copy of the gene that produces huntingtin (a protein of unknown function thought to be instrumental in human Huntington’s disease) fail to develop past an early embryonic stage. This outcome indicates the profound necessity of the protein for life but severely limits the use of the knockout mice for studying the development of the human disease.And, of course, the presence of a gene product can just as easily be important to causing disease as its absence, as in sickle cell anemia, where it is the aberrant hemoglobin—not its lack—that is at fault. And what of multigene interactions with nonhomologous partners, present in the normal “host” but not in the model system?

For these reasons, many researchers use the flip side of knockouts—transgenic techniques that add, rather than delete, a gene or genes in the mouse or other model systems. Transgenic animals provide positive expression of the protein of interest in order to allow researchers to try to determine its effects. Using such transgenics, the behavior of specific human gene alleles alone and in a variety of combinations can be studied and compared in the model organism.

Genoming for druggables
Of course, homologies between related organisms are also of use in the search for new drugs to attack a wide variety of pathogens and parasites. Researchers in nematode genomics, for example, ( are comparing the genomes of parasitic species with that of C. elegans to try to find exploitable differentials and to understand pathological interactions. They hope to use this knowledge to discover new drugs that will attack human hookworm and a variety of related animal diseases caused by nematodes.

Comparing human with microbial genomes may also allow for the development of bacterial- or fungal-specific antibiotics without equivalent targets in the human cell that could cause toxicity. This strategy is of particular interest for attacking a wide variety of tropical parasites that share the human body systemically as well as the eukaryotic lifestyle, making them difficult to treat without endangering the patient. (It is hoped that comparative genomics may indicate potential sites for attack in the parasite that are lacking in the host.)

Ecce homo (sapiens)
Ultimately, it appears that the only way to behold the truth of human nature is to hold it up to the mirror of our biological kindred. In a way then, comparative genomics can almost be considered the equivalent to functional evolution—making use of our evolutionary past to predict and control our biomedical future. Whether through sequence homology in genes and proteins, knockout genes or gene insertions in model organisms, or even plants and animals, the use of genomes other than our own provides the most powerful, and what most researchers consider the only morally palatable, method to determine who we are and how we work—not to mention how our drugs work on us, as well.

Mark S. Lesney is a senior editor of Modern Drug Discovery. Send your comments or questions regarding this article to or the Editorial Office by fax at 202-776-8166 or by post at 1155 16th Street, NW; Washington, DC 20036.

Return to Top || Table of Contents