ACS Publications Division - Journals/Magazines
About MDD - Subscription Info
February 2002
Vol. 5, No. 2, pp 26–30.
To MDD Home Page
Focus: Combinatorial Chemistry
Feature Article
Rationalizing combi-chem


In the search for new drugs, a little knowledge is a dependable thing.

opening art
Linda Melton
Historically, rational drug design meant using traditional wet chemistry—organic synthesis methods—following a one test tube/one scientist model. Combinatorial chemistry was envisioned to change all that. For some the goal, in part, was to obviate the need for a rational approach completely. A large enough library with the right high-throughput screening assay promised the ultimate brute-force method for obtaining bioactive compounds without biologists and organic chemists getting in the way. That is certainly a long-term view.

But in the short term, “smaller and focused” may still be better. The most promising hope for obtaining new drug leads through combinatorial chemistry is to choose templates and functional group additions based on prior knowledge of bioreactive compounds. The first and simplest method of doing this is to try to mimic a known biologically active molecule or family of molecules. Analysis of the structures of known drugs can provide initial design and subsequent building blocks for creating plausible active compounds. Combinatorial synthesis can then attempt to emulate or improve upon this prior knowledge.

By a mix and match of in vitro and virtual combinatorial approaches, structural knowledge of the known chemical entity can (hopefully) be parlayed into the production of something new and improved. This is a mass-production version of structure-based ligand design. It is merely modeling new drugs on old.

Alternatively, one can proceed from knowledge of a disease-related protein or receptor and use a combinatorial approach to design chemically likely binding molecules. Such an approach may become more common as the results of the proteomics revolution provide researchers with differentially expressed proteins that are specific to various diseases but whose function is unknown (or whose function does not seem to play a role in disease). Knowledge of 3-D geometry, obtained empirically through NMR and X-ray crystallography, or virtually from predictions based on amino acid sequence, can hint at potential active sites or junctures that would be logical binding targets.

A third approach is more general and exploratory. It is often used to form or screen larger libraries and depends on molecular structures in a less constrained fashion than the search for specific biologically designed molecules discussed above. This method simply relies on a series of guesstimates to determine whether a potential compound produced or building block used can be considered “druglike”. This type of design depends heavily on bioinformatics techniques.

But all this drug design is accomplished, of course, by making libraries of compounds. These are created in various ways, only some of which are applicable to both random- and knowledge-based design.

Libraries of (combinatorial) congress
To produce small libraries (i.e., containing hundreds to thousands of compounds), parallel synthesis is frequently used. This is most often the case when preexisting template information is available and a particular drug or bioactive compound is being mimicked. In parallel synthesis, each reaction sequence is performed separately and simultaneously with every other in separate compartments. The main benefit of this technique is that there is no difficulty determining the synthetic heritage of the individual compounds produced—each member in a particular well or tube is identical to every other. These kinds of libraries, when small, can be created using either solution techniques or solid-phase (bead) supports to equivalent advantage.

Split, or split-pool, synthesis is a mixing technique that can produce libraries of thousands to tens of thousands of compounds. Although the sequence can vary, starting scaffolds typically are produced on beads, reacted, pooled, and then subjected to a round of chemistry. The resulting products can be split and then reacted again, with or without repooling, in however many multiples are desired or required.

The greater the number of steps and/or the more frequent the pooling, the larger the potential number of library members produced. To increase the manageable size of such libraries, tagging is frequently used. Tagged, or encoded, methods of synthesis allow the largest possible libraries to be constructed—from thousands to hundreds of thousands and more. Tagging usually involves the use of fluorescent compounds or radiolabels that are attached directly to beads or to the starting framework of the molecules themselves. Such huge libraries are generally not used for rational design. And, as could be expected, the different sizes of combinatorial libraries are as much related to their functions as to the techniques used.

According to Dolle (1), libraries for lead discovery, for example, are typically large (>5000 members) and deal with searches based on little or no preconceived biological information. (However, no combinatorial approach is undertaken without any advanced biological preconceptions—if for no other reason than that some form of assay must be used to determine biological effectiveness.)

In rational combinatorial chemistry, the libraries tend to be smaller. These include the so-called targeted libraries, which “contain a pharmacophore known to interact with a specific (or family of) molecular target” (1). A typical rational approach is to create optimization (or lead development) libraries “where a lead exists and an attempt is being made to improve its potency, selectivity, pharmaceutical profile, et cetera.”

Looking for “like” in all the right places
One set of criteria for “druglike” status is the Lipinski “rule of 5” (named because of its emphasis on the number 5 and multiples of 5), which predicts that poor adsorption and permeability of potential drug candidates will occur if
  • there are more than 5 hydrogen-bond donors (expressed as the sum of –OHs and –NHs),
  • the molecular weight is more than 500,
  • the logP is more than 5, or
  • there are more than 10 hydrogen-bond acceptors (expressed as the sum of nitrogens and oxygens).

For each of these criteria, nearly 80–90% of the actual drugs examined fell below the cutoff range (2). Various architectural criteria are also being considered. Mark Murcko and colleagues, for example, examined 5120 distinct drug molecules and found that 32 particular geometric frameworks (using atoms as vertices and bonds as edges) formed 50% of these drugs (3). Such frameworks make an obvious start or a potential goal for combinatorial synthesis.

In rational design, the “druglike” concept is important for deciding which molecular scaffoldings might prove useful for starting off a combinatorial library with a reasonable chance of providing potentially valuable leads. Selecting among “druglike” criteria is also critical to choosing whether to continue examining a potential test candidate that such a library might produce.

Several criteria are currently in vogue for deciding whether a new and unknown compound (or a portion of an old drug or compound) is “druglike”. These include potential for adsorption and permeability into cells, as well as the presence of known bioactive functional groups (see box, “Looking for ‘like’ in all the right places”).

Tackling (building) blocks
When synthesizing libraries using a rational or knowledge-based approach, it is also important to choose building blocks to add to the template that

  • might contribute to the formation of larger druglike frameworks, as discussed above, and
  • contain biologically active functional groups in accessible forms that could modify frameworks in a way likely to produce reactive molecules.

Both templates and building blocks can thus be chosen rationally.

Because “the vast majority of marketed pharmaceuticals are low-molecular-weight nonpeptide, nonpolymeric entities, . . . it is logical that small organic molecules that can display functional groups would surface as a scaffolding approach” (1). This is especially pertinent in creating combinatorial building blocks. For such reasons, a wide variety of specific small-molecule scaffolds have been designed for differential introduction of functional groups.

The wit and wisdom of synthetic organic chemists comes into play here. In one example, an all-cis-substituted cyclopentane library was developed such that “by clever use of a cyclic anhydride, a methyl ester, and a Boc-protected amine, it is a relatively straightforward task to sequentially introduce four different functional groups as desired” in each of the scaffold’s four active sites (4).

Natural product templates are becoming very common as scaffolds. Molecules used as starting points have included tri-substituted purine libraries, flavone derivatives, benzofurans and benzopyrans, steroids, taxoids (based on the valued anticancer agent Taxol), and a wide variety of natural alkaloid frameworks, to name a few (5).

The key is to start with bioactive models and modify them with known reactive groups in order to produce a physiological effect better than, or antagonistic to, that of known drugs or hormones. However, “despite having a priori an active compound as a starting point, a rational approach to library design is only possible in cases where prior QSAR (quantitative structure–activity relationship) data is available or when structural information on the complex between the natural product bound to its biomolecule target is known. This . . . can help define potential positions for diversification and dictate the type and number of building blocks” (5). So targeting is key.

Turning to mimetics
beta-Turn mimetics are nonpeptide small molecules that mimic the shape of the reverse turn in folded peptides. They have been an important focus of medical research for years, and mimetics have been found that act as integrin antagonists and inhibitors of the human neutrophile receptor, among other physiological effects. Recently, Ellman used combinatorial techniques based on known active structures to produce a beta-turn mimetic library using solid-phase synthesis that incorporated a wide variety of side-chain functionalities (1).

Mimetics have also been produced to the beta-pleated sheets of proteins. These sheet motifs are natural recognition elements for various medically important enzymes, such as HIV protease, and the mimetics are being examined for their potential as inhibitors of proteolytic activity.

Tackling targets

For the present, a structurally (and informationally) conservative approach at the start may make biological as well as financial and practical sense. That there can be a successful concept such as “druglike” structures demonstrates the fact that despite the nearly infinite array of possible structures, nature is surprisingly conservative in those compounds that we have already discovered to be medicinally active and yet safe in biological systems. Apparently, nature is also fairly sparing with the targets those drugs seem to act upon.

According to Klaus Gubernator (CombiChem, Inc., San Diego) and Hans-Joachim Böhm (Hoffmann-La Roche AG, Basel, Switzerland), “Of the top 100 pharmaceutical drugs, 18 bind to seven transmembrane receptors, 10 to nuclear receptors, 16 to ion channels, and the remainder generally inhibit enzymes” (6). In fact, one class of surface receptors, the G-protein coupled receptors, are an especially important therapeutic target. Various groups not only are intensely studying the structure–activity relationship of drugs that bind to these receptors, but also are screening the human genome to find more receptors that are currently unknown.

Compounds that bind to DNA and various components of the immune system are also under investigation. And a host of other processes provide possible targets with which drugs have been found to interact. But the key point is that in medicine, as in so many endeavors, a few players do most of the work—making a rational approach to combinatorial drug design that tries to take advantage of this principle even more viable.

Mother (nature) knows best?
At present then, the wealth of untapped natural targets and the explosion of knowledge about the structure and function of bioreactive natural products seem capable of providing a vast pool of starting information for rational combinatorial chemistry to draw upon. Not only does the use of this information prevent drug researchers from reinventing the wheel, it also gives a starting point for understanding deeper levels of physiological processes—which may also provide access to new drugs and targets. The marriage of genomics and proteomics may prove most powerful in the way that it expands this knowledge base and gives us more natural targets and more knowledge of what it is to be “druglike”, so that rational combinatorial chemistry may proceed apace.


  1. Dolle, R. E. Comprehensive Survey of Combinatorial Library Synthesis: 1999. J. Comb. Chem. 2000, 2 (5), 383–433.
  2. Lipinski, C. A.; et al. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Delivery Rev. 1996, 23, 3–25.
  3. Bemis, G. W.; Murcko, M. A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 1996, 39, 2887–2893.
  4. Combinatorial Chemistry and Molecular Diversity in Drug Discovery; Gordon, E. M., Kerwin, J. F., Jr., Eds.; John Wiley & Sons: New York, 1999.
  5. Hall, D. G.; Manku, S.; Wang, F. Solution- and solid-phase strategies for the design, synthesis and screening of libraries based on natural product templates. J. Comb. Chem. 2001, 3 (2), 125–150.
  6. Gubernator, K.; Böhm, H-J. Structure-Based Ligand Design; Wiley-VCH: New York, 1998.

Mark S. Lesney is a senior editor of Modern Drug Discovery. Send your comments or questions regarding this article to or the Editorial Office by fax at 202-776-8166 or by post at 1155 16th Street, NW; Washington, DC 20036.

Return to Top || Table of Contents