About MDD - Subscription Info
April 2002
Vol. 5, No. 4, pp. 45–48.
the toolbox

The right road to drug discovery?

Fragment-based screening casts doubt on the Lipinski route.

Lead discovery is a high-risk endeavor. Several major pharmaceutical companies have acknowledged that they are only successful in identifying a high-quality lead for a druggable protein target in around one out of four attempts (1). Thus, although chemistry and screening throughputs have massively increased over the past decade, lead discovery productivity has not necessarily increased accordingly. This inability to identify multiple high-quality leads that are novel, tractable, and efficiently optimizable remains a key bottleneck in today’s drug discovery environment.

Despite continued and unprecedented levels of investment in high-throughput screening (HTS) and combinatorial chemistry (combichem) technologies, these two techniques clearly do not provide solutions for all targets. Although most pharmaceutical companies have derived significant value from these technologies, there continues to be a gap in lead discovery productivity that the industry is seeking new methods to fill.

So, where might an increase in productivity be found? Some enhancement will probably come from further refining of the HTS and combichem approaches. Improvements in productivity also might come from the many innovative approaches now being developed, including knowledge-based technologies such as pharmacophore-based screening (2), virtual screening (3), and systems-based searches (4).

This article focuses on the concepts and precedents for fragment-based screening and explores the argument that the industry may currently be focused on making and screening the wrong types of compounds for lead generation.

The wrong compounds?
Increasing throughput has driven HTS and combichem technologies, but the issue of improving the quality of compounds entering the lead discovery process has only recently been addressed. Although many people in the industry focused on this drive for higher throughput, some continued to focus on issues such as what sorts of compounds make good leads. From this work, people are now beginning to ask whether the sorts of compounds typically synthesized and screened are too druglike rather than leadlike.

Many researchers say that compounds with good drug properties may not necessarily make the best leads for further optimization. That is, leadlike properties and druglike properties, although not mutually exclusive, are significantly different (5, 6).

Over the past decade, the industry has been active in defining druglike properties. The much-cited “Lipinski rule of five” (7) derives empirically from the vast amount of data that the industry has gathered on properties that maximize an oral drug candidate’s probability of surviving development: molecular weight (MW) < 500, number of hydrogen bond donors < 5, number of hydrogen bond acceptors < 10, and ClogP < 5.

Although these rules are useful for assessing the risk profile of an oral drug candidate entering development, they do not necessarily define the properties of a good lead. Such druglike property rules, however, have been applied almost universally to the design and selection of compounds for lead discovery.

Rather than studying only druglike properties, Mike Hann and his colleagues at GlaxoSmithKline (Stevenage, U.K.) studied a set of more than 450 pairs of commercial drugs and their corresponding leads (5, 8). Thus, for the first time, a large body of data was analyzed from which the differences between historical druglike and leadlike properties could be derived.

On average, historical leads had lower MW, lower lipophilicity (ClogP), fewer aromatic rings, fewer hydrogen bond acceptors, and lower Andrew’s binding energy functions than the corresponding final drug. Other independent work also concluded that it is highly likely that libraries consisting of compounds with MW = 100–350 and ClogP = 1–3 are greatly superior for finding leads than those comprising druglike compounds, with higher MW and ClogP (6).

The MW and lipophilicity of initial leads typically increase during the lead optimization process. Thus, if the initial lead is already too druglike, then the optimization process that is likely to be needed to tailor the molecule to the new receptor or enzyme will likely result in a higher MW and a more lipophilic drug candidate. The candidate may thus no longer possess druglike properties. This suggests that when looking for leads, the guidelines given by Lipinski should be lowered so that leads that are found by HTS give more “room” for further property optimization.

Complexity and hit rate
In considering whether leads should be less complex than drugs, another theoretical factor was highlighted by Hann’s group: the effect of increasing the complexity of the compounds screened on the hit rate. In a simple statistical model of the interactions between receptors and ligands, as the systems become more complex, the chance of a useful match for a randomly chosen compound falls dramatically.

This trend might in hindsight be intuitive, but the exponential severity of the fall in the number of “ligand–receptor interaction matches” as the lead’s complexity increases is not. Additionally, making a large number of compounds via combichem tends to produce more complex compounds, which have a lower chance of matching the receptor as the number of interactions to be satisfied grows. This view is completely consistent with the observations of historical lead and drug pair differences discussed above.

A key aspect of molecular recognition is the probability that any one of a molecule’s features is compatible with those of a designated binding site. Effective molecular recognition is essentially the matching of properties of a molecule with its binding pocket through complementarity of shape and electronic properties, such as charge and hydrophobicity. In their simplest form, these are localized recognition elements that are highly detrimental to binding if incorrectly matched but that are beneficial (or neutral) if correct.

Molecular recognition can be simply modeled in one dimension by a sequence of plus (+) and minus (–) symbols (not necessarily charge-related) that represent the features of a binding site, and another sequence that represents the ligand molecule. A successful recognition event is thus a + ligand matching a – binding site (or vice versa). The model allows the binding site to contain a variable number of features by expanding the length of the sequence, but it does not incorporate flexibility in either the ligand or the binding site.

Figure 1. Products formed in UV-damaged DNA.
Figure 1. Binding modes. Example of ligand-receptor matching possibilities for a random receptor of complexity 12 and a ligand of complexity 4.
Figure 1 illustrates this model for the case of a binding site with 12 features. Consider how many different ways a ligand of complexity 4 with features (+ + – +) can fit. In this example, there are two ways the ligand can match.

This model can be used to explore the effect that the increasing complexity of the ligand (as indicated by its number of features) has on its chance of matching a binding site of given complexity. This model was used to calculate the probability that a randomly chosen molecule might match the binding site. In this extreme model, one mismatch is defined as sufficient to totally obviate binding.

The probability of finding a match decays exponentially as the size of the ligand increases, because as ligand complexity grows there are far more ways of obtaining a mismatch than a match. This is in conflict with the fact that any observed affinity will be high if a complex ligand does match. One interpretation of this model is that the industry, in trying to quickly identify high-affinity matches, has focused on screening complex druglike compounds, but that in doing so it has unwittingly screened compounds whose complexity leads on statistical grounds to low hit rates. A way out of this dilemma, then, is to initially screen simpler and more leadlike compounds that have a higher probability of efficiently binding (matching), even if they deliver less active starting points.

Consider a simple pyrmidine-based library with three variable points, each of which could be one of 100 different constituents. Testing every possible druglike molecule would require the production and screening of 106 compounds. However, by adopting a fragment-based approach, only 300 would need to be made and screened to explore the same chemical space (Figure 2). This sampling efficiency is based on the additive nature of the fragments, as compared to the multiplicative procedure needed for the larger druglike compounds.

Results and prospects
Papers that illustrate the potential benefits of fragment-based screening are now starting to appear. Stephen Fesik’s group at Abbott Laboratories (Abbott Park, IL) demonstrated the ability of NMR-based screening to efficiently identify potent low-molecular-weight adenosine kinase inhibitors (9). Markus Boehringer and his colleagues at F. Hoffmann-La Roche (Basel, Switzerland) identified novel inhibitors for DNA gyrase where HTS failed (10).

In each case, the molecules derived were novel and thus unlikely to have been exactly represented in screening collections for HTS work. However, the strength of the fragment approach is the potential to use less complex starting points and some evidence of activity to work into areas of chemistry that have not been previously explored. Patentability (also known as novelty) continues to be a necessity for undertaking an expensive drug discovery campaign, and these methods present opportunities to explore chemical structures that are not already embedded in corporate collections or suppliers’ catalogs.

Several new companies are basing at least part of their lead discovery strategy on the coupling of fragment-based screening to innovative assay technologies. Examples include Astex Technology (www.astex-technology.com), which uses high-throughput X-ray crystallography; Triad Therapeutics (www.triadthera.com), which uses NMR; and Graffinity (www.graffinity.com), which uses surface plasmon resonance.

Hits can be weak
Although there are potentially significant benefits to fragment-based screening, there are also issues. In particular, the hits identified are likely to be weak (the smaller and less functionalized compounds screened are likely to yield only weak hits of tens to hundreds of micromolar), and therefore there is a need for high confidence that these hits are valid and can be rapidly optimized. Weak leads will require more optimization than typical druglike leads (and therefore may not shorten cycle times), but in so doing they may develop into new chemistries that are not yet represented in “druglike” molecular screening collections.

So, will fragment-based screening prove to be a major breakthrough technology? Or will it, like many other technologies, not quite meet its early promise? Only results and time will tell, but with the rapid growth in the field it will not be long before we know what position fragment-based screening will assume in the ar mory of methods that will ultimately be needed to find tractable leads for every important target.

We thank Harren Jhoti, Andrew Leach, and Drake Eggleston for their constructive input and discussions in the preparation of this article.


  1. Milne, G. M. Accelerating R&D Productivity. Presented at Drug Discovery Technology, Boston, August 2001; www.drugdisc.com/multimedia.
  2. Beno, B. R.; Mason, J. S. Drug Disc. Today 2001, 6, 251–258.
  3. Walters, W. P.; Stahl, M. T.; Murcko, M. A. Drug Disc.Today 1998, 3, 160–178.
  4. Frye, S. V. Chem. Biol. 1999, 6, R3–R7.
  5. Hann, M. M.; Leach, A. R.; Harper, G. J. Chem. Inf. Comput. Sci. 2001, 41, 856–864.
  6. Teague, S.; Davis, A. M.; Leeson, P. D.; Oprea, T. Angew. Chem. Int. Ed. 1999, 38, 3743–3748.
  7. Lipinski, C. A.; et al. Adv. Drug Delivery Rev. 2001, 46, 3–26.
  8. Sneader, W. Drug Prototypes and their Ex ploitation; John Wiley & Sons: New York, 1996.
  9. Hajduk, P. J.; et al. J. Med. Chem. 2000, 43, 4781–4786.
  10. Boehm, H. J.; et al. J. Med. Chem. 2000, 43, 2664–2674.

Robin Carr is vice president of drug discovery at Astex Technology (Cambridge, U.K.), and Mike Hann is director of computational and structural sciences at Glaxo SmithKline’s Medical Research Centre (Stevenage, U.K.). Send your comments or questions regarding this article to mdd@acs.org or the Editorial Office by fax at 202-776-8166 or by post at 1155 16th Street, NW; Washington, DC 20036.

Return to Top || Table of Contents