Combinatorial
Chemistry: Novel Strategies, Chemistry,
Purification and Chemoinformatics
(source: Drug
& Market Development
Publications, Oct
1999)
- The introduction of combinatorial chemistry
in the mid eighties ushered in a new paradigm
for organic chemistry, and recent years have
seen several reports on new leads from
combinatorial libraries, as well as their
successful optimization as drug candidates.
- Although some companies still rely on the
production of huge libraries to increase the
chances of finding new leads, the majority of
those involved in combinatorial drug design have
shifted strategies. Many now attempt to reduce
the effort and costs associated with synthesis
and screening by carefully designing focused
libraries with optimized diversity and drug-like
characteristics.
- Solid phase chemistry continues to play an
important role, as large libraries of relatively
pure compounds can be produced once the
time-consuming reaction optimization has been
performed. However, liquid phase synthesis,
sometimes in combination with polymer-bound
reagents or scavenger reagents, offers
significant advantages; the range of applicable
chemistry is much broader and most problems
associated with solid-phase synthesis are not
encountered.
- For small, focused libraries, automated
parallel synthesis in liquid phase is the most
cost-effective strategy. Due to the increasing
importance of such libraries, automated
purification is now an integrated process in the
production of compounds for biological testing.
All steps can be automated, from the input of
impure compounds in a certain format to the
output of pure compounds in the same
format.
- In recent years, combinatorial chemistry has
also been used to optimize organic reaction
conditions, to produce materials with desired
optical, electric and magnetic properties, and
to discover new catalysts and polymers.
- Two recent IBC UK (www.ibc-uk.com)
conferences titled Combinatorial Chemistry
99: Novel Strategies, Chemistry,
Purification & Chemoinformatics and
Exploiting the Promise of Combinatorial
Chemistry, held June 23-25, 1999, in London, UK,
offered a forum for discussion of the latest
approaches to combinatorial chemistry.
Introduction and Overview
The earliest drugs originated in folk medicine.
As societies advanced, more rational approaches
were adopted, but accidental discoveries still led
to many valuable drugs such as chlordiazepoxide,
cisplatin, penicillin, pethidine (meperidine),
sulfamidochrysoidin and warfarin, to name a few.
Even acetylsalicylic acid was a serendipitous
discovery, as its inventor, the Bayer chemist Felix
Hoffmann, did not know that this compound was not
simply a prodrug of salicylic acid (as supposed by
him), but a unique therapeutic in its own right; in
addition to its fever-reducing and pain-killing
properties, it was found to possess antiplatelet
aggregation activity.
For almost a century, organic syntheses and
animal experiments and trial and error governed the
search for new drugs. In recent years, however, new
targets from genomics, high-throughput test
systems, molecular modelling and computer-aided
drug design have added new dimensions to lead
discovery and optimization. With the ongoing
progress in protein crystallography and NMR,
structure-based ligand design has become more and
more important. Many successful drugs have proven
the value of such rational approaches.
Combinatorial chemistry had a slow start.
Although the development of the Ugi multicomponent
reaction in 1962 and the Merrifield solid-phase
synthesis in 1963 offered, in principle, the
necessary tools to synthesize libraries of small
organic compounds, the first combinatorial
syntheses didnt result until 20 years later,
represented in the work of A. Furka, M. Geysen and
R. Houghten on the production of peptide (later
peptide and nucleic acid) libraries. From about
1990 onwards, small molecules were also
synthesized, usually as multi-component mixtures.
Nowadays, combinatorial chemistry is most often
applied to produce libraries of single, pure
compounds with drug-like properties.
Correspondingly, the sizes of the libraries have
become smaller and smaller and, as a consequence,
the balance between the necessary effort for
reaction optimization and the number of resulting
compounds has become unfavorable. New synthetic
procedures as well as rational design approaches
needed to be developed. Combinatorial chemistry
only seemingly forces chance to improve the success
in drug research. The mere synthesis of enormous
numbers of organic compounds, without rational
design, is a waste of time and resources.
Combinatorial chemistry will increase the success
rate in drug research only in combination with
appropriate selection procedures, such as filtering
compounds by properties and/or virtual
screening.
Combinatorial chemistry has been largely adopted
by the pharmaceutical and biotechnology industries.
From the first technological developments to the
current widespread integration in discovery and
development at a highly automated level, there have
been questions raised regarding the ultimate extent
of diversity and the leads that have been
generated. Correspondingly, the most important
topics at the Combinatorial Chemistry 99
conference (see introductory bullets) were
chemoinformatics tools for rational library design,
molecular diversity, sublibrary selection and
virtual screening of combinatorial libraries. The
shift in synthetic and design strategies was
clearly reflected in a number of presentations at
the conference.
Novel Strategies
Taken together, library synthesis and
high-throughput screening (HTS) allow the
development and screening of huge numbers of
molecules. However, the futility of this approach
has become apparent, as there will always be many
more molecules to make than can be screened and
tested. As such, several strategies for the design
of appropriate molecules for lead generation and
their optimization into drug-like molecules have
been developed at Glaxo Wellcome (Stevenage,
Hertfordshire, UK). Sublibraries are selected from
virtual libraries by considering the desired
properties of the final products. As illustrated by
Mike Hann, Group Leader Computational Chemistry,
the company is using a range of Web-based tools in
this approach, including a generalized genetic
algorithm (GA) approach for design against any
property, a GA approach that generates diversity
within a focused design, and an efficacy and
efficiency score for fast monomer selection.
At SmithKline Beecham (King of Prussia, PA),
high-throughput lead discovery technology uses the
efficiency of split-and-mix combinatorial syntheses
while still allowing the assay of individual
compounds. Michael Moore, Associate Director
Combinatorial and Chemical Technologies, reported
on high-loading, large diameter polystyrene beads
that are used as polymeric support. Single beads
are arrayed and cleaved, yielding compounds in the
range of 10-20 nmol per bead (i.e., about 5-10 mg),
which is sufficient for multiple biological assays
as well as for compound identification by
LC/MS.
For successful lead optimization libraries, the
library design has to be focused around a lead and
biased towards "drug-like" compounds. Heuristic
optimization strategies to combinatorial library
design include the application of multicenter
pharmacophores for the description of molecular
diversity space, considering conformational
flexibility of the molecules. For the design of
drug-like properties and for enhancing hit-to-lead
properties of lead optimization libraries, two
approaches were presented by David E. Clark,
Research Fellow Computer-Assisted Drug Design,
Rhone-Poulenc Rorer (Dagenham, Essex, UK). The
first seeks to maximize the overlap of computed
physical properties (e.g., ClogP, MW, etc.) between
the designed library and a collection of known drug
molecules. The second considers bioavailability by
using the "rule of five" and calculating polar
surface areas of the compounds as a measure of
their ability to permeate biological membranes. The
techniques were illustrated by the optimization of
a 4,5-bis-arylimidazole library with respect to
Caco-2 cell permeability.
Combinatorial Chemistry and
Methodology
Scavenger resins, polymer-bound reagents, catch
and release methods, and resin capture have become
increasingly important in the production of
combinatorial libraries. Such strategies allow
organic synthesis using simple, parallel product
purification by filtration, avoiding silica-gel
chromatography and/or extraction. Reaction
procedures for parallel synthesis of libraries of
tertiary amines (from diamines and
polystyrene-TsCl), amides and sulfonamides (using a
polystyrene-DMAP reagent), 2-aminothiazoles and
1,2,3-thiadiazoles using catch and release and
resin capture techniques, and polymeric reagents
for the chlorination of acids and alcohols for
Mitsunobo and Wittig reactions
(polystyrene-triphenylphosphine) were presented by
Bernd Renneberg, Applications Chemist, Argonaut
Technologies AG (Muttenz, Switzerland). New soluble
polymer-supported catalysts, reagents and synthetic
targets as adjuncts to solution-phase library
development were described by Paul Wentworth, Jr.
of The Scripps Research Institute & Skaggs
Institute for Chemical Biology (La Jolla, CA).
Examples are covalent scavenger reagents to remove
unwanted starting materials or by-products, resin
capture reagents to transfer products in solution
to solid phase for further modification, or
"fishing-out" reagents to remove products from a
complex reaction mixture. In parallel, new resin
supports for solid-phase chemistry have been
developed by a novel cross-linking strategy. The
physicochemical properties of PEG cross-linked
polystyrene resin supports (i.e., increased
mechanical stability, better swelling properties
and better intra-resin diffusion in poor-swelling
solvents) demonstrate the utility of this
material.
The development and application of resin-bound
selenium as traceless linker in solid-phase organic
syntheses of small non-peptide compounds was
described by Thomas Ruhland, Research Chemist, H.
Lundbeck A/S (Copenhagen, Denmark). Compounds are
attached by direct loading without the requirement
of an auxiliary spacer, as demonstrated by the
synthesis of a small-sized library of single alkyl
aryl ethers by the Mitsunobu reaction. The
selenium-alkyl bond that attaches the products to
the resin is smoothly cleaved under radical
conditions. Disadvantages of the selenium linker,
however, are its incompatibility with certain
reaction conditions (i.e., radical chemistry, some
oxidation reactions, Pd catalysis) and the toxicity
of selenium.
Anthony D. Baxter, Chief Scientific Officer,
Oxford Asymmetry International (Abingdon, Oxon,
UK), and Lutz Weber, Chief Scientific Officer,
Morphochem AG (Martinsried, Germany), described the
respective approaches of their companies for the
synthesis of lead discovery libraries. Oxford
Asymmetrys building block approach to library
synthesis via novel building blocks uses
conformationally restricted templates, having at
least three sites of diversity. These are coupled
with preselected monomers. As guidelines for
"drug-like" molecules, a slightly modified "rule of
five" is applied. Privileged structures include
biphenyls, biphenyl ethers, benzodiazepins,
benzoxazins, spirohydantoins and other heterocyclic
templates. Different functional groups are attached
to appropriate linkers to enhance the diversity of
a library that is based on one template.
The Morphochem approach towards high diversity
combinatorial chemistry, on the other hand,
includes the generation of libraries where
substituents and backbones are varied at the same
time. Once a biologically interesting molecule has
been identified, rather conventional combinatorial
chemistry techniques may follow. Methods that allow
the generation of high diversity libraries by
systematic variation of reaction types, instead of
starting materials of just one reaction, have been
developed. The most efficient optimization of lead
structures is performed by application of
intelligent selection and filter functions (e.g.,
ClogP, MW, Pfizer rules, etc.); in addition,
affinity estimations, neural networks, genetic
algorithms and pattern recognition methods are
applied. However, such strategies require the
synthesis of "random access" chemical libraries
instead of todays systematic combinatorial
libraries.
Automation, Purification and
Analytics
Michael Moore of SmithKline Beecham presented
the Myriad Core System and the Irori AutoSort 10K
system as equipment for medium- and large-scale
array synthesis of combinatorial libraries.
High-throughput purification is performed at
SmithKline Beecham with a Gilson AutoPrep HPLC; 192
samples per day are purified, using LC/MS data of
the crude sample and UV-triggered fractionation.
Most compound fractions are collected into a single
tube. Overall, this is combined to a
high-throughput production process that includes
high-throughput synthesis, isolation, analysis,
purification, quality control and registration.
With advances in automated parallel synthesis,
the construction of large combinatorial arrays has
become possible; however, this has resulted in a
corresponding challenge for the analyst to provide
meaningful information on such large sample
libraries. As illustrated by Ashley B. Sage,
Project Leader LC/MS Applications, Micromass UK
Ltd. (Wythenshawe, Manchester, UK), the use of
LC/MS in combinatorial chemistry has the following
requirements: -Automated high throughput
characterization and exact mass determination of
arrays using an orthogonal acceleration
time-of-flight (oa-TOF) mass spectrometer
- The design and implementation of a
multiplexed electrospray system integrated into
a mass spectrometer that is capable of rapidly
sampling multiple liquid streams, facilitating
and increasing sample analysis throughput
- Multi-milligram quantity purification of
compounds using both reverse and normal phase
HPLC conditions
- Fraction collection by LC/MS including
automated sample tracking and reporting
A high-throughput organic chemistry (HTOC)
process was designed at Biotage, a division of Dyax
Corporation (Charlottesville, VA), to transfer
crude reaction mixtures to purified compounds of
known identity and weight. The Parallex HPLC,
presented by Patrick Coffey, Vice President
Automated Chemistry, is a four-column preparative
HPLC system with deep-well microtiter plate formats
for input and output. All samples and fractions are
bar-coded and all information is tracked. The user
can specify rules for selection, combination,
dissolution and wash steps and interactively
fine-tune the selections. If fractions need to be
combined into smaller volumes, the corresponding
plates are sent to an evaporator and redissolved to
place identical fractions into the final
destination sites. As a result, the input format of
an impure library is reproduced as the same format
of purified and analytically characterized
compounds. The estimated costs of the entire
process are less than $10 per 10 mg compound, which
favorably compares with the significantly higher
costs of synthesis of the impure samples.
Various software tools for structure
verification by MS, MS/MS and NMR spectra were
presented by Herbert Thiele, Manager Software
Development, Bruker Daltonik GmbH (Bremen,
Germany). With a corresponding increase in
computational effort, NMR spectral shifts can be:
(a) estimated from increment systems [e.g.
SpecTool], (b) predicted [WIN-SpecEdit,
SpecInfo, CSEARCH, ACD], (c) simulated
[LAOCOON, WIN-DAISY], or (d) calculated by
quantum mechanical treatment of the molecule
[HyperNMR, NMR-Cindo, Gaussian]. Practical
limitations of the empirical approaches are limited
bond coverage of the topology code, through space
effects, tautomers, delocalized bonds and charges,
and stereochemical and solvent influences.
Chemoinformatics: Design and Virtual
Screening of Libraries
The development of automatic 3D structure
generators (e.g., the program CORINA) has made
possible the large-scale investigation of the
relationships between 3D structures and biological
activity. Much controversy presently centers around
2D vs. 3D descriptors, with 2D descriptors often
outperforming 3D descriptors. However, molecules
are 3D objects, and biological activity is a
reflection of the 3D properties of the molecules.
Novel ways to code chemical structures for
correlating biological activity and determining
chemical diversity were presented by Johann
Gasteiger, University of Erlangen-Nuremberg
(Erlangen, Germany). The relationships between 3D
structures and biological activities are explored
by statistical analyses, pattern recognition
methods and neural networks for similarity
perception.
RECAP is a retrosynthetic combinatorial analysis
procedure for identifying biologically privileged
fragments for use in the synthesis of targeted
libraries. This powerful tool, described by Duncan
B. Judd, Group Leader Lead Design, Discovery
Technology, Glaxo Wellcome R&D, involves the
use of databases of compounds with known biological
activity that are electronically "cleaved" at bonds
amenable to combinatorial chemistry (e.g., amides,
esters, amines, ureas, ethers, biphenyls and
sulfonamides). The fragments and motifs can be
readily used as building blocks to prepare
combinatorial libraries containing biologically
privileged substructure motifs that may be used in
lead generation or lead optimization.
ChemSpace is an approach for the generation of
both focused and diverse libraries. Topomeric
searching of huge virtual libraries built around
lead compounds facilitates the rapid and
generalized definition of structure-activity
relationships by guiding the synthetic chemistry
program. Implementation of ChemSpace in the
production of designed libraries of chemical
compounds was discussed by Anthony Cooper, Managing
Director, Tripos Receptor Research (Bude, Cornwall,
UK). In cooperation with Bristol Myers Squibb
(Princeton, NJ), ChemSpace was validated by
searching angiotensin II antagonists in a virtual
library of 2.6 billion compounds, from which, after
several filtering processes, 425 compounds were
selected and synthesized. Whereas there were 63
hits in this set, no highly active compounds were
found in a control set of randomly chosen
compounds.
Nick Perry, Molecular Modelling, Knoll
Pharmaceuticals (Nottingham, UK), posed the
question, "Does more diverse mean
more informative?" Much effort has been
expended in devising methods to select structurally
diverse subsets from compound libraries, with the
implicit assumption that testing a diverse subset
in a biological screen is more effective for lead
finding than testing a random subset. MDL keys and
Ghose and Crippen fingerprints were used to define
dissimilarity metrics and two recursive selection
methods were compared, with the Most Descriptive
Compound method performing better than the Minimum
Separation method. The efficacy of the selection
procedure was measured as the improvement compared
with random selection of compounds representing
different biological activity-types. Typically,
structurally diverse subsets give a better
biological coverage when compared to randomly
selected subsets.
Chemically related compounds most often show
similar biological activities. This fundamental
relationship has led to the discovery and stepwise
optimization of many valuable drugs. However,
pitfalls that may arise in the design of a
combinatorial library were discussed by Hugo
Kubinyi, BASF AG (Ludwigshafen, Germany). In many
cases, presumably closely related compounds show
very different modes of action and/or biological
potencies, for example tricyclic antihistamines,
neuroleptics and antidepressants, the female and
male sex hormones, agonists and antagonists, and
isosteric analoges. Other minor structural
modifications result in surprisingly different
binding modes.
Exploiting the Promise of Combinatorial
Chemistry: Applications in Drug Research and
Catalysis
Genomics and proteomics offer a host of novel
therapeutic targets. Combinatorial chemistry holds
the promise of decreasing overall development times
by increasing the probability of success for lead
discovery against "difficult" targets. John J.
Baldwin, Chief Science and Technology Officer,
Pharmacopeia, Inc. (Cranbury, NJ) illustrated the
use of large encoded libraries in lead discovery.
Success stories on micromolar and submicromolar
ligands against different protein kinases and
nanomolar inhibitors against the bradykinin
receptor-1 (B1) were presented without providing
structural details.
Combinatorial chemistry and structure-based drug
design are effective methods for finding lead
compounds, but both have their limitations. The
PRO_SELECT methodology, discussed by Stephen C.
Young, Head of Synthetic Chemistry, Proteus
Molecular Design Ltd. (Macclesfield, Cheshire, UK),
combines the benefits of both. A huge virtual
combinatorial library is first reduced in size by
applying several filters and then screened for
complementarity against a receptor structure. The
resulting focused sublibrary has a higher
hit-per-compound ratio than a random and diverse
library of related chemistry. The PRO_SELECT
approach has been used to create small libraries
targeted against the therapeutically important
serine proteases thrombin (virtual library of 37.5
million compounds reduced to a 72-member
sublibrary), trypsin, and factor Xa; a high
proportion of "hits" was obtained. For factor Xa,
the process was used iteratively, starting from a
small template molecule. This ultimately led to a
series of highly potent and selective molecules.
The same approach could be used in designing
slightly larger combinatorial libraries focused on
targets with less clearly defined structures.
Nowadays, resistance of bacteria against most
antibiotics used in the clinic is a serious
problem. Thus, discovery of new medicinally useful
anti-infectives against resistant gram-positive
bacteria is a very important issue. A series of
novel small molecules (e.g., 3,4-disubstituted
2-(indol-3-yl)-tetrahydroisoquinolines) synthesized
via combinatorial methods and demonstrating very
interesting in vitro and in vivo activity against
resistant gram-positive infections were presented
by James R. Hauske, Sr. Vice President Discovery,
Sepracor, Inc. (Marlborough, MA). In this study,
combinatorial methods provided both lead discovery
and lead optimization libraries that were linked to
in vitro ADME, toxicological and physicochemical
property screens affording compounds with drug-like
properties.
An overview of combinatorial catalysis and the
state-of-the-art in industry was provided by Thomas
Bradshaw, President, The Catalyst Group
(Springhouse, PA). The high-throughput screening of
either heterogeneous or homogeneous catalysts for
the highest activity and yield for a specific
reaction requires the optimization of
multidimensional operating parameters (e.g., pH,
temperature, solvent or gas phase, pressure, etc.)
to maximize yield and purity of the reaction
product(s). Although combinatorial catalyst
development and optimization is still in its
infancy, commercial commitments already exist to
develop this tool further; there have been several
cooperations between large companies and the
combinatorial material science company, Symyx
(Santa Clara, CA), and other venture capital
companies. These cooperations include synthesis of
catalysts using combinatorial approaches, the
evaluation of catalyst activities using robotic
systems, and process development for novel
catalyzed reactions.
Summary and Conclusions
Drug research is an evolutionary process. In the
same manner that nature developed higher organisms
from more primitive forms, lead structure search
and optimization follow evolutionary principles.
Combinatorial chemistry will not change this.
However, combinatorial chemistry does speed up drug
discovery in two different ways: (1) the production
of large numbers of chemically diverse libraries
may increase the yield of new leads, and (2)
rational lead optimization by automated parallel
synthesis reduces the time needed for each
evolutionary cycle.
Drug properties are more important than the
chemical accessibility of a library. Thus,
chemistry-driven combinatorial libraries are less
attractive than rationally designed libraries.
"Drug-like" properties, good oral bioavailability
and metabolic stability are preconditions for
valuable leads. Similarity can be better defined
than diversity, or the lack of similarity.
Chemically similar compounds may have very
different biological activities; their "biological
similarity" significantly depends on the target. On
the other hand, very different compounds may have
identical biological activities.
Large libraries may have a higher probability of
producing hits (following a nonlinear dependence!),
ease the derivation of SARs and/or QSARs, and allow
a broader patent coverage. However, despite all
this, large libraries are most often a waste of
time and resources because small libraries need
less effort for reaction optimization and
production (e.g., liquid phase instead of solid
phase synthesis) and generate a much higher
diversity.
Structure-based and computer-assisted design of
protein ligands supplement combinatorial chemistry
in drug research. Virtual screening will become
increasingly important in the design of
combinatorial libraries. Computer-assisted,
flexible, combinatorial construction of drugs
within the binding pocket of the biological target
using diverse building blocks will be possible in
the near future. The necessary tools are already
available, but scoring functions have to be
improved.
Finding active compounds in libraries is no
success at all. Decisive for industrial success is
not "me too," but "me better." "me faster," "me
first" or even "me only". Combinatorial chemistry
will not replace classical chemistry. Only minor
segments of chemistry and of the universe of
biologically active compounds can be covered by
combinatorial libraries; only simple and robust
equipment is suited for automated synthesis in
organic chemistry laboratories. In essence,
combinatorial chemistry is a tool. It will not
directly produce development candidates for
clinical investigations, but a lot of important
information will result that will enable the
medicinal chemists to find new drugs much easier
and much faster. 
|