4.4 – Combinatorial Chemistry

Introduction to Combinatorial Chemistry

Once an assay has been developed for testing HVA-like compounds for dopamine inhibition, the next step is to synthesize a library of test compounds to screen. As HVA is the lead compound (Figure 4.4.a), the goal is to synthesize thousands of new compounds that share the core aromatic structure but differ in other aspects. This is done through an approach known as combinatorial chemistry. Combinatorial chemistry is the mixing and matching of substituents at different sites on the molecule to create libraries of unique compounds.

Line-bond drawing homovanillyl alcohol shows a central six-membered aromatic ring with alternating single and double bonds. Clockwise from the top of the six-membered aromatic ring, there are the following substituents: at carbon #2 there is a CH2CH2OH group; at carbon #5 there is a hydroxyl group (OH); at carbon #6 there is a methoxy group (OCH3).
Figure 4.4.a. The chemical structure of homovanillyl alcohol, HVA, the lead compound for this drug discovery process.

This can be applied to HVA through the manipulation of the diversity sites on the compound. Diversity sites are parts of a molecule which can be substituted with different atoms or functional groups. The molecule HVA below has several diversity sites, three of which are labelled as R1, R2 and R3 (Figure 4.4.b).

Three line-bond drawings of homovanillyl alcohol (left diagram) and two skeletal structures similar to homovanillyl alcohol (middle and right diagrams). Refer to the previous diagram in Figure 4.4.a for the structure of homovanillyl alcohol along with the numbering of carbon atoms from the top of the six-membered aromatic ring. In the left diagram, three aspects are highlighted in red: the methoxy group at carbon #4 of the six-membered ring (top left), the hydroxy group at carbon #5 of the ring (bottom left), and the hydroxy group that is part of the CH2CH2OH group on carbon #2 (right).The middle diagram is like the left diagram, except that the three red portions have been replaced for R1 (top left), R2 (bottom left), and R3 (right).   The right diagram is like the middle diagram, except that all the other atoms that are not shown in red are labeled as “Not diversity sites”.
Figure 4.4.b. Three diversity sites in HVA are shown in red. The main aromatic structure and ethylene chain on the compound remain the same, seen in the black outline. Grey arrows point to sites that are not diversity sites for this molecule.

With these three diversity sites, various functional groups or substituents can be added to one or more of them to make different compounds. For example, one substituent, fluorine (F), can be substituted at one or more of the sites to make 8 unique compounds, including HVA.

A chart with line-bond drawings of compounds similar or identical to homovanillyl alcohol. Above the chart, the line-bond drawing shows the same diagram as the middle diagram in Figure 4.4.b, with R1, R2, and R3 highlighted in red. This is the skeletal structure. In the first row of the chart, the left cell says “Changing zero sites (original molecule)”, the middle cell shows the structure of homovanillyl alcohol, as described in Figure 4.4.a, and the right cell shows possible substituents for R1 = methoxy (CH3O) or fluoro (F), R2 = hydroxy (OH) or fluoro (F), and R3 = hydroxy (OH) or fluoro (F).   In the second row of the chart, the left cell says “Changing one site”, and the subsequent cells in this row show three molecules, in which a fluorine atom replaces one substituent at R1, R2, or R3 in the skeletal structure shown at the top of the diagram.   In the third row of the chart, the left cell says “Changing two sites”, and the subsequent cells in this row show three molecules, in which two fluorine atoms replace two of the three substituents R1, R2, and/or R3, in the skeletal structure shown at the top of the diagram.   In the fourth row of the chart, the left cell says “Changing three sites”, and the subsequent cell in this row shows one molecule, in which three fluorine atoms replace all three substituents R1, R2, and R3, in the skeletal structure shown at the top of the diagram. The formula below the chart reads “2x2x2 = 8 total compounds”.
Figure 4.4.c. By introducing a different substituent (a fluorine atom) at one or more diversity sites on HVA, eight unique compounds can be synthesized.

To further expand this idea, we can introduce a second functional group to substitute at each site of diversity, such as chlorine (Cl). Now there are even more possibilities for mixing-and-matching (Figure 4.4.d).

A chart with line-bond drawings of compounds similar or identical to homovanillyl alcohol. Above the chart, the line-bond drawing shows the same diagram as the middle diagram in Figure 4.4.b, with R1, R2, and R3 highlighted in red. This is the skeletal structure. In the first row of the chart, the left cell says “Changing zero sites (original molecule)”, the middle cell shows the structure of homovanillyl alcohol, as described in Figure 4.4.a, and the right cell shows possible substituents for R1 = methoxy (CH3O), fluoro (F), or chloro (Cl), R2 = hydroxy (OH), fluoro (F), or chloro (Cl), and R3 = hydroxy (OH), fluoro (F), or chloro (Cl).   In the second row of the chart, the left cell says “Changing one site”, and the subsequent cells in this row show six molecules, in which a fluorine atom or a chlorine atom replaces one substituent at R1, R2, or R3 in the skeletal structure shown at the top of the diagram.   In the third row of the chart, the left cell says “Changing two sites”. The subsequent cells in this row show twelve molecules, in which two of the three substituents R1, R2, and/or R3, in the skeletal structure shown at the top of the diagram are replaced by some combination of two fluorine atoms, two chlorine atoms, or one fluorine atom and one chlorine atom.   In the fourth row of the chart, the left cell says “Changing three sites”. The subsequent cell in this row shows eight molecules, in which all three substituents R1, R2, and R3, in the skeletal structure shown at the top of the diagram are replaced by some combination of fluorine and chlorine atoms.
Figure 4.4.d. By introducing two different substituents (a fluorine atom or a chlorine atom) at one or more diversity sites on HVA, 27 unique compounds can be synthesized.

The complete set of compounds that can be produced is called a combinatorial library. The library size refers to the total number of compounds in the library. It is a function of both the number of substituents and the number of diversity sites, expressed in the equation:

The equation for the size of the combinatorial library derives from combinatorics. For example, in Figure 4.4.c, there are two different substituents at three diversity sites. To create any molecule in the library, you have two choices for the first diversity site, two choices for the second diversity site, and two choices for the third diversity site. Therefore, there are 2x2x2 = 8 different possibilities.

For a molecule to become an approved drug, the typical success rate is approximately one in 5 000 to 10 000. Therefore, chemists must create libraries of at least 5000 compounds to screen. This is where combinatorial chemistry plays an important role. By increasing the number of substituents at each diversity site, thousands of novel drug candidates can be produced. All these compounds are synthesized by automated robotic processes, which simplifies and speeds up the synthetic process.

For example, with 18 different substituents at each site, a total of 5832 unique compounds can be synthesized (183 = 5832), all of which can be screened for dopamine inhibition in the hopes of developing a potential therapeutic. Figure 4.4.e below shows a sample of 18 substituents that could potentially be used for this process.

 

18 line-bond drawings of compounds similar or identical to homovanillyl alcohol. The structure of homovanillyl alcohol is shown at the top, as described in Figure 4.4.a, with the hydroxy group (OH) at the bottom left of the molecule highlighted in red. The other 17 line-bond drawings show variations of this structure, in which the red hydroxy group is replaced by a variety of possible functional groups such as halogens, amines, amides, esters, ethers, and thiols.
Figure 4.4.e. A sample of 18 possible substituents at R2, including the original -OH group in HVA, shown in red, with the remaining backbone of the molecule shown in black.

 

(The full solution this solution can be found in Chapter 5.3)

 

Identifying Diversity Sites on a Library of Molecules

Given several molecules in the combinatorial library, it is possible to infer the number of substituents and diversity sites present, and calculate the maximum size of the combinatorial library. For example, three molecules in a library of drug candidates are shown in Figure 4.4.f.

Three line-bond drawings, in which each compound has a five-membered singly bonded carbon-containing ring, with various substituents outside of the ring. In the left diagram, clockwise from the top, the top carbon atom in the ring is bonded to an amine group (-NH2). The next carbon atom in the ring is bonded to a hydroxymethyl group (-CH2OH). The next carbon atom in the ring is bonded to a methyl group (-CH3). The next carbon atom in the ring has no extra substituents. The last carbon atom in the ring is bonded to an isopropyl group (-CH(CH3)2).  In the middle diagram, clockwise from the top, the top carbon atom in the ring has no extra substituents. The next carbon atom in the ring is bonded to a hydroxymethyl group (-CH2OH). The next carbon atom in the ring is bonded to a methyl group (-CH3). The next carbon atom in the ring is bonded to a chlorine atom (-Cl). The last carbon atom in the ring has no extra substituents.  In the right diagram, clockwise from the top, the top carbon atom in the ring is bonded to an amine group (-NH2). The next carbon atom in the ring is bonded to a methyl group (-CH3). The next carbon atom in the ring is bonded to a methyl group (-CH3). The next carbon atom in the ring is bonded to a chlorine atom (-Cl). The last carbon atom in the ring is bonded to an isopropyl group (-CH(CH3)2).
Figure 4.4.f. Three different molecules that are part of a combinatorial library.

To determine the number of diversity sites, the first step is to determine the core structure that stays constant throughout all the molecules. In this example, the core structure includes carbon atoms within the ring and outside of the ring, shown in blue in Figure 4.4.g.

The same three line-bond drawings as shown in Figure 4.4.f, with a portion of each molecule highlighted in blue. The blue portion includes the five-membered ring as well as the carbon atoms connected to position 2 and 3 of the ring, when numbered clockwise from the top of the ring.
Figure 4.4.g. Three different molecules that are part of a combinatorial library. The portion shown in blue is the core structure conserved in all compounds within the combinatorial library.

After determining the core structure, the diversity sites can be identified (Figure 4.4.h). These are the locations that have different substituents in the three molecules. For example, diversity site 1 could have the substituent NH2 (in the molecules on the left and right) or the substituent H (in the molecule in the center). Similarly, diversity site 2 could have the substituents OH or H, diversity site 3 could have the substituents Cl or H, and diversity site 4 could have the substituents H or isopropyl. The carbon group on the bottom right highlighted by the gray arrow is commonly mistaken to be a diversity site as it is a substituent outside of the ring; however, as this carbon group shows up in all three molecules, it is not changing and thus, is not a diversity site.

The same three line-bond drawings as shown in Figure 4.4.f, with the same portion highlighted in blue as shown in Figure 4.4.g. Additionally, yellow dots with numbers inside are shown in each of the three drawings. The number 1 is shown at the top-most carbon in the ring. The number 2 is shown at the carbon atom on the top right, which is the carbon atom connected to carbon 2 in the ring (when numbered clockwise from the top of the ring). The number 3 is shown at the carbon atom on bottom left of the ring. The number 4 is shown at the carbon atom on the top left of the ring. The same numbering scheme appears in all three molecules.
Figure 4.4.h. Three different molecules that are part of a combinatorial library. The portion shown in blue is the core structure while the yellow dots represent the four diversity sites which have variable substituents.

The next step is to identify the different substituents that are used to create this library. In these three compounds, there are a total of five different substituents present on the various diversity sites: NH2, OH, H, Cl, isopropyl. Thus, when constructing a full library of drug candidates, all five of these substituents can be used on each diversity site. Thus, a total of 54 = 625 unique compounds can be synthesized.

A line-bond diagram showing the core structure of the molecule shown in Figure 4.4.f, 4.4.g, and 4.4.h, with the same portion highlighted in blue. Clockwise from the top, the top carbon atom in the ring is connected to R1. The next carbon atom in the ring is bonded another carbon atom that is then connected to R2. The next carbon atom in the ring is bonded to a methyl group (-CH3). The next carbon atom in the ring is connected to R3. The last carbon atom in the ring is connected to R4.The possible substituents listed for R1 include NH2, OH, Cl, H, and isopropyl. The same possible substituents are listed for R2, R3, and R4.   The formula at the bottom of the diagram reads “Size of library = (# of substituents) raised to the exponent of (# of diversity sites)”. This is equal to 5 raised the exponent 4, which is equal to 625 unique compounds.
Figure 4.4.i. The core structure of compounds in the combinatorial library is shown in blue and the four diversity sites are shown in red, indicated as R1, R2, R3, and R4. There are five substituents that can be present at each diversity site. Thus, the library size is 54, or 625.

The following video includes a worked example from a previous CHEM 1AA3 test or exam that students struggled with. Try solving it on your own before looking at the solution.

 

(The full solution to this problem can be found in Chapter 5.3)

 

Key Takeaways

  • In the drug delivery process, combinatorial chemistry can be utilized on a lead compound of choice, such as HVA, to create a library of compounds to test.
  • Combinatorial chemistry involves “mixing and matching” various substituents onto specific atoms of the lead compound called diversity sites. These sites can have the functional groups modified or substituted to other substituents of choice.
    • The resulting library of different but similar compounds is called a combinatorial library.
  • The amount of compounds in a library is given by the equation below:
    • (# substituents)(# diversity sites)

Key terms in this chapter:

Key term Definition
Combinatorial chemistry The process of modifying a lead compound with various functional groups at multiple sites to create a large library of compounds.
Diversity sites An atom on a lead compound which can have its functional groups modified to other functional groups of choice.
Combinatorial library A library, or large collection, of unique compounds, all with the same core structure derived from a lead compound.

Diversity in Chemistry: Árpád Furka 

The field of combinatorial chemistry is a lot broader than discussed in this chapter, with not only diverse small molecules being synthesized, but also larger polymers made of amino acids and nucleotides. Árpád Furka is considered one of the pioneers of combinatorial chemistry, developing a method known as the split-and-mix synthesis. This technique uses solid-phase synthesis, where a growing peptide is bonded to a solid bead, and reactants are added to the resin to react with the reactive growing chain. As the name suggests, different combinations of amino acids can be made, and then these small chains can be divided and react with each other to form larger ones. The cycle can repeat infinitesimally to synthesize millions of different combinations of peptides or DNA for testing. This technique was first developed in 1982 at the Eötvös Loránd University in Budapest, Hungary. As a young boy born into poverty, Furka faced many challenges as he mainly worked in his adolescence to support his family, and was behind his peers in academics as he did not have the opportunity to study. Despite so, after attending school in his early twenties, Furka managed to become a trailblazer in combinatorial chemistry, developing one of the most powerful techniques in the field.

A flow-chart depicting different coloured dots in red, yellow, green and blue. At the top row of the flow-chart is one green dot, which can combine with a red, yellow, or blue dot to create two dots bonded together, shown in the next row of the flow-chart. Each of these pairs of dots can then combine with a red, yellow, or blue dot to create three dots bonded together, shown in the last row of the flow-chart.
The split-and-mix synthesis technique pioneered by Furka.

 

Any feedback or comments on this chapter? You may either email chemoer@mcmaster.ca, access this MS Form, or provide a comment in the feedback box below.

definition

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Organic Chemistry and Chemical Biology for the Students by the Students! (and the Profs...) Copyright © 2023 by Emma Abreu; Anumta Amir; Anthony Chibba; Jim Ghoshdastidar; Sharonna Greenberg; Angela Liang; Layla Vulgan; and Shuoyang Wang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book