November 14, 2012
Past Project Reports
Computer-aided drug discovery and development are challenging computational approaches that play critical roles in the early phases of a drug project. Our research is developing cutting-edge computational methods to better identify promising drug leads; we are applying these methods to discover new inhibitors of the protease from dengue (and related) viruses. In Phase 1 of our project, we used the Autodock program to screen a library of ~2.3 million small molecules against proteases from dengue, West Nile, and hepatitis C viruses. This program predicted binding conformations and energies between each small molecule and each protease. This conventional approach typically has a high false-negative rate, and thus not surprisingly only a single novel dengue drug lead was authenticated by laboratory testing of the best computer predictions from Phase 1.
Phase 2 calculations were implemented to improve the binding energy predictions. These calculations used conformations from Phase 1 (which are typically very reliable) and detailed computationally demanding molecular dynamics simulations to accurately estimate entropy and solvation terms. Phase 2 calculations, using 2000 molecules with the best energies from Phase 1, were recently completed against the dengue protease and are being analyzed. Phase 2 calculations against West Nile virus protease have resulted in a large number of incomplete (i.e., failed) calculations, apparently due to instability in the molecular dynamics simulations. We are investigating approaches to make these routines more robust and stable. Phase 2 calculations against hepatitis C virus protease are ongoing.
In addition, we have recently completed Phase 2 free energy of binding calculations against several test systems (trypsin, HCV, ER agonist) to determine the effectiveness of these calculations against diverse protein systems. The small molecule libraries for these test systems contain several dozen known inhibitors and several thousand decoy (i.e., non-binding) molecules. In all cases, Phase 2 calculations were significantly better than Phase 1 AutoDock calculations at identifying known inhibitors in the screened libraries. We are using these test systems to investigate how to further improve the Phase 2 calculations to identify even greater numbers of binders from the test libraries.
Figure 1 from March 09, 2011 update
Figure 2 from March 09 update
Distribution of Hepatitis C, Dengue, and West Nile around the world.
March 09, 2011
Many apologies for the interval between updates. And, thank you for your patience and computer cycles as we move into the final stages of this challenging project.
Phase 2 has been fully optimized, tested, and launched. The initial results of our large-scale free energy of binding calculations, launched during the past year on test systems, have supported our expectations that these calculations would provide improved enrichment rates and reduced false positive rates relative to virtual screening based on traditional docking programs (in this case, AutoDock). More details will be provided below.
While we continue to benchmark test cases (based on the systems developed by Shoichet and Irwin laboratories [http://dud.docking.org/ ]) to better understand the strengths and limitations of our phased approach, we have launched Phase 2 production runs for the dengue and West Nile virus proteases, and are readying hepatitis C virus protease for Phase 2 free energy calculations. Each Phase 2 production run is expected to take ~4 months of wall-clock time (based on the turn-around times we are seeing with the test systems).
For completeness, we completed laboratory testing (i.e., using protease inhibition assays) of ~50 compounds predicted from the Phase 1 simulations as potential inhibitors of the dengue and West Nile proteases. Not surprisingly (actually, quite expected), none of the tested Phase 1 ï¿½hitsï¿½ showed activities that were more promising than some of our already discovered dengue and West Nile protease inhibitors (see, for instance, Tomlinson and Watowich, Anthracene-based inhibitors of dengue virus NS2B-NS3 protease. 2011. Antiviral Res. 89, 127-35). It is the Phase 2 simulations that were established to correctly predict compounds that will be good protease inhibitors.
For those interested, here are some details about the initial tests that were done to validate our ongoing free energy calculations. A small-scale virtual screening experiment, using a collection of 16 binders and 14 non-binders to the lysozyme protein, showed that free energy of binding calculations could accurately discriminate between binders and non-binders. The success rates of the free energy of binding predictions (green curve labeled DG in the figure to the right) closely followed the ideal curve (yellow curve), and were a marked improvement on the Autodock (red curve) predictions. These results supported the ability of our two phase approach to reduce false positives in virtual screening experiments.
A recently completed medium-scale virtual screening experiment (using our Phase 1 and Phase 2 strategy) examined the estrogen receptor agonist system available from the Shoichet and Irwinï¿½s ï¿½Directory of Useful Decoysï¿½ (DUD; http://dud.docking.org/). This test system contained ~70 known small molecules binders and ~2,300 chemically similar ligands that were assumed to be non-binders (i.e., decoys). From a practical point of view, we are most interested in the early success rate of virtual screening calculations, since this will dictate our success at validating predicted inhibitors in the laboratory. It is practical (based on time and budget constraints) to test only a few dozen to a few hundred of the best computer predictions. Thus, it is important that computerï¿½s best predictions most highly correlate with true binders. In the figure to the right, we compare the early success rates for Autodock (Phase 1; red curve) and free energy (Phase 2; green curve) calculations. On average, the top Autodock-predicted ï¿½hitsï¿½ had an success rate of ~15% while the top free-energy-predicted ï¿½hitsï¿½ had a success rate of ~40%. This success rate varied as a function of number of top predicted ï¿½hitsï¿½ that were examined, but typically the free energy calculations identified ~3 times more binders relative to the Autodock calculations.
We are investigating methods to further improve the success rate of the free energy calculations, since we would like early success rates that exceed 80%. Moreover, we are completing Phase 2 calculations on additional DUD test systems to determine if the observed free energy-based improvements to the success rate are applicable to different target proteins. This is necessary, since virtual screening success rates vary dramatically as a function of the target protein. If one is interested, within the free virtual screening portal that we constructed (http://docking.utmb.edu) we have posted results (http://docking.utmb.edu/dudresults) obtained from running several virtual screening programs against the 40 protein targets (and libraries) that are part of the DUD database.
Finally, we are honored to receive a grant from the IBM International Foundation to support expanded testing of compounds predicted by World Community Grid calculations. The IBM International Foundation grant is based in part on the cash prize received from the recent Jeopardy! win by the IBMï¿½s Watson supercomputer.
All my best,
June 25, 2009
A recent review from our lab discusses computational efforts to find small molecule drug-like leads that disable the dengue virus NS3 protease. These structure-based grid-centric approaches can be applied to most disease targets.
April 13, 2009
We are back on track with our DiscoveringDengueDrugs-Together project, and life is returning to some semblance of normalcy in Galveston and at UTMB. We thank World Community Grid members for their support and patience during the past 6 months following Hurricane Ike.
The Autodock phase (Phase 1) of our dengue project is now complete! We have screened (i.e., computationally tested) more than 3 million potential drug candidates against each of 10 different proteases from dengue, West Nile, and hepatitis C viruses. Inhibiting these target proteases prevents virus replication. We continue to analyze the 30 million Phase 1 results. We are using biochemical, cell-based, and small animal assays to characterize some of the inhibitors identified during our recently completed Phase 1 calculations.
The CHARMM phase (Phase 2) of our project is being optimized at Texas Advanced Computer Center (TACC; www.tacc.utexas.edu) in Austin, Texas and ported by IBM to the grid for testing. This phase will perform extensive molecular dynamics calculations to accurately calculate binding free energies for the best potential drug candidates predicted in Phase 1. These calculations will significantly reduce the number of false positive predictions produced in Phase 1, thereby increasing our success rate for identifying experimentally active compounds from ~10% to >80%. Launch of Phase 2 is anticipated for early summer, 2009.
As we prepare to launch Phase 2 of our project to discover drugs for dengue, West Nile, and hepatitis C infections, we will also begin Autodock calculations in support of a collaborative drug discovery effort against leishmaniasis disease. This work is done in partnership with researchers from Universidad de Antioquia, Colombia (http://www.pecet-colombia.org). Leishmania affects ~12 million people throughout the tropics, subtropics, and southern Europe, with ~2 million new cases each year. The disease is spread by the bite of sand flies infected the Leishmania parasite. Although a handful of antimicrobials exist to treat some forms of leishmaniasis, concerns about their modes of delivery, effectiveness, resistance, and cost spur our drug discovery efforts into novel anti-Leishmania drugs. For this project, scientists in Colombia identified a set of enzymes critical for survival of the Leishmania parasite. Atomic structures exist for each of these enzymes, this allowing us to computationally examine our drug candidate library for compounds that prevent the Leishmania enzymes from functioning.
As reported earlier, this project is now running better than ever before. We have moved our main computational tasks (those required to prepare workunits for World Community Grid and analyze results returned from the grid) to the supercomputers housed at TACC. This establishes a nice synergy between one of the worldï¿½s most powerful supercomputers and the worldï¿½s largest computer grid. Our storage capabilities have increased in size and robustness with redundant storage occurring at TACC and locally on our 12 TByte IBM DS3200 disk system.
As stated before, we greatly appreciate the computer time that has been unselfishly provided by the members of the World Community Grid!
January 30, 2009
In response to the following question posted on the World Community Grid forum:
Q: ï¿½Obviously, we know the project is back up and running again (it appears to still being 'throttled back' though), and thus, I was wondering, have all the scientists/staff managed to get their lives (and labs) back in full working order yet?ï¿½
A: Good question! We have asked the IBM to rein in this project a little; hence, we are not receiving equivalent processing time as other projects. There were several factors for this decision. (1) Although we are up and running, we are not yet fully recovered at UTMB. The campus and community (and hence our staff) continue to rebuild, but this will take time and money (and only one of those items is assured). (2) We are still evaluating our new disk storage arrangement. Until we have more experience with its operation, we decided to temporarily reduce the data flow we receive from the grid (since each target generates ~0.5 TByte of information, we are seeing how well we handle 1 TByte/month of incoming data before again ramping up this project). (3) We have ~4 more antiviral targets (mainly HCV) and 6 Leishmania targets to examine in Phase 1. However, we are behind in analyzing and testing Phase 1 results received for the dengue and West Nile protease targets. Thus, we are playing catch-up with the results we have received from your computers. Delays in analyzing Phase 1 results are due to Hurricane Ike and from shifting the efforts of our computational chemists more towards the Phase 2 calculations. Delays in "wet bench" testing of Phase 1 results are due to Hurricane Ike damage and the time-consuming nature of these experiments.
So, at this point we are hard-pressed to keep up with the data generated by the grid (too many ever-increasing powerful computers out there). However, I suspect that the grid will need to grow even more to return our Phase 2 calculations as fast as we would like ... we will see shortly ...
January 18, 2009 update
On January 18, 2009, our World Community Grid project ï¿½Discovering Dengue Drugs-Togetherï¿½ resumed sending out workunits to grid clients around the world. This project has been off-line since September 13, 2008, when Hurricane Ike devastated Galveston, Texas and our university. We thank World Community Grid members for their support and patience as we rebuild our damaged infrastructure and community. Much is left to do to fully recover, but at least this project is restored.
Our initial projections as to how long it would take to restart operations were clearly overly optimistic. Recounting the myriad of large and mundane problems that were faced to get back on-line is a little mind-numbing at this point. However, in brief, most personnel on this project lost their homes and possessions, and needed to focus on family recovery. Our main computer cluster sustained damage and remains unstable (contrary to initial assessments). Disk storage components were lost to flood waters and have taken time to replace. Institutional support for this project was reduced as the University of Texas Medical Branch cut ~25% of its staff.
On the positive side, this project is now running better than ever before. We have moved our main computational tasks (those required to prepare workunits for World Community Grid and analyze results returned from the grid) to the supercomputers housed at the Texas Advanced Computer Center (TACC; www.tacc.utexas.edu) in Austin, Texas. Our pre- and post-processing calculations execute 10x faster in this environment compared to our in-house 30-node computer cluster. Thus, we have established a nice synergy between one of the worldï¿½s most powerful supercomputers and the worldï¿½s largest computer grid. Our storage capabilities have increased in (1) size with the local addition of IBMï¿½s 12 TByte DS3200 disk subsystem and (2) robustness with redundant storage occurring at TACC and locally. Programs required to accurately calculate free energy of binding (Phase 2 of this project) have been tested and optimized at TACC, and are being ported to the grid. Thus, the computational side of this project is more powerful and dynamic than before Hurricane Ike.
The current tasks before us are clear. (1) We will resume pre-clinical characterization of potential dengue, West Nile, and hepatitis C virus protease inhibitors predicted by our Phase 1 calculations. Several Phase 1 compounds have shown promising in vitro activity. (2) Launch Phase 2 free energy of binding calculations to reduce the large false-positive rates associated with virtual screening. Finally, our newly established synergistic World Community Grid/TACC environment will help us tackle additional global health projects ï¿½ with IBMï¿½s support we are about to launch a collaborative drug discovery effort with researchers in South America to target the Leishmania parasite.
As stated before, we greatly appreciate the computer time that has been unselfishly provided by the members of the World Community Grid. Your interest and support has made this grid one of the most powerful computing resources on the planet!
December 17, 2008
Hello from UTMB! No, we are not inactive. Yes, we are bringing this project back online. Yes, it is taking us longer than expected since we continually run into unexpected personnel and infrastructure surprises. But, behind the scenes, we are working hard and are committed to bringing this project back online. The coding and testing of Phase 2 (free energy calculations) of this project is occurring even as Phase 1 sits apparently silent.
Our main LINUX cluster used for pre/post processing of workunits and storage of the vast amount of information generated by your crunching computers was offline. Now, only our 12 TB IBM DS3200 storage system is offline. We are reinstalling 3rd party interface boards and drivers to bring this system up (if this fails, we will move this storage system to another cluster, and work from there). Surprising, as it may seem, this is taking time.
Thank you for your continued patience.Best wishes to all for this Holiday Season from the DDD-T group at UTMB.
September 30, 2008
A quick update on our project, which was greatly impacted by Hurricane Ike when it tore through the Houston-Galveston area ~16 days ago. Houston, the 4th largest city in the US, is recovering quickly from this massive storm. Power is largely restored to most residents (including my own home). It was very strange and quiet to be without electricity and internet for many days.
Unfortunately, Galveston and the University of Texas Medical Branch (where this project originates) were very hard hit by the hurricane's storm surge. Fortunately, prudent preparations kept most residents and staff safe. However, it will take Galveston many months (to years) to completely rebuild and recover from the storm's destruction. Galveston remains largely without power, and water is problematic in many places. Yet, even without power, cleanup and rebuilding operations are in full swing.
Our current priority remains getting our students and researchers dug out, dried off, and moved into new apartments (in my lab alone, over 60% of the students/post-docs lost their apartments and all possessions due to flooding and wind damage). Once our staff and their families are secured and resettled, the campus repaired, and our labs reopened, our DDDT project will return to full production mode. We are working hard to have this accomplished within the next 2 weeks. Until then, thank you for your help and keep your computers dry and crunching fast.Pre-Ike, we tested the first set of dengue and West Nile inhibitors predicted by our Phase 1 calculations (0202 and 0203 workunits). We were delighted to find several compounds that were highly effective protease inhibitors. Once our labs reopen, we will test the antiviral activity of these compounds in cell culture. We are very encouraged by the preliminary Phase 1 results, and look forward to starting our Phase 2 calculations to improve the success rate of our antiviral drug discovery calculations.
June 3, 2008 update
ï¿½ Our Discovering Dengue Drugs ï¿½ Together project has currently benefited from >6,000 years of computer processing time on World Community Grid. We are extremely grateful and indebted to those individuals who have unselfishly volunteered their computers to help us search for cures to dengue, West Nile, and Hepatitis C diseases.
ï¿½ Trypsin protease (Protein Data Bank entry 1EB2) has been screened against our 2.2 million member ligand library to provide a negative control for inhibitor discovery (termed workgroup 0401).
ï¿½ Hepatitis C virus protease was used as a target to screen a combined drug and lead-like library containing 2.2 million compounds. These calculations (termed workgroup 0501) were completed in ~ 2 months.
ï¿½ A second optimization of the Autodock program and its parameters was completed in late April 2008. This included modifying the code to enable multiple ligands to be combined into a single workunit for delivery to end-user computers. Although this increased the execution time for each workunit, this has decreased network traffic on the World Community Grid servers and enabled more ligands for the DiscoveringDengueDrugs-Together project to be distributed. This has allowed our workflow to increase by ~2- to 3-fold.
ï¿½ Recoding and porting the CHARM molecular dynamics program for use on World Community Grid is underway. The CHARMM program will be the main software for Phase 2 of this project. It will enable us to accurately calculate free energies of binding for tens of thousands of the best fit protease-ligand structures determined in the Phase 1 Autodock calculations. These calculations are necessary to correct false-positive energy predictions made by the Autodock scoring function.
ï¿½ A variant of the dengue virus NS3 protease was used as a target to screen a combined drug and lead-like library containing 2.2 million compounds. These calculations (termed workgroup 0202) were completed in <1 month.
ï¿½ A variant of the West Nile virus NS3 protease was used as a target to screen a combined drug and lead-like library containing 2.2 million compounds. These calculations (termed workgroup 0302) were completed in <1 month.
ï¿½ We are preparing to launch a new version of the Phase 1 Autodock software that contains error checking embedded within each workunit. This will eliminate the use of duplicate (or triplicate) calculations to validate each workunit. This improved code should allow us to reach our goal of screening our ligand library against a single protein in less than 2 weeks.
ï¿½ The combined results from screening our ligand library against the dengue virus NS3 protease and a dengue protease variant are being analyzed in our laboratory to identify common hits (i.e., compounds predicted to fit into the protease active site). These results will be filtered to remove inhibitors predicted to bind trypsin, since broad inhibitors may have adverse human effects. Compounds with promising inhibition constants and specific dengue protease binding will be synthesized and evaluated in cell culture and animal models.
ï¿½ The combined results from screening our ligand library against West Nile and dengue proteases are being analyzed in our laboratory to identify potential flavivirus protease inhibitors. These results will be filtered to remove compounds predicted to bind trypsin, since broad-spectrum protease inhibitors may have adverse human effects. Compounds with promising inhibition constants and specific flavivirus protease binding will be synthesized and evaluated in cell culture and animal models.
January 22, 2008 update
ï¿½ An initial optimization of docking parameters was completed in early Fall, 2007. This allowed us to reduce computation time ~20-fold without impacting the accuracy of pose predications.
ï¿½ Dengue virus NS3 protease (Protein Data Bank entry 2FOM) was used as a target to screen a combined drug and lead-like library containing 2.2 million compounds. These calculations (termed workgroup 0201) were completed in ~2 months. Grid resources for these calculations were limited to <10% of World Community Grid to minimize file transfer bottlenecks resulting from packing only a single ligand into each workunit.
ï¿½ Predicted inhibitors obtained from screening our ligand library against dengue virus NS3 protease are being analyzed in our laboratory using a protease inhibition assay. Compounds with promising inhibition constants will be further evaluated in cell culture and animal models.
ï¿½ West Nile virus NS3 protease (Protein Data Bank entry 2FP7) was used as a target to screen a combined drug and lead-like library containing 2.2 million compounds. These calculations (termed workgroup 0301) were completed in ~2 months and ended in early January ï¿½08. Grid resources for these calculations were also limited to <10% of World Community Grid to minimize file transfer bottlenecks resulting from packing only a single ligand into each workunit.
ï¿½ Predicted inhibitors obtained from screening our ligand library against West Nile virus NS3 protease are being analyzed in our laboratory using a protease inhibition assay. Compounds with promising inhibition constants will be further evaluated in cell culture and animal models.
ï¿½ Additional docking calculation optimizations are entering beta testing. These optimizations will enable us to reach our goal of screening our ligand library against a single protein in less than 2 weeks. This involves packaging multiple ligands (~10) into a single workunit, thereby minimizing file transfer bottlenecks. In addition, checksum error checking will be embedded within each workunit, eliminating the use of redundant calculations to validate each workunit.
ï¿½ Trypsin protease (Protein Data Bank entry 1EB2) is being screened against our ligand library to provide a negative control for inhibitor discovery (termed workgroup 0401).
ï¿½ Hepatitis C virus protease is being readied for computational screening.
ï¿½ Our Discovering Dengue Drugs ï¿½ Together project has currently benefited from 3,626 years of computer processing time on World Community Grid. We are grateful to those individuals who have unselfishly volunteered their computers to help us find cures for dengue, West Nile, and Hepatitis C diseases.
August 21, 2007 update
ï¿½ Phase 1 (Autodock) of this project was launched August 21, 2007.