At last year’s BIO CEO & Investor Conference, BIO and BioMedTracker presented data showing that drugs in development for oncology indications had the lowest overall clinical trial success rate of all therapeutic areas in development since 2004. We have updated these numbers to include clinical trials through December 2011. Although overall success rates did improve slightly, oncology remained in last place.
The chart below illustrates why oncology’s overall success rate is so low relative to other therapeutic areas. The overall Phase I to FDA approval (2004-2011) for oncology was 6.7% vs. all other therapeutic areas combined yielding 12.1%. This 2x delta is driven primarily by the big drop in Phase III success for oncology trials:
For this year’s BIO CEO & Investor Conference, we hosted an oncology panel topic to ask “Why?”: Why do oncology Phase IIIs have the lowest success rate? Is there a trend or defining characteristic in trial design for trials that are successful vs those that are not? Are there characteristics of Phase II trials that are predictive of Phase III success? Where is trial design headed and will this enable a reverse course?
The panel, moderated by Michael King, Jr. of Rodman & Renshaw, hosted an experienced line up of oncologists from both industry and academia:
- Mohammad Azab, MD, M Sc, MBA, CMO, Astex
- Michael M. Morrissey, PhD, President & CEO, Exelixis
- Robert F. Ozols, MD, PhD, SVP, Fox Chase Cancer Center
- Konstantin H. Dragnev, MD, Hematologist/Oncologist, Dartmouth-Hitchcock Medical Center
Why is oncology so much more difficult?
Dr. Azab kicked off the discussion by pointing out that there is an endpoint issue in oncology. The gold standard is the overall survival endpoint, but it is rare to get solid data for survival in Phase II. Typically Phase IIs are run for progression-free survival (PFS) or response rates (overall, complete, and partial response rates). This is very different from infectious disease or lipid lowering drugs, where the Phase endpoint is well defined and translated into the Phase III design. For example, unlike endpoints used in oncology, HepC and HIV have highly quantifiable markers such as viral load that provide direct, predictive value for Phase III. In oncology, the Phase II can be set up differently from the Phase III as the Phase II is often run to gain just initial evidence to justify more rigorous, quantitative investigation.
The real change needed to improve the success in oncology is to run better Phase II trials, according to Dr. Azab. And this does not necessarily mean randomized or double-blind, but it means running trials with clear, robust criteria for success.
Is it a design issue?
BioMedTracker’s Michael Hay presented data from our recent BIO/BioMedTracker analysis of Phase II cancer trial design components. A subset of the data is presented below, and shows Phase II trial characteristics for trials that made it to Phase III. These 97 Phase II trials were divided into those that were successful in Phase III vs those suspended after reporting Phase III results. In terms of numbers, 46 were advanced from Phase III (went on to NDA/BLA submission with the FDA), and 51 were suspended after Phase III results.
For major trial parameters such as randomized/non-randomized, open-label/double-blind, and single/multi-arm, the difference between successful and unsuccessful programs was not statistically significant. In other words, more complex randomized, double blind, and multi-arm trials do not stand out as more likely to translate into a successful Phase III and eventual approval. Looking at the numbers, it seems single arm, open-label trials in Phase II are just as good at driving success in Phase III. The panelists, as noted below, gave hints as to why this may be the case.
Single-arm, open-label Phase II trials in recent approvals
Recent news drove the discussion on this topic. Pfizer’s (PFE) Xalkori was approved for lung cancer in 2011 based on two single arm, open-label Phase II trials with 255 patients total. Response rates were high (~60%), durable, and the patient population was enriched via biomarker. Curis’ (CRIS) Erivedge was recently approved in a 104 patient single arm, open-label Phase II trial based on response rate primary endpoints (see Genentech/Roche PII release here).
Roche’s (RHHBY) Zelboraf , approved for melanoma in 2011, had a biomarker and solid response rates in a 132 patient single arm, open-label Phase II, and was approved after a 675 patient Phase III trial. Pfizer’s (PFE) Inlyta was approved last month for RCC with a 723 patient randomized, open-label trial after conducting a 62 patient single arm, open-label Phase II. Incyte’s (INCY) Jakafi was approved for myelofibrosis after two randomized, double-blind Phase III trials in 525 patients after a 153 patient open-label PI/II study.
Dr. Dragnev, who specializes in lung cancer, explains that we are moving from non-mechanism based therapies to refined mechanism based therapies. Thus biomarkers will be more predictive, and we will see more trials that are small and focused. However, getting the right biomarker is a challenge. Sometimes we get it right (Xalkori), and sometimes not (early EGFR work), so we need to spend more time in Phase I validating the biomarkers.
The real answer: Biomarkers
Although the number one recipe for success in the future may be biomarkers, Dr. Azab noted that Avastin is a good example of not being able to find a biomarker, but developing a great drug. However, the industry is realizing that successful drugs without some form of companion diagnostic will be the exception in the future.
Dr. Azab expanded on what is meant by “biomarker”: It does not only mean finding the readout that correlates with clinical activity. Biomarkers can also tell you when the targeted pathway is being hit, even in the absence of clinical activity. The latter, when used early on, can help drive plans for progressing compounds designed for a specific biological hypothesis. As Chief Medical Officer at Astex (ASTX), Dr. Azab is searching for biomarkers in every ongoing Phase I trial as he believes biomarkers must be identified early on in order to drive the Phase II studies.
Dr. Ozols brought up that Herceptin would not have been approved in breast cancer, had Genentech not had the Her2 diagnostic. He added to Dr. Dragnev’s point that we need predictive Phase I trials. In the past, we focused on toxicity, PK/PD, and dosing, but now with all the antibodies in the clinic, we often do not need to worry about dose toxicity as most biologics have low side effects in cancer. The focus must be on identifying activity early on.
One key obstacle with both types of biomarkers, is obtaining the biopsy sample, and obtaining it in sufficient quantity and quality for analysis. A further complication for biomarkers is the complexity of the disease itself. Most indications can be broken down into a mix of many different diseases at the molecular level. However, there are other non-molecular selection criteria available. When we look back at OSI’s Tarceva, the EGFR mutations were correlated with phenotype such that selection by phenotype alone (female, non-smokers) could enrich the patient population. Secondly, the rash that developed in responding patients was also indicative of clinical activity.
At Exelixis (EXEL), bone scans are used to correlate PFS and other clinical benefit endpoints for cabozantinib. CEO Michael Morissey said he would be reluctant to bring any drug forward if it did not have some diagnostic/biomarker associated with it. Dr. Azab agreed, but added that the readout must actually provide evidence of hitting the intended target or pathway.
The grey area: endpoints
One issue brought up by the panel is the disagreement and inconsistency at the agency level on survival endpoints, surrogate endpoints, and single arm studies. One example is accepting PFS vs. overall survival. In the case of Avastin in ovarian cancer, there was solid PFS, but FDA wanted overall survival. However, it seems the EMEA will accept PFS. Dr. Ozols added that many oncologists feel that PFS is a clinically valid endpoint for approval, but that there is some debate over the degree of PFS required. For example, most everyone would agree that a 12 month difference is meaningful, but we enter into grey areas near two months. So the debate remains in the actual interpretation of how many months represents clinical benefit.
Just last week, we saw the ODAC vote 12:1 against Amgen’s (AMGN) Xgeva. Even though Xgeva is the first drug to significantly increase bone metastasis-free survival in castration-resistant prostate cancer, the four months was less than what FDA wanted, which was six months. When Dacogen was reviewed for AML today, the vote was 10:3 against, and median survival was seven months with p=0.11. So, some would view the FDA as being “hard-line” in the grey zone. Of course, there are exceptions, as illustrated by Xalkori being approved on response rates.
Dr. Ozols pointed to the recent termination of the PARP inhibitors for ovarian cancer. AstraZeneca’s Olaparib program was suspended due to a lack of clarity around how PFS outcomes in the Phase II study would translate into overall survival benefit in a larger Phase III. Although the Phase II study had high PFS scores, and many docs found the study to be a success, there was no certainty on how to select patients to definitively meet significant overall survival endpoints. This case also emphasizes that there is a need to select beyond hereditary mutations (like BRCA) and into post-hereditary mutations that arise over time. Perhaps these biomarkers are harder to identify due to the heterogeneity of the cancers found in any given population under study (in this case, finding markers for homologous repair deficiency which correlates with clinical response).
Another example is Bipar’s iniparib, a PARP inhibitor studied in breast cancer in Phase II and III. The Phase II was a success in that the 123 patient randomized trial demonstrated statistical significance for survival and high response rates. However, the 519 patient randomized Phase III did not meet its primary endpoint (overall survival). Although some of this may be attributed to design variables that were modified, it shows that simply using randomized trials and looking for survival is not the answer to all the failures we are seeing. What can happen in the control arm is that, for various reasons, the control (chemo only) group has a very poor response (low survival) and this boosts the statistics for the active arm.
Dr. Dragnev pointed out that Gemcitabine was approved without survival endpoints, but rather on patient reported outcomes. It took 14 years before the next approval without survival, and only a couple of drugs in the last 30 years were approved with patient reported outcomes. Dr. Ozols adds that for the Quality of Life (QoL) endpoint, reviewers essentially glaze over this as it is not as scientific to be used as a surrogate. Dr. Azab argued that patient reported responses have proven difficult in terms of data collection completeness and consistency.
Another issue with the FDA panel process, according to Dr. Ozols (who was an ODAC member in the 1990s) is that there can be panels where there is not an expert on the particular indication under review. As our understanding of cancer deepens, the complexities surrounding each indication grow. At the same time, drug products are becoming more specialized in down regulating tissue specific pathways. All of this may require more specialists to evaluate therapies in niche sub indications. Dr. Ozols adds that the ODAC has become more conservative over time.
Why aren’t there more adaptive trials in oncology today?
Synta (SNTA), Infinity (INFY), and Sunesis (SNSS) were mentioned as having taken an adaptive design approach to their ongoing trials. Exelixis (EXEL) CEO Michael Morissey added that some cabozantinib trials were run in such a manner as to select patients based on active histological factors allowing for recruitment in an iterative fashion. One of the issues with adaptive trials is that it is not as easy to register the trial with FDA. Other issues arise with statistical evaluation and logistics, which become top issues when registering a trial with FDA.
Regarding the statistical evaluation, and the PFS vs overall survival discussion earlier, Dr. Azab throws in the quote of the conference: “The joke among statisticians is that statisticians are the people that think Columbus did not discover America, because at the initial trial design his endpoint was to discover India!” Adaptive trials suffer from the same kind of criticism. A sponsor might set up a trial heading after one endpoint or patient group, but may find a better group or surrogate endpoint along the way. This creates a complicated set of data to interpret from a statistical viewpoint.