Time to progression ratio: promising new metric or just another metric?
Commentary

# Time to progression ratio: promising new metric or just another metric?

Ming-Wen An1, Sumithra J. Mandrekar2

1Department of Mathematics and Statistics, Vassar College, Poughkeepsie, NY, USA; 2Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA

Correspondence to: Ming-Wen An. Department of Mathematics and Statistics, Vassar College, 124 Raymond Avenue, Poughkeepsie, NY 12604, USA. Email: mian@vassar.edu.

Submitted Sep 07, 2016. Accepted for publication Sep 11, 2016.

doi: 10.21037/atm.2016.10.21

Cirkel et al. (1) evaluated an alternative metric based on tumor progression for assessing treatment efficacy for targeted therapies. The time to progression (TTP) ratio, the metric they propose, is defined as the ratio of TTP2/TTP1, where TTP2 is the time to progression while on treatment and TTP1 is the time to progression prior to start of treatment (Figure 1). TTP1 is computed as time from baseline to progression in the absence of treatment and as such represents natural disease progression. The TTP ratio addresses two important aspects in the evaluation of treatment efficacy. First, by focusing on progression, rather than response, it acknowledges cytostatic drug activity. Second, by using TTP1, it accounts for intrinsic tumor growth.

A similar metric, the progression-free survival (PFS) ratio, proposed by Von Hoff et al. (2), uses time to progression on prior treatment (TTP1*), rather than TTP1, in the denominator (Figure 1). One limitation of the Von Hoff metric, as the authors note, is that the success of the previous treatment is a major determinant of efficacy of the treatment of interest (p.14). An important assumption with TTP1 is that there is a sufficient washout period between any prior treatments and measurement of TTP1, to avoid any lingering effects of prior treatments on TTP1. Nevertheless, even if TTP1 does indeed accurately represent natural disease progression, measuring TTP1 requires withholding treatment for a patient until documented progression, which in many instances may not be ethical or feasible, for example, in the setting of advanced or refractory disease.

A challenge with using a TTP-based metric, both TTP ratio and PFS ratio, is to decide how to handle patients who die before disease progression or who are lost to follow-up. To their credit, Cirkel et al. (1) clearly stated that such patients are considered non-evaluable (p.8). But excluding these patients from the overall analysis will bias the results; more discussion on the extent of this bias is needed. A related challenge is that of interval censoring. For example, in Cirkel et al. (1), tumor assessments were performed every 8 weeks until RECIST-defined progression. But if the tumor progression occurred between assessments, which is often the case, then the TTP ratio metric will be biased and will provide an overestimate for the true TTP. This problem is further exacerbated if a patient misses one of their assessments. These challenges arise with any TTP-based metric, and it is essential to assess the impact of these issues on the robustness of the metric and its interpretation. We applaud the authors for utilizing centralized assessment of volumetric measurements with at least two independent observers and obtaining replicable results. However, real time central reading of scans is usually not practical in multi-center trials as clinical decision making is based on local assessment of progression. Thus metrics that are robust to “reader” variability are needed. From a clinical trial standpoint, it would be immensely helpful if we can develop methodology for tumor-based measurement metrics that can address these issues.

The TTP ratio metric faces other challenges, which is also shared by other metrics based on tumor measurements (3-5). First, it’s not clear from the manuscript how the TTP ratio can be used as an early endpoint since it is based on time to progression, which can be quite long in many disease settings with effective therapies and therefore may still delay the process of detecting promising treatments early. Second, demonstrating superior predictive ability is not equivalent to demonstrating statistically significant differences in two groups via, for example, hazard ratios or Kaplan-Meier curves. These correspond to measures of association. Instead, appropriate statistical measures of discrimination, e.g., the concordance index, and calibration, e.g., goodness-of-fit statistics, should be utilized for evaluating predictive ability [Steyerberg et al. (6)]. For example, the authors highlight that, among RECIST-defined stable disease (SD) patients, the TTP ratio can differentiate between those with better or worse overall survival, an important clinical distinction. However the claim needs to be validated, as the results are based on a small sample size (n=28; p11) and based solely on statistically significant differences in Kaplan-Meier survival curves, and even so, the log-rank P value is 0.0496.

A few additional questions arose as we read the paper. First, how were the “fewer lesions” selected as target lesions (p.10) when determining the TTP ratio as opposed to RECIST? How are target lesions defined with TTP ratio that led to this difference? Second, how does the TTP ratio “identify potential patient groups that might benefit from treatment” in a manner that other tumor metrics cannot do? Finally, a patient was classified as a responder if the TTP ratio was <0.7. The authors explain their choice of cutoff was based on Von Hoff et al. (2), who used a cutoff that corresponds to 0.75 for the TTP. The authors chose a stricter cutoff since TTP1 is determined without treatment, in contrast to TTP1* under Von Hoff et al. (2), which is determined under previous treatment. While it is reasonable to choose a stricter cutoff, it is not clear that 0.70 is the best cutoff. Were alternative cutoffs considered?

We thank the authors for their exploratory evaluation of this new metric, thus contributing to the growing literature on identifying alternative tumor metrics. A number of the issues we raise here are common challenges with tumor measurement data. We hope that such continued efforts will pave the way towards developing more robust tumor metrics predictive of long-term clinical outcomes.

None.

## Footnote

Provenance: This is a Guest Commentary commissioned by Section Editor Jianrong Zhang, MD (Department of Thoracic Surgery, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou Institute of Respiratory Disease, Guangzhou, China).

Conflicts of Interest: The authors have no conflicts of interest to declare.

## References

1. Cirkel GA, Weeber F, Bins S, et al. The time to progression ratio: a new individualized volumetric parameter for the early detection of clinical benefit of targeted therapies. Ann Oncol 2016;27:1638-43. [Crossref] [PubMed]
2. Von Hoff DD, Stephenson JJ Jr, Rosen P, et al. Pilot study using molecular profiling of patients' tumors to find potential targets and select treatments for their refractory cancers. J Clin Oncol 2010;28:4877-83. [Crossref] [PubMed]
3. An MW, Han Y, Meyers JP, et al. Clinical Utility of Metrics Based on Tumor Measurements in Phase II Trials to Predict Overall Survival Outcomes in Phase III Trials by Using Resampling Methods. J Clin Oncol 2015;33:4048-57. [Crossref] [PubMed]
4. An MW, Dong X, Meyers J, et al. Evaluating Continuous Tumor Measurement-Based Metrics as Phase II Endpoints for Predicting Overall Survival. J Natl Cancer Inst 2015.107. [PubMed]
5. Mandrekar SJ, An MW, Meyers J, et al. Evaluation of alternate categorical tumor metrics and cut points for response categorization using the RECIST 1.1 data warehouse. J Clin Oncol 2014;32:841-50. [Crossref] [PubMed]
6. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 2010;21:128-38. [Crossref] [PubMed]
Cite this article as: An MW, Mandrekar SJ. Time to progression ratio: promising new metric or just another metric? Ann Transl Med 2016;4(Suppl 1):S43. doi: 10.21037/atm.2016.10.21