How to improve assessment of balance in baseline characteristics of clinical trial participants—example from PROSEVA trial data

Emir Festic, Bhupendra Rawal, Ognjen Gajic

Abstract

The randomization process is expected to balance assignment between the groups, independent to the participant and/or investigator, and as such avoids systematic error. However, it is recognized that groups assigned through the randomization process are not completely the same. Generally, a table with baseline characteristics is provided, where investigators report demographic and pertinent clinical variables based on the random group assignment and P values for the each variable in attempt to either support the balanced assignment or to indicate that the balance between groups was not ideal. The recently published PROSEVA trial showed more than 50% relative risk reduction of 28-day mortality among ARDS patients in the prone group compared to the supine group. In order to demonstrate a novel approach and exemplify how imbalance in baseline characteristics between groups could have potentially contributed to the large observed effect, we pooled pertinent baseline clinical variables from the trial in a meta-analysis-like manner. In addition to the quantification, we assigned the variable’s “quality” of probable effect on the outcome as likely beneficial or harmful. After pooling pertinent dichotomous variables by the probability of their effect on the outcome, it appeared that approximately 37% (18% to 60%) of the observed PROSEVA trial effect could have been due to differences in baseline clinical characteristics. The main limitation of this approach is that all variables are assumed to have similar weights on the outcome. Interestingly, the weights of beneficial and harmful effects on the outcome were very similar. The proposed method of assessment of potential imbalance between the intervention groups assesses not only the magnitude of the difference, but rather the pooled probability of beneficial or harmful effect towards outcome, as well. As such, it could be useful as a secondary measure for the assessment of imbalance in the trials with the unexpectedly large observed effects.