The World Health Organization (WHO) designated the pneumonia caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) as the 2019 coronavirus disease (COVID-19). COVID-19 has become a global public health crisis. As of May 8, 2020, more than 3.5 million cases of COVID-19 and 250,000 deaths have been reported to the WHO. As had said, “We cannot end the pandemic until we address the inequalities that are fueling it” (1).
Although most patients only exhibit mild symptoms, a considerable proportion of patients have serious disease progression, with the aggravation of hypoxia until respiratory distress or in some cases respiratory failure. Approximately 5% of patients need intensive care, including mechanical ventilation (MV) (2-4). The progression of respiratory distress is important to the survival and prognosis of patients. As patients with respiratory distress usually need equipment such as ventilators, if respiratory distress could be predicted in advance, medical personnel will be better equipped to deal with the emergency condition of some patients. Meanwhile, this will provide a basis for the allocation of medical resources and help improve the prognosis of patients. Because of the variability of COVID-19 virus and the shortage of healthcare resources in the affected areas, it is very important to effectively allocate resources to high-risk patients with worsening disease (5). Respiratory distress should be prepared in advance to avoid emergency intubation or cardiopulmonary resuscitation, so as to avoid endangering the lives and safety of medical staff.
Although several studies have analyzed the prognostic factors for mortality outcomes in COVID-19 (2,6,7), the predictors of respiratory distress have not been reported fully and in detail. Thus, in this study, we retrospectively evaluated patients with COVID-19 to determine the factors that can predict respiratory distress and the need for MV in the early stages of the disease. This will facilitate reasonable early triage of patients and adequate preparation for the provision of the required medical resources.
We present the following article in accordance with the TRIPOD reporting checklist (available at http://dx.doi.org/10.21037/atm-20-4977).
Study design and data collection
This was a retrospective study of 1,150 patients, 12–97 years of age, who were hospitalized in the infection ward of Wuhan No. 1 Hospital, China. The infection ward was established on February 12, 2020, and cleaned up on March 19, 2020. After 286 patients were excluded because they were not diagnosed at the time of discharge from the hospital, the remaining 864 patients who were included were diagnosed as having COVID-19 pneumonia (ICD-10 code U07.100x001) or COVID-19 infection (ICD-10 code U07.100x002), or had a clinical diagnosis of COVID-19 (ICD-10 code U07.100x003) according to Diagnosis and Treatment of COVID-19 (trial version 7) (8). COVID-19 infection was confirmed in the laboratory by local health authorities. In China, pediatricians take care of children <14 years old, and those above 14 years old are treated in an adult specialty. Therefore, we excluded one patient who was under the age of 14.
Derivation cohort: data for this cohort were extracted from the electronic medical records (EMRs) designed for patients with “respiratory distress” at the infection ward of Wuhan No. 1 Hospital, China (briefly designated as IWCH-COVID-19 in this paper). The clinical characteristics, laboratory findings, treatment, procedure, and outcome data of 863 patients who were discharged from the hospital (including patients who died or were transferred to another hospital) before March 19, 2020, were extracted from EMRs.
The anonymized data were subjected to privacy-free and cleaning treatment. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This study was approved by the Ethics Committee of the First Affiliated Hospital of Nanjing Medical University (No. 2020-SR-163), and informed consent from study participants was waived.
Patients were divided into two groups depending on whether they developed respiratory distress within 30 days: patients without respiratory distress (NRD group) and patients with respiratory distress before discharge (RD group). Respiratory distress was defined as the need for MV or being diagnosed with acute respiratory distress syndrome (ARDS, ICD-10 code J80.x00) or respiratory failure (ICD-10 codes J96.000, J96.900). Patient survival time in the RD group was defined as the number of days from the time of admission to the time of starting MV, while the patient survival time in the NRD group was counted as the number of days from the day of admission to the day of discharge (or day of death if the patient in the NRD group had died). Due to the recommendation of immediate high-flow nasal oxygen and prone positioning for patients with moderately severe hypoxemia (9) or respiratory distress (10), the patients undergoing these procedures were placed in the NRD group.
In our analysis, we included variables that were available in the hospitals’ EMR system and known or thought to affect respiratory distress risk according to results reported in related research. The laboratory test value closest to the entry date to the cohort for each patient was used, and missing values were imputed where necessary, as described below. Two demographic characteristics (age and gender); five clinical measurements [systolic blood pressure (SBP), diastolic blood pressure (DBP), breath, pulse, and temperature at admission]; seven comorbidities (hypertension, diabetes, coronary heart disease, chronic obstructive lung disease, cerebrovascular disease, carcinoma and chronic kidney disease) and fourteen laboratory test results [white blood cell count, neutrophil count, lymphocyte count, hemoglobin, platelet count, albumin, alanine aminotransferase (ALT), aspartate aminotransferase, serum creatinine, lactate dehydrogenase, creatine kinase, D-dimer, interleukin-6 (IL-6), and complement C3 (C3)] that are routinely measured were initially identified as potential predictors on the basis of existing research (2,6,7) (Table S1). The first laboratory test results obtained in 3 days from the day of patient admission were used to present the patient admission status, and the results were divided into classes using the reference values below:
- White blood cell count: (3.50–9.50)×109/L;
- Neutrophil count: (1.80–6.30)×109/L;
- Lymphocyte count: (1.10–3.20)×109/L;
- Hemoglobin: 130–175 g/L;
- Platelet count: (125–350)×109/L;
- Albumin: 40.0–55.0 g/L;
- ALT: 9–60 IU/L;
- Aspartate aminotransferase: 15–40 IU/L;
- Serum creatinine concentration: 53–106 µmol/L;
- Lactate dehydrogenase: 114–250 IU/L;
- Creatine kinase: 0–171 IU/L;
- D-dimer: 0–1.00 mg/L;
- IL-6: 0–5.90 pg/mL;
- C3: 0.90–1.80 g/L.
The combinations of missing values were analyzed by cluster analysis of patterns of missingness. The IWCH-COVID-19 cohort had missing information on SBP (1.64%), DBP (1.76%), pulse (0.23%), white blood cell count (7.03%), neutrophil count (7.03%), lymphocyte count (7.03%), hemoglobin (7.03%), platelet count (7.03%), albumin (9.50%), ALT (9.50%), aspartate aminotransferase (9.50%), serum creatinine concentration (10.20%), lactate dehydrogenase (30.95%), creatine kinase (30.95%), D-dimer (48.77%), C3 (89.21%), and IL-6 (96.95%). The completeness of predictors can be found in Table S2.
Missing predictor data were imputed using multiple imputations, assuming data were missing at random (R package mice). Because of the high rate of missing data (>90%), we excluded IL-6 test results. The model contained all the prediction factors except IL-6, and was used to generate the estimation dataset of 20 missing variables. It was evaluated by Marshall’s adaption of Rubin’s rules (11).
Grouped Kaplan-Meier analysis, which is a nonparametric approach, was applied to compare the different characteristics of the IWCH-COVID-19 cohort. The time unit was set as days.
The hypothesis of proportional hazard (PH) was tested by the statistical significance test and graphic diagnosis based on the scaled Schoenfeld residuals. A univariate Cox PH model was used to evaluate the risk of respiratory distress in the IWCH-COVID-19 cohort with associations expressed as hazard ratios (HRs) (Figure S1, step 1). When appropriate, the 95% confidence interval (CI) was calculated, and P<0.01 was considered statistically significant. Each model calculated Akaike’s information criterion (AIC). The top 12 statistical significance predictors with lower AIC values were included in a multivariable Cox PH model (Figure S1, step 2). A backward stepwise procedure was used to evaluate the final statistical significance of the predictors.
Performance of risk prediction
The predictive performance of the selected risk scores was assessed by internal validation. Discrimination of the final model was indicated by the Harrell C statistic. Discrimination describes the ability of the model to distinguish between patients with and without respiratory distress. A higher C statistic indicated better performance, and a value of 0.5 denoted a prediction model that is not clinically useful. AIC was used to evaluate the steps used in model development. Bootstrap values were applied in the internal validation.
All statistical analyses were carried out in R software version 3.5.3. The rms, survival, survminer, and pec packages in R were used in the analysis.
A total of 1,150 patients were hospitalized at the infection ward of Wuhan No. 1 Hospital. After excluding 287 (24.96%) patients that were not confirmed by COVID-19 detection as of March 19, 2020 and children (<14 years of age, 1 patient), the remaining 863 (70.04%) patients were placed in the IWCH-COVID-19 cohort in the final analysis (Figure 1). The median age of the 863 patients was 62.0 years [interquartile range (IQR), 51.0–70.0], with a range of 16–97 years, and most patients were female, with 388 men (44.96%). There were 323 (37.43%) older people (age >65 years). Of the 863 patients, 319 suffered from comorbidities such as hypertension (228 patients, 26.42%), diabetes (89 patients, 10.31%), coronary heart disease (56 patients, 6.49%), chronic obstructive lung disease (6 patients, 0.70%), cerebrovascular disease (16 patients, 1.85%), carcinoma (11 patients, 1.29%), and chronic kidney disease (4 patients, 0.46%). IWCH-COVID-19 cohort characteristics are presented in Table 1.
In the IWCH-COVID-19 cohort, 60 patients (6.95%) developed respiratory distress within 30 days of admission. Among these, 33 of the patients who needed ventilatory support were men (62.26%) compared with 20 women (37.74%). The survival curves of eight predictors (gender, comorbidity diabetes, comorbidity carcinoma, comorbidity chronic obstructive lung disease, neutrophil count, D-dimer, platelet count, and C3) are shown in Figure 2, and other predictors are shown in Figure S2.
Risk prediction model development and performance
In univariate Cox models, gender, age, pulse, temperature, white blood cell count, neutrophil count, lymphocyte count, hemoglobin, platelet count, creatine kinase, D-dimer, C3, and high-sensitive cardiac troponin I were all statistically significant predictors (P<0.01). The top 12 predictors according to the AIC value were included in a multivariate Cox model using a backward stepwise selection procedure. Five predictors—temperature, platelet count, neutrophil count, D-dimer, and C3—were found to be statistically significant (P<0.01) in predicting respiratory distress in COVID-19. All these procedures and results can be found in Table S3.
The final five predictors were included in a multivariate Cox model (called the final model). Neutrophil count >6.3×109/L, D-dimer ≥1.00 mg/L, and temperature at admission ≥37.3 °C had a significant positive association with outcomes of respiratory distress in the final model [HR, 8.286 (1.833–37.446), P<0.01; HR, 7.835 (3.737–16.423), P<0.001; and HR, 3.299 (1.673–6.503), P<0.001, respectively; Table 2]. C3 level of 0.9–1.8 g/L, platelet count >350×109/L, and platelet count of 125–350×109/L had a significant negative association with outcomes of respiratory distress in the final model [HR, 0.268 (0.143–0.504), P<0.001; HR, 0.096 (0.028–0.326), P<0.001; and HR, 0.209 (0.106–0.411), P<0.001, respectively; Table 2]. Figure 3 shows the forest plot of the HR.
Table 2 shows the performance of the final model with a C statistic of 0.891 (0.867–0.915) and an AIC of 567.65. The bootstrap CI of the final model was 0.866 (0.842–0.89).
Application of the model
As an example to illustrate the use of this risk model, consider a male patient aged 58 years for whom the following data were recorded at admission: temperature, 37.8 °C; platelet count, 137×109/L; D-dimer, 0.89 mg/L; C3, 1.1 g/L; neutrophil count, 6.5×109/L (these values were obtained from the first test after the patient was admitted to the hospital). For this patient, the risk is calculated as follows:
Thirty-day respiratory distress risk =(1–0.97839exp[1.1518×(1–0.0985)–1.7335×(1–0.8335)+2.1551×(0–0.687)–1.4017×(1–0.8183)+2.7162×(1–0.0996)])×100%=8.98%
Furthermore, the 15- and 30-day survival estimates can be calculated based on the nomogram of the final model shown in Figure 4.
This paper proposes a predictive risk model of respiratory distress in COVID-19 patients. Using this model for analysis, the results revealed that in the IWCH-COVID-19 cohort, a neutrophil count >6.30×109/L, D-dimer >1.00 mg/L, temperature at admission ≥37.3 °C, platelet count <125×109/L, and C3 <1.80 g/L were risk factors for predicting the likelihood of respiratory distress in COVID-19 inpatients. These five predictors yielded a statistically significant result and demonstrated useful discrimination and excellent calibration of the model.
Comparison with previous studies
Some Kaplan-Meier survival plots for different prognostic factors have been mentioned in previous studies (12). The trends of platelet, neutrophil, and lymphocyte counts were similar to those in the plots by Chen (12). In our Kaplan-Meier survival curves, male gender, D-dimer ≥1.00 mg/L, and C3 <0.90 g/L also indicated a higher risk of respiratory distress. The trends of IL-6 and comorbidity with coronary heart disease were not so obvious in our dataset.
Some previous research analyzed the risk factors for fatal outcome or critical illness in COVID-19 patients admitted to hospital. Zhou et al. (2) and Huang et al. (13) found higher D-dimer levels on admission could predict a poor prognosis. Fan et al. (14) indicated that a higher neutrophil count and a lower platelet count were associated with poor outcomes. Lippi et al. (15) and Ruan et al. (16) also reported that low platelet count was associated with increased risk of severe disease and mortality in patients with COVID-19. Our predictive model was consistent with the results reported in these studies. Furthermore, in a study by Ko et al. (17), it was concluded that Middle East respiratory syndrome patients with a lower platelet count were at higher risk of developing respiratory failure, which is consistent with the findings of our research.
Some studies have mentioned older age, comorbidities such as diabetes (18), hypertension (14,19,20), lymphocytopenia (14,21), leukocytosis, IL-6 (22), and high hypersensitive troponin I (16) to be risk predictors for fatality. Other research made no definite conclusions on the association between diabetes and the morbidity or mortality of COVID-19 patients (23). We also used these predictors as potential predictors. Gender, age, number of breaths, pulse, comorbid diabetes, and high-sensitive cardiac troponin I >0.026 were statistically significant in our univariate Cox models. However, these factors were not included in our final multivariate Cox model.
Strengths and limitations
One strength of this study is the comprehensive risk prediction model for respiratory distress in COVID-19 patients proposed in this study. We not only investigated the presence of respiratory distress but also performed survival analysis based on the time of the respiratory distress occurring for the first time because we were able to trace the exact time of these events.
One weakness of this study is that we used data from the infection department of only one grade A hospital as we obtained authorized access to this data alone. It is hard to study a multicenter cohort that could reflect information on predictors and outcomes more completely. Meanwhile, we considered an interactive model, but could not achieve this because of the small number of positive samples. Furthermore, smoking and body mass index (BMI) have been mentioned in the literature several times as potential predictors of respiratory distress in COVID-19 patients, but due to the lack of records, smoking and BMI were not included as potential predictors in our research. Another limitation is that patients aged >80 years may not have the same reference values for their laboratory test results, as there is no standard method for analyzing them. Additionally, COVID-19 virus infection mainly affects older adults, so there are not enough samples to correct for the lack of homogeneity across age groups.
The predictive model generated in this study based on the factors obtained at admission can be used to calculate the risk of respiratory distress within 30 days of admission; specifically, the nomogram can be used to calculate the risk and classify patients at an early stage. The model can be very helpful for the early allocation of medical resources, attention to severe disease progression, and improved prognosis. The results may help guide the clinical management of patients with severe COVID-19, especially when limited resources need to be strategically allocated.
Funding: This work was supported by grants from the National key Research & Development Plan of the Ministry of Science and Technology of the People’s Republic of China (Grant No. 2018YFC1314900, 2018YFC1314901), 2019 Provincial Special Guide Fund Project for the Development of Modern Service Industry [2019 (783)], the 2018 Projects of Jiangsu Province Department of Industry and Information Technology (Grant No. 2018419) and the 2016 Projects of Nanjing Science Bureau (Grant No. 201608003). YL is the guarantor of this paper.
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at http://dx.doi.org/10.21037/atm-20-4977
Data Sharing Statement: Available at http://dx.doi.org/10.21037/atm-20-4977
Peer Review File: Available at http://dx.doi.org/10.21037/atm-20-4977
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at http://dx.doi.org/10.21037/atm-20-4977). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This study was approved by the Ethics Committee of the First Affiliated Hospital of Nanjing Medical University (No. 2020-SR-163), and informed consent from study participants was waived.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
- WHO. Coronavirus disease (COVID-19) weekly epidemiological update and weekly operational update. 2020. Available online: (Accessed 5 April 2020).https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports
- Zhou F, Yu T, Du R, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet 2020;395:1054-62. [Crossref] [PubMed]
- Wang D, Hu B, Hu C, et al. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. JAMA 2020;323:1061-9. [Crossref] [PubMed]
- Guan WJ, Ni ZY, Hu Y, et al. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med 2020;382:1708-20. [Crossref] [PubMed]
- Emanuel EJ, Persad G, Upshur R, et al. Fair allocation of scarce medical resources in the time of Covid-19. N Engl J Med 2020;382:2049-55. [Crossref] [PubMed]
- Yang X, Yu Y, Xu J, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med 2020;8:475-81. [Crossref] [PubMed]
- Wu C, Chen X, Cai Y, et al. Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in Wuhan, China. JAMA Intern Med 2020;180:934-43. [Crossref] [PubMed]
- Department of Medical Administration. Diagnosis and Treatment Protocol for COVID-19 (Trial Version 7). Available online: (Accessed May 8 2020).http://www.nhc.gov.cn/yzygj/s7653p/202003/46c9294a7dfe4cef80dc7f5912eb1989.shtml
- Matthay MA, Aldrich JM, Gotts JE. Treatment for severe acute respiratory distress syndrome from COVID-19. Lancet Respir Med 2020;8:433-4. [Crossref] [PubMed]
- WHO. Clinical management of COVID-19. Available online: (Accessed Apil 8 2020).https://www.who.int/publications-detail/clinical-management-of-severe-acute-respiratory-infection-when-novel-coronavirus-(ncov)-infection-is-suspected
- Marshall A, Altman DG, Holder RL, et al. Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol 2009;9:57. [Crossref] [PubMed]
- Chen R, Sang L, Jiang M, et al. Longitudinal hematologic and immunologic variations associated with the progression of COVID-19 patients in China. J Allergy Clin Immunol 2020;146:89-100. [Crossref] [PubMed]
- Huang C, Wang Y, Li X, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020;395:497-506. [Crossref] [PubMed]
- Fan BE, Chong VCL, Chan SSW, et al. Hematologic parameters in patients with COVID-19 infection. Am J Hematol 2020;95:E131-4. [Crossref] [PubMed]
- Lippi G, Plebani M, Henry BM. Thrombocytopenia is associated with severe coronavirus disease 2019 (COVID-19) infections: a meta-analysis. Clin Chim Acta 2020;506:145-8. [Crossref] [PubMed]
- Ruan Q, Yang K, Wang W, et al. Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China. Intensive Care Med 2020;46:846-8. [Crossref] [PubMed]
- Ko JH, Park GE, Lee JY, et al. Predictive factors for pneumonia development and progression to respiratory failure in MERS-CoV infected patients. J Infect 2016;73:468-75. [Crossref] [PubMed]
- Huang I, Lim MA, Pranata R. Diabetes mellitus is associated with increased mortality and severity of disease in COVID-19 pneumonia - A systematic review, meta-analysis, and meta-regression. Diabetes Metab Syndr 2020;14:395-403. [Crossref] [PubMed]
- Xie J, Covassin N, Fan Z, et al. Association between hypoxemia and mortality in patients with COVID-19. Mayo Clin Proc 2020;95:1138-47. [Crossref] [PubMed]
- Chen R, Liang W, Jiang M, et al. Risk factors of fatal outcome in hospitalized subjects with coronavirus disease 2019 from a nationwide analysis in China. Chest 2020;158:97-105. [Crossref] [PubMed]
- Shi SJ, Li H, Liu M, et al. Mortality prediction to hospitalized patients with influenza pneumonia: PO2 /FiO2 combined lymphocyte count is the answer. Clin Respir J 2017;11:352-60. [Crossref] [PubMed]
- Liu F, Li L, Xu M, et al. Prognostic value of interleukin-6, C-reactive protein, and procalcitonin in patients with COVID-19. J Clin Virol 2020;127:104370. [Crossref] [PubMed]
- Hussain A, Bhowmik B, do Vale Moreira NC. COVID-19 and diabetes: knowledge in progress. Diabetes Res Clin Pract 2020;162:108142. [Crossref] [PubMed]