
Worldwide, immunoglobulin A (IgA) nephropathy (IgAN) is one of the most common types of primary glomerulonephritis; in Asian regions, IgA accounts for ~45.3% of cases of primary glomerulonephritis. Clinical presentation of IgAN varies widely, from isolated hematuria to rapid progression to kidney failure. Patients may also present with a variety of histologic lesions, from mild mesangial hypercellularity to crescentic glomerulonephritis and diffuse sclerosis. Recent long-term studies have found that the prognosis for IgAN is poor; 30% to 40% of patients develop end-stage renal disease (ESRD) within 10 to 25 years of initial diagnosis.
Previous studies have identified multiple risk factors affecting the prognosis of IgAN, including baseline urinary protein excretion >1 g per day, hypertension, decreased glomerular filtration rate (GFR), hyperuricemia, male sex, and scores indicating severe pathology. Prediction of IgAN prognosis has been achieved via a variety of scoring systems that are, according to researchers in China, limited by small derivation sample sizes, varying pathologic scoring standards, inclusion of relatively few variables, and poor clinical applicability.
The researchers, led by Tingyu Chen, MD, and Xiang Li, PhD, recently conducted a multicenter retrospective cohort study designed to use machine learning to build a prognostic prediction and risk stratification system (Nanjing IgAN Risk Stratification System) that combines clinical and pathologic variables to assist physicians in predicting kidney prognosis quickly and accurately. The system was described in the American Journal of Kidney Diseases [2019;74(3):300-309].
The study utilized data from 2047 patients with IgAN and long-term follow-up. Inclusion criteria were ≥18 years of age, follow-up >12 months, estimated GFR (eGFR) ≥30 mL/min/1.73 m2, proteinuria with protein excretion ≥0.5 g per day, and a biopsy specimen with ≥8 total glomeruli on periodic acid-Schiff staining. Patients who progressed to ESRD or had a 50% reduction in eGFR within the first 12 months of follow-up were also included. Exclusion criteria were secondary causes of mesangial IgA deposits such as IgA vasculitis and autoimmune disorders, or comorbid conditions such as diabetes mellitus or Alport syndrome.
The data were retrieved consecutively from the Nanjing Glomerulonephritis Registry between January 2006 and June 2009 (derivation cohort). Multicenter data were retrieved from 18 renal centers between January 1997 and June 2010 (validation cohort). Follow-up data were updated to August 2017.
Two risk models were constructed: a prediction model using eXtreme Gradient Boosting (XGBoost), which would require a computer for accurate risk prediction, and a simpler restricted variable scoring scale model (SSM) derived by stepwise Cox regression for risk stratification.
In the derivation and validation cohorts with complete medical records, median follow-up was 7.9 and 7.8 years, respectively. Most patients in the two cohorts were treated with renin-angiotensin system blockade, resulting in good blood pressure control during follow-up. The 5- and 10-year kidney survival rates were 96.8% and 92%, respectively, in the derivation cohort; the rates in the validation cohort were 83.8% and 76.4%, respectively.
The XGBoost model, trained on 36 variables, had a C statistic of 0.89 (95% confidence interval [CI], 0.87-0.94) for the derivation cohort and 0.84 (95% CI, 0.80-0.88) for the validation cohort while using the 10 most important variables measured by XGBoost importance score as input (Table 1). Among the machine learning models and traditional regression models, the XGBoost model achieved the best prediction performance.
Table 1. Variables Selected Using XGBoost and the Corresponding Variable Importance Score
Variables | Importance Score* |
Tubular atrophy/interstitial fibrosis (%) | 0.156 |
Serum albumin (g/L) | 0.125 |
Global sclerosis (% of glomeruli) | 0.109 |
Hypertension before biopsy | 0.078 |
Serum uric acid (mmol/L) | 0.063 |
Microscopic hematuria (RBC count/mL) | 0.063 |
Age at biopsy (years) | 0.063 |
Urine protein (g/d) | 0.063 |
Mean mesangial score | 0.047 |
Serum creatinine (mg/dL) | 0.047 |
Abbreviation: RBC, red blood cell.
*The Importance Score is the relative number of times a variable is used to distribute the data across all trees.
The SSM model was constructed using three variables (tubular/interstitial fibrosis, global sclerosis, and urine protein excretion). Scores corresponding to these variables were added together to obtain the patient’s risk score (Table 2).
Table 2. Derived Score of the Scoring Scale Model
Characteristic | Beta | HR (95% CI) | P | Derived Score* |
T1 | 1.234 | 3.43 (1.84-6.42) | <.001 | 1 |
T2 | 1.901 | 6.69 (3.15-14.22) | <.001 | 2 |
Global sclerosis >25% | 1.004 | 2.73 (1.40-5.32) | .003 | 1 |
Urine protein >1 g/d | 0.561 | 1.75 (1.06-2.91) | .03 | 1 |
Abbreviations: CI, confidence interval; HR, hazard ratio; T1, tubular atrophy-interstitial fibrosis 25%-50%; T2, tubular atrophy-interstitial fibrosis >50%.
*Scores were derived from the coefficients of the variables in the stepwise Cox regression model.
Using the SSM risk score, the 5-year risks for the combined event in the validation cohort for score 0 though 4 points were 2.7%, 4.8%, 16.4%, 30.8%, and 72.4%, respectively. The low-risk group (SSM risk score 0-1 point) included the majority of patients with IgAN in the derivation (69.8%; 713/1022) and validation (66.0%; 677/1025) cohorts.
In both cohorts, using the Kaplan-Meier method, the survival curve without a combined event during follow-up was significantly better (P<.001) in the absence of T1, T2, global sclerosis >25%, and urine protein excretion >1 g/day. The SSM successfully stratified the patients (P<.001).
The researchers cited some limitations to the study: the cohort is not from a prospective therapeutic trial, and the therapeutic interventions were variable; the prediction model was developed using data from a Chinese population, possibly limiting the generalizability of the findings to other ethnic groups.
In conclusion, the researchers said, “We established and externally validated the Nanjing IgAN Risk Stratification System including an accurate XGBoost prediction and a simplified SSM for risk stratification, both of which show promising performance and have better prediction power than the absolute renal risk. The Nanjing IgAN Risk Stratification System is accessible online with consideration of both model prediction performance and interpretation. The Nanjing IgAN Risk Stratification System could be easily implemented in clinical practice for physicians and patients to stratify risk and predict kidney prognosis quickly and accurately, thereby serving as a more favorable tool to strengthen individualized treatment and management in patients with IgAN.”
Takeaway Points
- Researchers in China developed and validated a system to aid in predicting long-term outcomes and stratifying risk in patients with immunoglobulin A nephrology (IgAN).
- Two risk models were developed: (1) a prediction model using eXtreme Gradient Boosting (XGBoost) that requires a computer for accurate risk prediction, and (2) a simpler restricted variable scoring scale model (SSM) derived by stepwise Cox regression for risk stratification.
- The two models, both of which showed promising performance and better prediction power than the absolute renal risk, resulted in the Nanjing IgAN Risk Stratification System.