2008 USRDS Annual Data Report
This Chapter
Download PDF: Win*   Mac*
Download all slides: Win*   Mac*
*corresponding data in Excel included
Search This Page
Search All

Two: Chronic kidney disease identified in the claims data

Identifying chronic kidney disease is a significant challenge, as most datasets lack the biochemical data that provides, in comparison to diagnosis codes, the greatest precision in identifying the disease. And while random samples such as the NHANES dataset do include biochemical information, such studies rarely include event rates or economic data, making it difficult to evaluate access to care for this high-risk population, or to examine the interactions of CKD with diabetes and cardiovascular disease.

The USRDS obtains several datasets which we have, in past ADRs, used to assess the recognized CKD population. This year we have developed methods to examine this population by looking at services performed by providers when CKD diagnoses are reported. In this chapter we define two such methods, using claims data and looking as well at the new ICD-9-CM diagnosis codes for CKD, introduced in 2006. Data come from the Medstat MarketScan database, in which most of those covered are self-insured, and from the Ingenix i3 dataset, in which most are not. We compare the prevalence of CKD in these datasets, and show a true estimate from the NHANES samples, providing a sense of the under-recognition of this disease.

On the next page we present a schematic for defining the CKD population in the Medicare 5 percent sample. With the point prevalent method we identify CKD from diagnosis codes within a single calendar year. Since this approach defines the disease only from codes in that year — excluding patients without CKD codes in that year, even if they had codes in prior years — it may limit the number of actual cases.

To address this potential loss of patients we also developed a period prevalent method. All patients alive on January 1 of a year have data reviewed in the prior year for a diagnosis of CKD, defining a point prevalent population known to have CKD. New cases of CKD within the current year are then added, completing a period prevalent cohort. This method is similar to the one used to define period prevalent cohorts in the ESRD population. Additional information on these two approaches is available in Appendix A.

Using the point prevalent method, with a single year to define the CKD population, 6.4 percent of the Medicare population carries a diagnosis of CKD. Using the two-year period prevalent method, the number rises to 9 percent. The latter method provides a case definition that can be used to address expenditures in a way similar to that used with the ESRD population. Because the Medstat and Ingenix i3 populations, however, have fewer years of consistent claims, we have used the first method to examine them.

On the next spread we use the point prevalent method to look at the prevalence of CKD in the Medicare population, comparing that identified through a full constellation of diagnosis codes (discussed in Appendix A) to that identified solely through the new ICD-9-CM diagnosis codes for CKD. It is clear that the comprehensive codes provide a greater yield, a reality of the early introduction of any new codes to the payment system and to providers. For this reason we also use the comprehensive codes to illustrate the increasing amount of recognized CKD — which is still significantly less than the actual prevalence of the disease indicated by true population estimates (Chapter One).

With Medicare’s coding methods required across all state and regional fiscal intermediaries, the consistency of billing rules and procedures may differ from that in the employer group health plans. Medicare payment rules, for example, indicate that all claims now need to use the new ICD-9-CM codes for CKD, and allowed no grace or phase-in periods. Our analyses show that the adoption of these codes in the Medicare system was much greater than in the private sector. The USRDS Coordinating Center will monitor these coding changes and report on their utility in health services research. In the early phase of the changes, however, it appears that the full constellation of kidney disease codes still provides a better indication of recognized CKD than do the new 585.x ICD-9-CM diagnosis codes on their own.

Trends in the amount of recognized CKD (Figures 2.3 and 2.5) do suggest that providers are documenting more CKD in their service claims, though the prevalence of CKD is still far less than that noted in the population estimates reported from the NHANES 1999–2006 data. Despite these limitations, the specificity of the disease codes is very high, at over 90 percent; the sensitivity, however, is much lower. The high specificity allows researchers to assess care in those known to have the disease, providing insights into access to care in this high-risk population.


As mentioned earlier, laboratory data allow a more accurate identification of CKD. The Ingenix i3 dataset contains information on prescription drugs and laboratory data reported to United Healthcare contract laboratories, and these data provide a different view of the prevalence of CKD than that identified in the ICD-9-CM diagnosis codes.

We use two formulas, based on serum creatinine levels, to define estimated glomerular filtration rates (eGFRs): the Modification of Diet in Renal Disease (MDRD) equation and that published by Rule et al. The prevalence of CKD as defined through an eGFR <60 ml/min/1.73 m2 is greater than that defined from the diagnosis codes. In addition, the comorbidity burden based on this CKD definition shows a high prevalence of congestive heart failure and other cardiovascular disease. In Figure 2.6 we illustrate the relationship between CKD stage reported from the claims and that identified from the laboratory data. As CKD stage, identified from the diagnosis codes, progresses, the estimated GFR decreases. There are, however, notable exceptions, which may relate to early use of the codes in the employer group health plan (EGHP) claims data. At Stage 5 (ICD-9-CM code 585.5), the laboratory eGFR appears to be a mean of 35 ml/min/1.73 m2 — higher than the expected level of <15 from the definition. These irregularities are important for researchers to consider, and the relationship between the timing of the laboratory data and the reporting of the diagnosis codes should be explored. We will continue to assess these areas.

There are clearly important challenges in any attempt to define the CKD population from administrative datasets. The Medicare 5 percent data appear to provide a greater reporting of recognized CKD, possibly because of the consistency of coding and billing procedures compared to those used by EGHPs. The low prevalence of reported CKD in the EGHP population would appear to be addressed with more complete laboratory data on this population. And there is little doubt that the Medicare data would also be improved with the addition of laboratory data, the basis of the case definition. Any development of CKD cohort definitions from administrative data should acknowledge that there is significant underreporting of the disease. As awareness of CKD grows in the medical community, based on studies of the disease’s interaction with diabetes and cardiovascular disease, it is likely that reported codes will increase. The new initiatives passed by Congress to address CKD education in Stage 4 may influence provider recognition of the condition, as may the quality improvement assessments currently being discussed in relation to CKD performance measures. Lastly, since CKD in its advanced stages has such an impact on morbidity and mortality, it may be worth considering a registration system for CKD, with surveillance data that will further the analyses of outcomes and access to care.

figure 2.1 point & period prevalent populations for 2006, estimated from 5 percent Medicare sample using standard methods (see Appendix A for further details). CHF, diabetes, & CKD determined from claims.

figure 2.2 The standard methodology (one or more inpatient diagnosis codes or two or more outpatient codes) identifies a higher percentage of patients with CKD (6.4 per-cent) than that obtained with the new stage-specific codes (4.2 percent for all 585 codes combined). Within these stage-specific codes, the most commonly used codes are 585.9 (unknown) and 585.3 (Stage 3). The use of these codes began late in 2005, and provid-ers are still adjusting to their use. Both their use and their accuracy should increase over time.

figure 2.3 Among Medicare patients, the prev-alence of CKD identified through claims has increased dramatically in the last ten years, from 1.8 percent in 1995 to 6.4 percent in 2006. Much of this growth is probably due to increasing recognition and/or coding of earlier stages of CKD.

table 2.a Data here show generally increas-ing levels of comorbidity with increasing CKD stage as defined by the 585.1–585.5 codes. Although the pattern is not universal, agreement appears to be best between the old method and ICD-9-CM code 585.4.

figure 2.4  The Medstat population age 20–64 shows about one-tenth the CKD prevalence of the Medicare cohort age 65 and older (shown in Figure 2.2). The distribution of new stage-specific codes compared to the old method is similar to that found in the Medicare cohort; there appears, however, to be less use of the new stage codes in the Med-stat data. Of these stage-specific codes, 585.3 is the most commonly used in the Medstat cohort. Again, the use and accuracy of these codes in the EGHP datasets should improve over time.

figure 2.5  With a pattern similar to that observed in the Medicare data, the preva-lence of claims-identified CKD has risen sub-stantially since 1999. Increased recognition and coding of earlier-stage CKD again most likely accounts for much of this increase.

table 2.b  As in the Medicare cohort, Medstat data show generally increasing levels of comor-bidity with increasing CKD stage, as defined by the new codes. In this dataset, the best general agreement between the old method and the new codes seems to be with Stage 3, identified with ICD-9-CM code 585.3.

table 2.c This table illustrates the percentage of patients with meta-bolic-related abnormalities in CKD Stages 3–5, using the two differ-ent estimating equations. Among non-CKD patients, the percentage with each abnormality is similar for both equations. For patients with CKD of Stages 3–5, however, the percentages are considerably higher when CKD is identified using the Rule method as compared to the MDRD equation. The largest difference is observed for Stage 3, which agrees with the general finding that the MDRD formula identifies significantly more individuals as Stage 3 than does the Rule equation, but similar numbers within Stages 4 and 5.


figure 2.6  Agreement between CKD as iden-tified from claims (old versus new codes) and as identified through eGFR (using the MDRD equation) shows a pattern of gener-ally decreasing eGFR with increasing stage, and “unknown” stage from codes being a mixture of all stages. The average eGFR for patients with CKD, identified through the old method, seems to correspond approximately to Stage 2 as identified in the claims.

table 2.d  This table presents a detailed assess-ment of agreement between eGFR using serum creatinine, and of CKD identified from claims (using the “old” method). The right-most column shows the (unchanging) proportion with CKD from claims — about 1.9 percent. Each row represents a different eGFR cutoff used to identify patients with CKD, representing a changing gold standard. The parameters of sensitivity (most probable of having CKD from claims, given an eGFR less than cutoff), specificity (most likely of not having CKD identified from claims, given an eGFR > cutoff), positive predictive value (PPV, the probability of having an eGFR < cutoff given CKD identified in claims), and negative predictive value (NPV, the probabil-ity of having an eGFR > cutoff given CKD not identified in claims) are shown for each level of eGFR. Although the choice of the eGFR cutoff giving “best” agreement with claims is a tradeoff of each of these four parameters, the column labeled Kappa is a measure of overall agreement between eGFR and CKD from claims. This value is highest (0.38) when an eGFR cutoff of <45 ml/min/1.73 m2 is used, suggesting that CKD from claims generally identifies patients with an eGFR <45. Some caveats are important to mention. These analyses of eGFR using serum creatinine are on patients with measured creatinines — a subset of all Ingenix patients. Also, data on race are not available; eGFR is calculated for each individual without considering race, and is therefore an underestimate for Afri-can American individuals.

figures 2.7 & 2.8 We show here the distribution of CKD stages, using claims-based methodology in the Medicare, Medstat, and Ingenix i3 datasets, and lab-based methodology in the Ingenix i3 dataset. Claims-based case identification is much more frequent for Medicare patients. The discrepancy between lab-based and claims-based case identification in the Ingenix i3 dataset is notable, with claims suggesting that just 0.13 percent of subjects have CKD Stages 3–5, compared to 10.5 percent identified with laboratory-based estimates. We further show claims-based comparisons of CKD prevalence using the three claims-based methods, and we apply two lab-based methods to the U.S. population and to Ingenix i3 patients. Among adults younger than 65, CKD prevalence estimates are broadly similar with the two lab-based methods. It is likely that the vast majority of subjects with CKD in the Ingenix i3 database are never formally identified.

table 2.e In the Medicare data, the preva-lence of each comorbid condition generally increases with increasing CKD stage defined by ICD-9-CM codes 585.1 through 585.5. Not surprisingly, the prevalence of each comor-bidity is considerably less in patients without any codes for CKD compared to those with CKD identified through either the traditional or new codes within Medicare, or in the Med-stat or Ingenix i3 datasets. Comorbidity gen-erally increases by age, but there are inconsis-tent differences by gender and race.

figures 2.9, 2.10, 2.11, & 2.12 The figures below show the percentage of patients with diabetes, congestive heart failure, hypertension, and cancer, by dataset and CKD status, using the traditional method of identifying CKD and, for the Medicare dataset, the new ICD-9-CM codes. Across all four conditions, the prevalence of each comorbidity is similar between the Ingenix i3 and Medstat datasets. In the Medicare cohort there is no clear increase in comorbidity prevalence with increasing CKD stage using the new codes, with the exception of CHF, for which the relationship is more clear.

figure 2.13 Plotting patient-level comorbidity (CVD, diabetes, hypertension) presence (0 versus 1) by estimated GFR, and then fitting a smoothed curve to each comorbid condition, shows a clear pattern of increasing comor-bidity by decreasing eGFR. Diabetes and overall cardiovascular disease show similar relationships, while hypertension prevalence is higher at every level of eGFR.



figures 2.2–3 & table 2.a point prevalent general Medicare patients age 65 & older, surviving all of 2006 with Medicare as primary payor & not enrolled in an HMO. ESRD patients excluded. CKD & other comorbidi-ties defined by diagnosis codes in 2006. figures

2.4–5 & table 2.b point prevalent Medstat patients age 20–64, surviving all of 2006, & enrolled in a fee-for-service plan. ESRD patients excluded. CKD & other comorbidities defined by diagnosis codes in 2006. • *All codes: CKD identified through one or more inpatient/outpatient institutional claims (inpatient hospitalization, skilled nursing facility, or home health agency), or two or more institutional claims (outpatient) or physician/supplier claims, the method used in other USRDS studies. **CKD identified through the 585.x ICD-9-CM codes. ^In USRDS analyses, patients with ICD-9-CM code 585.6 are considered to have code 585.5; see Appendix A for details.

table 2.c & figure 2.6 point prevalent Ingenix i3 patients age 20–64, surviving all of 2006 & enrolled in a fee-for-service plan. ESRD patients excluded. CKD defined by eGFR in Figure 2.c & by diagnosis codes in Figure 2.6; see Appendix A for details of MDRD & Rule methods. Last serum creatinine value of 2006 used for eGFR calculation. Uric acid & parathyroid hormone abnormalities defined by ≥95th percentile from NHANES data; cal-cium abnormality defined by ≤5th percentile from NHANES data; & glucose abnormality based on normal range from Ingenix i3 data. Reduced HDL: <40 mg/dl in men, <50 mg/dl in women; based on criteria proposed by the National Cholesterol Education Program (NCEP) Adult Treatment Panel III (ATP III), with elevated triglycerides ≥150 mg/dl. WHO anemia: males, hemoglobin <13 g/dl; females, hemoglobin <12 g/dl. Error bars in Figure 2.6 show 25th & 75th percentiles.

table 2.d point prevalent Ingenix i3 patients age 20–64, surviving all of 2006 & enrolled in a fee-for-service plan. eGFR calculated using MDRD method, using last serum creatinine value of 2006.

figures 2.7–8 Medicare: point prevalent general Medicare patients age 65 & older, surviving all of 2006 with Medicare as primary payor & not enrolled in an HMO. Medstat & Ingenix i3: point prevalent Medstat & Ingenix i3 patients age 20–64, surviving all of 2006 & enrolled in a fee-for-service plan. ESRD patients excluded. CKD & comor-bidities defined by diagnosis codes in 2006. NHANES: NHANES 1999–2006 partici-pants, age 20 & older. CHF & diabetes are self-reported; CKD defined through serum creatinine values. • CKD stages: Stage 3, eGFR <15 ml/min/1.73 m2; Stage 4, 15 ≤ eGFR < 30; Stage 5, 30 ≤ eGFR < 60. ^In USRDS analyses, patients with ICD-9-CM code 585.6 are considered to have code 585.5; see Appen-dix A for details.

table 2.e & figures 2.9–12 Medicare: point prevalent general Medicare patients age 65 & older, surviving all of 2006 with Medicare as primary payor & not enrolled in an HMO. Medstat & Ingenix i3: point prevalent Medstat & Ingenix i3 patients age 20–64, surviving all of 2006 & enrolled in a fee-for-service plan. ESRD patients excluded. CKD & other comorbidities defined by diagnosis codes in 2006. Table values are comorbidity prevalence within each cell defined by patient demographics (rows) & CKD status (old vs. new codes).

figure 2.13 point prevalent Ingenix i3 patients, surviving all of 2006. Comorbidity defined by diagnosis codes in 2006, & eGFR calculated from mean of serum creatinine levels in 2006. In USRDS analyses, patients with ICD-9-CM code 585.6 are considered to have code 585.5; see Appendix A for details.