ACP American College of Physicians - Internal Medicine - Doctors for Adults

Effective Clinical Practice

Changing Disease Definitions: Implications for Disease Prevalence. Analysis of the Third National Health and Nutrition Examination Survey, 1988-1994

Effective Clinical Practice, March/April 1999

Lisa M. Schwartz and Steven Woloshin

For author affiliations, current addresses, and contributions, see end of text.

Context. In the hope of extending treatment benefits to patients with early disease, various professional societies have recommended changing several common disease definitions by lowering the threshold value for diagnosis.

Count. Number of Americans labeled "diseased" under new definitions for diabetes, hypertension, hypercholesterolemia, and being overweight.

New cases = Number of cases under new definition - Number of cases under old definition

Data Source. Adult participants (age >17 years) in the Third National Health and Nutrition Examination Survey (1988-1994).

Results. Adopting the new definitions would dramatically inflate disease prevalence. Changing the threshold for diabetes from a fasting glucose level of greater than or equal to 140 mg/dL to greater than or equal to 126 mg/dL would result in 1.7 million new cases. Redefining hypertension as systolic blood pressure greater than or equal to 140 mm Hg instead of greater than or equal to 160 mm Hg or diastolic blood pressure greater than or equal to 90 mm Hg instead of greater than or equal to 100 mm Hg would create 13 million new hypertensive patients. For hypercholesterolemia (a cholesterol level of greater than or equal to 200 mg/dL instead of greater than or equal to 240 mg/dL) and being overweight (body mass index greater than or equal to 25 kg/m2 instead of greater than or equal to 27 kg/m2), the number of new cases would be 42 million and 29 million, respectively. The new definitions ultimately label 75% of the adult U.S. population as diseased.

Conclusions. If these modest changes in disease definition were adopted, great numbers of people would be considered diseased. The extent to which new "patients" would ultimately benefit from early detection and treatment of these conditions is unknown. Whether they would experience important physical or psychological harm is an open question.

Take Home Points

Preventing the serious consequences of chronic disease is an important focus of our health care system. It is often assumed that such prevention is best accomplished through early identification and treatment. One approach to early identification is to lower the diagnostic threshold for disease (e.g., changing the cut-off value that defines high blood pressure).

The benefit of treating patients early in the course of disease is often inferred from studies demonstrating the benefit of treating patients with more advanced disease. For example, the efficacy of improved glycemic control in preventing diabetic complications (1,2) and the benefit of lowering blood pressure among patients with moderate or severe hypertension (3, 4) have raised questions about whether these results can be extrapolated to patients with milder forms of disease (i.e., would people with glucose intolerance benefit from identification and treatment?). In fact, several recent studies have demonstrated some benefit in treating mild hypertension (5) and lowering "normal" cholesterol levels. (6) Perhaps inspired by such findings, various professional societies have recommended lowering the threshold value that defines "abnormal" for several common conditions--in essence, expanding the definition of disease to include increasingly mild disease (or even "pre-disease" forms).

We considered the implications of recently suggested changes in the definition of four common conditions: diabetes, (7) hypertension, (8) hypercholesterolemia, (6) and being overweight. (9) We selected these conditions because they are familiar to physicians and the general public and because in each case, important policy makers, professional organizations, or researchers have called for lowering the threshold value used to define being "diseased." By using data from the Third National Health and Nutrition Examination Survey (NHANES III), we examined how lowering each of these four thresholds would affect disease prevalence in the U.S. adult population.


Data Source

All data presented in this study are from NHANES III, the most recent (1988-1994) national examination study conducted by the National Center for Health Statistics, Centers for Disease Control and Prevention, to assess the health and nutritional status of the civilian, noninstitutionalized population of the United States. The NHANES is a valuable public resource that the federal government makes easily available and encourages researchers to use. Practical advice for using the NHANES database is provided in the Appendix. The complex sampling design, data collection methods, and weighting approach have been described elsewhere. (10, 11)

Briefly, a national sample of approximately 34,000 people 2 months of age or older were selected to participate in NHANES III. Data collection included household interviews (e.g., sociodemographic information and dietary and health questionnaires) and standardized medical examinations that included various blood tests. Overall, 73% of the selected sample underwent examination. We limited our analyses to the 20,040 men and women 17 years of age or older for whom examination data were available.

Calculating Disease Prevalence

We estimated the prevalence of diabetes, hypertension, hypercholesterolemia, and being overweight under the old and the new disease definitions and calculated the net change (i.e., number of new cases). The threshold values and sources of these definitions are presented in Table 1.

Diabetes, Hypertension, and Hypercholesterolemia

Under both the old and the new disease definitions, we calculated prevalence as the total number of Americans 17 years of age or older who met the criteria for a given condition, regardless of whether the diagnosis was established. Figure 1, which gives the "back-of-the-envelope" calculation, shows how we calculated the number of "diseased" patients under the new and the old definitions of disease.

"Disease" was defined as the combination of individuals reporting the condition (self-reported cases) and those who did not report the condition but were found to meet diagnostic criteria (undiagnosed reservoir cases). Our method was as follows:

1. Self-reported cases: We counted individuals with a self-reported diagnosis of disease (i.e., those who responded "yes" to the question, "Have you ever been told by a doctor or other health professional that you have ___ ?" ). In this way, we accounted for persons with established diagnoses who receive adequate treatment and would therefore be missed if only a laboratory-based definition of abnormal (i.e., glucose level >140 mg/dL for diabetes) were used.

2. Undiagnosed reservoir cases: We identified all individuals whose laboratory values exceeded a specific diagnostic threshold for fasting plasma glucose level, fasting total cholesterol level, and blood pressure based on an average of up to six measurements taken during the NHANES examination (operational details are available elsewhere (13)). In this way, we accounted for patients with abnormal results on laboratory tests but without an established diagnosis--cases that would have been missed if only a self-report-based definition of disease were used.

Being Overweight

Because there was no self-report question of being overweight, we calculated the number of individuals whose body mass index exceeded the old threshold (27 kg/m2) and the number of individuals whose body mass index exceeded the new threshold (25 kg/m2). Body mass index was calculated by using the following equation:

Statistical Analysis

We present the distribution of each variable for the U.S. adult population and counts of Americans labeled as "diseased" under the old and the new thresholds for defining "diseased." All analyses incorporated sampling weights to adjust for differential probability of selection (given the complex sampling design) and to account for nonresponse. All analyses used the SVY series of commands in Stata 5.0 (College Station, Texas).


Figures 2, 3, 4, and 5 show the distribution of each disease in the U.S. adult population, with lines marking the threshold values defining "diseased" under the old and the new recommendations. Table 2 shows the number of Americans labeled "diseased" under the old and the new recommendations and demonstrates that adopting each of these newly proposed definitions will substantially inflate the prevalence of each condition.


The American Diabetes Association recently lowered the threshold fasting glucose level that defines diabetes from >140 mg/dL to >126 mg/dL7 (Figure 2). The newly adopted definition creates 1.7 million new cases of diabetes.


In the past, the Joint National Committee on High Blood Pressure advocated treatment for all patients with moderate to severe hypertension (i.e., systolic blood pressure >160 mm Hg or diastolic blood pressure >100 mm Hg) and patients with mild hypertension (i.e., systolic blood pressure >140 mm Hg or diastolic blood pressure >90 mm Hg) who had evidence of target organ damage or major risk factors for cardiovascular disease. (8) For other patients with mild hypertension, "in the absence of target organ damage and other major risk factors, some physicians may elect to withhold antihypertensive drug therapy. (12) In their most recent report, however, the Committee suggested treatment for all patients with mild hypertension. In essence, their definition of hypertension requiring treatment has changed from systolic blood pressure >160 or diastolic > 100 mm Hg to >140 or >90 mm Hg, respectively. This change causes an additional 13 million Americans to meet criteria for antihypertensive therapy. Figure 3 shows how the change in the systolic blood pressure cut-off changes the prevalence of hypertension.

Another notable change in the latest report is the creation of a new disease category called high-normal blood pressure (systolic blood pressure 130 to 139 mm Hg or diastolic blood pressure 85 to 89 mm Hg). Patients with blood pressure in this range would be prescribed either lifestyle modifications or drug therapy, depending on their predicted risk for cardiovascular disease. Implementation of this diagnostic strategy would add 31 million new cases of disease to the 38 million cases established under the old definition.


The recent publication of the results of the Air Force/Texas Coronary Atherosclerosis Prevention Study (AFCAPS/TexCAPS), which demonstrated a reduction in cardiovascular mortality among patients with "normal" cholesterol levels, has stimulated discussion about lowering the threshold value for "high" cholesterol levels. (6) One strategy (based on the entry criteria from the AFCAPS trial) would lower the threshold cholesterol level for treatment from 240 mg/dL to 200 mg/dL. Adopting this strategy would create 42 million new cases, almost doubling the number of people for whom pharmacologic treatment would be recommended. Figure 4 shows that the new disease threshold, a cholesterol level of 200 mg/dL, falls at about the median value of the U.S. population; consequently, this threshold labels about half of the entire U.S. population as "diseased."

Being Overweight

The National Heart, Blood, and Lung Institute recently recommended that being overweight be defined as a body mass index >25 kg/m2 rather than >27 kg/m2. (9) To facilitate interpretation, Table 3 presents the height-specific cut-off values that define "overweight" according to the old and the new definitions. This change would result in 29 million new diagnoses of being overweight. Figure 5 shows that this definition also falls at about the median value of the U.S. population distribution and therefore would label about half of the U.S. adult population as diseased.

Any Condition

Table 2 presents the number of people labeled with any one of these four conditions. Under the old disease definitions, 109 million people, or 58% of the U.S. adult population, would be labeled as diseased. If all four new disease definitions are implemented, 75% of the U.S. adult population (over 140 million people) will be labeled as having at least one disease. The Venn diagram (Figure 6) shows that these four conditions overlap substantially and that many patients will receive more than one diagnosis.

Adequacy of Current Treatment and Detection

To understand the adequacy of current treatment for patients with known disease under the old definitions, we further categorized these patients by whether their disease at the time of self-report was adequately controlled (i.e., whether their laboratory values were less than or greater than the old cut-off values defining disease). Figure 7 shows the proportion of cases under the old definition that are diagnosed and adequately treated, diagnosed and inadequately treated, or undiagnosed. This figure highlights the fact that a substantial proportion of patients with diagnosed disease are inadequately treated: 12% of hypertensive patients, 41% of diabetic patients, and 23% of hypercholesterolemic patients. In addition, a substantial number of individuals have undiagnosed disease under the old disease definitions: the undiagnosed reservoir of disease ranges from 12% for hypertension to 43% for hypercholesterolemia.


In the hope of extending treatment benefits to patients with early disease, various professional societies have recommended changing several common disease definitions by lowering the threshold value for diagnosis. Adopting these seemingly modest changes would dramatically inflate the prevalence of diabetes, hypertension, hypercholesterolemia, and being overweight. In the case of the latter two conditions, half the U.S. adult population would be classified as diseased. If all four conditions are considered, three quarters of the U.S. adult population would receive at least one diagnosis.


An important limitation of our study is that, with the exception of blood pressure, all of the data presented are based on single measurements. Although there is no reason to doubt the meticulousness of the NHANES assessments, it is likely that in some cases, the values obtained represent measurement error. Moreover, it is unrealistic to assume that clinicians would base treatment decisions on single measurements; the diagnosis of diabetes, for example, requires two abnormal values. (7) However, the purpose of our analysis was to establish a reasonable estimate of the probable impact of the proposed changes. Even if the true impact were half or one quarter of our estimate, a large number of people would be affected.

Another potential limitation comes from our use of self-reported disease prevalence to determine established diagnoses. It is possible that patients may be mistaken about their condition. Some may not recall having received a diagnosis (especially a diagnosis of mild disease that did not require active treatment), others may erroneously recall a diagnosis, and still others may choose not to acknowledge a diagnosis in a survey. Although we have no way of estimating the amount of misreporting, it probably only represents a small fraction of the large changes in prevalence noted, and overreporting will cancel out underreporting to some extent. Finally, the potential for nonresponse bias should be noted; however, its effect is probably very small because the calculation of sample weights included statistical methods to reduce such bias (10) and because nonresponse could only result in underrepresentation of cases under both the old and the new disease definitions.

Implications of Lower Diagnostic Thresholds

Although it is appealing to believe that individual patients or society would benefit from the identification and treatment of earlier disease, there are three reasons to question this assumption and to be cautious in adopting these new disease definitions.

First, the supporting evidence is incomplete. For example, randomized trial data show that treating low-risk patients who have normal cholesterol levels reduced acute major cardiovascular events (fatal and nonfatal myocardial infarction, unstable angina, or sudden death) from 10.9 to 6.8 per 1000 patient-years, but all-cause mortality did not differ statistically between the study groups (and was actually slightly higher in the intervention group: 4.6 deaths vs. 4.4 per 1000 patient-years). (6) The definitional changes for diabetes and for being overweight are not based on trials but solely on extrapolations from the experience of patients with more advanced disease (e.g., patients with overt diabetes or morbid obesity). The new definition of mild hypertension (i.e., systolic blood pressure >140 mm Hg or diastolic blood pressure >90 mm Hg) has limited support from one randomized trial in which treating mild diastolic hypertension (diastolic blood pressure 90 to 109 mm Hg) resulted in a small absolute reduction in stroke rates but no change in overall coronary events or all-cause mortality. (5)

The second reason is practical: There are competing priorities. A focus on finding new patients will compete with treating existing ones. Finding new cases of disease will take time, effort, and money--resources that will have to come from somewhere. One limited resource is physician time and attention. An unintended consequence of identifying new, milder cases of chronic disease may be to distract physicians from treating unrelated conditions; for example, a recent study reported that patients with emphysema were less likely to receive lipid-lowering agents than patients without emphysema, and women with diabetes were less likely to be prescribed estrogen replacement therapy than women without diabetes. (14) Multiplying the number of diagnoses leaves less time and energy to deal with each one. Given the inadequacy with which medical conditions are currently detected and treated under the old definitions, it does not seem sensible to change disease definitions to detect even milder cases. Figure 7 shows that a substantial proportion of patients with known disease do not receive adequate treatment, a finding supported by a recent, large Veterans Affairs study of hypertensive patients (15); this suggests that diagnosing disease is easier than treating it. Moreover, a large reservoir of undiagnosed disease already exists. Rather than lowering thresholds to identify more patients with milder disease, it may make more sense to identify patients who meet the old disease criteria and focus efforts on more adequately treating their disease.

Third, diagnosis and treatment are potentially harmful. Regardless of whether newly identified patients receive treatment, simply labeling them with a diagnosis carries potentially important physical and psychological consequences. (16-19) Considering only the new definitions of the four conditions we studied, 75% of the entire U.S. adult population would be labeled as having at least one chronic disease (compared with 58% under the old definitions). The impact of such ubiquitous labeling is difficult to quantify but is probably substantial. In a nation already obsessed with weight and body image and in which eating disorders (e.g., anorexia nervosa and bulimia) are prevalent, labeling half of the population "overweight," for example, may be traumatic.

Treatment side effects represent another potential harm. Even if serious side effects are rare, the enormous increase in the number of people exposed to treatment means that more will occur. The cardiac valvular abnormalities related to the use of dexfenfluramine and fenfluramine (i.e., Phen-Fen) to treat obesity are a recent salient example. Other serious side effects include hypoglycemia in the treatment of diabetes, potentially fulminant hepatic necrosis resulting from use of 3-hydroxy-3-methylglutaryl coenzyme A reductase (i.e., the statins) inhibitors to lower cholesterol levels, and syncope or ischemic events with aggressive lowering of blood pressure.

The potential harms noted are most relevant for the new cases. Because these patients have the mildest disease, they stand to gain the least from diagnosis and treatment. The potential harms of disease labeling and treatment side effects, however, are probably similar across all levels of disease; thus, the balance of benefit against harm will be least favorable for patients who received diagnoses based on the new definitions. If the benefit of treatment is small, untoward effects may overwhelm the benefits, and the new diagnostic criteria could actually result in net harm.

Lower diagnostic thresholds will not only raise the prevalence of disease, they will appear to improve disease outcomes. That is, to the extent that the disease variable being redefined (e.g., the systolic blood pressure value considered to be abnormal) is associated with severity or prognosis, people with cases under the new criteria will be less sick than those with cases diagnosed by the old criteria. For entirely spurious reasons (i.e., a "Will Rogers effect" (20)), the new definitions will make everyone appear healthier. For example, the average fasting blood glucose level among diabetic patients would appear to decrease by about 13 mg/dL and the average total cholesterol level among hypercholesterolemic patients would seem to decrease by about 25 mg/dL, even though each individual patient's laboratory results remain unchanged. Statistics used to summarize the net benefits of treatment will be affected in a similar way: Amputation rates for diabetic patients and stroke rates for hypertensive patients, for example, will appear to decrease, even if the patients are not doing any better, because the new patients with mild disease dilute the overall outcomes. The foregoing considerations argue against the widespread adoption of the new definitions outside the context of a trial; otherwise, we may never learn their true impact.

Our analyses demonstrate that adopting the seemingly modest proposed disease definition changes would label great numbers of people as diseased. How much these new "patients" will ultimately benefit from the early detection and treatment of these conditions is unknown. Whether they will experience important physical or psychological harm is an open question.


1. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. The Diabetes Control and Complications Trial Research Group. N Engl J Med. 1993;329:977-86.

2. Ohkubo Y, Kishikawa H, Araki E, et al. Intensive insulin therapy prevents the progression of diabetic microvascular complications in Japanese patients with non-insulin-dependent diabetes mellitus: a randomized prospective 6-year study. Diabetes Res Clin Pract. 1995;28:103-17.

3. Effects of treatment on morbidity in hypertension. Results in patients with diastolic blood pressures averaging 115 through 129 mm Hg. JAMA. 1967;202:1028-34.

4. Effects of treatment on morbidity in hypertension. II. Results in patients with diastolic blood pressure averaging 90 through 114 mm Hg. JAMA. 1970;213:1143-52.

5. MRC trial of treatment of mild hypertension: principal results. Medical Research Council Working Party. Br Med J (Clin Res Ed). 1985;291:97-104.

6. Downs JR, Clearfield M, Weis S, Whitney E, Shapiro DR, Beere PA. Primary prevention of acute coronary events with lovastatin in men and women with average cholesterol levels: results of AFCAPS/TexCAPS. Air Force/Texas Coronary Atherosclerosis Prevention Study. JAMA. 1998;179:1615-22.

7. Report of the Expert Committee on the Diagnosis and Classification of Diabetes Mellitus. Diabetes Care. 1997;20: 1183-97.

8. The sixth report of the Joint National Committee on prevention, detection, evaluation, and treatment of high blood pressure. Arch Intern Med. 1997;157:2413-46.

9. Heart, Lung and Blood Institute. Clinical guidelines on the identification, evaluation and treatment of overweight and obesity in adults. 1998.

10. Third National Health and Nutrition Examination Survey, 1988-1994, NHANES III Household Adult and Laboratory Data Files (CD-ROM). Hyattsville, MD: U.S. Department of Health and Human Services. National Center for Health Statistics, Centers for Disease Control and Prevention; 1996. Public Use Data File Documentation Number 76200.

11. National Center for Health Statistics. Plan and Operation of the Third National Health and Nutrition Examination Survey, 1988-94. Hyattsville, MD: U.S. Department of Health and Human Services, Public Health Service, Centers for Disease Control and Prevention, National Center for Health Statistics; 1994 DHHS publication no. (PHS) 94-1308.

12. The fifth report of the Joint National Committee on Detection, Evaluation, and Treatment of High Blood Pressure (JNC V). Arch Intern Med. 1993;153:154-83.

13. Gunter EW, Lewis BG, Koncikowski SM. Laboratory Procedures Used for the Third National Health and Nutrition Examination Survey (NHANES III), 1988-1994. Hyattsville, MD: Center for Disease Control and Prevention; 1996.

14. Redelmeier DA, Tan SH, Booth GL. The treatment of unrelated disorders in patients with chronic medical diseases. N Engl J Med. 1998;338:1516-20.

15. Berlowitz DR, Ash AS, Hickey EC, et al. Inadequate management of blood pressure in a hypertensive population. N Engl J Med. 1998;339:1957-63.

16. Bergman AB, Stamm SJ. The morbidity of cardiac nondisease in schoolchildren. N Engl J Med. 1967;276:1008-13.

17. MacDonald LA, Sackett DL, Haynes RB, Taylor DW. Labelling in hypertension: a review of the behavioural and psychological consequences. J Chronic Dis. 1984;37:933-42.

18. Cadman D, Chambers LW, Walter SD, Ferguson R, Johnston N, McNamee J. Evaluation of public health preschool child developmental screening: the process and outcomes of a community program. Am J Public Health. 1987;77:45-51.

19. Feldman W. How serious are the adverse effects of screening? J Gen Intern Med. 1990;5(5 Suppl):S50-3.

20. Feinstein AR, Sosin DM, Wells CK. The Will Rogers phenomenon. Stage migration and new diagnostic techniques as a source of misleading statistics for survival in cancer. N Engl J Med. 1985;312:1604-8.


The authors thank Peter Mogielnicki, MD, for helpful review of the manuscript.


The views expressed herein do not necessarily represent the views of the Department of Veterans Affairs or the United States Government.

Grant Support

Drs. Woloshin and Schwartz are supported by Veterans Affairs Career Development Awards in Health Services Research and Development and by a U.S. Army Medical Research and Materiel Command Breast Cancer Research Program New Investigator Award (DAMD17-96-MM-6712).


Lisa M. Schwartz, MD, MS, and Steven Woloshin, MD, MS, Veterans Affairs Outcomes Group (111B), Veterans Affairs Medical & Regional Office Center, 215 North Main Street, White River Junction, VT 05009; e-mail:;

Appendix: Practical Advice to Users of NHANES III

1. Description: The NHANES III is an extraordinarily rich data source for performing nationally representative, cross-sectional analyses. Data are available for a wide array of sociodemographic and examination data (e.g., physical examinations, blood tests, x-rays, ultrasonograms, spirometry, and electrocardiography.).

2. Accessing the database: The National Center for Health Statistics World Wide Web site ( me.htm) has links to NHANES and other surveys (e.g., National Health Interview Survey) conducted by the Centers for Disease Control and Prevention. The site allows users to download reports and some data sets. Users can also e-mail questions about options and get useful and timely responses. For simple, preliminary tabular analysis, the NHANES III data set in the Statistical Export and Tabulation software is available on CD-ROM (PC systems only) for $20 from the Government Printing Office and the National Technical Information Service. For purchasing instructions, go to http://w For complex analyses (i.e., anything other than simple tabulations), users should obtain the ASCII CD-ROM data set. ASCII versions of the data set on CD-ROM are available from Data Dissemination Branch, National Center for Health Statistics, Centers for Disease Control and Prevention, 6525 Belcrest Road, Room 1064, Hyattsville, MD 20782; telephone, 301-436-8500; fax, 301-436-4258; e-mail,

3. Manuals: The CD-ROM includes extensive documentation of operational issues (e.g., how various laboratory tests and examinations were performed). The discussion of statistical issues (e.g., complex sampling design, accounting for nonresponse, and weighting strategies) is very helpful and is a useful resource on its own.

4. Using the data set: To facilitate statistical analyses, we suggest downloading data from the NHANES CD-ROM. Be aware that the data sets are big; our working data set consisted of 33 variables and used about 6 MB. The first step in creating your own data set is to download the records of all individuals in the sample that correspond to the sampling weights being used (see below). Once the data set is loaded into the appropriate software (e.g., Stata [College Station, Texas], SUDAAN [Research Triangle Institute, Research Triangle Park, North Carolina]), one can select a subset of records to correspond to the specific population under analysis.

5. Analyses: To estimate national variables, analyses need to use the proper sampling weights and the primary sampling unit (PSU) and stratum designation to account for the "complex, multi-stage sample design of NHANES." Although use of sampling weights will allow for correct estimates of central tendency or prevalence, the PSU and stratum variables are necessary to correctly estimate variance and standard errors and to perform any testing of statistical significance. Because PSU and stratum designation reflect a person's chance of selection, individuals have only one PSU and stratum variable assigned to them. The analytic and reporting guideline (available in Section V of the reference manuals and reports of the CD-ROM on-line documentation) discusses this issue in detail.

Different research questions require different sampling weights. There are nine sampling weights that, in essence, indicate the number of people represented by each observation, adjust for nonresponse, and compensate for inadequacies in the sample frame (e.g., the number of people without a fixed address). Researchers must select the appropriate sampling weight (from the nine available choices) for the specific variables being analyzed. The sampling weights reflect three categories of examinations: site (the mobile examination center, at home, or both), timing (specimens collected in the morning, afternoon or evening, or any time), and special subsets (allergy or central nervous system disease). We present a simplified version of Table 5.1 from the reference manual of the CD-ROM to help guide selection of the appropriate sampling weight (Appendix Table).

Users need a statistical package that allows the proper use of sampling weights plus stratum and PSU variables to incorporate the complex sample design into the analyses. Although we used the SVY series of commands in Stata 5.0, other packages, such as SUDAAN, offer the same capability.

Finally, Appendix B of the analytic reporting guidelines gives recommended minimal sample sizes to achieve stable estimates. Even though NHANES is a large sample, multiple levels of stratification lead to small cell size, depending on the analysis. For example, the NHANES III supports analyses stratified by some ethnicities (e.g., non-Hispanic White, non-Hispanic Black, and Mexican-American) but not others (e.g., Native American or Asian-Pacific Islanders). The analytic and reporting guidelines (available on the CD-ROM NHANES III Reference Manuals and Reports, Section V) discuss all of the foregoing issues in detail.