The prevalence and incidence of chronic conditions have implications for policy and healthcare utilization. Valid information about risk factors is important in reducing the burden of chronic diseases. Although systems to rank the strength of the recommendations about effective interventions consider all evidence from observational studies as low, prevalence and risk factors for chronic diseases can be evaluated only in observational studies. Public policy decisions should be based on applicable and unbiased results from high quality studies. Assessing the quality of observational studies is an important part of evidence-based reports made for the Agency for Healthcare Research and Quality (AHRQ). An extensive review of all available systems for rating the strength of scientific evidence and concluded that future efforts need to identify valid and reliable quality ratings for observational studies. Different methodological aspects, including selective treatment assignment, access to health care, or provider characteristics may have different importance for studies that examine treatment effects and prevalence of chronic conditions or the association of disease risk factors with patient mortality and morbidity. Therefore, quality evaluation that is part of grading of a body of evidence must be tailored to the methodological aspects and quality standards of nontherapeutic observational studies. The present collaborative project sought to develop valid and reliable quality criteria of observational studies that examine the incidence or prevalence of chronic conditions and risk factors for diseases. We propose criteria for the design, reporting standards, and assessment of nontherapeutic observational studies in systematic reviews and evidence-based reports. We developed two checklists, one for studies of incidence or prevalence and another for risk factors, based on our literature review and in collaboration with experts from other Evidence-based Practice Centers and the Centers for Disease Control and Prevention. The protocol to construct the checklists was based on a conceptual model of the development of indexes, rating scales, or other appraisals to describe and measure symptoms, physical signs, and other clinical phenomena in clinical medicine. We defined external validity as the extent to which the results of a study can be generalized to the target population. Applicability may differ from external validity by the definition of the target population; well designed studies from different countries with good external validity can have low applicability to the U.S. population. We defined internal validity as the extent to which the results of a study are correct for the subjects and the associations detected are truly caused by exposure. We defined biases the checklists should address, but avoided labeling biases in quality evaluation because of differences in definitions of biases and because of applicability of previously labeled selection, information, differential verification, context, treatment paradox, disease progression, and other biases to interventional studies.