Volume 22, Number 10—October 2016
Dispatch
Daily Reportable Disease Spatiotemporal Cluster Detection, New York City, New York, USA, 2014–2015
Table 1
Feature | Selection | Notes |
---|---|---|
Geographic aggregation |
Census tract (defined using US Census 2000 boundaries) of residential address at time of report* |
The less data are spatially aggregated, the more precisely areas with elevated rates can be identified. New York City has 2,216 census tracts in an area of 305 square miles. |
Date of interest for analysis |
Event date, defined using hierarchy of onset date → diagnosis date (collection date of first specimen testing positive) → report date → date event created in surveillance database |
Defining reportable disease clusters according to when case-patients became ill is preferred. However, onset date is missing for most case-patients who have not yet been interviewed, and each case needs a date to be included in analysis. Thus, the best available proxy for onset date is used. Because we use daily data (rather than weekly, monthly, or yearly data), the time precision is specified as day on the SaTScan (http://www.satscan.org/) input tab. The time precision parameter indicates the temporal resolution of the data in the case file. |
Study period |
1 y for most diseases, ending the day before analysis† |
One year is a reasonable choice, balancing the need for a period long enough to establish a stable local baseline for each spatial unit, yet short enough to avoid variable secular trends (e.g., geographically different increases in the underlying population of a spatial unit). Analyses are run each morning using data with event dates through the previous day. |
Case inclusion criteria |
Include all reported cases, regardless of current status (e.g., confirmed, probable, suspected, pending, noncase)† |
Depending on the disease, cases initially might be assigned a transient pending status and, upon investigation, be reclassified as a case (confirmed, probable, or suspected) or a noncase. Timeliness is preserved by analyzing all reported cases, including noncases and pending cases, regardless of whether they ultimately will be confirmed. By analyzing all reported cases, case inclusion criteria are consistent across the study period. If instead the case file were restricted to confirmed and pending cases, then analyses would be biased toward false signaling, as some cases with an initial pending status will be ultimately reclassified after investigation as a noncase. This reclassification process is complete for the baseline but ongoing for the current period of interest (1), and the speed of reclassification might vary geographically. |
Day-of-week variable | Include a variable that indicates the day of the week (1–7) | The analysis automatically adjusts for day-of-week effects but not for space by day-of-week interaction. Including this variable in the SaTScan case file accounts for how the daily pattern of health-seeking behavior and diagnoses might vary geographically. |
*Exception to residential address at time of report: if the residential address is not geocodable (e.g., because the case-patient is not a resident of the city or because a post office box is reported instead of a street address), then the geocoded work address, if available, is substituted.
†For exceptions, see online Technical Appendix (http://wwwnc.cdc.gov/EID/article/22/10/16-0097-Techapp1.pdf).
Page created: September 19, 2016
Page updated: September 19, 2016
Page reviewed: September 19, 2016
The conclusions, findings, and opinions expressed by authors contributing to this journal do not necessarily reflect the official position of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.