Comparing adaptive and fixed bandwidth-based kernel density estimates in spatial cancer epidemiology

Background: Monitoring spatial disease risk (e.g. identifying risk areas) is of great relevance in public health research, especially in cancer epidemiology. A common strategy uses case-control studies and estimates a spatial relative risk function (sRRF) via kernel density estimation (KDE). This st...

Authors: Lemke, Dorothea
Mattauch, Volkmar Robert
Heidinger, Oliver
Pebesma, Edzer J.
Hense, Hans-Werner
Division/Institute:FB 05: Medizinische Fakultät
FB 14: Geowissenschaften
Document types:Article
Media types:Text
Publication date:2015
Date of publication on miami:14.04.2015
Modification date:09.01.2023
Edition statement:[Electronic ed.]
Source:International Journal of Health Geographics 14 (2015) 15, 1-10
DDC Subject:610: Medizin und Gesundheit
License:CC BY 4.0
Language:Englisch
Notes:Finanziert durch den Open-Access-Publikationsfonds 2014/2015 der Deutschen Forschungsgemeinschaft (DFG) und der Westfälischen Wilhelms-Universität Münster (WWU Münster).
Format:PDF document
ISSN:1476-072X
URN:urn:nbn:de:hbz:6-49299486816
Other Identifiers:DOI: doi:10.1186/s12942-015-0005-9
Permalink:https://nbn-resolving.de/urn:nbn:de:hbz:6-49299486816
Digital documents:s12942-015-0005-9.pdf

Background: Monitoring spatial disease risk (e.g. identifying risk areas) is of great relevance in public health research, especially in cancer epidemiology. A common strategy uses case-control studies and estimates a spatial relative risk function (sRRF) via kernel density estimation (KDE). This study was set up to evaluate the sRRF estimation methods, comparing fixed with adaptive bandwidth-based KDE, and how they were able to detect ‘risk areas’ with case data from a population-based cancer registry. Methods: The sRRF were estimated within a defined area, using locational information on incident cancer cases and on a spatial sample of controls, drawn from a high-resolution population grid recognized as underestimating the resident population in urban centers. The spatial extensions of these areas with underestimated resident population were quantified with population reference data and used in this study as ‘true risk areas’. Sensitivity and specificity analyses were conducted by spatial overlay of the ‘true risk areas’ and the significant (α=.05) p-contour lines obtained from the sRRF. Results: We observed that the fixed bandwidth-based sRRF was distinguished by a conservative behavior in identifying these urban ‘risk areas’, that is, a reduced sensitivity but increased specificity due to oversmoothing as compared to the adaptive risk estimator. In contrast, the latter appeared more competitive through variance stabilization, resulting in a higher sensitivity, while the specificity was equal as compared to the fixed risk estimator. Halving the originally determined bandwidths led to a simultaneous improvement of sensitivity and specificity of the adaptive sRRF, while the specificity was reduced for the fixed estimator. Conclusion: The fixed risk estimator contrasts with an oversmoothing tendency in urban areas, while overestimating the risk in rural areas. The use of an adaptive bandwidth regime attenuated this pattern, but led in general to a higher false positive rate, because, in our study design, the majority of true risk areas were located in urban areas. However, there is a strong need for further optimizing the bandwidth selection methods, especially for the adaptive sRRF.