Understanding the Complexities of Public Health through Data Mining

Christopher Stephens

C3 - Centro de Ciencia de la Complejidad - UNAM

Disease is a property of complex adaptive systems (CAS). Unfortunately, there does not currently exist an adequate theoretical framework within which to understand either complexity or adaptation. In the absence of theory per se, phenomenology and taxonomy become even more important. In the last few decades the data revolution has made available vast quantities of data associated with CAS and, in particular, disease and public health, with which we can potentially develop a much better understanding of the dynamics of CAS. However, the vast majority of this data is ""non-scientific"" or ""coincidental"", in the sense that it is not associated with sets of controlled scientific experiments designed to examine specific hypotheses. In this talk I will argue that data mining, as a phenomenological approach to understanding and modeling CAS, offers the most promise for developing a better understanding of them in practical settings such as public health and, indeed, in its automated form, is the only feasible way of analyzing the exponentially increasing amounts of data that are becoming available. I will illustrate these points using data sets associated with three important diseases: influenza(-like illness), type 2 diabetes and Leishmaniasis - an important emerging zoonosis. For influenza, I will discuss what we can learn from citizen participation systems, using a data mining of risk factors for ILI from the data associated with the Mexican REPORTA system. I will also discuss the difficulties of symptomatic diagnosis. For type 2 diabetes I will use analyses of large-scale public health surveys in Mexico to discuss risk factors for diabetes and the relationship between nutrition, obesity and diabetes, showing that quantity not quality is the most important factor for obesity. Finally, in the case of Leishmaniasis, I will show how large numbers of spatial-data sets can mined by converting them into complex inference networks which, in the case of species distributions, can then be used to predict reservoirs of Leishmaniasis or other emerging diseases."

Organization:
Contents © 2013 Flávio Codeço Coelho - Powered by Nikola
Share