From Text to Insights: NLP-Driven Classification of Infectious Diseases Based on Ecological Risk Factors

  • Saviour Inyang Akwa Ibom State University
  • Imeh Umoren Akwa Ibom State University
Abstract views: 59 , PDF downloads: 67
Keywords: Classification, Natural Language Processing, Ecology, Infectious Disease


Numerous factors can affect the development of infectious diseases that emerge. While many are the result of natural procedures, such as the gradual emergence of viruses over time, certain ones are the result of human activity. Human activities form an integral part of our ecosystem, and especially the ecological aspect of human activities can encourage disease transmission. Additionally, Health ecologists examine changes in the biological, physical, social, and economic settings to understand how these alterations impact the mental and physical well-being of individuals. Hence, this research adopts a Framework-Based Method (FBM) in carrying out the task of classification of infectious diseases. The Framework-Based Method outlines all phases that this research follows to carry out the infectious disease classification process, providing a structured and reproducible approach. Results show that: XGB: Confusion matrix accuracy: 76%, Kappa: 73%, RF: Confusion matrix accuracy: 65%, Kappa: 60%, SVM: Confusion matrix accuracy: 63%, Kappa: 58%, ANN: Confusion matrix accuracy: 71%, Kappa: 67%, LDA: Confusion matrix accuracy: 76%, Kappa: 73%, GBM: Confusion matrix accuracy: 60%, Kappa: 53%, KNN: Confusion matrix accuracy: 43%, Kappa: 34%, and DT: Confusion matrix accuracy: 37%, Kappa: 29%. Furthermore, a Deep Learning model BERT was integrated with the best classification model XGBoots to create an interactive interface for users to carry out infectious disease classification. This integration enhances user experience and accessibility, contributing to the practical application of machine learning and Natural language processing in ecological disease classification

Author Biography

Imeh Umoren , Akwa Ibom State University

Department of Computer Science


S. Morse, "Factors in the emergence of infectious diseases," in Plagues and Politics, A. T. Price-Smith (Ed.). Palgrave Macmillan, London, 2001, pp. 8-26. DOI: 10.1057/9780230524248_2

S. Tong and C. L. Soskolne, "Global Environmental Change and Population Health: Progress and Challenges," EcoHealth, vol. 4, pp. 352-362, 2007.

M. Sharma and A. Atri, Essentials of International Health, Jones & Bartlett Learning, 2010.

H. Frumkin, "Urban sprawl and public health," Public Health Reports, vol. 117, no. 3, pp. 201-217, May-Jun. 2002.

M. Naghavi, H. Wang, and R. Lozano, "Global, regional, and national age-sex specific all-cause and cause-specific mortality for 240 causes of death, 1990–2013: A systematic analysis for the Global Burden of Disease Study 2013," The Lancet, vol. 385, pp. 117-171, 2015.

P. Grandjean and P. J. Landrigan, "Neurobehavioral effects of developmental toxicity," The Lancet Neurology, vol. 13, no. 3, pp. 330-338, Mar. 2014.

K. Taniguchi et al., "Overview of infectious disease surveillance system in Japan, 1999-2005," Journal of Epidemiology, vol. 17, no. Suppl, pp. S3-S13, Dec. 2007. DOI: 10.2188/jea.17.s3

K. E. Jones et al., "Global trends in emerging infectious diseases," Nature, vol. 451, no. 7181, pp. 990-993, Feb. 2008. DOI: 10.1038/nature06536

S. L. LaDeau et al., "Data-model fusion to better understand emerging pathogens and improve infectious disease forecasting," Ecological Applications, vol. 21, no. 5, pp. 1443-1460, Jul. 2011. DOI: 10.1890/09-1409.1

M. X. Tong et al., "Infectious diseases, urbanization and climate change: challenges in future China," International Journal of Environmental Research and Public Health, vol. 12, pp. 11025-11036, Sep. 2015. DOI: 10.3390/ijerph120911025

H. Heesterbeek et al., "Modeling infectious disease dynamics in the complex landscape of global health," Science, vol. 347, no. 6227, p. aaa4339, Jan. 2015. DOI: 10.1126/science.aaa4339

A. Wesolowski et al., "Connecting mobility to infectious diseases: The promise and limits of mobile phone data," Journal of Infectious Diseases, vol. 214, no. Suppl_4, pp. S414-S420, Dec. 2016. DOI: 10.1093/infdis/jiw273

C. J. E. Metcalf et al., "Identifying climate drivers of infectious disease dynamics: Recent advances and challenges ahead," Proceedings of the Royal Society B: Biological Sciences, vol. 284, no. 1860, p. 20170901, 2017. DOI: 10.1098/rspb.2017.0901

B. Davgasuren et al., "Evaluation of the trends in the incidence of infectious diseases using the syndromic surveillance system, early warning and response unit, Mongolia, from 2009 to 2017: A retrospective descriptive multi-year analytical study," BMC Infectious Diseases, vol. 19, no. 1, p. 705, 2019. DOI: 10.1186/s12879-019-4362-z

M. Baguelin et al., "Tooling-up for infectious disease transmission modelling," Epidemics, vol. 32, p. 100395, Mar. 2020. DOI: 10.1016/j.epidem.2020.100395

R. E. Baker et al., "Infectious disease in an era of global change," Nature Reviews Microbiology, vol. 20, no. 4, pp. 193-205, 2022. DOI: 10.1038/s41579-021-00639-z

S. Inyang and I. Umoren, "Semantic-Based Natural Language Processing for Classification of Infectious Diseases Based on Ecological Factors," International Journal of Innovative Research in Sciences and Engineering Studies (IJIRSES), vol. 3, no. 7, pp. 11-21, 2023.

M. Muntean and F. D. Militaru, "Design Science Research Framework for Performance Analysis Using Machine Learning Techniques," Electronics, vol. 11, no. 16, p. 2504, Aug. 2022. DOI: 10.3390/electronics11162504

I. A. Umoren et al., "A New Index for Intelligent Classification of Early Syndromic of Cardiovascular (CVD) Diseases Based on Electrocardiogram (ECG)," European Journal of Computer Science and Information Technology, vol. 11, no. 4, pp. 1-21, 2023.

A. Ekong, A. Silas, & I. S. Inyang, "A Machine Learning Approach for Prediction of Students’ Admissibility for Post-Secondary Education using Artificial Neural Network," International Journal of Computer Applications, vol. 184, pp. 44-49, 2022.

I. J Umoren & S. J. Inyang, “Methodical Performance Modelling of Mobile Broadband Networks with Soft Computing Model,” International Journal of Computer Applications, vol. 174, no. 25, pp. 7-21, 2021.

PlumX Metrics