Integrating Socioeconomic Determinants with Graph and Transformer Models for Equitable Public Health Forecasting

Main Article Content

Sreeja Poduri

Abstract

This study presents a socioeconomic-aware, county-level public health forecasting framework that integrates data from the U.S. Census, Medicare, and CDC Social Vulnerability Index (SVI) to predict health outcomes and identify disparities across regions. The proposed model leverages deep autoencoders to capture latent socioeconomic patterns, Graph Neural Networks (GNNs) to represent inter-county relationships, and transformer-based temporal modeling for dynamic health trend prediction. A fairness-aware loss function ensures equitable performance for disadvantaged counties, reducing prediction bias across vulnerable populations. Experimental results demonstrate that the Random Forest baseline outperformed Linear Regression, achieving a lower MAE (~90 vs. 98) and comparable RMSE (~105), while fairness optimization reduced error for vulnerable counties by approximately 70%. Feature importance analysis revealed Obesity Rate (%) and Broadband Coverage (%) as dominant predictors, emphasizing the intersection of health behavior and digital access. Policy simulations further indicated that a +10% increase in broadband coverage could lower predicted hospitalization rates by up to 3% in several counties. Overall, the results validate the framework's ability to combine accuracy, interpretability, and fairness, providing a scalable, data-driven tool for equitable public health planning and resource allocation

Article Details

How to Cite
[1]
Sreeja Poduri, “Integrating Socioeconomic Determinants with Graph and Transformer Models for Equitable Public Health Forecasting”, Int. J. Comput. Eng. Res. Trends, vol. 10, no. 11, pp. 70–81, Nov. 2023.
Section
Research Articles

References

M. T. Ribeiro, S. Singh, and C. Guestrin, "‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA: ACM, Aug. 2016, pp. 1135–1144. doi: 10.1145/2939672.2939778.

N. G. Reich et al., “A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States,” Proc. Natl. Acad. Sci. USA, vol. 116, no. 8, pp. 3146–3154, Feb. 2019, doi: 10.1073/pnas.1812594116.

A. Lavanya, S. Sindhuja, L. Gaurav, and W. Ali, “A Comprehensive Review of Data Visualization Tools: Features, Strengths, and Weaknesses,” IJCERT, vol. 10, no. 1, pp. 10–20, Jan. 2023, doi: 10.22362/ijcert/2023/v10/i01/v10i0102.

A. Lavanya, P. Darsha, P. Akhil, J. Lloret, and N. Yogeshwar, “A Real-Time Human Mobility Visualization of Covid-19 Spread from East Asian Countries,” in 2021 Eighth International Conference on Social Network Analysis, Management and Security (SNAMS), Gandia, Spain: IEEE, Dec. 2021, pp. 1–8. doi: 10.1109/SNAMS53716.2021.9732103.

B. E. Flanagan, E. W. Gregory, E. J. Hallisey, J. L. Heitgerd, and B. Lewis, “A Social Vulnerability Index for Disaster Management,” Journal of Homeland Security and Emergency Management, vol. 8, no. 1, p. 0000102202154773551792, Jan. 2011, doi: 10.2202/1547-7355.1792.

N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A Survey on Bias and Fairness in Machine Learning,” ACM Comput. Surv., vol. 54, no. 6, pp. 1–35, Jul. 2022, doi: 10.1145/3457607.

S. Lundberg and S.-I. Lee, “A Unified Approach to Interpreting Model Predictions,” 2017, arXiv. doi: 10.48550/ARXIV.1705.07874.

J. Lloret, M. Garcia, D. Bri, and S. Sendra, “A Wireless Sensor Network Deployment for Rural and Forest Fire Detection and Verification,” Sensors, vol. 9, no. 11, pp. 8722–8747, Oct. 2009, doi: 10.3390/s91108722.

D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” 2014, arXiv. doi: 10.48550/ARXIV.1412.6980.

H. Ge, Y. Guo, and S. Li, “An Efficient Parallel Pursuit Algorithm,” in 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Aug. 2016, pp. 587–591. doi: 10.1109/IHMSC.2016.34.

G. Ogedegbe et al., “Assessment of Racial/Ethnic Disparities in Hospitalization and Mortality in Patients With COVID-19 in New York City,” JAMA Netw Open, vol. 3, no. 12, p. e2026881, Dec. 2020, doi: 10.1001/jamanetworkopen.2020.26881.

A. Vaswani et al., “Attention Is All You Need,” 2017, arXiv. doi: 10.48550/ARXIV.1706.03762.

N. C. Benda, T. C. Veinot, C. J. Sieck, and J. S. Ancker, “Broadband Internet Access Is a Social Determinant of Health!,” Am J Public Health, vol. 110, no. 8, pp. 1123–1125, Aug. 2020, doi: 10.2105/AJPH.2020.305784.

World Health Organization, “Closing the gap in a generation: health equity through action on the social determinants of health - Final report of the commission on social determinants of health.” Accessed: Jan. 11, 2026. [Online]. Available: https://www.who.int/publications/i/item/WHO-IER-CSDH-08.1

W. E. Parmet and M. S. Sinha, “Covid-19 — The Law and Limits of Quarantine,” N Engl J Med, vol. 382, no. 15, Apr. 2020, doi: 10.1056/NEJMp2004211.

Z. Obermeyer, B. Powers, C. Vogeli, and S. Mullainathan, “Dissecting racial bias in an algorithm used to manage the health of populations,” Science, vol. 366, no. 6464, pp. 447–453, Oct. 2019, doi: 10.1126/science.aax2342.

M. Malencia, V. Kumar, G. Pappas, and A. Prorok, “Fair Robust Assignment Using Redundancy,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 4217–4224, Apr. 2021, doi: 10.1109/LRA.2021.3067283.

S. Barocas, M. Hardt, and A. Narayanan, Fairness and machine learning: limitations and opportunities. Cambridge, Massachusetts: The MIT Press, 2023.

J. H. Friedman, “Greedy function approximation: A gradient boosting machine.,” The Annals of Statistics, vol. 29, no. 5, pp. 1189–1232, Oct. 2001, doi: 10.1214/aos/1013203451.

[20]D. A. Chokshi, “Income, Poverty, and Health Inequality,” JAMA, vol. 319, no. 13, p. 1312, Apr. 2018, doi: 10.1001/jama.2018.2521.

S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.

M. Mitchell et al., “Model Cards for Model Reporting,” in Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta GA USA: ACM, Jan. 2019, pp. 220–229. doi: 10.1145/3287560.3287596.

S. M. Okoye, J. F. Mulcahy, C. D. Fabius, J. G. Burgdorf, and J. L. Wolff, “Neighborhood Broadband and Use of Telehealth Among Older Adults: Cross-sectional Study of National Survey Data Linked With Census Data,” J Med Internet Res, vol. 23, no. 6, p. e26242, Jun. 2021, doi: 10.2196/26242.

L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, Oct. 2001, doi: 10.1023/A:1010933404324.

T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks,” Feb. 22, 2017, arXiv: arXiv:1609.02907. doi: 10.48550/arXiv.1609.02907.

D. Kindig and G. Stoddart, “What Is Population Health?,” Am J Public Health, vol. 93, no. 3, pp. 380–383, Mar. 2003, doi: 10.2105/AJPH.93.3.380.

T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA: ACM, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785