Real-Time Predictive Analytics for Early Homelessness Prevention: A
Machine Learning Approach
Affiliations
1
Business Administration, Westcliff University, N/A
2
Electrical and Electronic Engineering, Ahsanullah University of Science and Technology, N/A
3
Engineering and Technology, Khulna University, N/A
Abstract
Homelessness is a complex and persistent societal issue, often exacerbated by economic
instability, housing shortages, and systemic inequities. Existing strategies primarily rely on
reactive interventions, which, while essential, fail to provide proactive solutions for prevention.
This study presents a novel machine learning-based framework for early homelessness
prediction, integrating key socioeconomic, housing, and public health indicators. Utilizing a realworld dataset, we compare the predictive performance of two machine learning models—
Random Forest and XGBoost—to assess their effectiveness in identifying high-risk populations.
The results demonstrate that the Random Forest model consistently outperforms XGBoost,
achieving a lower Mean Absolute Error (MAE) of 12.46, a lower Mean Squared Error (MSE) of
44,534.73, and a higher R² score of 0.996, indicating a superior fit. Feature importance analysis
reveals that total homeless counts (pit_tot_hless_pit_hud) and individual homelessness rates are
the most critical predictive factors, while economic conditions and housing market pressures also
play significant roles. Furthermore, residuals analysis and error distribution comparisons
illustrate that the Random Forest model maintains a more stable and consistent predictive
capability across different demographic and geographic groups. Our research stands apart by
integrating a high-dimensional, multi-source dataset to enhance predictive accuracy while
addressin...
Keywords:
Homelessness, machine learning,XGBoost, Random Forest