Real-Time Predictive Analytics for Early Homelessness Prevention: A
Machine Learning Approach
Affiliations
1
Department of Business Administration, Westcliff University, 17877 Von Karman Ave, 4th floor, Irvine, CA 92614, USA
2
Department of Electrical and Electronic Engineering, Ahsanullah University of Science and Technology, Tejgaon, Dhaka-1208,
Bangladesh
3
Department of Engineering Management, Trine University, University Ave, Angola, IN 46703, USA
4
Department of Marketing Analytics and Insights, Wright State University, 3640 Colonel Glenn Hwy, Dayton, OH 45435, USA
5
Department of Business Administration, International American University, 3440 Wilshire Blvd, STE 1000, Los Angeles, CA 90010
Abstract
Homelessness is a complex and persistent societal issue, often exacerbated by economic
instability, housing shortages, and systemic inequities. Existing strategies primarily rely on
reactive interventions, which, while essential, fail to provide proactive solutions for prevention.
This study presents a novel machine learning-based framework for early homelessness
prediction, integrating key socioeconomic, housing, and public health indicators. Utilizing a realworld dataset, we compare the predictive performance of two machine learning models—
Random Forest and XGBoost—to assess their effectiveness in identifying high-risk populations.
The results demonstrate that the Random Forest model consistently outperforms XGBoost,
achieving a lower Mean Absolute Error (MAE) of 12.46, a lower Mean Squared Error (MSE) of
44,534.73, and a higher R² score of 0.996, indicating a superior fit. Feature importance analysis
reveals that total homeless counts (pit_tot_hless_pit_hud) and individual homelessness rates are
the most critical predictive factors, while economic conditions and housing market pressures also
play significant roles. Furthermore, residuals analysis and error distribution comparisons
illustrate that the Random Forest model maintains a more stable and consistent predictive
capability across different demographic and geographic groups. Our research stands apart by
integrating a high-dimensional, multi-source dataset to enhance predictive accuracy while
addressin...
Keywords:
Homelessness, machine learning,XGBoost, Random Forest