Intelligent Urban Parking Recommendation System using Spatiotemporal Graph Neural Network, Large Language Model and Vision Assistant

Authors

    Muhtada Zuhair Ali Department of Electrical and Computer Engineering, Urmia University, Urmia, Iran
    Jamshid Bagherzadeh Professor, Department of Electrical and Computer Engineering, Urmia University, Urmia, Iran
    Parviz Rashidi-Khazaee * Assistant Professor, Department of Information Technology and Computer Engineering, Urmia University of Technology, Urmia, Iran p.rashidi@uut.ac.ir

Keywords:

Smart Parking, Urban Traffic Management, Parking Recommendation, Spatiotemporal Graph Learning, Kolmogorov–Arnold Networks, Temporal Transformer, LLMaVA, YOLOv12

Abstract

The growing imbalance between urban parking demand and available capacity leads to excessive cruising, traffic congestion, and unnecessary emissions, which collectively degrade urban traffic efficiency. To address this challenge, this paper proposes multimodal Spatiotemporal Graph Neural Network, Large Language Models and Vision Algorithm (STGNN-LLMaVA), framework for intelligent urban parking recommendation that jointly models visual perception, semantic context, and dynamic spatiotemporal dependencies. Parking-slot occupancy is inferred from surveillance images using You Only Look Once version 12 (YOLOv12). In parallel, the Large Language model and Vision Assistant (LLMaVA) generate compact semantic and temporal descriptions of the scene, capturing factors such as congestion, visibility, and surrounding activity. These visual and language-derived features are embedded into parking-node representations and processed by a GraphKAN–Temporal Transformer backbone. In this backbone, a Kolmogorov–Arnold Network models nonlinear spatial interactions, while a causal Temporal Transformer captures evolving availability patterns to produce top-k parking recommendations. Experiments conducted on four real-world parking datasets demonstrate that STGNN-LLMaVA consistently outperforms strong Graph Neural Network (GNN)- and Large Language Model (LLM)-based baselines, achieving improvements of up to 24.7% in HitRate@10, 10.5% in NDCG@10, and 9.8% in MRR@10. These results indicate that integrating vision-based occupancy estimation, language-driven contextual reasoning, and spatiotemporal graph learning provides an effective, scalable, and data-efficient solution for sustainable smart parking management.

References

[1] D. Pojani, J. Corcoran, N. Sipe, I. Mateo-Babiano, and D. Stead, Parking: An international perspective. Elsevier, 2019.

[2] M. Assim and A. Al-Omary, "A survey of IoT-based smart parking systems in smart cities," in In 3rd Smart Cities Symposium (SCS 2020), 2020-09-21 2020, pp. 35-38, doi: 10.1049/icp.2021.0911.

[3] Y. Chu and S. Li, "Application of IoT and artificial intelligence technology in smart parking management," in In 2023 IEEE International Conference on Integrated Circuits and Communication Systems (ICICACS), 2023-02-24 2023, pp. 1-6, doi: 10.1109/ICICACS57338.2023.10099976.

[4] L. E. Giampaoli and F. Hessel, "Parking Space Occupancy Monitoring System Using Computer Vision and IoT," in In 2021 IEEE 7th World Forum on Internet of Things (WF-IoT), 2021-06-14 2021, pp. 7-12, doi: 10.1109/WF-IoT51360.2021.9595935.

[5] D. Neupane, A. Bhattarai, S. Aryal, M. R. Bouadjenek, U. Seok, and J. Seok, "Shine: A deep learning-based accessible parking management system," Expert Systems with Applications, vol. 238, p. 122205, 2024, doi: 10.1016/j.eswa.2023.122205.

[6] L. Zhang, J. Huang, X. Li, and L. Xiong, "Vision-based parking-slot detection: A DCNN-based approach and a large-scale benchmark dataset," IEEE Transactions on Image Processing, vol. 27, no. 11, pp. 5350-5364, 2018, doi: 10.1109/TIP.2018.2857407.

[7] X. Wang et al., "Traffic flow prediction via spatial temporal graph neural network," in In Proceedings of the Web Conference 2020, 2020-04-20 2020, pp. 1082-1092, doi: 10.1145/3366423.3380186.

[8] A. Ali, Y. Zhu, and M. Zakarya, "Exploiting dynamic spatio-temporal graph convolutional neural networks for citywide traffic flows prediction," Neural Networks, vol. 145, pp. 233-247, 2022, doi: 10.1016/j.neunet.2021.10.021.

[9] A. Zhang, "Dynamic graph convolutional networks with temporal representation learning for traffic flow prediction," Scientific Reports, vol. 15, no. 1, p. 17270, 2025, doi: 10.1038/s41598-025-01696-7.

[10] J. Zhang, Y. Yang, X. Wu, and S. Li, "Spatio-temporal transformer and graph convolutional networks based traffic flow prediction," Scientific Reports, vol. 15, no. 1, p. 24299, 2025, doi: 10.1038/s41598-025-10287-5.

[11] M. Jiang and Z. Liu, "Traffic flow prediction based on dynamic graph spatial-temporal neural network," Mathematics, vol. 11, no. 11, p. 2528, 2023, doi: 10.3390/math11112528.

[12] X. Huang, Y. Ye, X. Yang, and L. Xiong, "Multi-view dynamic graph convolution neural network for traffic flow prediction," Expert Systems with Applications, vol. 222, p. 119779, 2023, doi: 10.1016/j.eswa.2023.119779.

[13] Y. Ye, Y. Xiao, Y. Zhou, S. Li, Y. Zang, and Y. Zhang, "Dynamic multi-graph neural network for traffic flow prediction incorporating traffic accidents," Expert Systems with Applications, vol. 234, p. 121101, 2023, doi: 10.1016/j.eswa.2023.121101.

[14] Y. Zheng, L. Yi, and Z. Wei, "A survey of dynamic graph neural networks," Frontiers of Computer Science, vol. 19, no. 6, p. 196323, 2025, doi: 10.1007/s11704-024-3853-2.

[15] W. Jiang, J. Luo, M. He, and W. Gu, "Graph neural network for traffic forecasting: The research progress," ISPRS International Journal of Geo-Information, vol. 12, no. 3, p. 100, 2023, doi: 10.3390/ijgi12030100.

[16] L. Zhang, B. Wang, Q. Zhang, S. Zhu, and Y. Ma, "Parking Lot Traffic Prediction Based on Fusion of Multifaceted Spatio-Temporal Features," Sensors, vol. 24, no. 15, p. 4971, 2024, doi: 10.3390/s24154971.

[17] C. Ma, X. Huang, and J. Li, "A review of research on urban parking prediction," Journal of Traffic and Transportation Engineering (English Edition), vol. 11, no. 4, pp. 700-720, 2024, doi: 10.1016/j.jtte.2023.11.004.

[18] P. Xu et al., "Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, doi: 10.1109/TPAMI.2024.3507000.

[19] Z. Han, X. Liu, and J. Hao, "LLaVA-GM: lightweight LLaVA multimodal architecture," Frontiers in Computer Science, vol. 7, p. 1626346, 2025, doi: 10.3389/fcomp.2025.1626346.

[20] M. Okmi, L. Y. Por, T. F. Ang, W. Al-Hussein, and C. S. Ku, "A systematic review of mobile phone data in crime applications: a coherent taxonomy based on data types and analysis perspectives, challenges, and future research directions," Sensors, vol. 23, no. 9, p. 4350, 2023, doi: 10.3390/s23094350.

[21] Z. Dai et al., "An intrusion detection model to detect zero-day attacks in unseen data using machine learning," PloS One, vol. 19, no. 9, p. e0308469, 2024, doi: 10.1371/journal.pone.0308469.

[22] J. Li, D. Li, S. Savarese, and S. Hoi, "Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models," in In International Conference on Machine Learning, 2023-07-03 2023, pp. 19730-19742. [Online]. Available: https://proceedings.mlr.press/v202/li23q.

[23] R. Sapkota et al., "YOLO advances to its genesis: a decadal and comprehensive review of the You Only Look Once (YOLO) series," Artificial Intelligence Review, vol. 58, no. 9, p. 274, 2025, doi: 10.1007/s10462-025-11253-3.

[24] Y. Jin et al., "Llava-vsd: Large language-and-vision assistant for visual spatial description," in In Proceedings of the 32nd ACM International Conference on Multimedia, 2024-10-28 2024, pp. 11420-11425, doi: 10.1145/3664647.3688992.

[25] L. Li, Y. Zhang, G. Wang, and K. Xia, "Kolmogorov-Arnold graph neural networks for molecular property prediction," Nature Machine Intelligence, vol. 7, no. 8, pp. 1346-1354, 2025, doi: 10.1038/s42256-025-01087-7.

[26] Cnrpark+Ext, "Available online," 2023. [Online]. Available: http://cnrpark.it.

[27] r. parking, "Available online," 2023. [Online]. Available: https://github.com/Eighonet/parking-research.

[28] A. Roboflow, "Available online," 2023. [Online]. Available: https://universe.roboflow.com/search?q=car+parking.

[29] o. parking space, "Available online," 2023. [Online]. Available: https://github.com/martin-marek/parking-space-occupancy.

[30] Y. A. l. tool, "Available online," 2023. [Online]. Available: https://github.com/LILINOpenGitHub/Labeling-Tool.

[31] G. S. Wong, K. O. Goh, C. Tee, and A. Q. Md. Sabri, "Review of vision-based deep learning parking slot detection on surround view images," Sensors, vol. 23, no. 15, p. 6869, 2023, doi: 10.3390/s23156869.

[32] Z. Chen et al., "Exploring the potential of large language models (LLMs) in learning on graphs," ACM SIGKDD Explorations Newsletter, vol. 25, no. 2, pp. 42-61, 2024, doi: 10.1145/3655103.3655110.

Downloads

Published

2026-09-01

Submitted

2026-02-02

Revised

2026-06-03

Accepted

2026-06-10

Issue

Section

Articles

How to Cite

Zuhair Ali, M., & Bagherzadeh, J. . (2026). Intelligent Urban Parking Recommendation System using Spatiotemporal Graph Neural Network, Large Language Model and Vision Assistant. Management Strategies and Engineering Sciences, 1-17. https://msesj.com/index.php/mses/article/view/349

Similar Articles

1-10 of 256

You may also start an advanced similarity search for this article.