Citypulse: Proof-of-Concept for Real-Time Traffic Data Analytics and Congestion Prediction
Main Article Content
Abstract
CityPulse is a proof-of-concept big data pipeline designed to enable real-time urban mobility analytics using scalable, containerized components—without reliance on physical sensor infrastructure. The system simulates the ingestion of 11 million traffic-related records representing urban phenomena such as vehicle congestion, GPS coordinates, and weather conditions. Data is ingested through a Dockerized Apache Kafka cluster, coordinated by ZooKeeper, and processed in real time using Apache Spark Structured Streaming. To ensure robustness under load, the architecture introduces a temporary data storage layer that buffers Spark output before committing it to a centralized data warehouse. This design improves write efficiency, fault tolerance, and enables batch processing of intermediate results. The refined data feeds into a lightweight machine learning module and is served through a Flask backend with a React-based frontend for visualization and interaction. Stress testing shows that the system maintains over 300K records/min throughput with only a 10% increase in latency under full load conditions. With its modular Docker-based deployment, CityPulse offers a cost-effective and reproducible analytics solution for traffic congestion monitoring in resource-constrained environments, particularly in developing regions like Cameroon. As a proof-of-concept, the system leverages synthetic traffic data, and thus its findings depend on assumptions of data realism and may not directly reflect all the complexities or uncertainties of real-world sensor deployments.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
IJCERT Policy:
The published work presented in this paper is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. This means that the content of this paper can be shared, copied, and redistributed in any medium or format, as long as the original author is properly attributed. Additionally, any derivative works based on this paper must also be licensed under the same terms. This licensing agreement allows for broad dissemination and use of the work while maintaining the author's rights and recognition.
By submitting this paper to IJCERT, the author(s) agree to these licensing terms and confirm that the work is original and does not infringe on any third-party copyright or intellectual property rights.
References
W. Tang, X. Lin, and Y. Liu, “Leveraging IoT data stream for near-real-time calibration of city-scale microscopic traffic simulation,” IET Smart Cities, vol. 2, no. 4, pp. 205–213, Dec. 2020, doi: 10.1049/iet-smc.2020.0019.
C. Chen, L. Yang, and H. Zhang, “SynTraC: A Synthetic Dataset for Traffic Signal Control from Traffic Monitoring Cameras,” arXiv preprint, arXiv:2408.09588, Aug. 2024. [Online]. Available: https://arxiv.org/abs/2408.09588
Cubig.ai, “The Critical Impact of Synthetic Data Utilization on Smart Cities: Opportunities and Challenges,” Cubig.ai Whitepaper, Jul. 2024. [Online]. Available: https://cubig.ai/blogs/impact-of-synthetic-data-utilization-on-smart-cities
G. Zimmermann, M. Troitzsch, and T. Iwanitz, “Towards Agent-Based Traffic Simulation Using Live Data from Sensors for Smart Cities,” in Multi-Agent-Based Simulation XX, vol. 12584, Springer, 2021, pp. 33–47. doi: 10.1007/978-3-030-66888-4_3.
A. G. Ismaeel, H. R. Abdulkareem, and K. R. Hassan, “Traffic Pattern Classification in Smart Cities Using Deep Recurrent Neural Network,” arXiv preprint, arXiv:2401.13794, Jan. 2024. [Online]. Available: https://arxiv.org/abs/2401.13794
Apache Foundation, “Apache Kafka,” [Online]. Available: https://kafka.apache.org/;
Apache Foundation, “Apache Spark,” [Online]. Available: https://spark.apache.org/;
Docker Inc., “Docker Documentation,” [Online]. Available: https://docs.docker.com/
Z. Huang et al., “Spatial‑temporal correlation graph convolutional networks for traffic forecasting,” IET Intelligent Transport Systems, 2023.
Z. Ma et al., “Spatio‑Temporal Heterogeneous Graph‑Based Convolutional Networks for Traffic Flow Forecasting,” Transportation Research Record, 2024.
Y. Zhang et al., “Dynamic Spatio‑Temporal Graph Fusion Convolutional Network for Urban Traffic Prediction,” Applied Sciences, vol. 13, no. 16, 2023.
C. Zheng et al., “Spatio‑Temporal Joint Graph Convolutional Networks for Traffic Forecasting,” arXiv preprint, arXiv:2111.13684, 2021.
A. Roy et al., “Unified Spatio‑Temporal Modeling for Traffic Forecasting using Graph Neural Network,” arXiv preprint, arXiv:2104.12518, 2021.
X. Liu et al., “Do We Really Need Graph Neural Networks for Traffic Forecasting?,” arXiv preprint, arXiv:2301.12603, 2023.
H. Chen, Y. Wang et al., “In‑Depth with Spatial‑Temporal Graph Neural Networks for Traffic Forecasting: An Overview with Attention,” CITA 2024 Conference Proceedings, 2024.