Replication based Fault Tolerant Algorithm for Component Based Distributed System
Main Article Content
Abstract
Component based distributed systems are growing rapidly due to their interoperability and ability to reuse components in location transparent manner. When components owned by third parties are reused, it is indispensable to have reliable mechanisms for smooth functioning of the system. Different kinds of faults might occur when a distributed system is deployed and running. There is need for tolerating faults and ensure that the system continuously serves its clients with reliability. Replication of server components is one of the strategies for fault tolerance. However, it needs to be carried out with optimal care as it causes overhead. In this paper, we proposed a replication based fault tolerant algorithm for component based distributed system. The algorithm is named as Replication based Reliable and Fault Tolerant Algorithm (RRFTA). Our algorithm exploits a replication manager and fault detectors at local and global level. The distributed system case study considered for empirical study has number of server components running in different machines at server side. The proposed algorithm ensures utilization of replicas appropriately in order to achieve smooth functioning of the system. The experimental results revealed the efficiency of the proposed algorithm in terms of its replication strategy.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
IJCERT Policy:
The published work presented in this paper is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. This means that the content of this paper can be shared, copied, and redistributed in any medium or format, as long as the original author is properly attributed. Additionally, any derivative works based on this paper must also be licensed under the same terms. This licensing agreement allows for broad dissemination and use of the work while maintaining the author's rights and recognition.
By submitting this paper to IJCERT, the author(s) agree to these licensing terms and confirm that the work is original and does not infringe on any third-party copyright or intellectual property rights.
References
Ramachandran, G. S., Wright, K.-L., Zheng, L., Navaney, P., Naveed, M., Krishnamachari, B., & Dhaliwal, J. (2019). Trinity: A Byzantine Fault-Tolerant Distributed Publish-Subscribe System with Immutable Blockchain-based Persistence. 2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), p1-9.
Torres-Huitzil, C., & Girau, B. (2017). Fault and Error Tolerance in Neural Networks: A Review. IEEE Access, 5, p17322–17341.
Abdulhamid, S. M., Abd Latiff, M. S., Madni, S. H. H., & Abdullahi, M. (2016). Fault tolerance aware scheduling technique for cloud computing environment using dynamic clustering algorithm. Neural Computing and Applications, 29(1), p279–293.
Albahri, O. S., Albahri, A. S., Zaidan, A. A., Zaidan, B. B., Alsalem, M. A., Mohsin, A. H., … Shareef, A. H. (2019). Fault-Tolerant mHealth Framework in the Context of IoT-Based Real-Time Wearable Health Data Sensors. IEEE Access, 7, p50052–50080.
Eisele, S., Mardari, I., Dubey, A., & Karsai, G. (2017). RIAPS: Resilient Information Architecture Platform for Decentralized Smart Systems. 2017 IEEE 20th International Symposium on Real-Time Distributed Computing (ISORC), p125-132.
Kaiwartya, O., Abdullah, A. H., Cao, Y., Lloret, J., Kumar, S., Shah, R. R., … Prakash, S. (2018). Virtualization in Wireless Sensor Networks: Fault Tolerant Embedding for Internet of Things. IEEE Internet of Things Journal, 5(2), p571–580.
Kaur, T., & Kumar, D. (2018). Particle Swarm Optimization-Based Unequal and Fault Tolerant Clustering Protocol for Wireless Sensor Networks. IEEE Sensors Journal, 18(11), p4614–4622.
Yang, S., Tang, Y., & Wang, P. (2018). Seamless Fault-Tolerant Operation of a Modular Multilevel Converter With Switch Open-Circuit Fault Diagnosis in a Distributed Control Architecture. IEEE Transactions on Power Electronics, 33(8), p7058–7070.
Albahri, A. S., Zaidan, A. A., Albahri, O. S., Zaidan, B. B., & Alsalem, M. A. (2018). Real-Time Fault-Tolerant mHealth System: Comprehensive Review of Healthcare Services, Opens Issues, Challenges and Methodological Aspects. Journal of Medical Systems, p1-56.
Yang Liu, Yu Peng, Bailing Wang, Sirui Yao, and Zihe Liu. (2017). Review on Cyber-physical Systems. IEEE. 4 (1), p27-40.
Bento, M. E. C., Dotta, D., Kuiava, R., & Ramos, R. A. (2018). A Procedure to Design Fault-Tolerant Wide-Area Damping Controllers. IEEE Access, 6, p23383–23405.
Hu, T., Guo, Z., Yi, P., Baker, T., & Lan, J. (2018). Multi-controller Based Software-Defined Networking: A Survey. IEEE Access, 6, p15980–15996.
Ni, J., Zhang, K., Alharbi, K., Lin, X., Zhang, N., & Shen, X. S. (2017). Differentially Private Smart Metering With Fault Tolerance and Range-Based Filtering. IEEE Transactions on Smart Grid, 8(5), p2483–2493.
Walaa Elsayed, Mohamed Elhoseny, A.M. Riad, and Aboul Ella Hassanien. (2017). Autonomic Self-healing Approach to Eliminate Hardware Faults in Wireless Sensor Networks. Springer, p1-10.
Sahni, Y., Cao, J., Zhang, S., & Yang, L. (2017). Edge Mesh: A New Paradigm to Enable Distributed Intelligence in Internet of Things. IEEE Access, 5, p16441–16458.
Babaei, M., Shi, J., & Abdelwahed, S. (2018). A Survey on Fault Detection, Isolation, and Reconfiguration Methods in Electric Ship Power Systems. IEEE Access, 6, p9430–9441.
Ramírez-Gallego, S., Fernández, A., García, S., Chen, M., & Herrera, F. (2018). Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce. Information Fusion, 42, p51–61.
Netto, H. V., Lung, L. C., Correia, M., Luiz, A. F., & Sá de Souza, L. M. (2017). State machine replication in containers managed by Kubernetes. Journal of Systems Architecture, 73, p53–59.
Pahlajani, S., Kshirsagar, A., & Pachghare, V. (2019). Survey on Private Blockchain Consensus Algorithms. 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), p1-6.
Liu, Y., Fieldsend, J. E., & Min, G. (2017). A Framework of Fog Computing: Architecture, Challenges, and Optimization. IEEE Access, 5, p25445–25454.
Kai Zhao; Sheng Di; Sihuan Li; Xin Liang; Yujia Zhai; Jieyang Chen; Kaiming Ouyang; Franck Cappello and Zizhong Chen; (2021). Algorithm-Based Fault Tolerance for Convolutional Neural Networks . IEEE Transactions on Parallel and Distributed Systems. http://doi:10.1109/tpds.2020.3043449
Nirmala, S. Jaya; Setlur, Amrith Rajagopal; Singh, Har Simrat and Khoriya, Sudhanshu (2020). An efficient fault tolerant workflow scheduling approach using replication heuristics and checkpointing in the cloud. Journal of Parallel and Distributed Computing, 136, 14–28. http://doi:10.1016/j.jpdc.2019.09.004
A. U. REHMAN, RUI L. AGUIAR and JOÃO PAULO BARRACA. (2022). Fault-Tolerance in the Scope of Cloud Computing. IEEE. 10, pp.63422-63441. http://doi:10.1109/ACCESS.2022.3182211
Yu Wu; Duo Liu; Xianzhang Chen; Jinting Ren; Renping Liu; Yujuan Tan and Ziling Zhang; (2021). MobileRE: A replicas prioritized hybrid fault tolerance strategy for mobile distributed system . Journal of Systems Architecture. http://doi:10.1016/j.sysarc.2021.102217
JUNQI CHEN, YONG WANG, MIAO YE AND QIUXIANG JIANG. (2023). A Secure Cloud-Edge Collaborative Fault-Tolerant Storage Scheme and Its Data Writing Optimization. IEEE. 11, pp.66506-66521. http://doi:10.1109/ACCESS.2023.3291452
Muhammad Asim Shahid; Noman Islam; Muhammad Mansoor Alam; M.S. Mazliham and Shahrulniza Musa; (2021). Towards Resilient Method: An exhaustive survey of fault tolerance methods in the cloud computing environment . Computer Science Review. http://doi:10.1016/j.cosrev.2021.100398
Marahatta, Avinab; Xin, Qin; Chi, Ce; Zhang, Fa and Liu, Zhiyong (2020). PEFS: AI-driven Prediction based Energy-aware Fault-tolerant Scheduling Scheme for Cloud Data Center. IEEE Transactions on Sustainable Computing, 1–1. http://doi:10.1109/TSUSC.2020.3015559
Chatterjee, Moumita; Mitra, Anirban; Setua, Sanjit Kumar and Roy, Sudipta (2020). Gossip-based fault-tolerant load balancing algorithm with low communication overhead. Computers & Electrical Engineering, 81, 106517–. http://doi:10.1016/j.compeleceng.2019.106517
Wang, Mingzhe and Zhang, Qiuliang (2020). Optimized data storage algorithm of IoT based on cloud computing in distributed system. Computer Communications, 157, 124–131. http://doi:10.1016/j.comcom.2020.04.023
Hassan Youness; Aly Omar and Mohamed Moness; (2021). An Optimized Weighted Average Makespan in Fault-Tolerant Heterogeneous MPSoCs . IEEE Transactions on Parallel and Distributed Systems. http://doi:10.1109/tpds.2021.3053150