Open Access Open Access  Restricted Access Subscription Access

A Hybrid Fault-Tolerant Software-Defined Networking Architecture for Reliable Operation of Critical Infrastructure Networks

Mission Franklin

Abstract


Critical infrastructure networks such as power grids, industrial control systems, and transportation platforms require highly reliable and resilient communication frameworks to ensure uninterrupted operation. Software-Defined Networking (SDN) has emerged as a promising paradigm for managing such networks due to its centralized control, programmability, and global network visibility. However, the separation of control and data planes in SDN introduces reliability challenges, particularly the risk of controller failures, control-plane bottlenecks, and delayed recovery from network faults, which are unacceptable in mission-critical environments. This study proposes a fault-tolerant SDN architecture designed to enhance the reliability and operational continuity of critical infrastructure networks. The architecture integrates a distributed controller framework with state replication, proactive failure detection, and fast data-plane recovery mechanisms to mitigate the impact of controller, link, and switch failures. A hybrid recovery strategy is adopted, combining local fast rerouting at the data plane with coordinated control-plane failover to minimize service disruption and recovery latency. The proposed architecture is evaluated through simulation under multiple fault scenarios representative of critical infrastructure environments. Performance metrics such as recovery time, packet loss, throughput, controller failover latency, and network availability are analysed and compared with conventional single-controller SDN architectures. Results demonstrate significant improvements in fault recovery speed, service availability, and resilience under both control-plane and data-plane failures. The findings highlight the potential of fault-tolerant SDN architectures as a viable solution for enhancing the reliability of critical infrastructure networks.


Full Text:

PDF

References


Amin, M. (2005). Toward self-healing energy infrastructure systems. IEEE Computer

a. Applications in Power, 14(1), 20–28. https://doi.org/10.1109/67.893345

Bari, M. F., Chowdhury, S. R., Ahmed, R., Boutaba, R., & Esteves, R. (2014).

a. Orchestrating virtualized network functions. IEEE Transactions on Network and Service Management, 13(4), 725–739. https://doi.org/10.1109/TNSM.2016.2601456

Cascone, C., Sanvito, D., Pollini, L., Capone, A., & Sansò, B. (2017).

a. Fast failure detection and recovery in SDN with stateful data plane. International Journal of Network Management, 27(2), e1957. https://doi.org/10.1002/nem.1957

Dixit, A., Hao, F., Mukherjee, S., Lakshman, T. V., & Kompella, R. (2014).

a. Towards an elastic distributed SDN controller. ACM SIGCOMM Computer Communication Review, 43(4), 7–12.https://doi.org/10.1145/2534169.2491193

Fonseca, P., Bordin, M., Karimzadeh, M., Esteves, R., & Turck, F. D. (2017).

a. A survey on fault management in software-defined networks. IEEE Communications Surveys & Tutorials, 19(4), 2284–2321. https://doi.org/10.1109/COMST.2017.2711030

Hu, F., Qiu, X., Chen, X., & Shen, X. (2015). Controller placement and

a. redundancy strategies in SDN. Journal of Network and Computer Applications, 57, 120–132. https://doi.org/10.1016/j.jnca.2015.06.006

Katta, N., Al-Fares, M., Radhakrishnan, S., & Bahl, P. (2015). R-CORD:

a. A fault-tolerant SDN framework. ACM SIGCOMM Poster, 45(4), 59–60. https://doi.org/10.1145/2829988.2787493

Kreutz, D., Ramos, F. M. V., Verissimo, P. E., Rothenberg, C. E., Azodolmolky, S., & Uhlig,

a. S. (2015). Software-defined networking: A comprehensive survey. Proceedings of the IEEE, 103(1), 14–76. https://doi.org/10.1109/JPROC.2014.2371999

Nunes, B. A. A., Mendonca, M., Nguyen, X. N., Obraczka, K., & Turletti, T. (2014).

a. A survey of software-defined networking: Past, present, and future of programmable networks. IEEE Communications Surveys & Tutorials, 16(3), 1617–1634.

https://doi.org/10.1109/SURV.2014.012214.00180

Sharma, S., Staessens, D., Colle, D., Pickavet, M., & Demeester, P. (2013).

a. OpenFlow: Meeting carrier-grade recovery requirements. Computer Communications, 36(6), 656–665.

https://doi.org/10.1016/j.comcom.2012.02.005


Refbacks

  • There are currently no refbacks.