White Paper

Demonstrating Segment Routing Potential for Utility IP/MPLS Networks

Utility operational technology (OT) networks often encounter complexities in managing communications from substations to control centers, traditionally relying on protocols such as resource reservation protocol (RSVP). Examining how utilities can improve network scalability and efficiency, this paper explores findings of segment routing (SR) performance compared to RSVP-TE through testing in a controlled laboratory setup. The results indicate that SR is a practical option for modernizing utility communication infrastructure.


Technology Over Time

In the mid-1980s and early 1990s, utilities began deploying private operational technology (OT) networks to deliver communications from substations to control centers. The systems often used Synchronous Optical Network (SONET) technology for resilient communications for the latency-sensitive critical applications required for grid operations. Those networks served the industry well for almost 25 years.

Information technology (IT) networks advanced during this same period using packet-based solutions that take advantage of modern protocols. The resulting networks perform like SONET but offer capacity and features for future growth.

Utility OT networks have since transformed legacy SONET and time-division multiplexing (TDM) networks into modern internet protocol (IP) and Ethernet networking solutions. A standard technology that emerged from these modernization efforts is internet protocol multiprotocol label switching (IP/MPLS). IP/MPLS can transport ancillary communications commonly found in OT networks using a control plane running an interior gateway protocol (IGP). IP/MPLS distributes labels with dedicated protocols, such as resource reservation protocol (RSVP), while the data plane transports customer traffic.

Traditional IP/MPLS OT networks that use RSVP have been well-suited for the utility industry. However, protocols for label distribution like RSVP carry additional complexities, posing challenges for network operators in troubleshooting and scaling.

 

Read The White Paper

Technology Over Time

In the mid-1980s and early 1990s, utilities began deploying private operational technology (OT) networks to deliver communications from substations to control centers. The systems often used Synchronous Optical Network (SONET) technology for resilient communications for the latency-sensitive critical applications required for grid operations. Those networks served the industry well for almost 25 years.

Information technology (IT) networks advanced during this same period using packet-based solutions that take advantage of modern protocols. The resulting networks perform like SONET but offer capacity and features for future growth.

Utility OT networks have since transformed legacy SONET and time-division multiplexing (TDM) networks into modern internet protocol (IP) and Ethernet networking solutions. A standard technology that emerged from these modernization efforts is internet protocol multiprotocol label switching (IP/MPLS). IP/MPLS can transport ancillary communications commonly found in OT networks using a control plane running an interior gateway protocol (IGP). IP/MPLS distributes labels with dedicated protocols, such as resource reservation protocol (RSVP), while the data plane transports customer traffic.

Traditional IP/MPLS OT networks that use RSVP have been well-suited for the utility industry. However, protocols for label distribution like RSVP carry additional complexities, posing challenges for network operators in troubleshooting and scaling.

Segment Routing Overview

SR is a network routing technique that emerged in 2013 and is recognized for its efficiency and simplicity in large-scale environments. SR quickly found favor among hyper-scale web providers and major tech corporations for its scalability and flexibility in managing complex network infrastructures. SR eliminates the need for distinct label distribution protocol (LDP) and RSVP, shifting toward a network managed centrally through a path computation element (PCE).

SR achieves end-to-end paths within interior gateway protocol (IGP) topologies by embedding these paths into sequences referred to as segments. Using extensions to link-state protocols within the IGP framework can introduce optional data elements, known as type-length-values (TLVs), to share data with other routers. TLVs allow routers within the SR domain to exchange additional information, such as segment identifiers (SIDs).

Segment Identifiers

SR uses SIDs to instruct how packets move through a network. As such, the SR method uses unique and deliberate SIDs:

  • A prefix SID is a sub-TLV linked to a router prefix, which must be unique within an IGP domain.
  • A node SID, a specific form of prefix SID, is associated with the router’s loopback address.
  • An adjacency SID can be allocated by a router to any connection between two nodes in an IGP topology. Adjacency SIDs hold local significance, as the value may be used on another router in the SR domain.

Global and local SIDs are assigned by the SR global block (SRGB) and the SR local block (SRLB). Within an SR MPLS domain, the SRGB is established networkwide, and it is standard practice to maintain uniform SRGB ranges across all routers in the domain.

After the network has converged, SIDs facilitate source routing to a destination, functioning similarly to the labels traditionally propagated in an MPLS network signaled with LDP or RSVP. Segment routing operates on individual nodes but can be integrated with a PCE to establish centralized network control.

Path Computation Element

With typically greater processing capabilities and memory than standard network elements, the PCE is the centralized source for traffic engineering in SR. A PCE can be a computer or a network node and is equipped to make complex routing decisions considering factors like metrics, latency and jitter.

A key advantage of the PCE is its centralized and holistic view of the network instead of the distributed per-node methodology. This centralized approach and processing allows more rapid and informed routing decisions, speeding up the process compared to waiting for multiple nodes to reach convergence. Combining PCE with SR paves the way for implementing software-defined networks.

Label Switch Path Configuration

There are several ways to configure segment-routed label switch paths (LSPs) in an SR network, including:

  • Intermediate System-to-Intermediate System (SR-ISIS)
  • Open Shortest Path First (SR-OSPF)
  • Traffic Engineering (SR-TE)

SR-ISIS and SR-OSPF LSPs manage the traffic flow from the source to the destination using labels propagated via the IGP. They use the destination router node SID to guide traffic routing within the network and are designed to follow the shortest-path IGP route, which limits traffic engineering options. Applying optimizations within the IGP serves as an effective strategy to alter the path selection in SR-ISIS and SR-OSPF LSPs.

A notable advantage of SR-ISIS and SR-OSPF LSPs is the capability for any-to-any labeled routing across an OT network, eliminating the need for setting up individual LSPs. Once the IGP network converges, it allows for designing and implementing services from one location to another without complexities. In contrast, LSP architectures like RSVP-TE require provisioning individual unidirectional LSPs.

While SR-ISIS and SR-OSPF configurations have limited traffic engineering, SR-TE signaled LSPs use a local Constrained Shortest Path First (CSPF) algorithm or a PCE to calculate SR LSPs through the network. CSPF-calculated SR-TEs can use LSP steering based on user input by a shared risk link group (SRLG), administrative groups, traffic engineering (TE) metrics and hop counts. Since SR LSPs are stateless except at the headend, bandwidth is not a constraint that may be implemented without introducing a PCE.

The three SR LSP implementation options can be combined with Loop-Free Alternate (LFA) routes to offer reliable, fast reroute functionalities.

Segment Routing Fast Reroute

Networks have depended on fast-reroute (FRR) mechanisms to lower the impact of node or link failures and maintain fault tolerance and resilience. In network failures, traffic is rerouted to a predetermined FRR path to address the issue immediately. Following the IGP reconvergence, traffic shifts to the newly identified shortest IGP path.

Enabling LFA within the IGP is an FRR mechanism that can be used on all three SR LSP configurations. LFA uses a neighbor as a next-hop backup path if the primary traffic path fails, with three variations: loop-free alternate (LFA), remote-LFA (R-LFA) and topology-independent LFA (TI-LFA).

Loop-Free Alternate

LFA defaults to selecting the backup next hop directly adjacent to the node computing the LFA.

To protect against a link failure, the backup next hop chosen by LFA can include the primary shortest path next-hop node but not the link that connects the local router with that primary shortest path next-hop node. To protect against node failure, the shortest path to the next-hop node cannot be included in the backup selection. Figure 1 illustrates this concept, with all link costs being identical in the illustration.

Figure 1: LFA example.

CLICK TO ENLARGE

Remote-LFA

R-LFA is used when LFA cannot provide a backup through a directly connected neighbor. Instead, R-LFA creates a logical neighbor to the node running the LFA calculation to serve as a suitable backup route in case of a link or node failure.

An R-LFA neighbor must be reachable through the IGP using the shortest path SR LSP that does not traverse the protected link. It must also forward packets toward their destination after exiting the SR tunnel without tracking back through the originating node.

It is important to mention that determining the tunnel endpoint requires calculating the extended P-Space and Q-Space, with the intersection of these spaces indicating a suitable tunnel endpoint.

Figure 2 illustrates a scenario in which an R-LFA is necessary because of the absence of LFA backup options. In the figure, all link costs are uniform.

Figure 2: R-LFA example.

CLICK TO ENLARGE

Topology-Independent LFA

During network failures, traffic is rerouted to a predetermined FRR path to address the issue immediately.

Following traffic rerouting and the reconvergence of the IGP, traffic then shifts to the newly identified shortest IGP path. However, the paths chosen for immediate network repair, such as those identified in LFA and R-LFA, might differ from the optimal IGP path determined after network reconvergence. This discrepancy can lead to traffic being rerouted multiple times.

TI-LFA sees that the selected backup path aligns with the optimal IGP path post-reconvergence, minimizing the need for multiple switches.

Figure 3 illustrates a scenario in which the LFA-calculated default backup, typically the best choice when both LFA and R-LFA backups are available, does not align with the IGP path after reconvergence.

Figure 3: TI-LFA example 1.

CLICK TO ENLARGE

Conversely, Figure 4 shows how implementing TI-LFA influences the selection of an R-LFA backup over an LFA backup, which would be the default preference without TI-LFA. This results in the R-LFA backup matching the IGP path after reconvergence, seeing that there is only the initial switch to the backup path.

Figure 4: TI-LFA example 2.

CLICK TO ENLARGE

Test Setup

The test design replicates a typical utility that utilizes private OT networks that transfer communications from substations to control center operating environments.

The physical setup (Figure 5) features four Nokia 7705 SAR-8 routers arranged in a ring topology with a single cut-through link. This platform has an established network space and is recognized for supporting ancillary equipment and protocols typically found in a substation environment.

Figure 5: Segment routing setup diagram.

CLICK TO ENLARGE

The network link setup here is consistent, with every link running at 10 Gbps and adhering to a /31 network scheme. The routers are identified by their names, marked as 7705-XXX, where XXX represents a unique identifier corresponding to the fourth octet in each router’s system address.

Time and frequency synchronization across the network was achieved using two Oscilloquartz OSA 5410 clocks, each with a GNSS connection. The Nokia 7705 SAR-8 routers utilized the IEEE 1588-2008 Precision Timing Protocol (PTP) profile, optimized for precise clock synchronization over networks like Ethernet. This protocol supports sub-microsecond synchronization accuracy, essential for utility OT network environments. A virtual service router network resource controller (VSR-NRC) was also integrated to verify and enhance network configuration during the testing phase (see Figure 6).

Figure 6: Test equipment.

CLICK TO ENLARGE

Test Logistics

Services set up for testing included:

  • Virtual Private Routed Network (VPRN) for Ethernet traffic.
  • Ethernet Pipe (Epipe) for Ethernet traffic.
  • Circuit Emulation Pipe (Cpipe) for C37.94 traffic.

All Ethernet services originated from router 7705-198 and terminated at router 7705-190. The C37.94 circuit applied a loopback at the 7705-190 router.

During testing, no specific traffic engineering techniques were employed, resulting in all traffic following the optimal IGP path through the network (see Figure 7).

Figure 7: Logistical services.

CLICK TO ENLARGE

Test Protocol

The test network (Figure 5) was configured similarly for both RSVP-TE and SR tests, and the outlined test procedures in this section were applied uniformly in both scenarios. The key difference between the test setups was that one setup implemented LSPs signaled by RSVP with FRR, while the other used SR-ISIS LSPs with TI-LFA implemented in the IGP. In the RSVP testing phase, configurations such as LFA, R-LFA and TI-LFA were not included within the IGP setup.

This testing did not involve segment-routed LSPs signaled by a PCE. Instead, all segment-routed LSPs were based on SR-ISIS to highlight segment routing’s fundamental functionality and repair time capabilities when TI-LFA is integrated into the IGP.

SR-TE LSPs, with or without PCE assistance, are viable options for setting up teleprotection channels. Their effectiveness is notably enhanced when integrated with the active multipath (AMP) feature in Nokia 7705-SAR8v2 routers. However, it’s important to note that while SR-TE LSPs are efficient, the primary objective of this testing was to demonstrate the simpler setup and reduced overhead of segment-routed LSPs compared to RSVP-signaled LSPs. Therefore, SR-TE LSPs were not implemented in these tests.

Baselines were established for each network setup before failover testing was conducted. The following steps were applied for each network disruption test:

  1. Obtain initial results using the C37.94 and Ethernet testing equipment.
  2. Start the C37.94 and Ethernet test sets, maintaining data transmission via the direct connection between routers 7705-198 and 7705-190.
  3. Disrupt the connection between routers 7705-198 and 7705-190 by removing the fiber-optic cable from port 1/2/6 on router 7705-190.
  4. Restore the link between routers 7705-198 and 7705-190 three minutes post-disruption by reconnecting the fiber to port 1/2/6 on router 7705-190.

Key Findings

The performance of the RSVP and SR networks was largely comparable, with only minor differences noted in the responses to the disruption and restoration processes.

One significant aspect was the data loss time (DLT) for the C37.94 circuit within the RSVP network (see Figure 8). It was noted that this circuit on the RSVP network experienced a slightly higher frequency of data loss incidents than the SR network. This occurrence was attributed to the process involved in the restoration and resignaling of the RSVP LSP. This detail highlights a key difference between the two networks’ handling of network disruptions.

Figure 8: C37.94 RSVP vs. SR.

CLICK TO ENLARGE

The data for the Ethernet streams indicates a very similar level of performance between the two networks (see Figure 9). However, a marginal difference was observed in the packet loss ratio. During the periods of disruption and restoration, the RSVP network demonstrated a slightly better performance, with a packet loss ratio of 0.0076% compared to the SR network’s 0.011%. This difference, while minor, points to the nuanced efficiencies of the RSVP network in managing packet loss under network stress conditions.

Figure 9: Ethernet streams RSVP vs. SR.

CLICK TO ENLARGE

Conclusion

Implementing SR in OT networks removes the requirement for traditional label distribution protocols, such as LDP and RSVP, and will greatly simplify network deployments. This streamlines the setup process and enhances the overall efficiency of OT IP/MPLS network operations without sacrificing reliability or performance.

Benefits confirmed as a result of testing include:

  • SR can operate without unidirectional LSP builds, particularly when using SR-ISIS or SR-OSPF LSPs. This feature allows for any-to-any site-labeled routing, showcasing a significant improvement in network traffic management and flexibility over traditional LDP- and RSVP-signaled OT IP/MPLS networks.
  • SR eliminates the requirement for session state at intermediary nodes, which reinforces the efficiency and simplicity of SR in network operations. This characteristic simplifies network architecture, reducing the complexity and overhead typically associated with traditional OT IP/MPLS networking approaches.
  • The capability of SR to support traffic engineering for mission-critical applications allows an essential layer of functionality on OT IP/MPLS networks to be retained. This is particularly important for applications that demand high network performance and reliability levels, like teleprotection, seeing that network infrastructure continues to meet the strict demands of mission-critical applications.
  • Integrating TI-LFA in the IGP with SR deployments mirrors the FRR capabilities implemented by RSVP in current OT IP/MPLS networks. This integration maintains network reliability and performance despite any unexpected disruptions or network changes. The ability of SR to provide robust and resilient network operations is essential to operating in any OT IP/MPLS network.
  • The test results demonstrate a comparable performance level between segment-routed LSPs and RSVP-signaled LSPs. This performance equivalence solidifies SR as an effective solution in modern OT IP/MPLS network environments.
  • Incorporating SR into existing OT IP/MPLS networks facilitates an efficient overlay with minimal operational disruption, primarily because of the protocol’s inherent routing preference mechanisms that favor RSVP over SR tunnels. This compatibility negates the requirement of a comprehensive network overhaul, maintaining the integrity of existing OT IP/MPLS networks.

SR emerges as an optimal solution for new OT IP/MPLS network deployments. The efficacy of SR integration into and enhancement of current network frameworks is a testament to its capability to optimize network performance and provide high reliability within the existing OT IP/MPLS topology. Not only can SR further advance OT IP/MPLS network infrastructures, but it also serves as a steppingstone toward integrating software-defined networking (SDN) and aligns with modern requirements for network robustness and streamlined operational efficiency.

 


Author

Andrew Silvius, PE

Senior Technical Consultant