• Distribution Automation Architecture for Utilities
  • DA Solution FLISR Use Case using SEL devices over CR mesh
  • DA FLISR Use case with SEL
  • Cisco Resilient (CR) Mesh - Design Considerations for Centralized FLISR use case
  • Solution Architecture and Components Selection
  • Solution Deployment Models for DA
  • Design Considerations for DA Feeder Automation Deployments Based on 900MHz ISM Band Solution
  • DA Feeder Automation using Cellular Service (3G/4G) Solution
  • Glossary
  • Distribution Automation - Feeder Automation Design Guide

    The Cisco Distribution Automation - Feeder Automation Design Guide provides a comprehensive explanation of the entire end-to-end Cisco Smart Grid Field Area Network (FAN) solution design, which was developed for the Utility Industry in the Americas region and leverages the license free spectrum: ISM band 902 - 928 MHz for last mile connectivity of the Distribution Network Grid devices. The document describes the two most common Distribution Automation use cases for monitoring and control of Distribution electrical lines equipment: Volt/VAR and Fault Location, Isolation, and Service Restoration (FLISR). It also includes information about the system's architecture, solution components, product choices, design models, and design considerations. This design targets implementations that will use the customer's Substation Private WAN as backhaul for the Resilient Mesh Network to transport data from grid devices in the field to the Control and Operation Centers. The document concludes with a high-level overview of a Feeder Automation Design based on Public Cellular Service that leverages Cisco's Cellular Industrial Routers (IR) Series products.

    Executive Summary

    Several key business drivers underlie the optimization of the distribution grid enabled by this solution. A pervasive, highly available, and well designed communications network will help enable increased reliability and availability while also reducing OpEx.

    Cisco Systems is addressing the networking needs of the utility industry. Specifically, in this Distribution Automation - Feeder Automation Design Guide, the communications solutions that address the utility distribution grid with use cases such as SCADA transport, FLISR, and line voltage-monitoring enabling applications such as Volt/VAR Control are being highlighted. Field devices like transformers can offer predictive maintenance opportunities that will help eliminate customer outages and expensive unscheduled repairs and truck rolls.

    The Cisco Distribution Automation validated solution, which is part of the Cisco portfolio of industry-leading, validated, and secure networking solutions for substation automation, Utility WAN, and Field Area Network Advanced Meter Infrastructure (FAN AMI), provides the following unique capabilities for distributed control and protection operations:

    blank.gifCisco Resilient Mesh and cellular networking with FlexVPN technologies that are cost-effectively built to scale for the large number of Distribution Automation devices being enabled in the distribution grid

    blank.gifAn IT-preferred security architecture, including hardware and software certification management, firewall, and malware protection with robust encryption to help ensure secure network communications and edge applications

    blank.gifEnhanced management and serviceability by Cisco Field Network Director (FND) with Zero Touch Deployment (ZTD) and plug-and-play (PnP) functionality to help enable deployment and enhance operations

    blank.gifHigh availability that is designed in the headend and Wide Area Network (WAN), with redundant control center support

    blank.gifEdge application capabilities within FND lifecycle-managed Cisco equipment that include deployment, monitoring, upgrading, and troubleshooting

    blank.gifEnd-to-end testing and validation, which are completed and documented with various Distribution Automation device vendors and use cases

    The recent enhancements to Cisco Resilient Mesh have increased by nearly tenfold the available bandwidth on the 900mhz field area network over the first generation, thus also reducing the latency between hops, helping enable peer-to-peer communication, and equipping the network with enhanced security features. Cisco has transformed a previously low performance wireless mesh network that was designed for smart metering into a network that is suitable for Distribution Automation use cases.

    Cellular can be applied to areas or use cases where extremely high performance is needed. Since they are managed under a single highly usable Field Network Director (FND) system, the customer will receive a consistently intuitive management experience.

    As a foundational element to any Cisco network, this DA architecture leverages enhanced security from the control center to the edge of the distribution network. The result is a reliable, scalable, and highly available DA network via wired and wireless, and a cellular WAN that supports large-scale DA deployments and secures communications to redundant control centers.

    Deployment, ongoing operation, and management is simplified via standards-based protocols and ZTD tools for proven large scale DA network provisioning. This is all addressed in detail as part of this design guide.

    This document covers this DA communications solution, which is based on industry-leading innovations in Cisco Resilient Mesh and cellular networking technologies that are built into the Cisco CGR 1240 and CGR 1120 Connected Grid Routers; the Cisco IR510 and IR530 Wi-Sun Mesh Industrial Routers product family; the IR807, IR809, and IR1101 Industrial Router cellular gateways; and the Cisco FND management system.

    Navigator

    The table describes the chapters in this document:

    Chapter
    Description

    Distribution Automation Architecture for Utilities

    Review of the Utility Industry Distribution Use Cases: Volt/VAR and FLISR. It is intended for readers who are unfamiliar with the industry DA applications.

    Solution Architecture and Components Selection

    Introduction to Cisco solution's product portfolio, product characteristics, and usage guidance for product selection based on utility footprint with reference links to documentation across the three main tiers: NAM, WAN, and Energy Operations Center or DC.

    Solution Deployment Models for DA

    Describes at a high level the different industry DA architectures (centralized versus distributed) and the three Cisco FAN design options available to support these DA architectures.

    Design Considerations for DA Feeder Automation Deployments Based on 900MHz ISM Band Solution

    Explains in detail the Cisco FAN Distribution Automation design based on 900Mhz ISM band Spectrum. It also contains the design specifications and functional description of aspects such as RF communication, network infrastructure, routing, security, and QoS across the FAN tiers.

    DA Feeder Automation using Cellular Service (3G/4G) Solution

    Overview of the Cisco FAN DA design based on Public Cellular Service solution.

    Audience

    The intended audience for this guide is comprised of, but is not limited to, system architects, network/compute/systems engineers, field consultants, Cisco Customer Experience (CX) specialists, partners, and customers.

    The solution encompasses multiple technology domains from infrastructure to switching and routing to security and network management. Readers should be familiar with the following transport technologies: Radio: IEEE802.15.4 based on 900MHz ISM band, IEEE 802.11 Wi-Fi and Cellular 3G/4G, IEEE 802.3 Ethernet, and overlay Virtual Private Networks: FlexVPN. The solution uses the following industry standard protocols: IPv4 and IPv6, 6LoWPAN, RPL, BGP, NAT (MAP-T), IKEv2, 802.1x, 802.11i, SNMP, and CoAP, including others.

    Document Objective and Scope

    This design guide provides a comprehensive explanation of the Cisco FAN system design based on standard unlicensed ISM 900MHz radio band frequency for Utilities Distribution Automation applications. It includes information about the system's architecture, possible deployment models, and guidelines for implementation and configuration. The guide also recommends best practices and potential issues when deploying the reference architecture.

    Use Cases/Services/Deployment Models

    This guide addresses the following technology use cases:

    blank.gifDA grid devices connectivity using unlicensed frequency radio: ISM 902-928MHz band available in certain countries in the Americas market, standard IEEE 802.15.4g/e based on Option 2 - OFDM modulation with higher physical data rates of up to 1.2Mbps.

    blank.gifRadio Optimization features: Adaptive Modulation, Adaptive Data Rates, and High Availability for mesh coordinator.

    blank.gifEdge Software Optimization features: for customer application edge deployment with dedicated resources.

    blank.gifAdvanced mesh IPv6 routing with peer-to-peer communication.

    blank.gifNew products release CGR 1000 Wireless Module (WPAN), IR510 DA Gateway, and IR530 DA Range Extender.

    blank.gifSolution WAN design options.

    blank.gifEnd-to-end Solution Security and Network Management.

    Distribution Automation Architecture for Utilities

    This chapter includes the following major topic:

    blank.gifDistribution Automation Use Cases

    Cisco Systems has taken a holistic approach to Distribution Automation, and, in this release, the focus will be the Utility Distribution system. The goal of Distribution Automation in the Utility grid is real-time adjustment to changing loads, distributed generation, and failure conditions within the Distribution grid, usually without operator intervention. The IT infrastructure includes real-time data acquisition and communication with utility databases and other automated systems. Accurate modeling of distribution operations supports optimal decision making at the control center and in the field. This heavily depends on a highly reliable and high performing communications infrastructure. This document address these communications requirements as an architecture and addresses the key use cases below.

    Distribution Automation technologies are commercially available for wide scale utility deployments. The key for the utility is to identify and unlock the value that these solutions provide. Applications that may have the greatest potential are those that directly affect operations and efficiency such as management of peak load via demand response, predictive technologies for advanced maintenance or equipment replacement and secure communications for equipment, and system restoration technologies.

    Automated control of devices in distribution systems is the closed-loop control of switching devices, voltage controllers, and capacitors based on recommendations of the distribution optimization algorithms. These closed loop systems often have rigorous communications systems requirements that vary from manufacturer to manufacturer and by application. The communications system must meet the most rigorous standards and do so at scale. Volt/VAR control is one of the key applications to optimize the distribution grid for the utility.

    A utilities fault may occur when a short circuit between two-phase lines occurs or for other reasons. The fault in any one of the lines can affect a large number of customers. Before the fault on the line can be corrected, it has to be identified and isolated from the large utility network. This identification and isolation is done by placing reclosers in the network. The reclosers are in turn connected to the recloser controller. The recloser controller is a connected gateway, which establishes a connection to the control center.

    When a fault is identified, the reclosers perform the trip operation and the fault is isolated from the larger network. This trip operation can be automated or can be sent from the control center. Once the fault is corrected, the close operation on the circuit, which is done from the control center, can be executed. This is commonly referred to as FLISR, and is also one of the key use cases for a utility in a grid optimization effort.

    This Distribution Automation architecture address the utility requirements for Volt/VAR and FLISR via a robust communications infrastructure that addresses the two predominant distribution automation schemes:

    In Europe, portions of South America, and Asia, the distribution scheme is based on a more centralized transformer design and is commonly referred to as the Secondary Substation.

    In North America, portions of South America, and along the Pacific Rim, the distribution scheme is based on a decentralized transformer model and this scheme will be referred to throughout this document as a Feeder Network.

    The architecture in Figure 1 leverages the latest technologies and recent enhancements to best address use cases and these topologies with a variety of cell-based gateways for the Secondary Substation as well as a combination of 900 Mhz mesh and cell gateways at the edge. The architecture addresses the requirements for these edge services and communications, including the edge as NAN, the backhaul as WAN, and the Operations and Control Centers commonly referred to as the Headend.

    Figure 1 Reference Architecture for Distribution Automation

     

    379663.jpg

    The Headend provides aggregation and security for and between the distribution automation applications typically at the Utility control center. This architecture leverages a secure WAN aggregation for scalability since feeder sections may scale to hundreds or more devices with the DA network scaling to thousands of feeder segments and Secondary Substation networks with over 100,000 nodes.

    As part of this architecture, the WAN segment is referred to in two modes: On-Net and Off-Net:

    blank.gifOn-Net is a high speed communications network owned and operated by the utility; examples include SDH/SONET, Carrier Ethernet, or MPLS as the most common.

    blank.gifOn the other hand, the Off-Net network is a service provider-leveraged network that can be based on the same technologies but as a shared service that often includes pre-negotiated service level agreements.

    The WAN segment for DA networks is often a cellular backhaul connection because building out a private network in numerous and remote locations, especially in the Secondary Substation model, is frequently cost prohibitive. The NAN Mesh offers opportunities to leverage the On-Net network as backhaul when the radio network gateway can be co-located at a utility-owned facility such as a substation or depot.

    The edge or the NAN is built on a small form factor gateway or NAN router connected to the edge device such as a Capacitor Bank Controller (CBC) or voltage line monitor based on application or service. The connection to the edge device is often serial, but is rapidly moving to Ethernet. The NAN router can be configured to deliver edge services such as adaptation for serial connections via raw socket encapsulation or translation from serial protocols like IEC-101 to the packet-based IEC-104 protocol. The NAN router also provides security services such as 802.1x port-based authentication, encryption, and routing with possible alternate backhaul options, thus providing a secure connection for the edge device to the control center. The backhaul in the case of Secondary Substations is most often cellular with some satellite or DSL options.

    Cisco Resilient Mesh is the latest version of the 900 Mhz Connected Grid Mesh radio with significant performance improvements now applicable for many Distribution Automation applications and use cases. However, it is recognized that Resilient Mesh may not be applicable for all use cases. The Distribution Feeder network will likely be a combination of mesh where the 900 Mhz radio network is feasible and hop count and latency meet application requirements with cellular to augment based on hop count, application performance, or latency requirements.

    Distribution Automation Use Cases

    Distribution Automation (DA) refers to the monitoring and control of devices located on the distribution feeders, such as line reclosers, load break switches, sectionalizers, capacitor banks and line regulators, and devices located in the distribution substation. DA is an overlay network deployed in parallel to the distribution feeder. It enables two-way communication between controllers used in the distribution feeder and the intelligence application that resides in the Utility control center or Secondary Substation for improving grid reliability, availability, and control. Figure 2 depicts a radial distribution feeder:

    Figure 2 Distribution Feeder

     

    379664.jpg

    In Figure 2, the distribution feeder can be observed coming out of the Secondary Substation; various distribution automation controllers (IEDs) in the feeder, such as the recloser controller, voltage regular controller, and capacitor bank controller, are positioned along the distribution feeder. Key functions and operations of Distribution Automation include protecting the distribution system, managing the fault, measuring the energy usage, managing the assets, and controlling and managing system performance. European feeders are largely three-phase and most European countries have a standard secondary voltage of 220, 230, or 240 V.

    Use Cases

    The following use cases of Distribution Automation will be discussed in this design guide:

    blank.gifVolt/VAR Regulation

    blank.gifFault Location Isolation and Service Restoration (FLISR)

    The radial feeder distribution system design is considered for Volt/VAR regulation use cases and the parallel feeder distribution system is considered for FLISR use cases. Cisco DA Gateways are very well suited for other feeder deployments such as mesh and loop distributed feeder designs.

    Volt/VAR Control Use Cases and Benefits

    This use case address automating dynamic and efficient delivery of power. Utilities look at achieving large saving by enhancing the efficiency of their power distribution infrastructure-in other words, improving the effectiveness of the flow of electricity. In order to evaluate the process, it is important to review the differences between what is called real power and reactive power.

    blank.gifReal power is used to run all lights, devices and production lines. It is the power that "does the work."

    blank.gifReactive power does not contribute anything to doing work, but it does cause conductors to heat up and it takes up a certain amount of "space" in the wires.

    The more reactive power flowing on a line, the less "room" there is for real power, and the less efficient is the distribution system.

    Today, in order to eliminate or at least minimize reactive power flows, utilities have deployed on their local distribution systems devices, such as capacitor banks or special transformers that are typically located at substations or on the feeder. These devices work to keep reactive power flows down, making the full capacity of the conductor available for the real power. This process is known as Volt/VAR regulation or control:

    blank.gifPower Factor Regulation/VAR Compensation—Improves efficiency of energy supply by ensuring voltage and current are in phase when supplied to the customer.

    blank.gifConservation Voltage Regulation—At times of peak load, ensure the minimum required voltage level is supplied to the customer.

    blank.gifVolt/VAR Control—Power factor regulation + Conservation voltage regulation.

    Volt/VAR Actors

    Figure 3 depicts various actors used in the Volt/VAR use case. The actors used in the Volt/VAR use case are Load Tap Changers, Voltage Regulators, and Capacitor Bank Controllers (CBCs).

    Figure 3 Volt/VAR Actors

     

    379667.jpg

    Voltage Regulator and Load Tap Controllers

    Voltage regulation functions are performed using the Voltage Regulator/Load Tap Controller actors. Voltage can be raised or lowered based on load conditions. Voltage Regulators are types of transformers that make small adjustments to voltage levels in response to changes in load. They are installed in substations (where they are called load tap changers) and along distribution feeders to regulate downstream voltage. Voltage Regulators have multiple "raise" and "lower" positions and can automatically adjust according to feeder configurations, loads, and device settings.

    Capacitor Bank Controllers

    CBCs are used to supply reactive power. Utilities use capacitors to compensate for reactive power requirements caused by inductive loads from customer equipment, transformers, or overhead lines. Compensating for reactive power reduces the total amount of power that needs to be provided by power plants, resulting in a flatter voltage profile along the feeder and less energy wasted from electrical losses in the feeder. A distribution capacitor bank consists of a group of capacitors connected together. Capacitor banks are mounted on substation structures, distribution poles, or are "pad-mounted" in enclosures.

    Volt/VAR Application Communication Flow

    In Figure 4, Volt/VAR and Supervisory Control and Data Acquisition (SCADA) applications are hosted in the DSO control center and RTU and load tap controllers are located in the Secondary Substation. The remote terminal unit (RTU) acts as an outstation device that proxies the poll and/or control command to various field devices like the CBC and end-of-line voltage monitor. This guide covers the use case scenario where the Volt/VAR application flow between the Intelligent Electronic Device (IED) and SCADA happens via RTU and the distribution feeder type considered is radial. A direct application flow from field devices to the control center for the Volt/VAR use case will be covered in future guides.

    Figure 4 Volt/VAR Block Diagram

     

    379668.jpg

    The detailed application flow between different actors for power factor regulation is depicted in Figure 5:

    Figure 5 Power Factor Regulation Flows

     

    379669.jpg

    1.blank.gif Event class data poll to the following devices from RTU:

    blank.gifSubstation meter, poll measured Value (Short Floating Point) registers (0 to 4)

    blank.gifAll CBC(s), poll measured Value (Short Floating Point) (0) and double point command(0)

    blank.gifEnd-of-line voltage monitor, poll measured Value (Short Floating Point) register (0)

    2.blank.gif The Volt/VAR Optimization processor processes the data received from the devices and makes a control command decision based on the power factor calculation.

    3.blank.gif The control command is sent to RTU via SCADA to CBCs to close the Capacitor Bank Controller N by writing in a Control Relay Output Block (CROB) command register in T104.

    4.blank.gif Event class data poll to the following devices from the RTU:

    blank.gifSubstation meter, poll measured Value (Short Floating Point) registers (0 to 4)

    blank.gifAll CBC(s), poll measured Value (Short Floating Point) (0) and double point command(0)

    blank.gifEnd-of-line voltage monitor, poll measured Value (Short Floating Point) register(0)

    5.blank.gif All the above steps are repeated to all the CBCs on the feeder line to maintain a Power Factor value close to 1.

    Figure 6 Conservation Voltage Regulation (CVR)

     

    379670.jpg

    Figure 6 depicts the detail call flow involved in conservation voltage regulation.

    1.blank.gif Event class data poll to the below devices from RTU:

    blank.gifSubstation meter, poll measured Value (Short Floating Point) registers (0 to 4)

    blank.gifAll CBC(s), poll measured Value (Short Floating Point) (0) and double point command (0)

    blank.gifEnd-of-Line voltage monitor, poll measured Value (Short Floating Point) register (0)

    2.blank.gif The Volt/VAR Optimization processor processes the data received from the devices and makes a control command decision based on the power factor calculation.

    3.blank.gif Control command is sent to RTU via SCADA to the load tap controller to lower/raise LTC by writing in a Control Relay Output Block (CROB) command register in T104.

    4.blank.gif Event class data polls to the following devices from RTU:

    blank.gifSubstation meter, poll measured Value (Short Floating Point) registers (0 to 4)

    blank.gifAll CBC(s), poll measured Value (Short Floating Point) (0) and double point command(0)

    blank.gifEnd-of-Line voltage monitor, poll measured Value (Short Floating Point) register (0)

    5.blank.gif The above steps are repeated to maintain a Power Factor value close to 1 along the feeder line.

    Fault, Location, Isolation, Service Restoration (FLISR)

    FLISR Use Case and Benefits

    FLISR is the process for dealing with fault conditions on the electrical grid. The following occurs as part of this process:

    1.blank.gif Detects (and locates) faults

    2.blank.gif Isolates the faults to the smallest segment of the grid possible

    3.blank.gif Restores as much service as possible while the fault is isolated

    FLISR includes automatic sectionalizing and restoration and automatic circuit reconfiguration. These applications accomplish DA operations by coordinating operation of field devices, software, and dedicated communication networks in order to automatically determine the location of a fault and rapidly reconfigure the flow of electricity so that some or all of the customers can avoid experiencing outages. Because FLISR operations rely on rerouting power, they typically require feeder configurations that contain multiple paths to single or multiple other substations. This creates redundancies in the power supply for customers located downstream or upstream of a downed power line, fault, or other grid disturbance.

    The benefits of FLISR include:

    blank.gifConsumers experience minimal outage.

    blank.gifUtilities improve the System Average Interruption Duration Index (SAIDI) and the System Average Interruption Frequency Index (SAIFI) numbers and avoid financial penalties being levied by the regulator.

    FLISR application control can be implemented in the following modes:

    blank.gifSupervised Mode-In supervised mode of operation, no automatic control, system delivers information to operator. Operator initiates manual control actions. Restoration time will be longer in this approach. Please refer to the Distribution Automation - Secondary Substation 1.0 Implementation Guide, which addresses this use case, at the following URL:

    blank.gif https://salesconnect.cisco.com/#/search/Secondary%2520Substation%2520Implementation%2520Guide/conten t

    blank.gifSemi Automatic Mode-A mix of automatic and supervised control is followed. The DA system automatically isolates the fault and performs the restoration part of upstream restoration. The upstream section is between the substation and the faulted section. Manual restoration operation is performed on the downstream section, which is between the fault section and the end of feeder. This guide will address this mode of operation. In this mode, communication happens between IEDs in field to the Distribution Management System (DMS) application residing in control center.

    blank.gifFully Automatic Mode-Isolation and restoration happens automatically without any dispatcher intervention. Communication happens directly between a group of associated IEDs. Restoration is very fast (<1 second), but this mode is a complex approach to deploy.

    How FLISR Works

    Figure 7 is divided into four parts (A,B,C, and D) to show how FLISR operations typically work.

    blank.gifIn Example A of Figure 7, the FLISR system locates the fault, typically using line sensors that monitor the flow of electricity, measures the magnitudes of fault currents, and communicates conditions to other devices and grid operators.

    blank.gifOnce located, FLISR opens switches on both sides of the fault: one immediately upstream and closer to the source of power supply (Example B of Figure 7), and one downstream and further away (Example C of Figure 7).

    blank.gifThe fault is now successfully isolated from the rest of the feeder. With the faulted portion of the feeder isolated, FLISR next closes the normally open tie switches to neighboring feeders. This re-energizes the unfaulted portion(s) of the feeder and restores services to all customers served by these unfaulted feeder sections from another substation/feeder (Example D of Figure 7).

    Figure 7 How FLISR Works

     

    379671.jpg

    FLISR Actors

    blank.gifRecloser-The circuit recloser is a self-contained device with a necessary monitoring circuit to detect and interrupt over-current conditions and automatically reclose the line.

    blank.gifSectionalizing Switch or Remote Control Switch-Remote Controller Switches can be load break or fault interrupting devices.

    blank.gifRemote Fault Indicator-Used to detect faults.

    blank.gifDistribution Management System (DMS)-The DMS application residing in the DSO control center is an intelligent application, which is the brain of FLISR systems and which performs application circuit reconfiguration logic.

    Figure 8 depicts a parallel feeder distribution system. Two distribution feeders are common out of two different Secondary Substations and each feeder has a recloser associated with it. Remote Fault Indicators and remote control switches are distributed across both feeders. RCS3 3 is, by default, an open switch.

    Figure 8 FLISR Parallel Feeders

     

    379672.jpg

    Figure 9 FLISR Application Communication Flow

     

    379673.jpg

    In Figure 9, the application flow can be observed happening directly from feeder devices to the DMS application in the DSO control center. The flow is summarized below:

    blank.gifRemote Fault Indicator (RFI) 1 reports to the Distribution Management System (DMS) whenever it encounters a fault.

    blank.gifRecloser2 opens and send a report to DMS when it encounters a temporary fault.

    blank.gifRecloser2 opens and send a report to DMS when it encounters a permanent fault.

    blank.gifRemote Control Switch (RCS) 2 reports no voltage status to DMS.

    blank.gifRCS 2 opens when it encounters faults for second time and send a report to DMS.

    blank.gifDMS issues a close command to the RCS 3.

    blank.gifDMS initiates a periodic poll (every minute) for the all feeder devices.

    blank.gifDMS initiates a solicit periodic poll (every 5 minutes once) for all feeder devices.

    DA Solution FLISR Use Case using SEL devices over CR mesh

    Cisco Resilient (CR) mesh solution provides a reliable communication infrastructure, with bandwidth capacity and low latency that meets the DA use case performance requirements.

    Schweitzer Electric Engineering Laboratories (SEL) is one of the major utility grid equipment and DA solution vendor in North America. Cisco and Schweitzer Engineering Laboratories (SEL) have collaborated on joint validation of FLISR use case over Cisco CR mesh. This joint validation used Fault Location Isolation and Service Restoration (FLISR) products from SEL operating over Cisco Resilient Mesh. Operational measurements such as trip time, data alignment, service restoration and operational consistency were recorded and the application was found to work very well over CR mesh.

    This section provides guidance to utilities that they could privately own and operate a FAN radio network in the ISM 902-928 MHz band, such as Cisco Resilient Mesh, as a multi-service solution working well with Distribution Automation applications. Cisco has committed to validate the major DA use cases like FLISR within their indoor labs and outdoor pilot locations.

    Schweitzer Electric Laboratories (SEL) has a comprehensive solution for the DA FLISR application that can be deployed in centralized architecture. The solution uses a controller device to provide advanced restoration capabilities that can be located in the control center. Combined with Cisco Resilient Mesh communication infrastructure the FLISR application can operate in fully automatic mode.

    The SEL equipment listed below is used for Validation:

    blank.gif SEL Real-Time Automation Controller (RTAC)

    blank.gif SEL Advanced Recloser Control (SEL-651R)

    Cisco Resilient Mesh and SEL FLISR Architecture

    Cisco resilient mesh and SEL FLISR architecture are shown in Figure 10. SEL FLISR controller device deployment follows a centralized FLISR architecture. The SEL FLISR controller devices are the outstation devices. These outstation devices connect to Cisco IR510 DA gateways, enabling the communication path. Cisco CR mesh comprising IR510, IR530 can be aggregated at a CGR1000 series of routers, usually located in the substation/along the feeder. The CGR1000 series of routers can talk to the control center over Private WAN (or) Public Cellular and fiber networks. Cisco Resilient mesh provides communication infrastructure to the SEL FLISR Architecture.

    Figure 10 Cisco Resilient Mesh and SEL FLISR Architecture

     

    386564.jpg

    The SEL reclosers connect to the Cisco Resilient Mesh Industrial Routers (IR510) using Ethernet.

    The Cisco IR510 DA gateways establish a reliable mesh wireless network based on signal propagation radio design. For instances where the radio signal cannot cover a certain area or signal levels are weak, Cisco Range Extenders (IR530) can be deployed to increase signal coverage for that area. CR mesh based on ISM 902-928MHz and IEEE802.15.4g/e standard using OFDM modulation with a physical data rate up to 1.2 Mbps can support the performance requirements of the FLISR application.

    The SEL and Cisco DA solutions help utilities lower the SAIDI (System Average Interruption Duration Index) and SAFI (System Average Interruption Frequency Index) performance metrics which reflects the reliability of their Power Distribution network.

    DNP3/IP messages between the SEL-651R recloser and the SEL controller are reliable routed through the mesh network via the most optimal path to the field area border router, such as Cisco CGR 1000 router, known as the mesh exit point, and then via either a fiber network or Cellular network to the Control Center location.

    The SEL FLISR solution can also be deployed in a Centralized architecture where one or more RTAC devices are installed in the Control Center and each controller services an area that is not necessarily bonded to a substation service area. This approach has a lower deployment cost.

    The Distribution Automation system consists of a DA controller (DAC) that communicates with multiple recloser controls, switch controls, feeder relays, and a wide range of other intelligent electronic devices (IEDs). The DAC is implemented on the SEL Real-Time Automation Controller (RTAC) family of controllers. The DAC can be applied to a wide variety of distribution network arrangements. The DACS addresses two major control objectives for the power distribution system: automatic reconfiguration (AR) and dynamic feeder optimization (DFO). This design will focus on the automatic reconfiguration since it has more stringent communications requirements.

    The DAC functions to detect permanent fault and open-phase conditions on the distribution network. The DAC will act to isolate the affected zone of the feeder and restore power to the unaffected zones of the feeder from the normal source and from alternative sources, if available. The DAC also functions to detect overload conditions on the distribution network. The DAC will act to mitigate overloads by transferring load to adjacent feeders or by load shedding if alternative capacity is unavailable. In addition, the DAC functions to detect station events and loss-of-source events. The DAC will act to isolate the affected station and restore power to the unaffected feeders from alternative sources, if available.

    DA FLISR Use case with SEL

    Solution Overview

    Cisco Resilient Mesh networks can support the SEL FLISR application over a variety of topologies and places in the network. Contents of the subsequent sections is summarized below:

    These topologies were considered during the joint validation activity:

    blank.gifFLISR Urban Topology

    blank.gifFLISR Rural Topology

    Each topology is described with the following:

    blank.gifFLISR Topology-based one-line diagram (with segment division)

    blank.gifFLISR Topology-based SEL device to Cisco device mapping diagram

    blank.gifFLISR Topology-based SEL device to Cisco device mapping table.

    FLISR Fault scenarios are described below:

    For the SEL FLISR use case in Rural/Urban topology, the following faults in the power line are considered:

    blank.gifFault with Lock Out

    blank.gifOpen Phase Fault

    blank.gifLoss of Source Fault

    Each of these FLISR Fault scenarios are considered in three states:

    blank.gifNormal state

    blank.gifFault state

    blank.gifRestored or reconfigured state

    SEL FLISR Use case Validation Call Flow Sequence

    The call flow sequence of the SEL FLISR use case validation uses the DNP3/IP application protocol. UDP is the recommended protocol at the transport layer to transport the DNP3/IP messages. The two topologies for SEL FLISR use case validation are:

    blank.gifFLISR Urban Topology

    blank.gifFLISR Rural Topology

    Figure 11 FLISR - Call Flow Event Sequence - Application DNP3 between DAC and IED

    386565.jpg

    The Call flow sequence in Figure 11 is categorized into these major blocks:

    blank.gifClass123 (Read/Response)

    blank.gifClass0123 (Read/Response)

    blank.gifSelect/Response

    blank.gifOperate/Response

    blank.gifUnsolicited Response/Confirmation

    Of the Commands listed above, Class123 is a periodic command; others are event-driven commands. For example, Unsolicited Response is an event-driven command.

    Class0123 is used as periodic command, but with a longer duration of polling interval. The interval is once every 30 minutes in case of FLISR over radio.

    In Figure 11, the Class123 event is configured to operate periodically every “N” seconds.

    In the following FLISR Fault scenarios, these faults are introduced in specific segments of the Urban/Rural topology:

    blank.gifFault with LockOut

    blank.gifOpen Phase

    blank.gifLoss of Source

    When these faults are introduced into the SEL FLISR (Urban/Rural) topology into a particular segment, the SEL device in the affected segment raises “Unsolicited Response” messages to the SEL FLISR Controller (DAC), located in the control center.

    DAC then sends an acknowledgment “Confirmation” message to the corresponding SEL device (Recloser Controller).

    The “Unsolicited Response” conveys a change of state in the network. The SEL DA FLISR Controller performs “Class0123 Read” of all the related set of SEL devices to get the holistic view of the SEL FLISR Topology status.

    In response to the Class0123 read request from the DAC, the SEL devices then respond with “Class 0123 Response” to the SEL FLISR Controller. The set of related SEL devices includes devices in the affected feeder section(s) and any adjacent sections connected with a normally open point.

    The SEL FLISR Controller performs a FLISR computation, and decides which SEL recloser devices it must open, and which SEL recloser devices it must close. To accomplish this, the SEL FLISR Controller uses a “Select” message, followed by “Operate” message. Alternatively, the DA FLISR controller could directly perform “Operate” commands instead of Select before Operate. For each of the “Select” or “Operate” messages from the SEL FLISR Controller, the Recloser responds back with “Response” message.

    Recommendation from SEL is to use direct Operate, as Select before Operate is a remnant of the past when auxiliary relays were used. This serves an additional purpose of reducing the load on the mesh by a fraction. Bandwidth conserved is the bandwidth earned for other application traffic.

    When the status of the SEL device changes, the SEL recloser controller updates the SEL FLISR DA Controller device using “Unsolicited response” message, which is acknowledged back by DAC using “Confirmation” message.

    When the SEL FLISR DA Controller requires a holistic view of the current status of the topology, it sends “Class0123 Read” to all the related SEL recloser controller devices. The current status of the related device is updated back in the corresponding “Class0123 Response” message sent by SEL recloser controller device to the DAC. The set of related SEL devices includes devices in the affected feeder section(s) and any adjacent sections connected with a normally open point.

    This completes the SEL FLISR Event call flow sequence. This sequence could happen over SEL FLISR Urban/Rural topology, which are discussed in upcoming sections.

    Note: Throughout this section, the terms DAC and SEL FLISR DA Controller are used interchangeably.

    Cisco SEL FLISR Use case – Urban Topology

    FLISR Topology, Urban Area - One-line diagram:

    Figure 12 FLISR Topology, Urban Area – One-line diagram

     

    386566.jpg

    The one-line diagram in Figure 12 describes the FLISR topology in urban area. The topology has been divided into ten segments (Seg1, Seg2,..., Seg10). Red boxes represent Energized (Closed) reclosers, while green boxes representing unenergized (Normally open) reclosers.

    blank.gifSeg 1, Seg 2 and Seg 3 were energized from Transformer1

    blank.gifSeg 4 was energized from Transformer2

    blank.gifSeg 5, Seg 6, Seg 7 were energized from Transformer3

    blank.gifSeg 8, Seg 9, Seg 10 were energized from Transformer4

    blank.gifReclosers 2, 4, 5 and 8 are in a Normally Open state

    Urban FLISR Topology - SEL device to Cisco device mapping

    Figure 13 FLISR Topology, Urban Area – SEL reclosers to Cisco device mapping

     

    258679.jpg

    This topology captures the 1-to-1 mapping of SEL recloser devices to Cisco IR510 devices. The controller device is located in the Primary control center. CR Mesh is aggregated at the Field Area Network aggregator using CGR1000 series router which can be located in the substation. The communication between substation and control center can be over public/private WAN. The SEL device is positioned behind IR510 and connected using Ethernet.

    In the above figure:

    blank.gifThe device representing DA CONTROLLER (SEL3530-2) is located in Primary control center.

    blank.gifThe device representing Recloser 9 (SEL3505-9) is associated with IR510-13.

    blank.gifThe device representing Recloser 6 (SEL3505-6) is associated with IR510-12.

    blank.gifThe device representing Recloser 1 (SEL3505-1) is associated with IR510-1.

    blank.gifThe device representing Recloser 10 (SEL505-10) is associated with IR510-23

    blank.gifThe device representing Recloser 7 (SEL505-7) is associated with IR510-22

    blank.gifThe device representing Recloser 2 (SEL505-2) is associated with IR510-21

    blank.gifThe device representing Recloser 8 (SEL505-8) is associated with IR510-33

    blank.gifThe device representing Recloser 4 (SEL505-4) is associated with IR510-32

    blank.gifThe device representing Recloser 5 (SEL505-5) is associated with IR510-42

    blank.gifThe device representing Recloser 3 (SEL505-3) is associated with IR510-4

    The following table captures the Individual mapping of the SEL device with the Cisco Mesh and the mesh depth. The mapped pair of SEL/Cisco device is located on the mesh. The FLISR Controller (SEL3530-2) is located in the control center. The rest of the SEL devices (SEL 3505) are located along the substation and feeder.

    Table 1 FLISR Urban Topology Components

    One-Line Diagram Dev Label
    SEL Device Name
    Cisco Mesh Device Name
    Mesh Node Hop Depth
    Rec1
    SEL3505-1
    IR510-1
    1
    Rec6
    SEL3505-6
    IR510-12
    1
    Rec9
    SEL3505-9
    IR510-13
    1
    Rec2
    SEL3505-2
    IR510-21
    2
    Rec7
    SEL3505-7
    IR510-22
    2
    Rec10
    SEL3505-10
    IR510-23
    2
    Rec4
    SEL3505-4
    IR510-32
    3
    Rec8
    SEL3505-8
    IR510-33
    3
    Rec3
    SEL3505-3
    IR510-4
    4
    Rec5
    SEL3505-5
    IR510-42
    4
    FLISR Controller
    SEL3530-2
    N/A
    N/A

    The next section discusses a few different types of FLISR Faults (Fault with LockOut, Open Phase, Loss of Source), each of them in three different states (Normal, Fault, Restored).

    FLISR Fault scenario - Fault with Lock Out

    In the case of a permanent fault, the recloser goes into lockout state until the fault is resolved by utility technician. For example, when a tree branch falls on the distribution line and conductors break each recloser that noticed the fault will trip and send an unsolicited DNP3/IP message upstream to the DA FLISR Controller. This covers the Fault Location identification phase of FLISR.

    When the fault location is identified, the DA controller gets the overall current status of the topology by sending a Class0123 poll to all the related reclosers. This group includes devices in the affected feeder section(s) and any adjacent sections connected with a normally open point. Then the DAC performs the FLISR restoration computation, and proceeds to the FLISR restoration phase.

    In the FLISR restoration phase, the DA Controller sends DNP3/IP commands to the respective reclosers that needs to change state in order to restore the services. Once the DA FLISR controller finishes the reconfiguration, it performs one round of Class0123 polling to ensure the stability. This covers the Fault Isolation and Service Restoration phase of the FLISR.

    Normal state

    The Urban One-line topology in normal operational state is shown below.

    Figure 14 Urban Topology - Fault with Lockout - Normal State

     

    386545.jpg

    Fault state

    When a fault is introduced in Segment 2 between Recloser1 and Recloser2, at point B, recloser1 on the segment changes its state from Normally Closed state (NC) to Open state. This results in loss of power to the segments shown in the figure below.

    blank.gifSegment 2 (where the fault occurred)

    blank.gifSegment 3 (where the fault did not occur)

    Figure 15 Urban Topology - Fault with Lockout - Fault State

     

    386546.jpg

    In the faulted state, customers in the region in segment 2 experience a loss of power; customers in the non-faulted region in segment 3 are also experiencing the loss of power.

    The moment the fault occurred, recloser1 that detects the fault (in segment2) changes its state from Close to Open and enters Lockout state. The DA FLISR controller then sends Class0123 polling after one recloser goes into Lockout state, indicating there is a permanent fault.

    Points to note in the topology:

    blank.gifRecloser1: Should change the state from Normally Closed state to Open state and locked out.

    blank.gifRecloser3: might continue to be in Closed state, but no power on the feeder line.

    blank.gifConsumers of power located in Segment2 and Segment3 would experience loss of service.

    Restored state

    The FLISR SEL DA controller response sequence is outlined below:

    1.blank.gif To Change the state of the Recloser 3 located between segment 2 and 3 from Normally closed to Open

    2.blank.gif To change the state of the Recloser 5(located between segment 3 & 8) from Normally Open state to Closed state, provided transformer 4 doesn’t get overloaded to serve segment 3.

    This results in energizing segment 3 with power source from transformer4 feeder.

    Figure 16 Urban Topology - Fault with Lockout - Restored State

     

    386547.jpg

    The results to customers is as follows.

    blank.gifCustomers in non-faulty region in segment 3 would have power service restored with the help of FLISR.

    blank.gifDue to Fault isolation, power outage is restricted to Customers in the faulty (and isolated) region in segment 2.

    Summary:

    blank.gifFault Location Identification - identified in segment2.

    blank.gifService Restoration – Restoring the power to non-faulty segment (segment 3).

    blank.gifIsolation - Fault has been restricted to affected customers in segment 2 alone.

    FLISR Fault scenario – Open Phase

    An Open Phase fault applies to three-phase circuits, where one of the line voltage is lost. One cause could be due to a bad or loose street pole line jumper that interrupts the line. The loss of voltage will trigger a DNP3/IP unsolicited message to the DA Controller. The DA controller then initiates the necessary action, similar to FLISR Fault scenario - Fault with Lockout.

    Normal state

    Figure 17 FLISR Urban Topology – Open Phase - Normal State

     

    386548.jpg

    Figure 17 portrays the normal state of the FLISR Urban topology. Take note of segment 8. Reclosers corresponding to this segment are Recloser 5, Recloser 8 and Recloser 10. Of these three reclosers, Reclosers 5 & 8 are Normally Open. Recloser 10 is in Closed state, thus energizing segment 8 with the power source from Transformer 4.

    Fault state

    The Open Phase Fault occurs in point B of recloser 10, as highlighted in below figure XX, with a red “X”.

    Figure 18 FLISR Urban Topology – Open Phase - Fault State

    386549.jpg

    Figure 18 portrays the fault state of the FLISR Urban topology. When fault occurred in segment 8 customers experience loss of power. The participating reclosers report the change of state using DNP3/IP Unsolicited Response message to the FLISR DA controller located in the control center.

    The DA controller sends DNP3/IP Class0123 polling from all the related reclosers in the FLISR Urban topology to get a holistic view of the topology status. The set of related SEL devices includes devices in the affected feeder section(s) and any adjacent sections connected with a normally open point.

    Note: In comparison to the FLISR Fault in the Fault with Lockout scenario, while in fault state only customers in the faulty segment 8 experience loss of power. All the other segments have power.

    Restored state

    Segment 8 is chosen to illustrate the following point. In cases where no other segment is affected, other than the faulty segment, the FLISR restoration action can be to do nothing.

    Figure 19 FLISR Urban Topology – Open Phase - Restored State

     

    386550.jpg

    In this state:

    blank.gifThe DA controller makes the decision to change the state of the Recloser10 from normally closed state to Open.

    blank.gifDue to Fault isolation, power outage is restricted to customers in the faulty region in segment 8.

    blank.gifNo Customers in any other segment are affected.

    Summary:

    blank.gifFault Location Identification - identified in segment8.

    blank.gifService Restoration – No FLISR restoration needed in this case. It is only the isolation of faulty segment.

    blank.gifIsolation - Fault has been restricted to affected customers in segment 8 alone.

    FLISR Fault scenario – Loss of Source

    Loss of source applies to fault that occurs within the substation yard. For example, a bus fault at the substation could cause an outage for the entire feeder originating from that substation. In this scenario, the relay within the substation will notify the SEL DA controller of the loss of power which will then initiate the restoration process.

    Normal state

    Figure 20 Urban Topology – Loss of Source - Normal State

     

    386551.jpg

    Figure 20 portrays the normal state of the FLISR Urban topology. Notice the feeder line from transformer 3 which includes segment 7, segment 6 and segment 5.

    In normal operational state,

    blank.gifReclosers 4 & 8 are Normally Open. Circuit Breaker3, Reclosers 6 & 7 are Closed.

    blank.gifAll the three segments (5,6 and 7) derive a power source from transformer3.

    Fault state:

    The third feeder transformer was taken out of service to simulate a loss of power. The substation feeder breaker (Brk3) tripped and all the downstream customers lost power.

    Figure 21 Urban Topology – Loss of Source - Fault State

     

    386552.jpg

    In fault state:

    blank.gifReclosers 4 & 8 are still in Normally Open state.

    blank.gifCircuit Breaker3, Reclosers 6 & 7 are in Closed state.

    blank.gifAll the three segments (5,6 and 7) experiences loss of power, as transformer3 is out of service.

    The circuit breaker and recloser devices sends DNP3/IP unsolicited message (about the loss of power) upstream to the SEL DA controller device located in the control center. The SEL DA controller performs DNP3/IP Class0123 polling from all the related reclosers in the FLISR Urban topology to get a holistic view of the current state of the topology.

    Restored state

    The SEL FLISR DA Controller finds the most optimal way to restore the services and, in this case, it chooses to:

    blank.gifChange the Circuit breaker 3 from Closed state to Open state.

    blank.gifChange the Recloser8 state from Normal Open to Closed state.

    This would result in energizing segments (5,6 and 7) with power source from transformer4 feeder.

    Figure 22 Urban Topology – Loss of Source - Restored State

     

    386553.jpg

    In this restored state:

    blank.gifCustomers in all the affected segments (5,6 and 7) would have power service restored with the help of FLISR.

    blank.gifDue to Fault isolation, transformer3 is taken out of the picture. Affected segments are now served by power from transformer4.

    Summary:

    blank.gifFault Location - identified fault in transformer3.

    blank.gifIsolation - Fault has been restricted to transformer3 and it has been taken out of service.

    blank.gifService Restoration – Restored the power to affected segments using alternate power source (transformer4).

    Cisco SEL FLISR Use case – Rural Topology

    FLISR Topology – Rural Area - One-line diagram:

    Figure 23 FLISR Topology (Rural Area) – One-line diagram

     

    386554.jpg

    Figure 23 captures the one-line diagram of the FLISR topology in a Rural area. The topology has been divided into eleven segments (Seg1, Seg2,... Seg11). Red boxes represent Energized (Closed) power line, and green boxes represent unenergized (open) power line. Segments are energized as follows:

    blank.gifSeg 1-6 are energized from Transformer1

    blank.gifSeg 7-11 are energized from Transformer2

    Rural FLISR Topology - SEL device to Cisco device mapping:

    Figure 24 FLISR Topology (Rural Area) – SEL reclosers to Cisco device mapping diagram

     

    258681.jpg

    Figure 24 captures the 1-to-1 mapping of SEL recloser devices to Cisco IR510 devices. The DA controller device is located in the Primary control center. CR Mesh is aggregated at the Field Area Network aggregator using CGR1000 series router which can be located in the substation. The communication between substation and control center can be over public/private WAN. The SEL device is positioned behind IR510 and connected using Ethernet.

    The following table captures the Individual mapping of the SEL device with Cisco Mesh device and the mesh depth. The mapped pair of SEL/Cisco devices is located on the mesh. FLISR Controller (SEL3530-2) is located in control center. Other SEL devices, SEL 3505s, are located along the substation and feeder.

    Table 2 FLISR Rural Topology Components

    One-Line Diagram Dev Label
    SEL Device Name
    Cisco Mesh Device Name
    Mesh Node Hop Depth
    Rec1
    SEL3505-1
    IR510-1
    1
    Rec2
    SEL3505-2
    IR510-2
    2
    Rec3
    SEL3505-3
    IR510-3
    3
    Rec4
    SEL3505-4
    IR510-4
    4
    Rec5
    SEL3505-5
    IR510-5
    5
    Rec6
    SEL3505-6
    IR510-6
    6
    Rec7
    SEL3505-7
    IR510-7
    7
    Rec8
    SEL3505-8
    IR510-8
    8
    Rec9
    SEL3505-9
    IR510-9
    9
    Rec10
    SEL3505-10
    IR510-10
    10
    FLISR Controller
    SEL3530-2
    N/A
    N/A

    A few different types of FLISR Faults such as Fault with LockOut, Open Phase, Loss of Source, each of them in three different states (Normal, Fault, Restored) are discussed.

    SEL FLISR Rural Topology - FLISR Fault scenario - Fault with LockOut

    This failure scenario in Rural topology is very much similar to “SEL FLISR Urban Topology - FLISR Fault scenario - Fault with LockOut”. From the SEL FLISR Use case point of view, there is not much difference. Affected segments sends DNP3/IP unsolicited messages to SEL FLISR DA Controller located in the control center, which in turns performs Class0123 polling to know the holistic view of the FLISR topology. Later, SEL DA controller performs the required restoration operations. However, the underlying topology of the Cisco RF mesh is different. Rural topology uses hierarchical multi-hop topology.

    Normal state

    The Rural One-line topology in normal operational state looks like Figure 25 below.

    Figure 25 Rural Topology - Fault with Lockout - Normal State

     

    386555.jpg

    In above Figure 25:

    blank.gifRecloser 6 is in Normally Open (NO) state. Every other device is in closed state (the line is energized).

    blank.gifSegments 1-6 are powered by feeder from transformer1

    blank.gifSegments 7-11 are powered by feeder from transformer2

    Fault state:

    When a fault gets introduced in Segment 3 (between Recloser2 and Recloser3), recloser 2 participating in that segment changes its state from Normally Closed state (NC) to Open state. This would result in loss of power to below segments (as shown in the figure below):

    blank.gifSegment 3 (between Recloser 2 and 3) - where the fault did occur

    blank.gifSegments 4-6 (between Recloser 3 and 6) - where the fault did not occur.

    Figure 26 Rural Topology - Fault with Lockout - Fault State

     

    386556.jpg

    In this state, customers in the faulty region in segment 3 is experiencing loss of power. Along with them, customers in non-faulty regions belonging to segments 4-6 are also experiencing the loss of power.

    The moment fault occurred, recloser2 participating in the segment (3) changes its state from Closed to Open. Meanwhile, following reclosers would send DNP3/IP unsolicited messages upstream to the SEL FLISR DA controller:

    blank.gifRecloser2 – conveys the fault.

    blank.gifReclosers (3-6) – conveys the loss of power.

    The SEL FLISR DA controller then performs Class0123 polling on every related recloser to get holistic view of the topology before making any restoration decision.

    Restored state

    Figure 27 Rural Topology - Fault with Lockout - Restored State

     

    386557.jpg

    The FLISR SEL DA controller decides:

    1.blank.gif To change the state of Recloser 3 (located between segment 3 & 4) from normally closed state to Open state.

    2.blank.gif To change the state of Recloser 6 (located between segment 6 & 7) from Normally Open state to Closed state. This would result in energizing segments (4-6) with power source from transformer2 feeder.

    In this state:

    blank.gifCustomers in non-faulty region in segments 4-6 would have power service restored with the help of FLISR.

    blank.gifDue to Fault isolation, power outage is restricted to Customers in the faulty region in segment 3.

    Summary:

    blank.gifFault Location Identification - identified in segment3.

    blank.gifService Restoration – Restoring the power to non-faulty segment (segment 4-6).

    blank.gifIsolation - Fault has been restricted to affected customers in segment 3 alone.

    SEL FLISR Rural Topology - FLISR Fault scenario – Open Phase

    Open Phase fault applies to three phase circuits, where one of the line voltage is lost. One cause could be due to a bad or loose street pole line jumper that interrupts the line. The loss of voltage will again trigger a DNP3/IP unsolicited message to the DA Controller. DA controller then initiates the necessary action, similar to FLISR Fault scenario - Fault with Lockout.

    Normal state

    Figure 28 FLISR Rural Topology – Open Phase - Normal State

     

    386558.jpg

    Figure 28 portrays the normal state of the FLISR Rural topology. The emphasis is placed on segment 8 and segment 7. Reclosers corresponding to these segments are Recloser 6,7 and 8.

    In above Figure 28:

    blank.gifRecloser 6 is in Normally Open state. Every other device is in closed state (means, line is energized).

    blank.gifSegments 1-6 are powered by feeder from transformer1

    blank.gifSegments 7-11 are powered by feeder from transformer2

    Fault state

    The Open Phase Fault occurs in point A of recloser 7, as highlighted in below figure XX, with a red “X”.

    Figure 29 FLISR Rural Topology – Open Phase - Fault State

    386559.jpg

    Figure 29 portrays the fault state of the FLISR Rural topology, when fault occurred on segment 8.

    This would result in loss of power to below segments:

    blank.gifSegment 8 (between Recloser 8 and 7) - where the fault did occur

    blank.gifSegment 7 (between Recloser 7 and 6) - where the fault did not occur.

    In this state, customers in the faulty segment (8) is experiencing loss of power. Along with them, customers in non-faulty segment (7) is also experiencing the loss of power. Before the fault occurred, both of these segments (7 & 8) used to receive power from transformer2.

    The moment fault occurred on segment 8, the participating reclosers would be sending DNP3/IP unsolicited messages upstream to the SEL FLISR DA Controller located in the control center. The SEL FLISR DA controller then performs Class0123 polling on every related recloser in the FLISR Rural topology, to get a holistic view of the topology before making any restoration decision.

    Restored state

    Figure 30 FLISR Rural Topology – Open Phase - Restored State

     

    386560.jpg

    The FLISR SEL DA controller decides:

    1.blank.gif To change the state of the Recloser 8 (located between segment 9 & 8) from Closed state to Open state

    2.blank.gif To change the state of the Recloser 7 (located between segment 8 & 7) from Closed state to Open state

    3.blank.gif To change the state of Recloser 6 (located between segment 6 & 7) from Normally Open state to Closed state. This would result in energizing segment (7) with power source from transformer1 feeder.

    In this state:

    blank.gifCustomers in non-faulty region in segments 7 would have power service restored with the help of FLISR.

    blank.gifDue to Fault isolation, power outage is restricted to Customers in the faulty region (segment 8).

    Summary:

    blank.gifFault Location Identification - identified in segment8.

    blank.gifService Restoration – Restoring the power to non-faulty segment (segment 7).

    blank.gifIsolation - Fault has been restricted to affected customers in segment 8 alone.

    SEL FLISR Rural Topology - FLISR Fault scenario – Loss of Source:

    Loss of source applies to fault that occur within the substation yard. For example, a transformer going bad (or) bus fault in the substation, that causes an outage for the entire feeder originating from that substation. In this scenario, the relay within the substation will notify the SEL DA controller of the loss of power which will then initiate the restoration process.

    Normal state

    Figure 31 Rural Topology – Loss of Source - Normal State

     

    386561.jpg

    Figure 31 portrays the normal state of the FLISR Rural topology. The emphasis is placed on entire feeder line from transformer 1 (which include segments 1-6) In normal operational state,

    blank.gifRecloser 6 is in Normally Open state. All other devices Circuit Breakers & Reclosers are in Closed state.

    blank.gifSegments 1-6 deriving power from transformer1

    blank.gifSegments 7-11 deriving power from transformer2

    Fault state

    The first feeder transformer was taken out of service to simulate a loss of source. Once the transformer1 is taken out of service, all the downstream customers in segments (1-6) would experience loss of power.

    Figure 32 Rural Topology – Loss of Source - Fault State

     

    386562.jpg

    In fault state:

    blank.gifSegments 1-6 would experience loss of power, as transformer1 is out of service.

    blank.gifReclosers 6 is still in Normally Open state.

    The circuit breaker and recloser devices sends DNP3/IP unsolicited message upstream to the SEL FLISR DA controller device located in the control center, about the loss of power in the feeder line. The SEL DA controller performs DNP3/IP Class0123 polling from all the related reclosers in the FLISR Rural topology to get a holistic view of the current state of the topology.

    Restored state

    The SEL FLISR DA Controller finds the most optimal way to restore the services and, in this case, it chooses to:

    blank.gifChange the Circuit breaker 1 from Closed state to Open state.

    blank.gifChange the Recloser6 state from Normal Open state to Close state, provided the additional load will not cause an overload on transformer2.

    blank.gifThis would result in energizing segments (1-6) with power source from transformer2 feeder.

    Figure 33 Rural Topology – Loss of Source - Restored State

     

    386563.jpg

    In this state:

    blank.gifCustomers in all the affected segments (1-6) would have power service restored with the help of FLISR.

    blank.gifDue to Fault isolation, transformer1 is isolated by opening Circuit Breaker1. Affected segments are now served by transformer2.

    Summary:

    blank.gifFault Location Identification - identified in transformer1.

    blank.gifService Restoration – Restored the power to affected segments using alternate power source (transformer2).

    blank.gifIsolation - Fault has been restricted between transformer1 and breaker1.

    Cisco Resilient (CR) Mesh - Design Considerations for Centralized FLISR use case

    This section covers common design considerations, followed by capacity planning of the CR mesh for deployment of Fault Location Isolation and Service Restoration (FLISR) use case. It would also discuss on the number of Distribution Automation (DA) gateways that could be positioned in the CR mesh for the FLISR use case, along with few mesh topology combinations.

    It becomes vital to dissect and understand the Application requirement and its exhibited traffic characteristics, to then figure out if CR mesh could cater to it. The first step is to understand the traffic profile of the Application that is being considered for deployment on CR mesh. Listed below are a few issues:

    blank.gifUnderstanding the packet profile of the application traffic, for example, FLISR application traffic profile.

    blank.gifWhat subset of the packet profile are periodic? These would be exchanged even without any FLISR event.

    blank.gifWhat subset of the packet profile are event driven that would be exchanged only when there is a FLISR event – for example, only when there is a fault in the feeder, unsolicited report would be generated, followed by FLISR restoration phase?

    blank.gifWhat is the latency requirement of the application? For example, 100 ms vs 1 second vs 5 minutes.

    blank.gifIs it a time critical application or noncritical application?

    blank.gifHow many numbers of devices participated in the FLISR traffic profile that is under analysis?

    blank.gifWhat is the accepted max duration for the completion of FLISR use case -- 2 minutes or 5 minutes or other?

    blank.gifImpact of DA FLISR algorithm on the CR mesh. For example, When the FLISR event fault occurs, does FLISR DA controller have to interact with all the nodes in the CR mesh or does it interact with only subset of the nodes in the CR mesh? This would have an impact on the number of nodes (IR510) competing for the mesh capacity.

    Addition factors to consider are:

    blank.gifNumber of packets of varying size that is being transmitted (very small, small, medium, large packet sizes)

    blank.gifClassification of the packets being transmitted (some may be periodic, some are event-driven).

    blank.gifFrequency of packets being transmitted (Is it bandwidth intensive?).

    blank.gifArea and the distance that needs to be aggregated (Urban vs Rural) by the CGR and CR mesh.

    blank.gifTransport layer used for Application traffic (Choice of UDP vs TCP), with recommendation being UDP.

    blank.gifDNP3 security if used, would increase the payload size.

    blank.gifAverage number of FLISR events per day.

    Evaluating Number of DA devices in mesh for given FLISR Application in single PAN – A Methodology

    Figure 34 Methodology for evaluating number of DA devices in mesh for FLISR Application

     

    386567.jpg

    Can a given FLISR application be run over CR mesh? The answer to that question lies in understanding the characteristics of the given FLISR application in terms of payload sizes exchanged, and their respective weightage. Weightage in this case is the number of occurrences of a particular payload size in FLISR packet profiling. Based on the understanding of the FLISR application traffic profile, FLISR mix of payload sizes could be formed. It is then recommended to run performance and latency tests for the FLISR mix of payload sizes, across various Ranks.

    Obtained throughput and latency could then be evaluated to check if it could satisfy the FLISR application bandwidth and latency requirement. Combining this observation, along with the maximum accepted duration for the completion of FLISR application use case, the possible number of DA devices that could be supported in the CR mesh could possibly be derived.

    Table 3 FLISR Mix of Payload sizes for 10 participating devices – A sample

     

    FLISR packet type
    Payload size (in Bytes)
    Number of datagrams
    Direction of Application Traffic
    Class123 Read,
    Class123 Response
    26
    29
    DA Controller (DAC) to IED
    20
    29
    IED to DAC
    Class0123 Read,
    Class0123 Response
    29
    46
    DAC to IED
    112
    43
    IED to DAC
    Select/Operate,
    Response
    37
    24
    DAC to IED
    39
    24
    IED to DAC
    Response,
    Unsolicited Response, Unsolicited Confirmation
    35
    18
    IED to DAC
    45
    47
    IED to DAC
    59
    11
    IED to DAC
    91
    10
    IED to DAC
    138
    8
    IED to DAC
    206
    5
    IED to DAC
    17
    78
    DAC to IED
    Total number of Datagrams for 10 DA devices
     
    372 Datagrams
     

    Note: 372 Datagrams in above table corresponds to Application goodput in sample FLISR use case. All the discussion below revolves around goodput to derive the number of traffic generating DA (utility) devices in the CR mesh.

    Common Design Considerations

    This section covers the common design considerations, various stages involved in it, and different types of traffic that consumes/competes for the available mesh capacity.

    Figure 35 CR Mesh Design Considerations

     

    386568.jpg

    RF Planning

    Before actual RF mesh deployment, theoretical planning of the RF site needs to be evaluated with the help of RF planning tools like ATDI. Details like radio and antenna parameters are entered into the RF planning tool. Locations for IR510, IR530 and CGR also needs to be fed into the tool.

    The objective of this RF planning exercise is to predict the received signal strength of any given RF link.

    RF Site Survey

    With the idea derived from the RF planning tool, the theoretical data must be verified with the help of live field tests. In this phase, the CR mesh devices like IR510, IR530 and CGR would be deployed in the planned locations, along with antennas mounted in desired height. Check the received and forward signal strength of any given RF link. Ensure, the RSSI values measured during this RF site survey phase for any particular RF device to be in the expected range, as planned during the RF planning phase.

    In this phase, presence of any interference would also be considered. Interferences could be temporary or permanent interference. Also, the noise floor be evaluated with the help of spectrum analyzer.

    Note: If the in-band interference is affecting the performance numbers of the node, you may consider improving the quality of the links by adding a range extender. To main stronger signal strength, consider RSSI range greater than or equal to -80 dBm.

    For out of band interference, consider using bandpass filter.

    Interference could be of transient nature or of permanent nature.

    blank.gifTransient interference conditions could be due to weather changes like RF foliage, obstacles in front of the antennas, or metallic objects in the way that change after a short time. In these instances, the mesh would self-heal and would retry transmission across all channels. When the conditions improve the performance returns to normal. The retry mechanism would resolve the issues as conditions improve.

    blank.gifPermanent interference: As part of RF site survey, if the mean reading indicates that certain sections of the ISM band is excessively used, then operator may consider notching the selective channels.

    Note: CR Mesh performance would be proportional to the number of channels being used.

    RF Radio medium

    RF radio is a half-duplex medium. The radio could either transmit or receive at any point in time. The IR510, IR530 and CGR could support OFDM, 2FSK modulations. In OFDM, 5 data rates are supported (1200, 800, 400, 200 and 50 Kbps). Each of these modulations have a different data rate and RSSI budget. For more details of the data rate supported, please refer to the R510 datasheet.

    Data Rate vs Packet Rate vs Goodput

    Theoretical maximum throughput for MAC/PHY technologies are expressed in Kbps, Mbps. Meanwhile, one measurement metric is packet rate, which signifies the number of packets that could be transmitted in any given second. Packet rate would vary according to the packet size under consideration. For example, comparing individual packet sizes of 64 vs 512, packet rate for 64 would be more than packet rate of 512. For a range of packet sizes, a packet mix could be composed, and the maximum packet rate could be obtained for the chosen packet mix.

    Figure 36 Max Data Rate vs Packet rate vs Goodput

     

    386569.jpg

    Maximum data rate represents the physical rate of the modulation/PHY Technology chosen (for example, OFDM 800). For a chosen modulation and maximum data rate, and for the chosen packet size (or mix of packet sizes), (packet rate) number of packets could be transmitted. Of these packet rate, some packets could get dropped due to congestion, and may have to be retransmitted. Since the retransmission also has to use the same physical capacity, net effective capacity that is available for application traffic, discounting the retransmission and lower layer communication overheads could be referred as Goodput, in our context.

    The key behind extracting the maximum performance out of the CR mesh lies in designing the network, by minimizing the retransmission, and by reducing the possibilities of congestion. It is also vital to clearly understand the characteristics of the application traffic that the mesh needs to carry and consider it while designing the network.

    Note: One important recommendation is to plan the CR mesh for 40-50% utilization of its full capacity, leaving the rest for peak traffic, and with some room for future growth.

    It is not good practice to plan the CR mesh for 90-100% utilization of its capacity, which is inviting for instability considering the physical nature of the congestion and retransmission in half duplex medium.

    Also, UDP as a layer4 transport protocol would offer better performance when compared to TCP in CR mesh, and hence usage of UDP is highly recommended.

    Also, it is recommended to configure RPL (Routing Protocol for Low-Power and Lossy Networks) to operate in storing mode, which would eliminate the source routing requirement for the downstream communication to the lower rank devices. This reduces the “source routing header” overhead which in turn could improve the mesh performance. Storing mode when enabled, has an additional benefit to allow peer to peer traffic inside the PAN.

    It is highly recommended to mark any latency sensitive critical traffic with dscp value of 18 (AF21), and the next level of moderately critical traffic with dscp value of 10 (AF11). Every other traffic could either be marked with dscp value of 0 or could be left unmarked. Ideal way is to have the QoS dscp values of the packet marked by the Utility controller devices located in the field, as well as by the devices located in the control center. For more details, please refer to the Quality of Service section.

    CR Mesh Capacity Planning

    Following are some of the different types of traffic that consumes/competes for the available mesh capacity.

    blank.gifCR Mesh control plane traffic (RPL routing, keepalives, multicast control plane, and so on).

    blank.gifCR Mesh devices (IR510, IR530) onboarding traffic (IEEE 802.1x, IEEE 802.11i, DHCPv6, and so on).

    blank.gifCR Mesh devices management traffic (CoAP, CSMP, and so on).

    blank.gifCR Mesh devices troubleshooting traffic (ping, traceroute, and so on).

    blank.gifCR Mesh devices Upgrade and other necessity driven traffic (Firmware upgrade, IOx upgrade, security update, and so on).

    blank.gifApplication traffic – periodic (Example, periodic DNP3 Class123 polling every 5 minutes)

    blank.gifApplication traffic – event driven (Example, power outage, fault identification, and successive poll/control traffic).

    blank.gifDifferent types of protocols used for Application communication – (Example, DNP3, MODBUS, IEC 60870, DLMS/COSEM, and so on.)

    blank.gifUtility device management traffic (Example, Accessing Utility controller device behind IR510)

    Figure 37 Mesh Capacity Planning

     

    386570.jpg

    Figure 37 captures a sample mesh capacity allocation for different types of traffic that competes for bandwidth.

    Considering the mesh capacity available between CGR and all Rank1 IR510/IR530 as 100%, plan 50% of it for the NMS and application use cases. Leave the remaining 50% as buffer for upgrades and other necessity driven traffic, and for future scaling of number of mesh devices. Of the planned 50% of mesh capacity, 10% could be assumed for control plane traffic as well as NMS related operations like (mesh device onboarding, management and troubleshooting). 40% of the mesh capacity could be planned for Application use case, which includes both periodic traffic, as well as event driven traffic.

    Notes:

    While the recommendation for planned capacity is 50%, it could be increased up to a maximum of 70% with 30% headroom.

    The numbers referenced in the below section is for demonstration purpose only, just to drive home the procedure. The actual performance and latency numbers for the given application needs to be derived, as mentioned in the sub section Evaluating Number of DA devices in mesh for given FLISR Application in single PAN – A Methodology.

    While planning for the worst case and considering the weakest, obtained result could be better than expected.

    The Goodput obtained during the performance and latency tests for the given FLISR mix of payload sizes could serve as a good reference for the mesh capacity. For example, if Goodput obtained at Rank1 is 70 datagrams per second for FLISR mix of payload sizes, these 70 datagrams/s could be referred as 100% mesh capacity for all Rank1 nodes. Planning for 40% of the mesh capacity at Rank1 translates to 40% (70) = 28 Datagrams per second (DPS) for entire Rank1.

    This 40% capacity (i.e., 28 Datagrams per second) at Rank1 could then be planned for sharing with lower rank nodes. If the FLISR application use case requires 372 datagrams (for 10 devices) to get completed, then 10 devices positioned at Rank1 could take (372/28=) ~14 seconds worth of mesh capacity at 40% loading to complete the FLISR use case. In reality, this could get completed faster, as there might be unused mesh capacity from the remaining 60%. Again, this is a sample number just to demonstrate the procedure.

    In case of peer to peer traffic, there could be multiple simultaneous transmissions in the mesh at different hops and channels. This adds extra capacity to the mesh that is not considered in above explanation.

    Determining the number of DA devices and mesh depth in the CR mesh for given application:

    The depth or the number of hops/ranks in the CR mesh could be dictated by the latency requirement of the use case application.

    Note: The next section assumes a few numbers to demonstrate the concept. These numbers do not represent the actual performance or latency numbers of the CR mesh. Run the performance and latency tests to arrive at the actual numbers pertaining to your CR mesh deployment.

    Determining the mesh depth

    To demonstrate an example, certain assumptions are made and the procedure to determine the mesh depth is shown

    Assumptions:

    blank.gifApplication requirement: Mesh needs to cater to an application that requires 100ms of one-way latency.

    blank.gifThe latency requirement of 100ms is between CR mesh device (IR510) and control center.

    blank.gifPer-hop latency based on the test results, if found to be in the range of 10-20 ms.

    Assuming the worst-case latency of 20ms for one hop, 5 hops might require 100ms of latency. Given such data points, it is recommended to retain the mesh depth to 5 hops or less.

    Similarly, if the application is fine with 1 second of one-way latency, theoretically the derivation could be something like (1000ms/20ms=) 50 hops. However, it is recommended not to exceed 8-10 rank depth, especially when the CR mesh is also used for latency sensitive traffic.

    Note: Although this section recommends 8-10 rank depth maximum, if the requirement is to reach remote areas with poor connectivity, you could consider increasing the hop depth with range extenders. Choosing to operate at a lower data rate (For example, OFDM 50 Kbps) along with increased hop depth is another option to consider.

    Determining the number of DA devices In the CR mesh

    As a pre-requisite for this section, performance test must be run for the given application packet profile, as mentioned in the section “Evaluating Number of DA devices in mesh for given FLISR Application – A Methodology”, and obtain the “GOODPUT” measured in Datagrams per second (DPS) for the given Application mix of payload sizes (example, FLISR payload mix).

    This section considers only 40% of mesh GOODPUT for the given application at any rank(hop), as per the recommended design. In other words, the intended application load could go up to a max of 40% of mesh capacity plus retransmissions. Meanwhile, at any given second, the then remaining and unused capacity would also be available for application to use, for even better performance.

    Note: Data in below table considers the GOODPUT between mesh nodes and control center, without considering the extra capacity obtained when running peer-to-peer traffic.

    To obtain the below table output, Bidirectional Multiple flow iperf3 tests needs to be executed with condensed version of multiple payload sizes and its respective weightage as referred in Table 3.

    Table 4 Mesh Capacity planning – GOODPUT Allocation table – A sample

     

    Rank
    (also referred as hop).
    100% Capacity
    (Unit: Datagrams per Second)
    40% Capacity planned for application use case
    (Unit: DPS)
    10% Capacity planned for NMS and its activities
    (Unit: DPS)
    50% Capacity planned for peak performance + future expansion.
    (Unit: DPS)
    Rank1
    70
    28
    7
    35
    Rank1-3
    35
    14
    3.5
    17.5
    Rank1-5
    24
    9.4
    2.3
    11.7
    Rank1-10
    17
    6.8
    1.7
    8.5

    Notes:

    The numbers shown in the above table are only sample values, chosen for the purpose of driving the methodology used to derive the number of DA devices in the mesh. Actual numbers could vary depending on the chosen Application packet profiling, and its corresponding performance and latency numbers over the CR mesh.

    Data in above Table 4 assumes OFDM modulation with data rate of 800 Kbps. The numbers would vary according to the chosen modulation and data rate.

    The 100% Capacity assumed in the above table refers to the multiple node throughput at the same rank. For example, 70 DPS in above table refers to cumulative throughput of multiple nodes in Rank1. If only one node is present in Rank1, 100% capacity could be 2X times approximately (for example,2 x 70 DPS = 140 DPS). Only multiple node throughput has been considered throughout the calculation.

    The data in above table is for data originating devices only (devices generating utility goodput traffic). Inclusion of range extenders anywhere in the path doesn’t affect the calculation of number of DA devices.

    For example, in case of three rank hierarchy (Rank1-3), if the rank3 device does not have good communication link with rank2 device, inclusion of range extender between rank2 and rank3 device would still be fine, as range extenders only relays the existing traffic, and doesn’t introduce any new data traffic. Hence, a mesh hierarchy with 3 IR510 and one IR530 should be considered as Rank 1-3 with respect to above Goodput allocation table.

    Similarly, a mesh hierarchy with 5 IR510 and 2 IR530 could be considered under Rank1-5 category with respect to above goodput allocation table.

    Data points required to derive the number of DA devices, for given application use case:

    blank.gifGoodput DPS for all devices in Rank1. (Example, if 8 nodes are in Rank1, obtain cumulative DPS for all 8 nodes). This corresponds to number 28 in above table.

    blank.gifGoodput DPS for 3 node hierarchy under any Rank1. This corresponds to number 14 in above table for Rank1-3 hierarchy.

    blank.gifGoodput DPS for 5 node hierarchy under any Rank1. This corresponds to number 9.4 in above table for Rank1-5 hierarchy.

    blank.gifGoodput DPS for 10 node hierarchy under any Rank1. This corresponds to number 6.8 in above table for Rank1-10 hierarchy.

    Data point taken based on FLISR Application packet profiling for 10 devices:

    blank.gifTotal number of datagrams required for one FLISR application use case to complete (with 10 devices) = 372 datagrams.

    Below section shall be discussed as multiple cases. The mesh could contain multiple rank1 devices. The term “10 nodes per rank1 hierarchy”, refers to linear mesh of 9 CR mesh devices, under one CR mesh rank1 device (IR510/IR530). Similarly, the term “3 nodes per rank1 hierarchy” would refer to linear mesh of 2 CR mesh devices under one rank1 device.

    The application use case considered for below case study is FLISR, and it took 372 datagrams for FLISR to complete for a total of 10 participating devices. This would be used as reference for completion of FLISR use case/10 nodes, in the below cases.

    Table 5 Number of DA devices in mesh – summary table

    Number of DA devices in mesh
    If maximum accepted duration for FLISR is 2 minutes
    If maximum accepted duration for FLISR is 4-5 minutes
    Case1: Considering all nodes in Rank1
    30-40
    45-60
    Case2: Considering up to 10 nodes per rank1 hierarchy
    40
    50-60
    Case3: Considering up to 5 nodes per rank1 hierarchy
    30
    50-60
    Case4: Considering up to 3 nodes per rank1 hierarchy
    20-30
    40-60

     

    Table 5 captures the summary of recommendations given for different scenarios. For more details, refer to below sections.

    Figure 38 Number of DA devices in CR mesh – Two Phases

     

    386571.jpg
    In Figure 38, the strategy to derive the number of DA devices in CR mesh could be defined in three phases as follows:

    blank.gifNumber of devices derived based on CGR WPAN goodput rate. This is scaling to fit within WPAN capacity at any given second. Example: 7 devices with goodput rate of 4 datagrams per second (DPS) is catered by WPAN which has goodput rate of 28 DPS. Therefore, 7 devices could be allowed to transmit at any given second.

    blank.gifConsidering some time buffer, arrive at Benchmark unit for further time scaling.

    blank.gifTime Scaling: Number of devices derived based on Application acceptable maximum duration limit. This is scaling to fit the number of DA devices over period of time. For example, if total datagrams for use case to complete is 72, 7 devices (each with a goodput rate of 4 DPS), would take (72 / (7*4) =) 3 seconds for use case to complete.

    blank.gifAssuming 3X time buffer, consider 12 seconds instead of 3 seconds for one use case.

    blank.gifTime taken for 1 set of 7 devices = 12 seconds.

    blank.gifThis way, if application accepted maximum duration limit = 120 seconds, then the mesh could cater up to a maximum of 10 sets (120s/12s per set of 7 devices). Theoretically, these 10 sets with 7 devices each could go up to a maximum of 70 devices.

    blank.gifThe resulting capacity scaling count = 7 devices; Time scaling count = 70 devices.

    blank.gifWhich means that 70 is the number of DA devices that could be deployed over CR mesh for the given application traffic characteristics.

    Case1: Considering all nodes in Rank1.

    This case assumes all the devices are positioned in Rank1 level itself.

    Data point assumption:

    blank.gifGoodput rate at rank1 = 28

    With goodput of 28 DPS, 372 datagrams for FLISR application could take (372/28=) 14 seconds. In fact, it could be lesser than 14 seconds considering that application traffic is planned only for 40% of the mesh capacity, and any portion of the remaining 60% could possibly be available, thus having a possibility to offer a performance, better than 14 seconds. However, below calculation considers 14 seconds as per the 40% design recommendation.

    The 372 datagrams FLISR packet profiling was based on 10 devices. So, 10 rank1 devices could take approximately 14-15 seconds to complete the FLISR use case.

    Table 6 Number of DA devices – All Rank1 nodes - scaling type and considerations

    Scaling Phase
    Time Duration
    Number of Rank1 nodes that could be served.
    Scaling Description
    Base Capacity
    15 seconds
    10
    10 rank1 nodes would consume 15 seconds of CGR WPAN (40%) capacity.
    Base Capacity
    1 minute
    30-40
    30-40 rank1 nodes would consume 45-60 seconds of CGR WPAN (40%) capacity.
    Benchmark Unit for Time Scaling
    (with time buffer)
    2-3 minutes
    30-40
    Unit of measurement for further scaling of nodes based on maximum acceptable duration for the FLISR application.
    Assumes 1-2 minutes of time buffer.
    Time Scaling based on Benchmark Unit
    4-5 minutes
    45-60
    Scaling based on maximum acceptable duration for the FLISR application. Time Scaling would assume 1.5 times the benchmark unit range.

    Above table is a theoretical extrapolation. In practical, it is advised to reduce the number of devices that competes for the same transmission medium.

    The number of devices that could be positioned in the mesh depends on the accepted maximum duration for FLISR use case. If the FLISR use case accepted maximum duration is 2-3 minutes, consider positioning 30-40 Rank1 nodes. If the FLISR use case accepted maximum duration is 4-5 minutes, consider positioning 45-60 Rank1 nodes.

    Case2: Considering up to 10 nodes per rank1 hierarchy

    This case assumes the Application latency requirement permits 10 hops of IR510/IR530 to be positioned under any Rank1 device.

    Data point assumptions:

    blank.gifGoodput DPS for Rank1-10 nodes = 6.8.

    blank.gifUp to 10 nodes are considered per rank1 hierarchy.

    blank.gifGoodput rate at rank1 = 28.

    With goodput of 6.8 (~7) DPS, 372 datagrams for FLISR application could take (372/6.8=) 55 seconds.

    At the rate of 7 DPS per rank1, and with goodput rate of 28 DPS for all Rank1, (28/7=) 4 Rank1 devices and its 10-node hierarchy should be able to transmit comfortably, without much of retransmission. This translates to 4 rank1 * 10 devices per rank1= 40 devices.

    This could be fine, when the accepted duration for FLISR use case to complete is 2 minutes or above (considering 2 x 55 seconds).

    Figure 39 Considering up to 10 nodes per rank1 hierarchy

     

    386572.jpg

    If the accepted maximum duration for FLISR to complete is 4-5 minutes, we could afford to have some congestion and retransmissions at the Rank1. Hence, we could add more branches and increase the number of DA devices positioned under the CR mesh. For example, if 2-3 minutes could serve 4 Rank1 capacity, allowing some buffer for retransmission, 4-5 minutes could serve 5-6 Rank1 devices worth of capacity, each with hierarchy of nodes underneath it.

    Table 7 Number of DA devices – up to 10 nodes per rank1 hierarchy - scaling type and considerations

    Scaling Phase
    Time Duration
    Number of nodes that could be served.
    Scaling Description
    Base Capacity
    55 seconds
    4 Rank1 x 10 nodes/Rank1 = 40 nodes
    40 nodes would consume 55 seconds of CGR WPAN (40%) capacity.
    Benchmark Unit for Time Scaling
    (with time buffer)
    2-3 minutes
    4 Rank1 x 10 nodes/Rank1 = 40 nodes
    Unit of measurement for further scaling of nodes based on maximum acceptable duration for the FLISR application.
    Considers 2 minutes of time buffer.
    Time Scaling based on Benchmark unit
    4-5 minutes
    5-6 Rank1 x 10 nodes/Rank1 = 50-60 nodes
    Scaling based on maximum acceptable duration for the FLISR application. Time Scaling assumes 1.25-1.5 times the benchmark unit range.

    Case3: Considering up to 5 nodes per rank1 hierarchy

    This case assumes the Application latency requirement permits 5 hops of IR510/IR530 to be positioned in any Rank1 hierarchy.

    Data point assumptions:

    blank.gifGoodput DPS for Rank1-5 nodes = 9.4

    blank.gifUp to 5 nodes are considered per rank1.

    blank.gifGoodput rate at rank1 = 28.

    blank.gifConsidered FLISR packet profiling has 372 datagrams for 10 devices.

    Figure 40 Considering up to 5 nodes per rank1 hierarchy

     

    386573.jpg

    All the calculations below assume only 40% of full capacity, factoring in multiple nodes in same rank. For example, 28 DPS in Rank1 is the cumulative of all rank1 nodes, at 40% capacity.

    With goodput of 9.4 DPS per branch of 5 nodes, combined goodput for 10 nodes = 18.8 DPS

    Time taken to complete FLISR application/10 nodes = (372/18.8) = 20 seconds.

    CGR WPAN (at 40% capacity) could serve FLISR application traffic worth 15 nodes at any point in time. (example, 9.4 DPS * 3 = 28.2 DPS, close enough to 28 DPS offered by Rank1).

    In 20 seconds, 15 nodes worth of FLISR application traffic could be served.

    This scaling limit comes from the CGR WPAN per second capability (again at 40% planned capacity). In reality, any unused/available mesh capacity of the remaining 60% would contribute to even better performance.

    Another level of device scaling could be achieved based on maximum accepted duration for the completion of FLISR use case (for example, it could be 1 minute or 2 minutes or 5 minutes or N minutes). To achieve that, consider some time buffer and arrive at a benchmark unit of measurement. For example, at the rate of 15 nodes/20 seconds, theoretically 45 nodes could be served in 1 minute. However, allowing some time buffer, only 30 nodes are considered for 2 minute interval, and only 15-30 nodes were considered for 1 minute interval.

    Table 8 Number of DA devices – up to 5 nodes per rank1 hierarchy - scaling type and considerations

    Scaling Phase
    Time Duration
    Number of nodes served that could be served.
    How many Rank1 nodes (along with its branch)?
    Scaling Description
    Base Capacity
    20 seconds
    15
    3 Rank1 * 5 nodes/rank1 branch.
    Scaling to CGR WPAN 40% capacity
    Base capacity
    1 minute
    15-30
    3-6 Rank1 * 5 nodes/branch
    Scaling to CGR WPAN 40% capacity for 60 seconds.
    Benchmark Unit for Time Scaling
    (with time buffer)
    2 minutes
    30
    6 Rank1 * 5 nodes/branch
    Arriving at Unit of measurement for further scaling of nodes based on maximum acceptable duration for the FLISR application.
    Time buffer considered = 1 minute.
    Time Scaling based on benchmark unit
    4-5 minutes
    50-60
    12 Rank1 * 5 nodes/branch
    Scaling based on maximum acceptable duration for the FLISR application.
    Time Scaling assumes 2 times the Benchmark Unit range.

    As per above table, the number of devices that could be positioned in the mesh depends on the accepted maximum duration for FLISR use case. If the FLISR use case accepted maximum duration is 2 minutes, consider positioning 30 devices. If the FLISR use case accepted maximum duration is 4-5 minutes, consider positioning 50-60 devices (with 12 Rank1 and with 4 devices under each Rank1 node). The above design considers mesh with sufficient time buffer to deliver a better FLISR performance.

    Note: The additional number of mesh devices can be positioned on the mesh, considering the buffer that was accounted in beginning. However, it is recommended to monitor the network for its expected performance, before adding additional number of devices, in incremental stages.

    Case4: Considering up to 3 nodes per rank1 hierarchy

    This case assumes the Application latency requirement permits 3 hops of IR510/IR530 to be positioned in any Rank1 hierarchy.

    Data point assumptions:

    blank.gifGoodput DPS for Rank1-3 nodes = 14

    blank.gifUp to 3 nodes are considered per rank1.

    blank.gifGoodput rate at rank1 = 28.

    blank.gifConsidered FLISR packet profiling has 372 datagrams for 10 devices.

    Figure 41 Considering up to 3 nodes per rank1 hierarchy

     

    386574.jpg

    All the calculations below assume only 40% of full capacity, factoring in multiple nodes in same rank. For example, 28 DPS in Rank1 is the cumulative of all rank1 nodes, at 40% capacity.

    At the rate of 14 DPS per rank1, and with goodput rate of 28 DPS for all Rank1, (28/14=) 2 Rank1 devices and its 3-node hierarchy should be able to transmit comfortably, at full application planned capacity. This translates to positioning 2 rank1 * 3 devices per rank1= 6 devices without much of congestion and retransmission.

    The reference 372 datagrams was profiled for 10 devices. This translates to 224 datagrams for 6 devices.

    With goodput of 28 DPS for 6 devices, 224 datagrams for FLISR application could take (224/28=) 8 seconds for 6 devices. In 16-20 seconds, goodput from 12 devices could be served. In 24-30 seconds, goodput from 18-20 devices could be served.

    Another level of device scaling could be achieved based on maximum accepted duration for the completion of FLISR use case (for example, it could be 1 minute or 2 minutes or 5 minutes or N minutes). To achieve that, consider some time buffer and arrive at a benchmark unit of measurement. For example, at the rate of 18 nodes/30 seconds, theoretically 72 nodes could be served in 2 minutes. However, allowing some time buffer, only 30 nodes are considered for 2-minute interval., and only 20-30 nodes were considered for 1-minute interval.

    Table 9 Number of DA devices – up to 3 nodes per rank1 hierarchy - scaling type and considerations

    Scaling Phase
    Time Duration
    Number of nodes served that could be served.
    How many Rank1 nodes (along with its branch)?
    Scaling Description
    Base Capacity
    30 seconds
    18-20
    6 Rank1 * 3 nodes/branch.
    Scaling to CGR WPAN 40% capacity
    Base Capacity
    1 minute
    20-30
    6-10 Rank1 * 3 nodes/branch
    Scaling to CGR WPAN 40% capacity for 60 seconds.
    Considers additional 50% nodes for another 30 seconds.
    Benchmark Unit for Time Scaling
    (with time buffer)
    2 minutes
    20-30
    10 Rank1 * 3 nodes/branch
    Unit of measurement for further scaling of nodes based on maximum acceptable duration for the FLISR application.
    Considers 1 minute of time buffer.
    Time Scaling based on Benchmark Unit
    4-5 minutes
    40-60
    up to 20 Rank1 * 3 nodes/branch
    Scaling based on maximum acceptable duration for the FLISR application.
    Time Scaling assumes 2 times the benchmark unit range.

    As per above table, the number of devices that could be positioned in the mesh depends on the accepted maximum duration for FLISR use case. If the FLISR use case accepted maximum duration is 2 minutes, consider positioning 20-30 devices. If the FLISR use case accepted maximum duration is 5 minutes, consider positioning 40-60 devices. Above design attempts to reduce congestion as much as possible and is expected to deliver a better FLISR performance.

    Scaling Up – Adding more nodes

    CR Mesh should be able to handle scale much more than what was mentioned in above tables. Still, all the planning that has been done so far has been with 40% of the mesh capacity for application goodput communication. Additionally, time buffer has been considered while trying to derive the number of DA devices in the mesh. It is recommended to start with 50% of recommended scale, evaluate the mesh performance, profile the mesh utilization for the applicable use cases, and then take data driven incremental steps to add more nodes.

    Note: The more the mesh is designed to reduce congestion and retransmission, better the performance would be.

    Given the lossy nature of the radio medium, it is recommended to design for moderate scale and achieve better performance, than to design for high scale and suffer low performance.

    Solution Architecture and Components Selection

    This chapter, which covers the DA places in network, the solution architecture for various DA application flows. and details about solution components that make up the DA architecture, includes the following major topics:

    blank.gifPlaces in the Network

    blank.gifApplication Flows

    blank.gifFAN Layer Infrastructure Components

    blank.gifWAN Layer Infrastructure Components

    blank.gifHeadend Layer Infrastructure Components

    Places in the Network

    This section describes the Smart Grid Multi-Services Field Area Network solution's building blocks for Distribution Automation applications. The Cisco FAN solutions offers end-to-end physical and network security to detect unauthorized access to the Distribution Network's assets and to prevent access to DA communication infrastructure, as well as network services such as QoS and multi-services (DA, AMI, DER and multi-tenant). The solution offers a centralized network management system that simplifies network and device provisioning through automation process called ZTD and an advanced graphical interface for large-scale network visualization and management. In addition, customers can deploy and manage their own or third party applications at the edge in the NAN block, enabling new services and functionality to the existing Distribution grid assets.

    Figure 42 Cisco FAN Solution Building Blocks

     

    256054.jpg

    Neighborhood Area Network

    The NAN is the last mile of the network communication infrastructure connecting the Utility Distribution Infrastructure assets for DA, AMI, and Remote Workforce management to the rest of the company's communication infrastructures: the Substation WAN and Data Center or Control Centers. The grid equipment, which includes feeder capacitor banks, voltage regulators, reclosers, end-of-line meters, and the transformer meters area, is connected to Cisco FAN Solution DA gateways devices such as IR510, IR800, and IR1100 using serial interfaces RS232 or RS485 or Ethernet for newer grid equipment. Cisco offers two solutions based on standard technologies:

    blank.gifCisco Resilient Mesh, which leverages the unlicensed ISM spectrum in the 900 MHz band, IEEE 802.15.4g/e, and 6LoWPAN for customers interested in deploying a private solution that offers low OpEx.

    blank.gifPublic Cellular DA, which uses standard public cellular 3G/4G technology and requires a monthly cellular service in order to function.

    Data from the electric grid devices is transported using one of the Cisco FAN solution to different aggregation, which represent exit points out of the NAN block. The aggregation points are referred to as Field Area Routers (FAR), which use the Cisco modular CGR 1000 series router to forward the traffic upstream towards the utility energy management systems. The FAR devices provide an interface to the WAN block that acts as the backhaul for NAN.

    This infrastructure, which enables customers to monitor and control the Distribution Network and perform measurement of electricity consumed and produced by prosumers, provides the foundation for advanced applications like Distribution Automation, Distributed Energy Resources (DER), and Demand Response (DR), which is a program for optimizing energy usage during peak periods.

    NANs also serve as a foundation for future virtual power plants, which are comprised of distributed power generation stations, residential energy storage (for example, in combination with electric vehicle charging), and small-scale trading communities.

    In the AMI scenario, the connected grid endpoints in the NAN are the smart meters that are part of the mesh radio network. These smart meters are IP-enabled grid devices with an embedded IPv6-based communication stack that are powered by the Cisco IPv6 CG-Mesh SDK.

    Refer to the Cisco Developer Network (CDN) to learn more about IP-enablement for partner technologies.

    Wide Area Network

    The WAN tier is responsible for providing the communications overlay between the NAN block through their FARs and Data Center or Cloud Services block. The Cisco FAN solution is agnostic to the customer's WAN infrastructure. This may be:

    blank.gifPrivate high speed in the case of the Substation On-Net, which leverages customers' dark fiber infrastructure

    blank.gifOff-Net, which is based on public services offered by Service Providers like Layer 2 Carrier Ethernet or Layer 3 MPLS service combined with a VPN design to ensure privacy and security

    Popular WAN backhaul options are Fiber, Ethernet, and Cellular 3G/4G. Other backhaul types for FARs may be a public or private IEEE 802.11 Wi-Fi smart cities infrastructure, satellite, or Private Cellular (if utilities have purchased cellular spectrum).

    The architecture assumes established network connectivity from the NAN to the headend systems. Data security forwarded over the WAN is critically important; incorporating IPSec tunnels in the design helps meet this requirement.

    Data Centers

    The Data Center block represents all customer Control Center or Energy Operational Center locations that are used to manage the Distribution Network. At a minimum, a customer will have a Primary Control Center and a Back-up Control Center for disaster recovery scenarios. For large utilities, the Data Center block could have a hierarchical layout with an additional layer besides the Primary/Back-up Control Centers. This secondary layer contains the regional Control Centers that only manage a subset of the entire Distribution Network. The EOC hosts all applications necessary to operate, manage, and secure the FAN sites and equipment. It is typically located in the Utility facility and may be co-located with the Utility IT Data Center in some cases. In a multi-service FAN deployment, the EOC will host the applications and associated servers performing the tasks required by DA, AMI, DER, Remote Asset Management, and Remote Workforce Management.

    Cloud Network

    With the recent trends in the IT industry, customers can leverage new services that are based on Cloud Services offered by different companies. Small utilities that don't desire to maintain a Data Center infrastructure and prefer to subscribe to Cloud Services could implement Cisco Cloud Services like Jasper or Kinetic for the FAN solution.

    Application Flows

    The communication infrastructure is designed to support the different Energy Systems application flows. In general, the customer energy systems are located in the Data Center or Control Center locations and use a Publish/Subscriber application architecture to monitor and control the Distribution Grid assets in the NAN. The data exchange flows between a Primary station located in the DC and Subordinate stations located in the NAN over the WAN infrastructure. FlexVPN technology makes the solution agnostic to the WAN transport infrastructure and ensures that data is secure at it travels the WAN.

    Figure 43 depicts the following main application flow types:

    1.blank.gif Publish-Subscriber between DMS and grid assets

    2.blank.gif Publish-Subscriber between DMS and Edge Compute Applications and between Edge Compute Applications and grid assets

    3.blank.gif Publish-Subscriber between DMS and Substation RTU and Substation RTU and grid assets

    4.blank.gif Publish-Subscriber between grid assets using peer-to-peer communication

    Figure 43 Application Traffic Flow Types

     

    256298.jpg

    FAN Layer Infrastructure Components

    FAN IEEE 802.15.4g/e Devices

    The FAN DA solution uses the following products to build a radio mesh network based on the IEEE 802.15.4g/e standard. The radio network requires a radio mesh Personal Area Network (PAN) coordinator. This function is supported by the solution FARs. Cisco offers CGR1240 or CGR1120 that customer can use as FAR devices.

    256340.jpg  

    256339.jpg  

    Cisco CGR 1240
    Cisco CGR 1120

    One big difference between the two products is that the CGR1240 is an IP67-rated device that can be deployed outdoors, whereas the CGR 1120 is rated as an IP30 device and is targeted to indoor substation installations. The other main difference is the number of modular slots. The CGR1240 has four module slots that give customers the ability to deploy multiple services or backhaul interfaces; the CGR1120 only has two slots.

    Lastly, the CGR 1240 obtains its power by AC while the CGR1120 supports both AC and DC.

    Table 10 Cisco DA CGR Routers Product Overview

    Feature
    CGR 1240
    CGR 1120

    Input Power

    AC

    AC/DC

    AC Phase

    1

    3

    Environment Rating

    IP67 (outdoor)

    IP30 (indoor)

    Battery Back-up

    Yes (optional)

    No

    IRIG-B

    Yes

    No

    Module Slots

    4

    2

    Fast Ethernet

    4

    6

    Table 11 Cisco DA CGR Router Part Numbers

     

    CGR 1240
    CGR 1120
    Product Part Number

    CGR1240/K9

    CGR1120/K9

    Description

    CGR1240 w/ 4 module slots,2 GE,2 serial,4 FE LAN, Wi-Fi, GPS

    CGR 1120 w/ 2 module slots,2 GE,2 serial,6 FE LAN, Wi-Fi, GPS

    Note: This solution design recommends the software 15.7.03 and mesh 6.0.19.

    Table 12 Cisco DA CGR Router Software Overview

    Product Part Number

    15703M

    Software Image Name

    cgr1000-universalk9-bundle.SPA.157-3.M3.bin

     

    Note: For additional information on CGR1240, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/routers/1240-connected-grid-router/model.html

    Note: For additional information on CGR1120, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/routers/1120-connected-grid-router/model.html

    The new Connected Grid Module (CGM) Orthogonal frequency-division multiplexing (OFDM) module supports OFDM modulation and data rates based on the IEEE 802.15.4g Option 2, which is standard for DA FLISR and Volt/VAR applications.

    Figure 44 CGM WPAN OFDM Module

     

    256366.jpg

    Table 13 Cisco DA Resilient Mesh OFDM 900 MHz Module

    Feature
    CGM WPAN OFDM Module

    OFDM Phy Data Rates (Mpbs)

    1200,800,400,200

    4FSK Phy Data Rates

    -

    2FSK Phy Data Rates

    150,50

    Table 14 Cisco DA OFDM 900 MHz Module Part Number

    Hardware Part Number

    CGM WPAN-OFDM-FCC

    Description

    OFDM 915Mhz WPAN module for CGR router

    Table 15 Cisco DA OFDM 900 MHz Module Software

    Software Part Number

    CGR1K-WPAN-FCC-6.0

    Software Image Name

    cg-mesh-bridge-6.0-6019-ir510-7bee575.bin

    Image Size

    0.18MB

    Besides the CGM module, customers can add the Cisco Content Switching Module (CSM) to enable Edge Compute on the aggregation points of the mesh network.

    Figure 45 CGM Edge Compute Server Module

     

    256337.jpg

    Table 16 Cisco DA CGR Compute Modules Part Numbers

    Part Number
    Description

    CGM-SRV-64

    64GB Server Module for the CGR1000

    CGM-SRV-128

    128GB Server Module for the CGR1000

    Table 17 Cisco DA CGR Compute Module Software Overview

    Hardware Part Number

    SW-CGM-SRV-10

    Software Image Name

    cgr1000-compute-1.7.0.1.SPA

    Image Size

    72.93MB

    Virtual Machine OS

    Ubuntu 14.04 and 15.10 Windows 7 and 10

    Note: For additional information on CGR CSM Modules, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/docs/cloud-systems-management/iox/212038-Configure-CGM-SRV-IOx-Module-on-CGR1xxx.html

    FAR routers can be also configured with 4G modems for backup network services. In case the primary backhaul link fails, the traffic from the field devices can be transported back to the Control Center or over a Public Cellular network using an overlay VPN (FlexVPN) to secure the data over the service provider network.

    Figure 46 North America CGR 4G Cellular Modules

     

    256338.jpg

    Table 18 Cisco CGR DA Cellular Modules compatible with Resilient Mesh Module Overview

    Feature
    CGM-4G-LTE-EA-900
    CGM-4G-LTE-MNA
    Comments

    Co-existence with 900Mhz

    Yes

    No

    CGM-4G-LTE-EA-900 includes RF filter

    Cellular Band Support

    LTE bands 1, 2, 3, 4, 7, 12,13, 17, 20, 25, 26, 29, 41

    LTE band 2, 4, 13, 17, 25

    --

    Market

    US (AT&T and Verizon)

    --

    --

    Canada

    US (AT&T and Verizon)

    --

    --

    Canada

    --

    --

    --

    FAN DA Deployments

    Back-up Solution for DA Unlicensed Frequency Standard Based Solution

    FAN DA Public Cellular Service Solution

    --

    Table 19 Cisco DA CGR Cellular Modules compatible with Resilient Mesh Module Part Numbers

    Part Number
    Description

    CGM-4G-LTE-EA-900

    4G/LTE CGM for North America

    CGM-4G-LTE-MNA

    4G/LTE CGM for US - ATT, Verizon, Sprint, Canada

    Table 20 Cisco DA CGR Cellular Modules compatible with Resilient Mesh Module Software Overview

    Part Number
    Carrier Firmware
    Description

    CGM-4G-LTE-EA-900

    FW-7455-LTE-AT

    FW Switching Load for 7455 AT&T and T-Mobile

    FW-7455-LTE-VZ

    FW Switching Load for 7455 Verizon

    CGM-4G-LTE-MNA

    FW-MC7354M-LTE-AT

    FW Switching for MC7354 Multi Carriers North America ATT

    FW-MC7354-LTE-CA

    FW Switching Load for MC7354 Canada

    FW-MC7354M-LTE-VZ

    FW Switching for MC7354 Multi Carriers North America Verizon

    Note: For additional information on CGR CGM Cellular Modules, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/1000-series-connected-grid-routers/datasheet-c78-730624.html

    The IR510 is the next-generation DA Gateway that connects utility grid assets to the Cisco Resilient Mesh network. It supports higher data rates based on OFDM modulation and is best suited for DA applications. It has dedicated hardware resources for Edge Compute applications, so that customers can deploy their own applications at the edge of the network.

    Figure 47 IR510 DA Gateway

     

    256346.jpg

    Table 21 Cisco DA Gateway Product Overview

    Feature
    IR510
    Comments

    Environment Rating

    IP30 (indoor)

    Requires enclosure for outdoor deployments

    OFDM Phy Data Rates

    1200,800,400,200

    --

    4FSK Phy Data Rates

    -

    --

    2FSK Phy Data Rates

    150,50

    --

    GPS

    Yes

    --

    Edge Compute

    Yes

    --

    Power Consumption

    <14W

    --

    Table 22 Cisco DA Gateway Product Part Number

    Product Part Number

    IR510-OFDM-FCC/K9

    Description

    IR510 915Mhz WPAN router w/ 2 serial,1 FE LAN

    Table 23 Cisco DA Gateway Software Overview

    Software Part Number

    SW-IR510-WPAN-6.0

    Soft. Image Name

    cg-mesh-dagw-6.0-6019-ir510-7bee575.bin

    Image Size

    0.64MB

    Note: For additional information on IR510, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/500-series-wpan-industrial-routers/datasheet-c78-730550.html

    When grid assets are not close to each other enough, for example rural area with long feeders, or no clear line of sight exists between nodes, customers can deploy the Cisco Range Extender to provide additional signal coverage. The latest range extender product release is the IR530, which supports the higher OFDM data rates.

    Figure 48 IR530 Range Extender

     

    256348.jpg

    Table 24 Cisco DA Range Extender Product Overview

    Feature
    IR530
    OFDM Data Rates

    1200,800,400,200

    4FSK Phy Data Rates

    -

    2FSK Phy Data Rates

    150,50

    Antenna Diversity

    No

    Battery Back-up

    Yes

    GPS

    No

    Table 25 Cisco DA Range Extender Part Number

    Part Number
    Description

    IR530SB-OFD-FCC/K9

    IR530 with single antenna and battery, 915MHz-WPAN. For North and South America except Brazil.

    Table 26 Cisco DA Range Extender Part Number

    Description
    IR530
    Software Product Part Number

    SW-IR530-WPAN-6.0

    Software Image Name

    cg-mesh-node-6.0-6019-ir530-7bee575.bin

    Image Size

    0.50MB

    Table 27 Cisco DA WIFI Product Part Numbers for US

    Part Number
    Description
    Comments

    IW3702-2E-B-K9

    Industrial Wireless AP 3702, 4 RF ports on top/btm, B dom

    US

    IW3702-4E-B-K9

    Industrial Wireless AP 3702, 4 RF ports on top, B reg domain

    US

    AIR-AP1572EAC-B-K9

    802.11ac Outdoor AP, External-Ant, AC-power, Reg. Domain-B

    US

    IW3702-2E-A-K9

    Industrial Wireless AP 3702, 4 RF ports on top/btm, A dom

    Canada/Mexico

    IW3702-4E-A-K9

    Industrial Wireless AP 3702, 4 RF ports on top, A reg domain

    Canada/Mexico

    AIR-AP1572EAC-A-K9

    802.11ac Outdoor AP, External-Ant, AC-power, Reg. Domain-A

    Canada/Mexico

    Note: For additional information on IR530, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/500-series-wpan-industrial-routers/datasheet-c78-740201.html

    FAN IEEE 802.11 (Wi-Fi) Devices

    Cisco IoT Wi-Fi products can be used for DA use cases that require very low latency and high bandwidth; for example, FLISR with IEC61850 Goose messaging. IoT Wi-Fi products can be used to extend backhaul connectivity for Resilient Mesh FAR devices as highlighted in Figure 54.

    256356.jpg  

    256325.jpg  

    Cisco IW3702
    Cisco AP 1572EAC

    Table 28 Cisco DA Wi-Fi Products Overview

    Feature
    IW3702
    AP1572
    Power Options

    DC, PoE

    AC, DC, PoE

    RF Output Power

    Up to 23 dBm

    Up to 30 dBm

     

    IP67

    IP67

    Flat mount

    Yes

    No

    Fiber SFP

    No

    Yes

    Operating Temp.

    -58°F to +167°F

    -40°F to 149°F

    Notes:

    For additional information on IW3702, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/wireless/industrial-wireless-3700-series/datasheet-c78-734968.html

    For additional information on AP1572, refer to the following URL:

    blank.gif https://www.cisco.com/c/dam/en/us/products/collateral/wireless/aironet-1570-series/datasheet-c78-732348.pdf

    FAN Cellular Devices

    For utilities companies that are interested in also deploying Cellular Services to Distribution Automation applications, Cisco offer a variety of products that address specific customer requirements. For example, some customers might have some power budget limitations on overhead line pole location and require DA Gateways with low power consumption.

    256352.jpg  

    256349.jpg  

    Cisco IR1101 256352
    Cisco IR807 256349

    256350.jpg  

    256351.jpg  

    Cisco IR809 256350
    Cisco IR829 256351

    Table 29 Cisco DA Cellular Products Overview

    Feature
    IR1101
    IR807
    IR809
    IR829
    Comments
    Market

    US and Europe

    US and Europe

    Globally

    Globally

    --

    Environment Rating

    IP30

    IP30

    IP30

    IP40

    --

    Modular

    2, with extension module

    Fixed

    Fixed

    Fixed

    --

    Additional Slot

    1

    No

    No

    No

    --

    Typical Power Consumption

    <10W

    ~6.7W

    ~15W

    ~40W ~70W with PoE

    --

    LTE bands

    2600 MHz (band 7)
    2100 MHz (band 1)
    1900 MHz (band 2 PCS)
    1800 MHz (band 3)
    1700/2100 MHz (band 4 AWS)
    900 MHz (band 8)
    850 MHz (band 5)
    800 MHz (band 20)
    700 MHz (band 5,12,13, 28) Dependent on Cellular modem card

    2600 MHz (band 7)
    2100 MHz (band 1)
    1900 MHz (band 2 PCS)
    1900 MHz (band 25 extended PCS)
    1800 MHz (band 3)
    1700/2100 MHz (band 4 AWS)
    900 MHz (band 8)
    850 MHz (band 5)
    850 MHz (band 26 extended CLR)
    800 MHz (band 20)
    700 MHz (band 12, 13, 17)

    2600 MHz (band 7)
    2100 MHz (band 1)
    1900 MHz (band 2 PCS)
    1900 MHz (band 25 extended PCS)
    1800 MHz (band 3)
    1700/2100 MHz (band 4 AWS)
    1700 MHz (band 9, 39)
    900 MHz (band 8)
    850 MHz (band 5,18,19)
    800 MHz (band 20)
    700 MHz (band 13,17

    2600 MHz (band 7,38)
    2500 MHz (band 41)
    2300 (band 40)
    2100 MHz (band 1)
    1900 MHz (band 2 PCS,25,39)
    1800 MHz (band 3)
    1700 MHz (band 4)
    1500 (band 21)
    900 MHz (band 8)
    850 MHz (band 5, 18, 19, 26)
    800 MHz (band 20)
    700 MHz (band 12, 13, 17, 28, 29)

    --

    Dual-LTE

    Yes, Active/Active (future)

    No

    No

    Yes, Active/Active

    Only specific IR829 models

    Dual-SIM

    Yes

    Yes, Active/Back-up

    Yes, Active/Back-up

    Yes

    --

    Gyroscope & Accelerometer

    No

    No

    Yes

    Yes

    --

    Gigabit Ethernet

    1 port onboard 1 port module

    No

    2

    4

    IR1101 requires additional module

    Fiber Interface

    SFP

    No

    No

    SFP

    --

    Fast Ethernet

    4

    2

    Share with Gigabit

    Share with Gigabit

    --

    LoRaWAN Support

    No

    Yes

    Yes

    Yes

    --

    PoE/PoE+

    No

    No

    No

    Yes/Yes

    --

    WIFI

    No

    No

    No

    Yes

    --

    Serial Interface

    1 RS232

    2

    2

    2

    --

    Edge Compute

    Yes

    No

    Yes

    Yes

    --

    Table 30 Cisco DA Cellular Products Hardware Part Numbers

    Part Number
    Description
    Comments

    IR807G-LTE-NA-K9

    807 Low-power Industrial ISR, 4G/LTE multimode for N.America

    AT&T

    IR807G-LTE-VZ-K9

    807 Low-power Industrial ISR, 4G/LTE multimode for Verizon

    Verizon

    IR809G-LTE-NA-K9

    809 Industrial ISR, 4G/LTE multimode ATT/Canada

    AT&T

    IR809G-LTE-VZ-K9

    809 Industrial ISR, 4G/LTE multimode Verizon

    Verizon

    IR829M-2LTE-EA-BK9

    829 Industrial ISR, Dual LTE US, WiFi, POE, SSD connector, FCC

     

    Notes:

    For additional information on IR1101, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/routers/1101-industrial-integrated-services-router/model.html

    For additional information on IR807, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/routers/807-industrial-integrated-services-routers/model.html

    For additional information on IR809, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/routers/809-industrial-router/model.html

    For additional information on IR829, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/routers/829-industrial-router/model.html

    For comparison information on Edge Compute products, refer to the following URL:

    blank.gif https://developer.cisco.com/docs/iox/ - !platform-support-matrix/platform-support-matrix

    WAN Layer Infrastructure Components

    The FAN DA solution uses Cisco FlexVPN as an overlay technology to connect the FAN layer to the Control Center layer leveraging an existing substation WAN infrastructure or Service Provider Cellular services. Depending on the DA design type, the VPN tunnel is established directly between the FAR router or the Cellular DA Gateway router and the Control Center Headend Router (HER).

    For cases where the DA FAN traffic needs to be routed and processed within the closest substation, the design would require an additional routing device (ISR4400 or ASR1000) to perform the MAP-T address translation and act as the FlexVPN spoke devices instead of the FAR device.

    Substation Network Services

    When address translation is required (MAP-T domain per substations), customers can choose between the Cisco ISR4400 series and the Cisco ASR1000 series. Customers can also use virtual routers (CSR1000v) as well if compute resources are available within the substation.

    Note: If customers already have routers that can provide these services, then no additional equipment within the substation needs to be acquired.

    256354.jpg  

    256330.jpg  

    Cisco ISR4331
    Cisco ASR1001-X

    256341.jpg  

    Cisco CSR1000v

    Table 31 Cisco Substation WAN Routers Overview

    Feature
    ISR4331
    ASR1001-X
    CSR1000v
    Comments
    Virtual Router

    No

    No

    Yes

    --

    Gigabit SFP Ethernet Ports (Fiber)

    2

    6

    No

    --

    Power Source

    AC or DC

    AC or DC

    N/A

    --

    Redundant Power

    No

    Yes

    N/A

    --

    Throughput

    100/300 Mbps

    2.5/5/10/20 Gbps

    10, 50, 100, 250, and 500 Mbps, and 1, 2.5, 5 Gbps

    For higher performance rates additional licenses are required

    Operating Temp.

    32°F to +104°F

    32°F to +104°F

    N/A

    --

    Table 32 Cisco Substation WAN Routers Part Numbers

    Part Number
    Description

    ISR4331/K9

    Cisco ISR 4331 (3GE,2NIM,1SM,4G FLASH,4G DRAM,IPB

    ASR1001-X

    Cisco ASR1001-X Chassis, 6 built-in GE, Dual P/S, 8GB DRAM

    L-CSR-100M-SEC=

    CSR 1000V e-PAK 10Mbps Security Package

     

     

     

     

    Notes:

    For additional information on Cisco ISR4331 router, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/4000-series-integrated-services-routers-isr/data_sheet-c78-732542.html - Product-Specifications

    For additional information on Cisco ASR1001-X router, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/asr-1000-series-aggregation-services-routers/datasheet-c78-731632.html?cachemode=refresh

    For additional information on Cisco CSR1000v router, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/cloud-services-router-1000v-series/datasheet-c78-733443.html

    The following components are optional and recommended as best practices to increase the security of the substation network and provide additional services at the edge of the network.

    Customers can deploy security services within the substation network to build a layer of protection between the utility substation WAN, LAN, and FAN layers. A variety of hardened firewalls like Cisco ISA 3000 or ASA5506H-X and non-hardened ASA5500 firewall appliances for substations with HVAC systems are available.

    256353.jpg  

    256326.jpg  

    Substation Firewall - ISA3000
    Cisco ASA5506H-X

    Table 33 Cisco Substation Firewall Products Overview

    Feature
    ISA3000
    ASA5506H-X
    Power Options

    DC

    AC

    Power Redundancy

    Yes

    No

    Mounting Kit

    Din Rail

    Rack

    Environmental Rating

    IP40

    IP30

    Alarm Contacts

    Yes

    No

    Hardware bypass

    Yes

    No

    Gigabit Ethernet Ports

    4

    4

    Fiber Ports

    Yes, 2

    No

    Fiber SFP

    No

    Yes

    Operating Temp.

    -40°F to +165°F

    -4°F to +140°F

     

     

    Notes:

    For additional information on Cisco harden ISA3000 firewall, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/security/industrial-security-appliance-3000/model.html

    For additional information on Cisco harden ASA5506H-X firewall, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/security/asa-firepower-services/ruggedized.html

    For additional information on Cisco non-harden Next-Generation firewalls, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/security/firewalls/index.html - ~products

    Besides Security services, customers can also deploy substation hardened Compute Gateways using the IC3000 appliance. IC3000 is a dedicated device for Edge Compute applications within the substation layer. Additional appliances can be deployed to meet the growing demand for compute processing at the edge of the network.

    Figure 49 Cisco IC3000

     

    256344.jpg
    Part Number
    Description

    IC-3000-2C2F-K9

    Cisco Software for IC3000 Industrial Compute Gateway

    Part Number
    Description

    IOTFND-SOFTWARE-K9

    IoT FND License for Managing IC3000 industrial compute: 1, 3, 5-year license

    Note: For additional information on Cisco harden IC3000 Compute Gateway, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/3000-series-industrial-compute-gateways/datasheet-c78-741204.html

    WAN Control Center

    In the Control Center, HER or routers will aggregate all the FlexVPN tunnels from the remote FAR devices or Cellular DA Gateways. More details can be found in Cisco Headend Router Overview.

    Headend Layer Infrastructure Components

    Application Layer

    The Cisco Resilient Network Headend Infrastructure is flexible and modular where it can be integrated in any Utility Control Center network infrastructure. The Headend infrastructure can be deployed as a new infrastructure block separate from the existing utility grid control systems and only allow the telemetry communication and access to the Cisco Resilient network management system to flow between the two systems. Further, some of the headend components can be re-used if the utility has already deployed them. For example, Certificate Infrastructure, IP Address, and Name management tools.

    Figure 50 Headend Infrastructure Components

     

    256055.jpg

    Note: The Headend Infrastructure design is outside of the scope of this document. Please check the SalesConnect for Cisco Field Area Network Full Headend Implementation Guide. If you do not have access, please reach out to your local sales account team.

    blank.gif https://salesconnect.cisco.com/open.html?c=db570d3f-3212-4659-a306-5f65aeab862b

    Certificate Infrastructure Overview

    The Cisco solution requires a Public Key Infrastructure (PKI) infrastructure to deploy and manage the solution's devices certificates. This approach provides a secure communication infrastructure between components that can also scale to very large number of devices. The Cisco solution requires two types of certificates: RSA for devices with lots of memory: HER or CGR routers and ECC for constrained devices with limited hardware resources: IR510/530, AMI meters. Also, ECC certificates are smaller in size and require less effective bandwidth utilization in the mesh network. Cisco supports the Microsoft Certificate Authority Services part of the Microsoft Windows Server 2016 and 2012.

    User and Device Security Management Overview

    In order to manage the system users and devices a database service is required. The Microsoft Active Directory services is one of the supported options. It can be easily integrated with Certificate Authority Services.

    In addition, a Network Policy Service (NPS) is required to perform device authorization. The Cisco solution uses Cisco Identity Services Engine (ISE) since it offers additional capability over other products, such as ease of integration with the rest of the Cisco infrastructure and VPN services. Alternatively, utilities can use Microsoft NPS services.

    Note: For additional information, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/security/identity-services-engine/index.html

    Address and Name Management Overview

    In order to ease deployment and address large-scale requirements, Cisco DA solution requires a Dynamic Host Configuration Protocol (DHCP) service to dynamically allocate IPv4 and IPv6 to the infrastructure. A DNS service is also required to easily identify the solution components. DNS plays an important role for Edge Compute applications. Cisco Network Register (CNR) offers both services and has the preferred components. Other services can be used as long as they meet the solution requirements.

    Note: For additional information, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/cloud-systems-management/network-registrar/index.html

    Field Network Director Overview

    The Cisco Field Network Director (FND) is the main solution management system that provides Fault, Configuration, Accounting, Performance, and Security (FCAPS) services.

    Note: For additional information, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/cloud-systems-management/iot-field-network-director/index.html

    Hardware Security Module or Software Security Module Overview

    The FAN solution requires a security module to digitally sign the IPv6 CSMP messages between the FND and Field Devices (IR510) to provide message authenticity. Customers can use a hardware appliance like SafeNet for the best level of security or leverage software functionality as a low-cost alternative. The SSM is licensed under the FND product.

    Note: For additional information on SafeNet HSM, refer to the following URL:

    blank.gif https://safenet.gemalto.com/data-encryption/hardware-security-modules-hsms/safenet-network-hsm/

    Tunnel Provisioning Server Overview

    In order to ease the deployment of the solutions and automate the process, Cisco has developed the Tunnel Provisioning Server (TPS) to help provision the initial VPN configuration for the FAR devices. Since it communicates with devices in less secure zones, it acts as a proxy configuration services between the FAR devices and the Cisco FND.

    Registration Authority Overview

    Cisco routers provide Registration Authorization (RA) services for the initial device onboarding with the Certificate Infrastructure. Network components such as the FAR, Field Devices, and the NMSs receive the proper certificates once they have been authenticated and authorized.

    Note: For additional information, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/routers/4000-series-integrated-services-routers-isr/index.html

    Compute Infrastructure Layer

    The Headend software components can be installed on any type of compute infrastructure as long as it follows the product installation hardware requirements. For customers that use Cisco UCS servers or are new to Cisco's server products family, the next section will provide guidance on product selection for installing the various FND components.

    Cisco UCS Servers Overview

    Note: The data in the table below represents reference numbers for general guidance. For more accurate information, please contact the Cisco account team for the latest product capabilities numbers.

    256334.jpg  

    256335.jpg  

    Cisco UCS C220 M5
    Cisco UCS C240 M5

    256336.jpg  

    Cisco UCS C480 M5

    Table 34 Cisco DC Compute Products Overview

    Feature
    C220 M5
    C240 M5
    C480 M5
    Rack Space

    1 RU

    2 RU

    4 RU

    Power Supplies

    2

    2

    4

    CPU Sockets

    2

    2

    2 or 4

    CPU Intel 81xx

    24 cores @2.7Ghz
    Max: 48 cores
    4 cores @3.6Ghz
    Max: 8 cores

    24 cores @2.7Ghz
    Max:48 cores
    4 cores @3.6Ghz
    Max:8 cores

    24 cores @2.7Ghz
    Max:96 cores
    4 cores @3.6Ghz
    Max:16 cores

    CPU Intel 61xx

    18 cores @2.7Ghz
    Max:36 cores
    8 cores @3.5Ghz
    Max:16 cores

    18 cores @2.7Ghz
    Max:36 cores
    8 cores @3.5Ghz
    Max:16 cores

    18 cores @2.7Ghz
    Max:72 cores
    8 cores @3.5Ghz
    Max:32 cores

    CPU Intel 51xx

    4 cores @3.6Ghz
    Max: 8 cores

    4 cores @3.6Ghz
    Max: 8 cores

    4 cores @3.6Ghz
    Max: 16 cores

    Max Virtual Core

    Up to 56

    Up to 56

    Up to 112

    Memory @2666Ghz

    Up to 128GB

    Up to 128GB

    Up to 6TB

    Disks Slots

    10

    26

    32

    Disk RPMs

    Min.15K

    Min.15K

    Min15K

    RAID

    Raid 10

    Raid 10

    Raid 10

    Max Storage (15K disks)

    9TB

    23.4TB

    28.8TB

    PCI Express

    2

    6

    12

    FND Components Recommendation

    FND Application, TPS

    FND Database

    FND Database

    Scale

    Small to Large

    Small to Medium

    Medium and Large

    For additional information on FND 4.3 Components Hardware requirements, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/iot_fnd/release_notes/4_3/rn-iot-fnd-4-3.html - 88464

    Table 35 Cisco Compute Hardware Part Numbers

    Part Number
    Description

    UCSC-C220-M5SX

    UCS C220 M5 10 SFF front drives with no CPU, memory, HDD, PCIe cards, or power supply

    UCSC-C240-M5SX

    UCS C240 M5 24 SFF front drives with option for two SFF rear drives with no CPU, memory, HDD, PCIe cards, or power supply

    UCSC-C480-M5

    UCS C480 M5 standard base chassis w/o CPU, mem, HDD, PCIe, PSU

    Notes:

    For ordering and server configuration information on UCS C220 M5, refer to the following URL:

    blank.gif https://www.cisco.com/c/dam/en/us/products/collateral/servers-unified-computing/ucs-c-series-rack-servers/c220m5-sff-specsheet.pdf

    For ordering and server configuration information on UCS C240 M5, refer to the following URL:

    blank.gif https://www.cisco.com/c/dam/en/us/products/collateral/servers-unified-computing/ucs-c-series-rack-servers/c240m5-sff-specsheet.pdf

    For ordering and server configuration information UCS C480 M5, refer to the following URL:

    blank.gif https://www.cisco.com/c/dam/en/us/products/collateral/servers-unified-computing/ucs-c-series-rack-servers/c480-m5-high-performance-specsheet.pdf

    Network Layer

    Customers can leverage the existing switching and routing infrastructure within their Control Center or acquire new equipment to build a dedicated environment for the FAN DA solution. For the latter case, customers can select products listed in the next sections.

    Switching and Routing Infrastructure Overview

    Cisco has a large portfolio of switching products for Enterprise as well as Industrial customers. For Control Center switching infrastructure, the Catalyst 9200 switching family is a great platform to use because of its cost and the support for network automation (SD-Access). The Catalyst 9200 switches offers two types of chassis: one that is modular and will allow customers to upgrade the uplink ports down the road and another one with a fixed configuration. Both versions are stackable, which allows the platform to be used for small or large Control Center networks.

    For Control Centers interconnected by a private WAN and where customers would like to distribute precision timing from the Control Center to remote Substations, the Cisco IE5000 switches can be used to complement the Catalyst 9200 switching infrastructure. The Cisco IE 5000 switch can be connected directly to a GPS interface and, through an additional license, can work as a PTP Grandmaster.

    256379.jpg  

    Cisco Catalyst 9200

    256380.jpg  

    Cisco IE 5000

    Table 36 Cisco DC Switching Products Overview

    Feature
    9200-24T
    9200-48T
    IE5000 12S12P-10G
    Rack Space

    1RU

    1RU

    1 RU

    Power Supplies

    2

    2

    2

    Input Power

    AC

    AC

    AC or DC

    Stacking

    StackWise

    StackWise

    Horizontal Stacking via 2x10 ports

    Power Supply Redundancy

    Yes

    Yes

    Yes

    Forwarding Line Rate

    160Gbps

    160Gbps

    64Gbps

    Switching Bandwidth

    128Gbps

    176Gbps

    128Gbps

    Forwarding Rate

    190.4 Mpps@64B

    190.4 Mpps@64B

    95.23 Mpps @64B

    10/100/1000 Ports

    24

    48

    12

    100/1000 SFP Ports

    No

    No

    12

    1/10G

    4

    4

    4

    # of VLANs

    4096

    4096

    1000

    Basic L3 Routing

    Routed Access (RIP, EIGRP Stub, OSPF - 1000 routes), PBR

    Routed Access (RIP, EIGRP Stub, OSPF - 1000 routes), PBR

    IPv4 Static Routing

    Advanced L3 Routing
    EIGRP, HSRP, IS-IS, OSPF

    Yes, with Network Advantage Subscription

    Yes, with Network Advantage Subscription

    Yes, with IP Services License

    L3 Multicast Routing

    Yes, with Network Advantage Subscription

    Yes, with Network Advantage Subscription

    Yes, IP Services Lic.

    Redundancy Protocols

    MSTP, PVRST+

    MSTP, PVRST+

    REP, PTP, MRP

    Software Define (SD)

    SD-Access

    SD-Access

    SDA extended node

    Timing

    No

    No

    GPS, IRIG-B, PTP (1588v2)

    Table 37 Cisco DC Switching Products Part Numbers

    Part Number
    Description
    Comments

    C9200L-24T-4X-A

    Catalyst 9200L 24-port data only, 4 x 10G, Network Advantage

    --

    C9200L-24P-4X-A

    Catalyst 9200L 24-port PoE+, 4 x 10G, Network Advantage

    PoE+

    C9200L-48P-4G-A

    Catalyst 9200L 48-port PoE+, 4 x 1G, Network Advantage

    --

    C9200L-48T-4X-A

    Catalyst 9200 48-port data only, 4 x 10G, Network Advantage

    PoE+

    IE-5000-12S12P-10G

    IE5000 12x1G SFP+12x10/100/1000 + 4 1G/10G LAN BASE

    Layer 3 requires additional license (L-IE5000-RTU=)

    Notes:

    For more information on Catalyst 9200 switches, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-9200-series-switches/nb-06-cat9200-ser-data-sheet-cte-en.html

    For more information on IE 5000 switches, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/switches/industrial-ethernet-5000-series-switches/datasheet-c78-734967.html

    Cisco Headend Router Overview

    To secure communication over any type of utility WAN implementations, the Cisco DA solution uses an overlay VPN technology. Therefore, the solution becomes agnostic to the transport infrastructure, maintaining the same design functionality, which simplifies the manageability of the network.

    The HERs terminate the VPN tunnels from the FAN devices, in particular the FAR. For large deployments, multiple HERs can be configured to operate in a cluster. Cisco has a large portfolio of WAN routers that match any deployment size of the DA solution.

    Customer can choose different models from the same routing product family for consistency and ease of management. The following products can be use as HER devices.

    Note: The data in the table below represents reference numbers (single dimension) for general guidance. For more accurate information, please contact the Cisco account team for the latest product capabilities numbers.

    256355.jpg  

    256341.jpg  

    Cisco ISR4451-X
    Cisco CSR1000v

    256329.jpg  

    256332.jpg  

    DA-FA-DG-73.jpg  

    Cisco ASR1001-X
    Cisco ASR1002-X
    Cisco ASR1006-X

    Table 38 Cisco Headend Router Products Overview

    Feature
    ISR4451
    CSR1000v (4vCPU 4G)
    ASR1001-X
    ASR1002-X
    ASR1006-X
    Comments
    Virtual Router

    No

    Yes

    No

    No

    No

    --

    Modular Chassis

    Yes

    No

    Yes

    Yes

    Yes

    --

    Onboard Gigabit SFP Ethernet Ports (Fiber)

    4

    No

    6

    6

    0

    ASR 1006 requires separate Ethernet module

    Onboard TenGigabit SFP Ethernet Ports (Fiber)

    No

    N/A

    2

    2

    0

    ASR 006 requires separate Ethernet module

    Power Source

    AC or DC

    N/A

    AC or DC

    AC or DC

    AC or DC

    --

    Redundant Power

    No

    N/A

    Yes

    Yes

    Yes

    --

    Software Redundancy

    No

    No

    Yes

    Yes

    Yes

    --

    Hardware Redundancy

    No

    No

    No

    No

    Yes

    --

    Throughput

    2 Gbps

    10, 50, 100, 250, and 500 Mbps, and 1, 2.5, 5 Gbps

    2.5 to 20 Gbps

    5 to 36 Gbps

    40 to 100 Gbps

    For higher performance rates additional licenses are required

    Encryption Throughput (IMIX)

    1.3Gbps

    Up to 1 Gbps

    Up to 5 Gbps

    Up to 4 Gbps

    13/16 Gbps

    CSR1000v based on UCS C240M3 with SR-IOV

    BGP Adjacencies

    2800

    2000

    4000

    6000

    6000

    --

    FlexVPN Tunnels

    3000

    1700

    4000

    10,000

    10,000

    --

    Max IPv4/IPv6 Routes

    1 Million

    270,000

    2 Million

    3 Million

    4 Million

    --

    Operating Temp.

    32°F to +104°F

    N/A

    32°F to +104°F

    32°F to +104°F

    32°F to +104°F

    --

    WAN Deployments

    Small

    Small

    Medium

    Medium

    Large

    --

    Table 39 Cisco Headend Router Products Part Numbers

    Part Number
    Description

    ISR4451-X/K9

    Cisco ISR 4451 (4GE, 3NIM, 2SM, 8G FLASH,4G DRAM)

    ASR 1001-X

    Cisco ASR1001-X Chassis, 6 built-in GE, Dual P/S, 8GB DRAM

    ASR 1002-X

    Cisco ASR1002-X Chassis, 6 built-in GE, Dual P/S, 4GB DRAM

    ASR 1006-X

    Cisco ASR1006-X Chassis

    Note: For CSR1000v product details, please reach out to your Cisco sales account team.

    Network Security Devices Overview

    Network Firewalls and end nodes agents provide good security boundaries between the Grid Operation Systems as well as the different components of the Cisco HE that communicate with devices outside the utility physical security boundary. Cisco offers a comprehensive portfolio of security network appliances and software applications.

    Note: The data in the table below represents reference numbers (single dimension) for general guidance. For more accurate information, please contact the Cisco account team for the latest product capabilities numbers.

    256328.jpg  

    256357.jpg  

    Cisco ASAv
    Cisco NGFWv

    256327.jpg  

    256342.jpg  

    Cisco ASA 5525-X
    Cisco Firepower 2110

    256343.jpg  

    Cisco Firepower 4110

    Table 40 Cisco DC Firewall Product Overview

    Feature
    ASAv10
    NGFWv
    ASA5525-X
    NGFW 2110
    NGFW 4110
    Virtual Device

    Yes

    Yes

    No

    No

    No

    High Availability

    Yes

    Yes

    Yes

    Yes

    Yes

    Active/Active Failover

    No

    No

    Yes

    Yes

    Yes

    Stateful inspection firewall throughput1

    500 Mbps

    1.9 Gbps

    2 Gbps

    3.0 Gbps

    35Gbps

    Maximum concurrent sessions, with AVC

    100,000

    100,000

    500,000

    1 Million

    10 Million

    Maximum VPN Peers

    250 - 1vCPU

    Dependent on host hardware

    750

    1500

    10000

    IPSec VPN Throughput (1024B TCP w/Fastpath)

    125 Mbps

    Dependent on host hardware

    300 Mbps

    500 Mbps

    8 Gbps

    Transparent Firewall

    Yes

    Yes

    Yes

    Yes

    Yes

    Multi-Context

    No

    No

    Yes

    Yes

    Yes

    Clustering Support

    No

    No

    Yes

    No

    Yes, up to 16

    EtherChannel

    No

    No

    Yes

    Yes

    Yes

    IPv6

    Yes

    Yes

    Yes

    Yes

    Yes

    Dynamic Routing

    OSPF, EIGRP

    OSPF/EIGRP/
    BGP

    OSPF/EIGRP/
    BGP

    OSPF/EIGRP/
    BGP

    OSPF/EIGRP/
    BGP

    NAT

    IPv4/IPv6 NAT46/64/66

    IPv4/IPv6 NAT46/64/66

    IPv4/IPv6 NAT46/64/66

    IPv4/IPv6 NAT46/64/66

    IPv4/IPv6 NAT46/64/66

    Multicast Routing

    Yes, IPv4

    Yes, IPv4

    Yes, IPv4

    Yes, IPv4

    Yes, IPv4

    DA Deployments

    Small

    Small

    Small

    Medium

    Large

    Table 41 Cisco DC Firewall Product Part Numbers

    Part Number
    Description

    L-ASAV10S-STD

    Cisco ASAv10 (1 Gbps) with all firewall features licensed

    FPRTD-V-K9

    Cisco Firepower Threat Defense Virtual Appliance

    ASA5525-K9

    ASA 5525-X with SW, 8GE Data, 1GE Mgmt, AC, 3DES/AES

    FPR2110-NGFW-K9

    Cisco Firepower 2110 NGFW Appliance, 1U

    FPR4110-NGFW-K9

    Cisco Firepower 4110 NGFW Appliance, 1U, 2 x NetMod Bays

    Notes:

    For additional information on ASAv, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/security/adaptive-security-virtual-appliance-asav/datasheet-c78-733399.html?cachemode=refresh

    For additional information on NGFWv, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/security/firepower-ngfw-virtual/tsd-products-support-series-home.html

    For additional information on FRP2110, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/security/firepower-ngfw/datasheet-c78-736661.html

    For additional information on FRP4110, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/security/firepower-ngfw/data_sheet-c78-736661.html

    Solution Deployment Models for DA

    This chapter includes the following major topics;

    blank.gifUtility SCADA Systems Architecture Overview

    blank.gifCisco DA Feeder Automation Solution based on Standard Unlicensed 900MHz ISM Band

    blank.gifCisco DA Feeder Automation Solution using Public Cellular Service (3G/4G)

    blank.gifCisco DA Feeder Automation based on Hybrid Design: Cellular & 900MHz ISM

    Utility SCADA Systems Architecture Overview

    Traditionally, SCADA system architecture is centralized and uses a single tier between the SCADA Primary stations that are located in the Primary and Back-up Control Centers and the SCADA clients (which are typically RTUs located in the substations). This architecture worked well as communication technology evolved and made its way onto the distribution grid network for asset monitoring and control. Figure 51 is an example of such system.

    Figure 51 Traditional, Centralized SCADA System Architecture

     

    256057.jpg

    However, with the introduction of Distributed Energy Resources (DER) and new requirements to increase the reliability of the distribution grid, customers in both the European and North America markets have started considering a two-tier SCADA architecture, which is sometimes called the Distributed SCADA architecture. It moves some of the decision logic from the Control Centers closer to the substations in order to achieve better system reaction times and make the system more flexible to changes in the grid.

    Figure 52 Distributed SCADA System Architecture

     

    256058.jpg

    In the next couple of sections, you will learn about the different Cisco FAN DA designs available for implementation that will support both types of SCADA architectures. The Private Network solution is a better option if you have a distributed SCADA architecture or believe you will be moving towards one in the near future.

    Cisco DA Feeder Automation Solution based on Standard Unlicensed 900MHz ISM Band

    Utility customers that prefer a private solution can opt to run their Distribution Automation systems using only the Cisco Resilient Mesh design based on the ISM open license 900 MHz band. This design eliminates any dependencies on third party services and allows customers to operate the network from provisioning to configuration management to the troubleshooting point of view. It offers a lower operation cost model since there is no monthly reoccurring service cost for the transport service and it eases the regulatory compliance process. It is also less vulnerable to weather-related or crowding events that affect a cellular service. This design is well suited for large or small utilities and it can be deployed in urban or rural territories. It supports both centralized and distributed SCADA systems.

    For utilities with a distribution substation fiber-rich WAN network, the FAR can be installed within the substation yard. This type of implementation is better suited for Distributed SCADA systems or future Smart Grid implementations where the Grid control needs to be regionalized for fast control response.

    The design can leverage a VPN service over the existing substation WAN if encryption is required or if the WAN is not IPv6 capable.

    Figure 53 DA Design Based on 900MHz ISM Band Solution [4003] 256299 or not?

     

    256299.jpg

    For substations that don't have a high-speed WAN connection, customers can choose to use the Cisco Outdoor Wi-Fi mesh solution to backhaul the Resilient Mesh traffic towards the closest substation that has a fiber WAN connection.

    The Wi-Fi backhaul can use a mesh topology similar to the Resilient Mesh or Point-to-Point Wi-Fi links to extend the substation connectivity. An additional benefit of this design is that customer can deploy DA applications that require high bandwidth and low latency like ultra FLISR based on IEC61850 or microPMUs. The FARs can be installed anywhere in the field where the utility has assets.

    Figure 54 Wi-Fi Backhaul for Location without Fiber

     

    256060.jpg

    Cisco DA Feeder Automation Solution using Public Cellular Service (3G/4G)

    As an alternative technology to the 900 MHz ISM band, customers can use a design that is a based only on the Cisco Cellular products for DA applications. This design has a higher OpEx due to reoccurring Cellular services, but for certain countries where the 900MHz band is not supported, Public Cellular is the only reasonable option.

    Besides cost, customers need to take into consideration the cellular base station congestion, which can be temporarily caused by crowded events or permanently caused by over-subscription when cell has peaked its original planned size. Public Cellular service does not offer service priority among its tenants. Cellular technology uses asymmetric bandwidth, which was designed to offer higher download link speed than upload link speed. The DA traffic profile is opposite to the Cellular Service link design where most of the traffic and bandwidth requirements are using the upload link speed. The SCADA primary system makes small size requests and the grid field device responds with larger amounts of data.

    Each distribution automation grid asset will be connected to a Cisco DA gateway that will build a VPN tunnel over the Service Provider Cellular network towards the Control Centers to encrypt the data. Service Providers support both hub-and-spoke topology as well as an any-to-any topology for peer-to-peer communication. The Cisco VPN services can be also configured to match the Service Provider service topology for optimal traffic flow. This design will accommodate both Centralized and Distributed SCADA implementations. For the latter scenario, a cellular DA gateway might be required to be installed within the distribution substation even if the substation has fiber or high-speed connectivity. The traffic from nearby field devices could share the same cellular base station with the Substation DA Gateway, which reduces the latency communication for a two-tier SCADA implementation. The Substation DA Gateway could also be used as a back-up solution for the substation primary WAN link. Without cellular connectivity, the traffic between grid field devices to nearby substation RTUs will hairpin at the Control Center since that is the only place that interconnects the Fiber and Cellular networks.

    Figure 55 Cellular DA Design

     

    256061.jpg

    Cisco DA Feeder Automation based on Hybrid Design: Cellular & 900MHz ISM

    For customers that prefer a Distribution Automation design based on private provider-managed Cellular services, but with a territory where the cellular coverage is not well developed or has pockets of no-service coverage, or where certain locations requires more bandwidth than what the ISM band can offer, a hybrid DA design with cellular and Resilient Mesh is more appropriate. The 900Mhz solution will complement the cellular by allowing customers to automate feeders that are located in new development areas or rural areas that lack cellular service. Customers can install the FARs closer to the edge of the cellular coverage area and extend the network connectivity services using Field Devices and the ISM 900 MHz band. Figure 56 is an example of such deployment.

    Figure 56 Hybrid DA Design

     

    256062.jpg

    Design Considerations for DA Feeder Automation Deployments Based on 900MHz ISM Band Solution

    This chapter, which focuses on the technical design considerations a network architect or engineer needs to understand in order to successfully develop a detail design for their FAN network based on the 900MHz ISM band, includes the following major topics:

    blank.gifEnd Devices Connectivity

    blank.gifIP Address Schema

    blank.gifFragmentation and Reassembly

    blank.gifNetwork Routing

    blank.gifNetwork Services

    blank.gifNetwork Security

    blank.gifNetwork Management System

    blank.gifNetwork Availability and Resiliency

    blank.gifNetwork Scalability

    blank.gifNetwork Flexibility

    blank.gifRF Design Considerations

    This first version of the document will focus on designs for which the FAR or CGR routers are installed within the utility's substation premises and leverage the utility's Substation Private WAN as backhaul connectivity.

    The design makes the following assumptions:

    blank.gifCustomers have a private WAN deployed to all their substation locations.

    blank.gifFAR devices will be installed within the substation premise.

    blank.gifFAR devices backhaul connectivity will be via Ethernet: fiber or copper to the Substation LAN or WAN block.

    blank.gifCustomer's SCADA system uses a centralized architecture and traffic is tunneled from the FAR router back to the Control Center HER using a single MAP-T domain.

    blank.gifThe Control Center network design is out of scope.

    Figure 57 High Level Design Topology

     

    256299.jpg

    Future releases of the document will also cover deployments where the FAR routers are deployed anywhere in the field and use Public Cellular Services for backhaul connectivity.

    The overall design effort will be broken down into the three main solution's layers:

    blank.gifFAN Resilient Mesh Infrastructure

    blank.gifWAN Infrastructure

    blank.gifHeadend Infrastructure

    The main emphasis will be on the FAN Resilient Mesh Infrastructure layer, but will also briefly touch on the remaining layers. References to other Design Guides that include the necessary details for a technical person to develop an end-to-end design are also provided.

    End Devices Connectivity

    Today, most of the Distribution Automation grid devices are connected to the Resilient Mesh communication equipment via legacy, low speed, and asynchronous serial interfaces RS232 or RS485. In order to transport the serial traffic from grid devices to the grid management systems located in the Control Centers over an IP network, Resilient Mesh uses a feature called Raw Socket to encapsulate the serial traffic in IPv4 packets over a UDP or TCP session. In this case, each field device (IR510) requires an IPv4 address to establish a session with the SCADA DMS servers or other terminating devices.

    Figure 58 Serial Grid Device Connectivity

     

    256063.jpg

    Distribution grid devices that support Ethernet interfaces can be connected to the IR510 Ethernet 0 port using IPv4 or IPv6 protocol.

    Figure 59 Ethernet Grid Device Connectivity

     

    256064.jpg

    With the introduction of Edge Compute capabilities, customers can now run grid applications local to the distribution grid devices. These enable customers to perform more advanced grid operations where legacy grid devices have limited capabilities. In addition, customers can leverage IoT architectures and protocols like Constrained Application Protocol (CoAP), MQ Telemetry Transport (MQTT), and Data Distribution Service (DDS) to deploy a middleware message bus where data manipulation can be performed. For example, data from different distribution grid device types can be normalized and exchanged between multiple devices or systems across the Resilient Mesh. For these scenarios, the Edge Compute application container will require additional IPv4 addresses and IPv6 in the future.

    Note: Customers are responsible for the developing or acquiring third party Edge applications that can be hosted on the Cisco FAN DA devices. Cisco provides a management solution for deploying and monitoring such applications at scale.

    Figure 60 Edge Compute Application Connectivity

     

    256065.jpg

    IP Address Schema

    FAN Resilient Mesh Layer

    Mesh Address Space

    The Cisco Resilient Mesh solution was developed with a forward-looking mindset to overcome IPv4 address exhaustion as well as provide support for millions of end devices and sensors. Therefore, routing within the mesh is based on native IP6. Since IPv4 is still prevalent for the end devices, especially in the Utility Industry where adaption of IPv6 is in its infancy stage because of lack of equipment vendor support, the Resilient Mesh also leverages industry standards for address translation between IPv4 and IPv6 as traffic flows over the mesh.

    In summary, the mesh requires both IPv4 and IPv6 networks in order to transport the end grid devices traffic.

    Design Guidance:

    The size of the mesh network address space, number of subnets will be based on the following factors:

    a.blank.gif Utility number of end devices: capacitor bank controllers, recloser, etc.

    b.blank.gif Utility number of take out points or exit points out of the mesh. This depends on the utility's substation WAN fiber presence as well as end device density for an area since the ratio of Field Devices per FAR needs to be considered.

    c.blank.gif Number of Field Devices per Mesh PAN ID, including expected number of devices that would perform PAN migration

    d.blank.gif Number of mesh range extenders required to close any RF signal coverage gaps

    e.blank.gif Number of edge compute applications and number of locations where application(s) will be deployed

    f.blank.gif Future growth, additional Smart Grid use cases: DER, etc.

    In general, each Cisco mesh device requires the following address types based on the design redundancy and resiliency requirements, substation WAN support for IPv6, and customer requirements for edge compute applications

    Figure 61 FAN Mesh Device Interfaces and IP Address Allocation

    Device Functional Name
    Device Type
    Interface Names
    IPv4 prefix
    IPv6 prefix
    Comments
    FAR

    CGR 1120 or 1240

    Loopback

    Yes

    Yes

     

    Backhaul Primary Ethernet Interface

    Yes

    Optional

    If WAN supports IPv6

    Backhaul Redundant Ethernet Interface (Optional)

    Yes

    Optional

    If WAN supports IPv6

    Backhaul Back-up Cellular Interface (Optional)

    Yes

    Optional

    If Cellular backup is required

    FlexVPN Tunnel Interface

    Yes

    Yes

     

    Mesh Interface (WPAN)

    N/A

    Yes

     

    Mesh WPAN HA Interface

    Yes

    N/A

    Uses HSRP

    Compute Module Interfaces

    Yes

    N/A

     

    IOX Applications Interfaces

    Yes

    Future

    If application requires IPv6

    FD

    IR510

    Mesh Interface (WPAN)

    N/A

    Yes

     

    Address Translation (MAP-T)

    Yes

    Yes

    Additional consideration

    IOX Applications

    N/A

    Optional

     

    Range Extender

    IR530

    Mesh Interface (WPAN)

    N/A

    Yes

     

    IPv6 as IPv4 uses two type of network addresses:

    blank.gifGlobally Unicast Addresses (GUA)

    blank.gifUnique Local Addresses (ULA)

    Global IPv6 prefix: Obtained through one of the five Regional Internet Registries (RIR): AFRINIC, APNIC, ARIN, LACNIC, or RIPE. The entity requesting the prefix from the RIR must be registered with the RIR as either a Local Internet Registry (LIR) or end-user organization. As an alternative, a global prefix might be obtained from an ISP.

    A utility should consider registering as a LIR to obtain its own IPv6 prefix and therefore be fully independent from any churn in the ISP addressing architecture.

    RIRs define policies regarding allocation of an IPv6 prefix and the prefix size. A RIR prefix allocation is by default ::/32 prefix for a LIR, and ::/48 for an end-user organization. The RIR policies also define how larger or smaller prefixes can be allocated to a LIR and an end-user organization.

    Note: For additional information on RIR, refer to the following URL:

    blank.gif https://www.ripe.net/publications/docs/ripe-707

    A justification that is based on the number of sites and hosts must be given for the non-default allocation. The number of FAN sites and subnets drive the decision to register as a LIR or as an end-user organization and further justify the requests made for prefix allocation and size.

    ULA IPv6 prefix: A unique local address (ULA) IPv6 prefix, documented in RFC 4193, is allowed to be "nearly unique." It starts with a FC00::/7 value but the following 41 bits, global routing ID, allow an addressing space far greater than the 3-private IPv4 prefixes (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) documented in RFC 1918. The size of the global routing ID effectively produces a pseudo "uniqueness." Note, however, that no central registration of ULA prefixes currently exists.

    The main differences between selecting a global or ULA IPv6 prefix are the following:

    blank.gifA global prefix requires registration to the RIR either as LIR or as an end-user organization. This requires paper work and fees before getting and justifying an IPv6 prefix allocation. A ULA does not require this registration.

    blank.gifFiltering at the border of the utility routing domain:

    blank.gifA ULA IPv6 prefix must NEVER be advertised to the Internet routing table.

    blank.gifA global IPv6 prefix or portions of its address space might be advertised to the Internet routing table and incoming traffic MUST be properly filtered to block any undesirable traffic.

    blank.gifInternet access: A ULA-based addressing architecture requires the IPv6-to-IPv6 Network Prefix Translation (IPv6 NPT, RFC 6296) device(s) to be located at the Internet border. Remote workforce management use cases, such as third-party technicians connecting to their corporate network from a FAN site or an FND operator using the Google map features, might require Internet access. For web access, web proxies can be a solution.

    blank.gifOnce an IPv6 prefix has been allocated for the FAN, a hierarchy numbering the regions, districts, sites, subnets, and devices must be properly structured. IPv6 addressing is classless, but the 128-bit address can be split among a routing prefix, upper 64 bits, the Interface Identifier (IID), and the lower 64 bits. A hierarchical structure eases the configuration of route summarization and filtering.

    Design Guidance:

    Cisco supports both address space implementations and highly recommends customers that use the Global Unicast Address space to filter the DA solution's IPv6 prefixes at the company's internet security border.

    For customers that choose to use ULA address space, it's good to ensure that NAT66 is not enabled between the DA IPv6 prefixes and the company internet global prefixes in order to avoid routing traffic from the internet in the DA infrastructure.

    Besides the unicast addresses, the FAN DA solution also requires IPv6 Multicast address allocation. The firmware for the FAN DA devices is done using multicast to optimize the distribution of the software over the LLN.

    Mesh Layer 3 Boundaries

    The Cisco Resilient Mesh is a Layer 3 IPv6-routed network. Each Field Area Router (CGR) mesh radio interface (WPAN) is a Layer 3 boundary and defines a Personal Area Network (PAN) that shares the same Layer 3 network prefix among all devices associated with the that PAN ID. Therefore, the address prefix length assigned to each CGR WPAN interfaces is directly related to the maximum number of field grid devices connected to a PAN, the number of Range Extenders (IR530) within the PAN, and the expected field devices that can migrate to the PAN from adjacent PANs.

    Figure 62 CGR WPAN and IPv6 Address Assignment

     

    256066.jpg

    For FAR High Availability (HA) designs, a direct Ethernet link is required between the two CGR routers that are used as a heartbeat link to detect CGR failure and for state replication of the WPAN state as described in FAN Infrastructure Layer. The HSRP protocol runs between routers; therefore, at a minimum, the IPv4 subnet prefix should be a /29.

    Address Translation

    For scenarios where the end grid devices are connected via serial interfaces or the Ethernet using IPv4 to the Field Devices (IR510), address translation is necessary to translate IPv4 packets into IPv6 packets and vice versa and allow end-to-end communication between islands of IPv4 networks over the Mesh IPv6 network, as shown in Figure 63. Tunneling is another method, but due to additional overhead associated with encapsulation, it's not suited for Low power and Lossy Networks (LLNs).

    Figure 63 Address Translation (IPv4 and IPv6)

     

    256067.jpg

    The Cisco Resilient Mesh network uses the IETF Network Address Translation and Protocol Translation (NAT-PT), in particular MAP-T, which is a double stateless NAT64 translation. The benefit of stateless address translation is that Cisco devices don't have to maintain a translation state and also allow DA devices to migrated within mesh without affecting the traffic flow over new paths.

    Note: Currently, the software version 6.0 running on the IR510 support up to 15 unique NAT44 entries.

    The MAP-T uses some key concepts: MAP-T domains, mapping rules: Default Mapping Rule (DMR), Basic Mapping Rules (BMR), Forwarding Mapping Rules (FMR), and Border Router (BR).

    Note: For additional MAP-T information, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipaddr_nat/configuration/15-mt/nat-15-mt-book/iadnat-mapt.pdf

    Design Guidance:

    Based on the utility SCADA architecture and the traffic patterns, MAP-T can be implemented in two types of designs:

    blank.gifSingle MAP-T domain per HER, which is suited for Centralized SCADA where all traffic from the field grid devices flow to the Control Center SCADA systems, and peer-to-peer communication between grid devices is not required or supports high latency requirements. Depending on the FlexVPN design, the HER will learn from FAR routers (CGR) each PAN host routes (/128) to the IR510 devices. If WAN FlexVPN is not configured, spoke-to-spoke communication traffic will hairpin at the Control Center. This design is suited for small-to-medium size networks with thousands of grid devices.

    blank.gifOne MAP-T domain per substation, which is optimal for use cases like FLISR where peer-to-peer communication is required and has low latency requirements; therefore, the traffic does not have to hairpin at the Control Center HER device. This design requires an additional routing device in the substation to perform the Border Router functions. The advantages of these design is that it scales to millions of devices, has optimal routing, and host routes are not advertised in the WAN block, only summaries.

    Figure 64 Single MAP-T Domain

     

    256068.jpg

    Figure 65 Multiple MAP-T Domains

     

    256069.jpg

    Table 42 MAP-T Configuration Options

    Field Grid Device Interface
    Control Center Device Interface
    Raw Socket Required
    MAP-T Required

    Serial

    Serial

    Yes, IPv4

    Yes

    Serial

    Ethernet IPv4

    Yes, IPv4

    Yes

    Serial

    Ethernet IPv6

    Yes, IPv6

    No

    Ethernet IPv4

    Ethernet IPv4

    No

    Yes

    Ethernet IPv6

    Ethernet IPv4

    No

    Yes

    Ethernet IPv6

    Ethernet IPv6

    No

    No

    Recommendation:

    For grid equipment upgrades or new grid equipment installation, customers should look at selecting vendor equipment that supports IPv6 protocol or at least work with the grid equipment vendors to add IPv6 capabilities to their devices. This will help the industry to adopt faster IPv6 protocol and simplify the FAN DA design by allowing end devices to communicate natively using IPv6.

    Wide Area Network Layer

    The FAN WAN layer uses a VPN overlay design to make the solution agnostic to the different utility customer distribution WAN network designs. The VPN services runs between the FAR devices or the Substation WAN routers and the HERs located in the Control Center.

    Customers can allocate any private IPv4 and IPv6 address space as long as the space is unused within the company. The best approach is to configure the VPN devices in a dual-stack configuration.

    The size of the address space depends on the following factors:

    blank.gifNumber of FAR devices (CGRs)

    blank.gifNumber of Control Centers

    blank.gifRedundancy requirements for the HER within a Control Center

    blank.gifRatio of FARs per HER device

    Note: For additional information regarding DA WAN FlexVPN design, refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/solutions/Verticals/Distributed-Automation/Secondary-Substation/DG/DA-SS-DG/DA-SS-DG-doc.html

    Headend Layer

    The Headend infrastructure address spaces depends heavily on the existing IP address schema used in the Control Center. In general, the new FAN DA Headend infrastructure requires both IPv4 and IPv6 address space.

    Table 43 Headend Infrastructure Devices and IP Address Allocation

    Device Functional Name
    Device Type
    IPv4 Prefix
    IPv6 Prefix

    AD

    Server

    Yes

    Optional

    CA - RSA

    Server

    Yes

    Optional

    SUB-CA RSA

    Server

    Yes

    Optional

    CA - ECC

    Server

    Yes

    Optional

    CNAR

    Server

    Yes

    Yes

    NPS

    Server/Service

    Yes

    Optional

    ISE

    Server

    Yes

    Optional

    TPS

    Server

    Yes

    Optional

    FND

    Server

    Yes

    Yes

    FND-DB

    Server

    Yes

    Optional

    RA

    Router

    Yes

    Optional

    HER

    Router

    Yes

    Yes

    FD

    IR510

    N/A

    Yes

    Note: For additional information regarding the Cisco Field Area Network Full Headend Implementation Guide, refer to the following URL. If you don't have access to the resource, contact the local sales account team.

    blank.gif https://salesconnect.cisco.com/ - /search/headend/content

    Fragmentation and Reassembly

    FAN Resilient Mesh Layer

    End Devices MTU

    The grid devices connected using the Ethernet interfaces use a Maximum Transmission Unit (MTU) of 1500 Bytes. The Ethernet standard does not support fragmentation and it is the responsibility of the sender device to fragment packets over 1500 Bytes. If a link has a smaller MTU between the sender and receiver devices, in case of VPN designs or Cellular transport service, then the network device in the path will perform network fragmentation, which is an intensive resource process that causes network service degradation or packets to be simply dropped.

    With IPv6 based on RFC8200, network devices do not perform fragmentation anymore; the RFC guidance is that the minimum IPv6 MTU configurable on an interface is 1280 bytes, which means any packets smaller than that are almost guaranteed not to be fragmented. For transport interfaces with MTU lower than 1280 bytes, an additional protocol must be implemented to perform fragmentation and reassembly below Layer 3. This was the case with the initial IEEE 802.15.4 where the Layer 2 MTU was 127 bytes, and the mesh was using the 6LoWPAN Adaptation Layer to perform fragmentation and reassembly. Further, upper layer protocols and applications should not send packets larger than 1500 Bytes unless they are sure the receiver can reassemble packets above 1500 Bytes.

    In order to avoid these type of scenarios, customers acquiring new grid equipment should ensure at minimum that the equipment can interpret IPv4 ICMP message type 3 - "Destination Unreachable" with Code 4 - "Fragmentation Needed and Don't Fragment was Set" from DA Gateways and fragment the packets in smaller sizes based on the lowest Path MTU in the ICMP message. Ideally, utilities should buy equipment that has Path Maximum Transmission Unit Discovery (PMTUD) capabilities. PMTUD is even more critical for grid equipment supporting IPv6 communications due to the lack on network fragmentation.

    For legacy grid devices that do not support PMTUD, the customer can manually set MTU on the device interface to the smallest value of the MTU on the communication path after taking into consideration all protocols headers overhead in case of the WAN VPN. For protocols like DNP3, this is not a concern since DNP3 data link MTU is 292 Bytes.

    Resilient Mesh MTU

    The DA Gateways and FAR devices support PMTUD over the mesh radio and can signal to end devices when the application needs to reduce the packet size to avoid network fragmentation.

    Originally, the IEEE 802.15.4 Physical Service Protocol Unit (PSDU) maximum size was 127 Bytes and the Cisco Mesh was using 6LoWPAN adaptation layer that was performing fragmentation/reassembly functions between the Layer 3 IPv6 and Mesh Layer 2.

    The IEEE 802.15.4-2015 standard increased the original PSDU of 127 Bytes to 2047 Bytes so the 6LoWPAN adaptation layer does not perform fragmentation and reassembly. It only does header compression for IR510 within the mesh.

    Figure 66 shows that the grid devices MTU should not be set higher than 1472 Bytes to avoid packet drops if end devices do not support PMTUD. The CGR Maximum Mesh MTU is 1500 Bytes of which 28 Bytes are used by IPv6 headers. Over the WAN, assuming it supports full Ethernet 1500 MTU, the IPSec and GRE header add up to about 112 Bytes when using IPv4 as transport and 164 Bytes for IPv6 transport. To accommodate for future growth as encryption algorithms evolve, it is recommended to set both IPv4 and IPv6 MTU to 1300 Bytes.

    Recommendation:

    The smallest link MTU will be over the WAN FlexVPN tunnel; therefore, in order to avoid fragmentation or packet drops, customers should also set the Grid Field Devices MTU for both IPv4 and IPv6 to 1300 Bytes when devices do not support PMTUD. If devices support PMTUD, then it's best to leave the Grid field devices to use their default MTU and rely on the PMTUD features that works with UDP and TCP applications.

    Note: DNP3 and Modbus use small PDU size and are not prone to fragmentation. It's possible that other device management protocols could benefits from PMTUD or manual configuration of the MTU on the Grid field device.

    Note: When packets are small, it will take less airtime to transmit and be open for packet errors due to interference or collision.

    Figure 66 End Device Maximum MTU for Ethernet Connectivity

     

    256070.jpg

    Grid field devices connected via Serial interface will use the Raw Socket feature to packetize the DNP serial data over the IP network. The Raw Socket session can be configured to use UDP or TCP for transport, but UDP is the recommend configuration since Mesh has a built-in reliability function at Layer 2 as well as the DNP protocol.

    A Raw Socket session from the DA Gateways can be terminated in the Control Center in multiple scenarios. For IP SCADA systems, the Raw Socket can be terminated directly on the DMS Front End Processor (FEP) system as depicted in Figure 67. When configuring Raw Socket over UDP, customers should set the Maximum Packet Length for the session to something that matches the protocol used. For example, for DNP set the Packet Length to be around 292 Bytes.

    Note: To avoid DMS packet reassembly, enable the message interleaving delay on the Grid field devices serial port.

    Figure 67 Raw Socket Packet Length for Serial Connectivity

     

    256071.jpg

    Note: The default Packet Timer is 10 milliseconds.

    For traffic towards the Grid field devices (downwards), the MTU packet size depends on the Mesh routing mode configuration. For Storing Mode, the same Upwards principal apply as described in Figure 68.

    When the Mesh is configured for Non-Storing Mode, then the FAR devices will insert a Source Routing Header and its size depends on the depth of the mesh, the hop count to destination.

    Figure 68 Non-Storing Mode MTU Considerations

     

    256072.jpg

    WAN MTU

    The substation WAN device's MTU depends on the WAN device connectivity interface type (Ethernet or Cellular) and the type of WAN transport service used (Private WAN, Managed MPLS service or Public Cellular Service).

    Regardless of the type of SCADA architecture used, the WAN device physical interface MTU should reflect the WAN transport service MTU.

    The FlexVPN spokes, either the FAR or substation WAN devices, and the Tunnel interface MTU should be set based on the lowest MTU of the Physical interface or Transport Service minus the FlexVPN protocols overhead. The same configuration should be applied on the HER Virtual Tunnel interface. As an example, Figure 69 shows the Tunnel MTU size when the WAN supports the full Ethernet MTU of 1500 Bytes.

    Figure 69 WAN MTU Considerations

     

    256073.jpg

    Headend Infrastructure MTU

    In the Control Center, Ethernet is the de-facto standard for connectivity and most of the systems run full operating systems that support PMTUD functionality; therefore, customers can use the default Ethernet MTU of 1500 Bytes.

    Network Routing

    The Cisco FAN Distribution Automation solution uses multiple routing protocols to facilitate end-to-end communication between the Grid field devices, Substation Systems and the Control Center Grid Management Systems.

    Traffic from Grid devices directly attached to the DA Gateways radios is routed using IPv6 based on the shortest and most reliable path towards the FAR device, which is the gateway out of the radio Mesh network. The Resilient Mesh network also supports peer-to-peer communication within the mesh for the DA FLISR solution that might require recloser coordination.

    The Mesh routing protocol is dynamic and specifically designed to deal with RF changing conditions of the links and route traffic around links with poor performance. Cisco implemented innovative functionality (such as peer-to-peer communication, adaptive modulation, and adaptive data rate) and was the first in the industry to give utility customers the functionality required for Distribution Automation systems. These features are detailed in RF Design Development Process.

    In the WAN, customer will need to run static or dynamic routing protocols for the transport layer to allow FAR routers to build a VPN tunnel back to the HER in the Control Center. The routing protocol selection depends on the WAN transport service, Layer 2 or Layer 3, but the Cisco FAN DA can work over any scenario. VPN can be established over an IPv4 as well as an IPv6 transport service.

    Once the VPN overlay is established over the transport service, customers could leverage the FlexVPN IKEv2 Dynamic Routing feature to advertise the Mesh DA Gateways IPv6 prefixes and IPv4 prefixes from the Substation network to the VPN aggregation router (HER). IKEv2 Dynamic Routing is light in traffic usage and can scale to a large number of FlexVPN spokes. Customers can also use dynamic routing protocols for implementations where bandwidth utilization is not an issue (for example, private Fiber WAN). For large-scale deployments, it is recommended that customer use the BGP routing protocol instead of EIGRP.

    Recommendation:

    OSPF is not recommended as an overlay routing protocol because of its scalability limitations. If one of the FAR backhaul connection flaps, it will creates a large amount of control plane exchange since the entire OSPF database needs to be updated on all the other FAR devices.

    Traffic entering or leaving the radio mesh between IPv4 devices will always be routed through the MAP-T Border Router. For Centralized SCADA architecture, the HER is configured to be the MAP-BR. For distributed SCADA deployments, the substation WAN router acts as the MAP-T BR.

    The DA Gateways and substation prefixes are only advertised within the VPN and are hidden, protected from the transport infrastructure which improve the solution security when the transport is untrusted. Customers can also enable multicast routing within the VPN layer even if the transport does not support it.

    For high availability deployments, customers can use different types of VPN designs as highlighted in 5.8.2 WAN Infrastructure section.

    Figure 70 shows the end-to-end routing architecture. Note that the second FlexVPN is optional and can be implemented if customers would like to have separate VPN failure domains between the Primary and Back-up Control Center.

    Within the Control Center, HERs participate in the site local IGP routing domain based on the existing implementation. The dynamic routing protocol typically is a customer choice based on their engineering team skills.

    Figure 70 FAN DA End-to-End Routing Architecture

     

    256075.jpg

    Mesh Routing (RPL)

    Routing in the Resilient Mesh (6LoWPAN) is done per PAN, where each IPv6 subnet leverages the distance vector routing protocol name (RPL): IPv6 Routing Protocol for Lossy and Low Power networks, RFC 6550.

    The Cisco Resilient Mesh routing is autonomous and does not require user input. It is optimized to operate and support the DA applications that require more bandwidth, low latency, reliable transmission and peer-to-peer communication. Administrators have only a few request parameter list (RPL) parameters that can be configured on the FAR devices.

    Note: For more information on Resilient Mesh RPL configuration, use the Configuration Guide documentation.

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/cgr1000/ios/modules/wpan_cgmesh/b_wpan_cgmesh_IOS_cfg.html

    DA Gateways (IR510/IR530) act as RPL nodes while the FAR device (CGR) acts as an RPL Directed Acrylic Graph (DAG) Root. The RPL protocol was designed to run on constrained devices with limited resources (memory, communication, and computation), dynamic environments like radio networks and at large scale: thousands of devices in a routing domain. RPL builds a forwarding tree topology similar in concept to Spanning-tree where alternate links and paths are "pruned" from the forwarding plane and not used, unless they become a better path towards the FAR than the current route or there is a failure of the active path. The topology is called Destination Oriented Directed Acrylic Graph (DODAG) and it is a directed graph, single rooted at the CGR, the destination, with no cycles or loops. Each topology node has a Rank number associated to show its position within the graph with respect to the root, the CGR. The Rank value is determined based on the ETX Path to the CGR. The DODAG is identified by the following information:

    blank.gifRPL Instance ID (potentially multiple DODAG, but one Objective Function)

    blank.gifDODAGID (set by the DAG Root: CGR)

    blank.gifDODAG Version Number (DODAG iteration number)

    The Objective Function (OF) is used by RPL to specify how the routing metric and constraints should be used to reach specific objectives. The metric could include link properties (such as bandwidth, latency, and reliability) and node properties (such as battery backup or not). For example, the OF may specify that the objective is to find the constrained shortest path where the constraint is related to the node power mode and the metric is the expected transmission count (ETX). Currently, the node OF is configured to find the most reliable path with the shortest distance to the CGR.

    Figure 71 Mesh Routing Protocol (RPL)

     

    256076.jpg

    RPL control messages are carried via ICMPv6 message. The following messages are available:

    blank.gifDAG Information Solicitation (DIS)

    blank.gifDAG Information Object (DIO)

    blank.gifDestination Advertisement Object (DAO)

    RPL DIO messages are sent for DODAG discovery and maintenance. DIO are link-local multicast packets (all-RPL-nodes multicast address - FF02::1A) according to trickle timers and contain the Routing Mode, Rank information relative to the DODAG root, and the ETX Path. The CGR Rank and ETX Path are set to 0. Trickle timers are an optimized form of controlling the control packets' update frequency by using an adaptive mechanism. DIO messages are sent more frequently when a DAG consistency issue is detected to improve the convergence time. As DAG stabilizes, messages are sent less frequently.

    Network administrators can control the regular frequency (in minutes) at which the CGR (root) can solicit DAO destination advertisement messages from the downstream nodes. The shorter the interval, the more up-to-date and accurate the CGR DOGAG topology is. However, this comes at the expense of the grid application available network bandwidth since the RPL control messages utilize more bandwidth. Therefore, a balanced approach should be taken based on the PAN size (number of nodes) and the DA application bandwidth requirements.

    Since DIO messages are link-local, each node listens for DIO messages from neighbor nodes. Once a DIO message is received, the node will send ICMPv6 Neighbor Solicitation (NS) unicast messages to evaluate the link quality. Then it will select a preferred parent and alternate or backup parents based on the information within DIO messages and the link quality metric between the node and its candidate parents. Note that to avoid routing loops, a node must select only parents with a lower rank.

    Once a node joins a DODAG graph, it will send its DIO messages downstream to other nodes and so forth until all nodes are part of RPL tree. This will result in building the upwards route towards the CGR.

    Figure 72 RPL Upward Routing

     

    256077.jpg

    RPL DIS messages are sent by nodes to solicit a DADOG Information Object from another RPL node when the DIO information has expired.

    RPL DAO messages are sent by nodes towards the root to inform the CGR of the available destination prefixes and to build the downwards routing from CGR to any DA Gateway in the mesh PAN.

    The Cisco solution supports two downwards routing models: Non-Storing Mode and Storing Mode. Nodes learn about the RPL routing mode through the DODAG root DIO message's Mode of Operations (MOP) field.

    Non-Storing Mode is used with constrained devices like AMI meters that do not have enough memory to store routing information. Nodes will send DAO messages that includes their parents list directly to the CGR by using double-header encapsulation. The outer header is changes as the packet travels upwards while maintaining the original header intact. CGR receives information from all nodes in the PAN and performs recursive lookup to determine the hop-by-hop path to each destination that will be inserted in the Source Routing Header of a packet. Therefore, in Non-Storing mode, nodes will only have a default route towards the CGR and the CGR will use Source-based Routing (listing all nodes in the path to destination) so that intermediate nodes know to which next hop neighbor they need to forward the packet.

    Figure 73 RPL Non-Storing Mode Downwards Routing

     

    256078.jpg

    Storing Mode is more appropriate for DA deployments since each parent stores all downstream routes towards their children and children's children.

    Note: Non-Storing Mode is the default Downward routing mode; therefore, for DA network deployments, the FAR routes must be configured to use Storing Mode. Further, to enable peer-to-peer communication between the nodes, administrators must enable this functionality since it's disabled by default.

    Figure 74 RPL Storing Mode Downwards Routing

     

    256079.jpg

    Certain DA FLISR solutions requires peer-to-peer communication between reclosers, so that the end-to-end delay between the grid devices is minimized by reducing the hop count. Traffic does not have to hairpin at the CGR unless the two devices are part of two different branches where the CGR is a common parent.

    Note: Since each DA Gateway (IR510/IR530) maintains a routing table, the DA Gateways are limited to 300 routes

    Note: Peer-to-peer traffic will always go directly between a source and destination DA Gateway of which they have a parent and child relation, for example, Node7 and Node8. For all other cases, the traffic will flow via the source and destination DA Gateways common parent, for example, Node4 and Node5 via Node1, even if the two nodes can hear each other at Layer 1.

    Figure 75 RPL Storing Mode Peer-to-Peer Communication

     

    256080.jpg

    RPL protocols uses control and data messages to categorize each neighbor link quality based on the link RF modulation rate and data rate and packet loss rate over time. This information is reflected in the RPL ETX Link metric, which ties together the RF physical layer to the routing layer. Each node calculates the path cost back to the FAR over each neighbor link and selects the shortest path that indirectly reflects the best reliable path. The value is stored in the ETX Path protocol metric, which is advertised to all the downstream neighbors that take the ETX Path value and add their local ETX link value towards that neighbor to determine their cost to the FAR.

    Figure 76 DA Gateway Path Cost Calculation

     

    256081.jpg

    Figure 76 shows an example of how each node (in this case, Node 13, Node 14 and Node 15) choose their best path towards the CGR when multiple paths are available based on the ETX Path metric. Note that the example uses an ETX Link metric express in multiples of ten for ease of calculation, but, in reality, the values are granular, unit level, and rarely will be the same over two paths.

    For scenarios where two paths have the same ETX Path metric, the node will use the parent Received Signal Strength Level (RSSI) as a tiebreaker to select the preferred path.

    The next example, Figure 77, shows how a subset of the RPL topology changes when an ETX Link metric increases from 50 to 350 due to packet loss. Node 14 and Node 15 select Node 13 as they parent since the ETX Path to CGR is shorter over that parent.

    Figure 77 Path Metric Change due to Link Metric Increased

     

    256082.jpg

    The higher the packet loss rate, the faster a node will switch to an alternate backup parent, but, in general, the goal of the parent selection algorithm is to prevent packet loss, which is more than 10% over the current parent's link. The packet loss rate is an average over time so that the mesh topology stability is maintained and constant flaps are prevented against short temporary changes in the link RF conditions.

    A node keeps track of up to four parents and their metrics: one entry for the primary parent, two entries for backup parents, and a fourth entry as a candidate backup in case a backup parent fails, or its metric is not optimal anymore.

    Figure 78 DA Gateways Parent Table

     

    256083.jpg

    If the primary parent fails, the ETX Link will increase and make the path less preferred and the node will promote a backup parent to a primary parent. Since the node has already computed the ETX Path via the backup node, the next-hop switch will be fast. All the nodes in the Parent Table must have a lower rank than the current node. The Parent table is constantly updated as nodes received DIO messages from all the nearby neighbors.

    Recommendation:

    It is important that customers perform an RF survey and include enough signal degradation margin in the RF link budget to improve network RPL stability.

    In addition, it is important to plan the network in such a way that a node has at least two parents to choose from that will increase the network availability.

    The FAR router WPAN Global IPv6 prefix for the mesh is inserted in the router RIB table as a connected prefix with an Administrative Distance (AD) of 0 and metric of 0. DA Gateway's MAP-T IPv6 Prefixes will show in the CGR routing table as host routes (/128) representing each DA Gateway's MAP-T BMR IPv6 address. If the Ethernet interface is configured with an IPv6 prefix, that prefix will also exist in the CGR RIB table. DA Gateways prefixes are inserted by RPL in the CGR routing table as connected routes with an AD of 0 with a metric of 19 on the HA Primary CGR router and with a metric of 29 on the Back-up HA CGR router.

    Table 44 Mesh Prefixes Administrative Distance and Metric Values

    Route Type
    CGR Routing Table
    Administrative Distance (AD)
    Primary CGR Metric
    Back-up CGR Metric (only HA setup)
    Comments

    Mesh Global IPv6 Prefix

    Prefix Length (variable)

    Connected (0)

    19

    29

    Depends on the size of the PAN

    IR510 MAP-T Address

    /128

    Connected (0)

    19

    29

    --

    IR510 Ethernet IPv6 Prefix

    Prefix Length (/64)

    Connected (0)

    19

    29

    --

    Administrators will need to redistribute the routes (connected and interface WPAN) in the WAN layer by using the FlexVPN IKEv2 Dynamic Routing Updates feature or into a dynamic routing protocol (such as BGP and EIGRP) that runs within the FlexVPN tunnel. For High Availability designs with two tunnels (primary and back), administrators will also need to ensure that the backup tunnel metric has a high cost than the primary tunnel.

    WAN Routing

    In the WAN layer with a FlexVPN design, customers will configure routing at the transport layer (outer tunnel) as well as the overlay layer (inner tunnel). At the transport layer for Private WAN, the routing protocol is a customer choice in case of dark fiber or it depends on the Service Provider peering requirements of either static routing or dynamic routing, typically BGP.

    For the VPN overlay, layer customers can choose to use between an IKEv2 Dynamic Routing Update (DRU) feature and a dynamic routing protocol based on the number of DA applications, DA deployment size, WAN size, WAN transport services, and specific convergence requirements for HA deployments. The IKEv2 DRU, when compared with traditional dynamic routing protocols, is lighter in control plane usage and therefore better fitted for Cellular WAN connectivity where the monthly cellular data plan has a limit.

    Table 45 WAN Scalability Considerations

    Spoke Deployment Size
    Overlay Routing Protocol
    FAR Backhaul
    VRF Support
    HER Type
    FlexVPN Max/Recommended Number of FAR per HER
    Increase Scaling Options
    HER Maximum IPv4 and IPv6

    Small

    EIGRP

    Private WAN

    Yes

    ASR1001-X With 8GB

    1,000/600

    Multiple Hubs

    1 Million

    Small

    IKEv2 Dynamic Routing Update

    Cellular or Private WAN

    No

    ASR1001-X With 8GB

    1,000/1000

    Multiple Hubs

    1 Million

    Medium

    EIGRP

    Private WAN

    Yes

    ASR1002-X With 16GB

    4,000/1,000

    Multiple Hubs

    4 Million

    Large

    BGP

    Private WAN

    Yes

    ASR1006-X RSP3 with 32GB

    10,000/6,000

    Multiple Hubs

    7.5 Million

    Note: The performance numbers should be used as reference. Since the maximum limit heavily depends on how many services ran on the HER and the throughput rate, customers should do the initial planning based on the recommended scale values and should work with the local Cisco account team for more up-to-date validated scalability numbers.

    Note: IKEv2 DRU in the future will support Virtual Routing and Forwarding (VRF) as well as a large number of spokes per hub ratio since today FlexVPN supports up to 10,000 spoke tunnels on certain Cisco router platforms.

    Design Guidance:

    For Mesh deployments that support DA Gateway migration between different PAN IDs (Inter-PAN), the WAN must support dynamic route advertisement of the DA Gateway MAP-T and Ethernet IPv6 prefix over the WAN so that traffic from the Control Center can be routed to the proper FAR device.

    The Resilient Mesh network prefixes and the Control Center prefixes should only be advertised into the Overlay routing protocol that runs over the tunnel interfaces. This will hide the DA network addresses from the transport layer or Service Provider routing domain.

    When running one or more dynamic routing protocols between the Transport and Overlay layers, make sure that the transport network prefixes are not learned via the overlay routing protocols and are filtered to avoid tunnel failure due to recursive lookup scenario.

    To scale and support large number of FAR devices, the network administrator must implement routing optimization best practices for each routing protocol. For example, Route Summarization should be configured on the FlexVPN Hub routers (HERs) to only advertise a summary route of all the Control Center prefixes. Advertising a default route can also be an option, but would attract any unknown traffic from the FAR devices towards the Control Center, which can use unnecessary WAN bandwidth or become a security concern if a remote location is compromised.

    For all the remote traffic to go to the Primary Control Center, the summary route or routes from the Back-up Control Center should be advertised with a higher cost.

    The FAR devices should be configured to summarize all the local substation prefixes and the mesh prefixes to reduce the FlexVPN Hub routing table since it aggregates a large number of FAR devices. This will also prevent any route updates in the substation or mesh to be propagated to the Control Center, therefore minimizing the routing control plane utilization. The only time DA Gateways host routes that will be advertised as specific routes over the WAN is when a DA Gateway performs an Inter-PAN migration, assuming there is Layer 1 connectivity for the device to attach himself to another substation Mesh network. The MAP-T and Ethernet IPv6 prefix will have to be advertised as specific routes; the default behavior is to allow a grid device attached to the DA Gateway to maintain its configured IP address and that the traffic for it will be routed to the new FAR device and mesh network.

    Figure 79 WAN Routing Summarization Boundaries

     

    256084.jpg

    Ideally, the two Control Centers should be interconnected by a dedicated link so that if the Primary Control Center primary WAN link fails, the FAN traffic can be sent to the Back-up Control Center and through the dedicated link back to the Primary Control Center. This will avoid the Grid Systems and FAN Management systems from having to perform a failover to the Back-up Control Center.

    Control Center Routing

    Customers can choose to use the same routing protocol in the Control Center over the Overlay VPN network as long it scales and meets the DA deployment requirements.

    However, in general, it is best to create different routing failure domains so that if a major network outage occurs due to human errors, the failure won't affect other parts of the network. In that case, each site will run an Interior Gateway Protocol (IGP) routing protocol within the site. Each site will be also configured with its own Border Gateway Protocol (BGP) AS number and an external BGP (eBGP) peering will be configured between each site over the FlexVPN networks since BGP has better routing policy control capabilities such as for route filtering, peer and route security and route dampening.

    The Control Center WAN router will run both IGP and BGP routing protocols and would advertise the local site IGP prefixes into BGP so that remote sites would know how to route to these prefixes. The remote prefixes learned by the WAN routers via BGP would advertise these routes into the IGP protocols without the need for redistribution.

    Network Services

    Quality of Service

    Following the IETF Differentiated Service model (RFC 2474), the FAN solution will deliver a particular kind of service based on the QoS specified for each packet. This specification can be made in different ways, one of them being the IP Precedence bit settings in IP packets or source and destination addresses. The QoS specification can be used to classify, mark, shape, and police traffic, and to perform intelligent queuing.

    QoS refers to the ability of the network to provide priority service to selected network traffic, thus increasing predictability by:

    blank.gifSupporting dedicated bandwidth

    blank.gifReducing loss characteristics

    blank.gifAvoiding and managing network congestion especially with multi-services traffic

    blank.gifShaping network traffic

    blank.gifSetting traffic priorities across the network - multi-services capabilities

    QoS is a key feature when designing multi-services Field Area Networks, as the need exists to differentiate and prioritize between traffic from different DA systems (such as FLISR and Volt/VAR) and network management use cases, for example communication equipment upgrades or new Edge Compute applications. Estimated transport losses, delay, and jitter introduced by networking devices must be considered when forwarding sensitive data, particularly when a WAN backhaul link offers constrained bandwidth. In the case of dual-WAN interfaces with different bandwidth capabilities (such as Ethernet and cellular for backup), QoS policies must be applied to prioritize the traffic allowed to flow over these limited bandwidth links in order to determine which traffic can be dropped.

    QoS also needs to be considered over the substation WAN and Control Center infrastructure layers, especially during network congestion. Typically, the WAN and CC communication infrastructure does not have the mesh hardware constraint for low power consumption and supports more advanced QoS settings. Therefore, mapping of the QoS policies between the different FAN architecture layers is critical.

    Figure 80 End-to-End QoS Architecture

     

    256085.jpg

    FAN Infrastructure Layer

    The FAR performs QoS actions on Layer 3 interfaces, as documented in the CGR 1000 QoS configuration manual. The sequencing of QoS actions on egress traffic is as follows:

    1.blank.gif Classification

    2.blank.gif Marking

    3.blank.gif Strict Priority Queuing

    The general CG-Mesh QoS guidelines are:

    blank.gifAll traffic should have a Differentiated Services Code Point (DSCP) QoS value set

    blank.gifTraffic without a DSCP will be put in the Low queue

    blank.gifThe default queue for most traffic should be the Normal queue

    blank.gifTraffic placed in Medium or High queue should be justified.

    blank.gifIt must be recognized that traffic in a higher queue can cause traffic in lower queues to drop.

    The Resilient Mesh devices use Priority Queuing with four types of egress queues (Q1-Q4) using Strict Priority scheduling. The Q4 queue, which is reserved and cannot be configured, is used for the mesh control plane traffic such as Layer 3 RPL protocol packets, and neighbor discovery.

    Figure 81 FAN Device Queues

     

    256086.jpg

    Administrators can classify traffic based on traffic criticality: low, normal, and medium. The traffic is automatically mapped to low priority traffic to Q1 queue, normal priority traffic to Q2, and medium priority traffic to Q3.

    Note: Q4 is reserved and administrators cannot configure it.

    Table 46 Mesh QoS Classification Classes and DSCP Marking Values

    QoS Class
    Mesh Classification
    Drop Priority
    DSCP (IPv4/IPv6)
    IPP (IPv4)
    Queue Mapping
    Comments

    Best Effort

    Low

    -

    0

    0

    Q1

    Mesh 6.0

    AF11

    Normal

    Low Drop

    10

    1

    Q2

    Mesh 6.0

    AF12

    Normal

    Medium Drop

    12

    1

    Q2

    Mesh 6.1

    AF13

    Normal

    High Drop

    14

    2

    Q2

    Mesh 6.1

    AF21

    Medium

    Low Drop

    18

    2

    Q3

    Mesh 6.0

    AF22

    Medium

    Medium Drop

    20

    2

    Q3

    Mesh 6.1

    AF23

    Medium

    High Drop

    22

    2

    Q3

    Mesh 6.1

    AF31

    High

    Low Drop

    26

    3

    Q4

    Reserved

    AF32

    High

    Medium Drop

    28

    3

    Q4

    Reserved

    AF33

    High

    High Drop

    30

    3

    Q4

    Reserved

    Note: Software version 6.0 only supports three types of markings: Low - DSCP 0, Normal - AF11/DSCP 10, Medium - AF21/DSCP18. The next software release, Mesh 6.1 will support all the 10 DSCP markings listed in Table 46.

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/cgr1000/ios/modules/wpan_cgmesh/b_wpan_cgmesh_IOS_cfg/wpan_cgmesg_IOS_cfg.html

    Note: For the remaining sections, the diagrams will use the convention shown in Figure 82 to capture the Layer 3 packet QoS marking. IEEE 802.15.4g/e does not specify a Layer 2 frame QoS field for unslotted CSMA.

    Figure 82 Marking Legend

     

    256358.jpg

    The Queue scheduler will always empty the higher priority queues ahead of the low priority queue. For example, before a device can empty the Q2 packets, the Q4 and Q3 queues must be empty. The scheduler uses a round-robin sequence that always starts at the higher priority queue.

    The three type of traffic marking can be done at the following locations:

    blank.gifInbound Ethernet physical port (Eth0). All incoming packets will be mark with the same traffic class.

    blank.gifInbound virtual IOx switch, per source IPv4 address, under a device DSCP profile, for Edge Compute applications

    blank.gifPer serial interface, under Serial interface profile, for Raw Socket scenarios

    Figure 83 Ethernet Port Marking

     

    256087.jpg

    Figure 84 Serial Port Marking

     

    256088.jpg

    Figure 85 Edge Compute Application Markings

     

    256089.jpg

    Packets that are already marked by end devices or Edge Compute application will be overwritten with DSCP value 0 or remarked with the QoS class assigned (Low, Normal, and Medium).

    Figure 86 is an example of how QoS works between two devices. Higher priority packets, Medium classification will be sent before Lower priority packets, Normal or Low. The same logic works for upstream flows as well.

    Figure 86 QoS Queuing and Scheduling

     

    256090.jpg

    When multiple DA gateways want to send traffic to their parent, there are certain scenarios when collision can occur, or when Mesh nodes are further apart where the Clear Channel Assessment (CCA) can't operate and packet retransmission will take place. The Resilient Mesh software has implemented an advanced QoS mechanism to dynamically adjust the back-off timer for collision or retransmission that each node uses based on the packet QoS marking. Higher priority packets will have a shorter back-off interface; therefore, they will be sent before lower priority packets.

    Figure 87 Advanced MAC Layer QoS

     

    256091.jpg

    In Figure 87, each node is sent a packet with different QoS markings towards their common parent. When a collision occurs, each node will use a back-off timer value that is directly related to the packet QoS markings. Packets with High Priority classification (Medium) will experience lower back-off time than Normal or Low packets; therefore, they will arrive at the parent node ahead of the other packets. If retransmission takes place, then packets with higher priority will have a shorter retransmission delay than lower priority packets, increasing the probability of being delivered ahead of the lower QoS classes.

    Design Guidance:

    Cisco recommends that critical FAN DA applications like FLISR should be marked with Medium priority whereas Volt/VAR or other application should be marked as Normal.

    Table 47 Cisco Resilient Mesh QoS Design Recommendations

    Grid Applications
    QoS Class
    DSCP
    Resilient Mesh Control Plane (Reserved)

    High

    26

    FLISR

    Medium

    18

    Volt/VAR

    Normal

    10

    Other (Firmware upgrade, etc)

    Low

    0

    WAN Infrastructure Layer

    Cisco FAR router supports two QoS models: one model for the FAN Mesh that runs on the WPAN module that has been described above and another for WAN and Control Center devices that run Cisco's full class IOS or IOS-XE operating system. The CGR routers are the boundary line between the two domains. There is a mismatch on the number of DSCP values that each domain supports; therefore, network administrators might need to aggregate the extra WAN markings into the Mesh markings when number of grid application exceeds the Mesh QoS markings.

    The CGR IOS's modular QoS CLI (MQC) also uses Priority Queuing and supports four queues: Q1 to Q4, where Q1 has the highest priority with a Strict Priority schedule. CGR IOS supports 64 DSCP values and allows network administrators to use class-maps, policy-map and service-policy configuration to apply QoS markings to each grid application type and also assign the packets to the proper interface queue for proper service level.

    In general, the packets arriving to the FAR router should already be marked and should only be remarked if they are out of policy.

    Figure 88 FAR QoS Queuing Architecture

     

    256093.jpg

    Since the WAN uses a VPN overlay architecture, the marked packets (Inbound QoS) from the Mesh destined to the Control Center must be preserved as it becomes encapsulated with new tunnel IP headers (Outbound QoS).

    Cisco IOS supports QoS per Tunnel interface and the Inbound QoS DSCP values will be copied to the Outbound QoS DSCP field, which is important if the customer WAN supports QoS.

    Figure 89 VPN Per Tunnel QoS

     

    256094.jpg

    Notes:

    For additional information regarding the CGR QoS feature, please consult the Cisco CGR 1000 Configuration Guide by using the link below.

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/cgr1000/1_0/software/configuration/guide/qos/cgr1000_Book/qos_stats_cgr1000.html

    The Distribution Automation - Secondary Substation Design Guide offers additional details about FlexVPN QoS design considerations:

    blank.gif https://www.cisco.com/c/en/us/td/docs/solutions/Verticals/Distributed-Automation/Secondary-Substation/DG/DA-SS-DG/DA-SS-DG-doc.html

    Headend Infrastructure Layer

    In the Control Center, customers can choose from a variety of network infrastructure platforms, either IoT or Enterprise based on their specific requirements. Each platform has its own unique QoS capabilities, which are well described in the product or solution configuration guides available at Cisco.com.

    The QoS Trust boundary lays on the headend switching infrastructure; therefore, packet classification and marking should be performed as close as possible to the Grid Application servers. Once traffic has been classified, the transient network devices trust and honor the QoS service level assign to each application.

    The Control Center WAN blocks the VPN aggregation devices. HERs will also need to be configured to support QoS over the VPN tunnel interfaces.

    In general, the remote substations have bandwidth mismatch between the physical interface speed and the WAN service provision for the substation connectivity. This is true especially for Cisco Off-Net Substation WAN designs where customers use a Service Provider service like the Managed MPLS Layer 3 service.

    Figure 90 Bandwidth Asymmetry between Central and Remote Substations

     

    256095.jpg

    Design Guidance:

    blank.gifNetwork administrators should configure the HER devices with an egress QoS policy to shape traffic for the proper WAN service speed rather than the router physical WAN speed interface if the two don't match.

    blank.gifSince all the traffic from the field devices is destined to the Control Centers, the WAN services will be provisioned with higher service speed than the remote substation or the field locations where the CGR routers will be deployed. To prevent traffic from Control Center systems from oversubscribing the substation WAN link, network administrators should configure on the HERs a per-Spoke QoS policy based on the remote spoke WAN service speed. To accommodate for burst traffic, the QoS policy should shape temporary busty traffic and police traffic that's outside of a typical substation QoS traffic profile.

    Note: The Cisco FAN Headend Implementation Guide offers additional details about the Control Center design considerations.

    blank.gif https://salesconnect.cisco.com/open.html?c=db570d3f-3212-4659-a306-5f65aeab862b

    Network Time Services

    Certain services running on the FAN require accurate time synchronization between the network elements. Many of these applications process a time-ordered sequence of events; therefore, the events must be time stamped to a level of precision such that individual events can be distinguished from one another and correctly ordered. A Network Time Protocol (NTP) version 4 server running over IPv4 and IPv6 network layer can act as a Stratum 1 timing source for the network.

    Over the FAN, the NTP might deliver accuracies of 10 to 100 milliseconds, depending on the characteristics of the synchronization source and network paths in the WAN.

    Some of the applications that require time stamping or precise synchronization are:

    blank.gifValidation of X.509 certificates used for device authentication, specifically to ensure that the certificates are not expired

    blank.gifTime stamps for meter readings, asynchronous notifications from meters, log entries, and so on

    blank.gifDying Gasp feature used for device outage notifications

    Design Guidance:

    The Cisco Headend infrastructure that includes FND, RA, TPS, NPS, and PKI Servers should have its clock synchronized to the same source: ideally a NTP server that receives time from the Internet or is connected to a GPS appliance. The FAR routers can be configured to receive time information via NTPv4 from HER or from the local GPS interface. FAR routers distribute their time information to other mesh nodes via Sync Beacon messages. IR510 can also receive time information from its local GPS interface.

    Network Security

    The Cisco Resilient Mesh solution for FAN DA has the best industry security practices built in from physical security features to application and firmware integrity. All the solution's security protocols are open standards based.

    Figure 91 FAN Security Overview

     

    256096.jpg

    The most vulnerable area of the solution is the FAN infrastructure layer since the grid devices and the Cisco DA gateways are located on the distribution overhead or underground utility assets and no good physical security exists besides an enclosure lock. The new IR510 was built in mind for this real risk; therefore, customers can use the dry-contact alarms with an external sensor to detect when the outdoor-rated enclosure that's housing the communication equipment is opened. This alarm can be correlated with the solution management system to alert operators of unauthorized access. Knowledge of unauthorized access does not solve the temporary problem where somebody could connect to the communication infrastructure and try to reach other locations in the distribution network or the Control Center.

    Network administrators should disable the Ethernet port if grid devices are connected via the Serial port. When grid devices are connected via the Ethernet port, then the security feature called Auto-Disable Ethernet Port should be enabled so that when somebody tries to connect another device to the DA Gateway (for example, an unauthorized PC), the IR510's Ethernet port status will go into disabled state. This will prevent any packets entering the Ethernet port and therefore protect the mesh and the local Edge Compute applications.

    The DA Gateway is running a special operating system that was designed for constrained, low power devices and does not uses traditional IT management tools like Telnet or other readily available tools to gain access to the device management. Customer will have to use the centralized management system FND or the local field tool called IoT Device Manager.

    Design Guidance:

    Device configuration management is centralized; therefore, even if someone would make local device configuration changes, the changes will be overwritten by the FND configuration. One will have to compromise the FND in order to make system changes.

    Additionally, each DA Gateway motherboard is equipped with a dedicated security chip that provides:

    blank.gifSecure unique device identifier (802.1AR)

    blank.gifImmutable identity and certifiable cryptography

    blank.gifEntropy source with true randomization

    blank.gifMemory protection and image signing and validation

    blank.gifTamper-proof secure storage of configuration and data

    The software images are digitally signed with Cisco certificates to validate the authenticity and integrity of the software.

    Another good security feature is the Edge Compute application signature that works with a Trusted Anchor located in the Control Center to enforce application identity (authentication) and authorization; this means that it's allowed to run on the platform.

    Customers will have the ability to deploy third party security applications at the edge to inspect traffic coming from the grid devices to enforce edge security.

    The second area of concern with any radio network is that somebody could try to intercept the data or even join the mesh with 900Mhz-capable devices. Cisco has implemented IEEE 802.1x as method to securely authenticate a radio node before allowing it to join the mesh PAN or even send packets into the network. Authentication is based on X.509 certificates and ECC that is locally installed in a protected area during the ZTD.

    Figure 92 First Rank Device Mesh Joining Process

     

    256097.jpg

    Figure 93 Multi-hop Service Mesh Join Process

     

    256098.jpg

    Figure 94 IEEE 802.1x Device Authentication

     

    256099.jpg

    The mesh security also performs authorization to ensure that a device is authorized to get access to the network and that it was not evicted by the administrator for security reasons (compromised or stolen).

    The DA gateways perform data link layer encryption where the data is encrypted on a hop-by-hop using AES 128 symmetric encryption algorithm. The only packets that aren't encrypted are the beacon broadcast that announces the PAN information for nodes to discover the mesh network.

    Design Guidance:

    When customers implement a security layer within each substation and enable node Inter-PAN migration, then network administrators need to be aware that an inline stateful firewall will break existing sessions between the Grid Systems and the field grid devices when a node migrates between adjacent PANs, especially for TCP protocols.

    If it is not acceptable for the grid device session to be turned down and re-established over the new path, then network engineers can enable routing between substations so that grid device sessions enters the WAN through its home substation FAR.

    The WAN infrastructure layer also implements a secure VPN solution (FlexVPN), which is based on the open standards IKEv2 and IPSec, so that traffic traversing any type of transport infrastructure, is encrypted and secure. FAR routers support additional type of encryption algorithms based on customer preferences.

    FAR routers use RSA certificates to authenticate with the HER and the headend NMSs as well as authorization.

    Since FAR routers are not constrained devices and run full IOS operating systems, they can be used as a security enforcement point between the mesh radio network and the rest of the Distribution network. Administrators can configure the security feature: Zone-Based Firewall, which is a stateful firewall combined with IPS signatures. As with the DA gateways, the FAR devices support a dedicated Edge Compute module that will allow customers to install third party security applications.

    Typically, the FAR routers are installed within the premise of the substations and are protected by the substation yard fence and the video security system. Therefore, physical security is not critical, but for instances where the FAR is deployed in a field or unsecure location, the FAR router supports the same security features as the DA gateway (such as dry-alarm contacts and the Disable Ethernet port). The FAR router is an IP67-rated devices and physical access inside the chassis is not a simple process due to the secure door bolts. This makes it extremely difficult for any rogue entity to open or uninstall the device from the pole-top mounting. The device generates NMS alerts if the router door or chassis is opened.

    As with the DA Gateways, the FAR router motherboard is equipped with a dedicated security chip and provides the same level of security.

    In order to ease local troubleshooting and device maintenance, the FAR routers have a Wi-Fi management radio interface that should be disabled during normal operations. In order to connect to the FAR, field engineers would need to have a trouble ticket open and have the access credential installed on their laptop to authenticate using the WPA2 standard.

    Note: The Cisco Secondary Substation Design Guide offers additional details about the Control Center design considerations. Refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/solutions/Verticals/Distributed-Automation/Secondary-Substation/DG/DA-SS-DG/DA-SS-DG-doc.html

    The Grid Systems and the Resilient Mesh management system in the Control Center are protected by network security devices (Cisco Next-Generation firewalls and IPS), end-host security agents (such as Cisco FireAMP) running on servers and by the security access features running on the switching and routing infrastructure.

    The mesh management system control messages between the FND and the FAR or DA gateways are signed to authenticate the source of the message as well as to detect any replay attacks.

    Note: The Cisco FAN Headend Implementation Guide offers additional details about the Control Center design considerations. Refer to the following URL:

    blank.gif https://salesconnect.cisco.com/open.html?c=db570d3f-3212-4659-a306-5f65aeab862b

    Network Management System

    The Resilient Mesh Management System, FND is a software platform that manages network and security infrastructure for multi-service mesh networks and is the core component of the solution.

    Note: The Cisco FAN FND product overview is available at the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/cloud-systems-management/iot-field-network-director/index.html#~stickynav=1

    The FND system has the following components:

    blank.gifFND Application Server —This is the core of mesh deployments. It runs on an RHEL server and allows administrators to control different aspects of the FAN DA solution deployment using its browser-based GUI. It also manages Edge Compute application deployment and configuration. FND Application Server HA deployments include two or more servers connected to a load balancer.

    blank.gifFND Database—This Oracle database stores all information managed by the Resilient Mesh solution, including all metrics received from the DA devices and all device properties such as firmware images, configuration templates, logs, and event information.

    blank.gifSoftware Security Module (SSM)—This is used for signing CoAP Simple Management Protocol (CSMP) messages sent to field devices.

    blank.gifTPS Proxy—Allows FARs to communicate with headend systems when they first start up in the field. After the FND provisions tunnels between the FARs and ASRs, the FARs communicate with the FND directly.

    The FND is responsible for the full life cycle network management tasks: Fault management, configuration management, accounting management, performance management, and security management (FCAPS). The FND uses the CSMP for remote configuration, monitoring and event generation over the IPv6 network.

    Features and capabilities of the FND include:

    blank.gifConfiguration Management—Facilitates configuration of large numbers of Cisco CGRs. They can be bulk-configured by placing them into configuration groups, editing settings in a configuration template, and then pushing the configuration to all devices in the group.

    blank.gifDevice and Event Monitoring—Displays easy-to-read tabular views of extensive information generated by devices, allowing monitoring the network for errors. Cisco CG-NMS provides integrated Geographic Information System (GIS) map-based visualization of FAN devices such as routers and smart meters. CG-NMS can be used to create CGR-specific work orders that include the required certificates to access the router.

    blank.gifFirmware Management—Serves as a repository for Cisco CGR and DA Gateway firmware images. Cisco FND can be used to upgrade the firmware running on groups of devices by loading the firmware image file onto the Cisco FND server, and then uploading the image to the devices in the group. Once uploaded, FND can be used to install the firmware image directly on the devices.

    blank.gifZero Touch Deployment—This easy-to-use feature automatically registers (enrolls) and distributes X.509 certificates and provisioning information over secure connections within a connected grid network.

    blank.gifODM File Upload and Hash Compatibility—Operational Data Model (ODM) files format commands that execute on Cisco IOS routers. The FND uses the formatted output for such things as periodic metrics collection, router version information, battery information, reading the Hypervisor (virtual machine monitor) version, and GPS information. ODM file hash compatibility and upload are performed while requesting a registration, during periodic inventory updates, or during the tunnel provisioning process.

    blank.gifTunnel Provisioning Between HERs and FARs—Protects data exchanged between Cisco HERs and Cisco CGRs and prevents unauthorized access to Cisco CGRs, to provide secure communication between devices. Cisco FND can execute CLI commands to provision secure tunnels between Cisco CGRs and Cisco HERs. The FND can be used to bulk-configure tunnel provisioning using groups.

    blank.gifIPv6 RPL Tree Polling—A node in the IPv6 Routing Protocol for Low power and Lossy Networks (RPL) tree discovers its neighbors and establishes routes using ICMPv6 message exchanges. RPL manages routes based on the relative position of the DA Gateway to the CGR that is the root of the routing tree. RPL tree polling is available through the mesh nodes and CGR periodic updates. The RPL tree represents the mesh topology, which is useful for troubleshooting. For example, the hop count information received from the RPL tree can determine the use of unicast or multicast for the firmware download process. The FND maintains a periodically updated snapshot of the RPL tree.

    blank.gifEdge Compute Support—For Cisco IOS CGR1000 devices that support the Cisco Compute Module (CGM), the FND allows approved users to manage applications running on the supported operating systems. The FND manages the application deployment and displays application status and the Hypervisor version running on the device.

    blank.gifDevice Location Tracking—For CGR 1000 devices, the FND displays real-time location and device location history.

    blank.gifSoftware Security Module (SSM)—This is a low-cost alternative to the Hardware Security Module (HSM) and is used for signing CSMP messages sent to meters.

    blank.gifDiagnostics and Troubleshooting—The CG-NMS rule engine infrastructure provides effective monitoring of triage-based troubleshooting. Device troubleshooting runs on-demand device path trace and ping on any CGR, range extender, or meter (mesh endpoints).

    blank.gifHigh Availability—To ensure uninterrupted network management and monitoring, the Cisco FND solution can be deployed in a High Availability configuration. By using clusters of load-balanced FND servers and primary and standby FND databases, the Cisco FND constantly monitors the health of the system, including connectivity within clusters and server resource usage. If a server cluster member or database becomes unavailable or a tunnel fails, another takes its place seamlessly.

    blank.gifPower Outage Notifications—DA Gateways implement a power outage notification service to support timely and efficient reporting of power outages. In the event of a power outage, nodes perform the necessary functions to conserve energy and notify neighboring nodes of the outage. The FARs relay the power outage notification to the FND, which then issues push notifications to customers to relate information on the outage.

    blank.gifMesh Upgrade Support—Over-the-air software and firmware upgrades to field devices such as Cisco CGRs and DA Gateways.

    blank.gifAudit Logging—Logs access information for user activity for audit, regulatory compliance, and Security Event and Incident Management (SEIM) integration. This simplifies management and enhances compliance by integrated monitoring, reporting, and troubleshooting capabilities.

    blank.gifNorth Bound APIs—Eases integration of existing utility applications such as outage management system (OMS), meter data management (MDM), trouble-ticketing systems, and manager-of-managers.

    blank.gifWork Orders for Device Manager—Credentialed field technicians can remotely access and update work orders.

    blank.gifRole-Based Access Controls (RBAC)—Integrates with enterprise security policies and role-based access control for DA Gateway network devices.

    blank.gifEvent and Issue Management—Fault event collection, filtering, and correlation for communication network monitoring. The FND supports a variety of fault-event mechanisms for threshold-based rule processing, custom alarm generation, and alarm event processing. Faults display on a color-coded GIS-map view for various endpoints in the utility network. This allows operator-level custom, fault-event generation, processing, and forwarding to various utility applications such as an outage management system. Automatic issue tracking is based on the events collected.

    FAN DA Device Onboarding: Device Registration and Configuration Processing

    A FAN DA solution makes it easy for customers to deploy and connect a large amount of communication equipment in a short period of time by leveraging a simplified device provisioning process. The ISM band devices and the cellular FAN DA devices can be automatically onboarded through the ZTD process.

    The FAN DA solution supports two ZTD models for the field router devices:

    blank.gifZTD 1.0-Device onboarding with manual bootstrap configuration

    blank.gifZTD 2.0-Complete automatic device onboarding including bootstrap configuration via PnP

    FAN field device onboarding refers to all the processes (device bootstrapping, device authentication and authorization, device certificate enrollment, configuration provisioning, etc.) that have to occur before a network device becomes operational and is fully managed by the network management (in this case, the Field Network Director or FND).

    Cisco offers two FAN DA solutions—one based on free license spectrum ISM 900 MHz band and the other based on managed cellular service 4G/LTE. The next section will detail the different device onboarding options available with each FAN DA solution.

    Unlicensed 900 MHz ISM FAN DA Solution Device Onboarding Overview

    For the FAN DA ISM-based solution, the infrastructure layer provisioning is done in a sequential two-step process. First, the FAR devices (CGR) must be provisioned since each CGR acts as a mesh PAN coordinator that advertises the mesh information for FD devices to join. Once the FAR devices are operational, then FD devices can also join the mesh network and go through the ZTD process.

    1.blank.gif Provision FAR devices for a specific area.

    2.blank.gif Provision FD devices for that area.

    Today, the FAN DA ISM solution supports the device onboarding methods shown in Table 48:

    Table 48 FAN DA ISM Device Onboarding Options

    Device Type
    Bootstrapping Config
    Device Certificate Enrollment
    Transport Supported

    CGR 1120 or 1240

    ZTD 1.0 or ZTD 2.0

    Manual or SCEP (RSA)

    Ethernet or Cellular

    IR510

    ZTD 1.0

    Manual or EST (ECC)

    WPAN

    IR530

    ZTD 1.0

    Manual or EST (ECC)

    WPAN

    ZTD 1.0 uses the original method to onboard devices onto the network, which requires that administrators apply a small basic device configuration part of the device bootstrap configuration process before the remaining onboarding processes can take place without human intervention. Because network devices need to be manually configured, this type of deployment typically uses a staging facility. During the staging phase, customers or partners, in addition to applying the bootstrap configuration, can perform local device firmware and software updates at a faster pace, which will minimize any issues that could arise in the field, update asset inventory system, and pre-provision device accounts in the network management systems.

    Figure 95 Device Onboarding using ZTD 1.0

     

    257040.jpg

    During this phase, it is recommended to validate the ZTD process on a few devices to validate the bootstrap configuration and the success of the remaining processes (device certificate enrollment, etc.) in order to avoid any additional work once the device is deployed in the field.

    Devices onboarding has the four stages shown in Figure 96:

    Figure 96 Device Onboarding Phases Overview

     

    256100.jpg

    blank.gifStaging can be done at a customer location or a partner warehouse location where the equipment order is shipped to for inventory, initial device configuration (bootstrapping), and temporary storage purposes until the deployment phase.

    blank.gifThe Pre-Provisioning stage is done by the customer or trusted partner once the headend infrastructure is built. Each device needs to have a pre-configured device account into the management systems (AD, FND, etc.) so that device authentication and authorization can be enforced during the device Provisioning phase.

    blank.gifThe Provisioning stage is an automatic process that occurs during the equipment field deployment phase when devices go through the ZTD process and transition to operation-ready state.

    blank.gifPost-Provisioning is the last stage in the device onboarding effort when administrators can address any security concerns, push firmware and software updates, remove temporary accounts, and change the device state from operation-ready to production.

    Field Area Routers (FAR) Device Onboarding using ZTD 1.0 (Manual Bootstrapping)

    Customers can use the ZTD feature in a secure way to automate the field device deployment process. Network administrators only need to manually configure the CGR router bootstrap configuration (WAN interface IP address, AAA information, Certificate Enrollment, etc.) part of the bootstrapping configuration step, so the router can contact the RA and TPS and go through the certificate enrollment process.

    Figure 97 provides an illustration of ZTD staging by the FND.

    Figure 97 FAR ZTD Process

     

    256101.jpg

    The FND is located in the Utility private network, while the Cisco Tunnel Provisioning Server proxy (TPS proxy) and Registration Authority (RA) are located in the DMZ.

    After installing and powering on the CGR 1000, it becomes active in the network and registers its certificate with the RA by employing the Simple Certificate Enrollment Protocol (SCEP). The RA functioning as a CA proxy, obtains certificates for the Cisco CGR from the CA. The Cisco CGR then sends a tunnel provisioning request over HTTPS to the TPS proxy that forwards it to FND. Cisco FND pushes the configuration to create a tunnel between Cisco CGRs and the headend router (HER).

    FAR Device Onboarding using ZTD 2.0 (PnP)

    With FND version 4.3, customers don't have to manually bootstrap the FAR configuration any more. They can take advantage of router PnP functionality where the router can receive the TPS or FND IP addresses information via different discovery mechanisms in order to download the router bootstrap configuration from the PnP server. FAR routers can use the manufacturer's device birth certificate Secure Unique Device Identifier (SUDI) certificate to authenticate and establish a secure connection with the PnP Proxy or PnP server depending on the deployment type.

    Figure 98 depicts the main stages of the process.

    Figure 98 Field Area Router Device Onboarding Overview

     

    257043.jpg

    Using ZTD 2.0, FAR devices can leverage Cisco PnP solution to automate the device bootstrap configuration step. This approach reduces the burden on customers by greatly simplifying the process of deploying new devices. An installer at the site can deploy a new device without any CLI knowledge, while a network administrator centrally manages device configuration.

    The Cisco Network PnP solution provides a simple, secure, unified, and integrated offering for customers to ease new field device roll-outs or for provisioning updates to an existing network. The solution provides a unified approach to provision the distribution automation communication infrastructure comprised of Cisco DA gateways, with a near ZTD experience.

    The Cisco PnP solution has the following components:

    blank.gifPnP Agent—The agent is embedded in Cisco devices and communicates to the Cisco Network Plug and Play application using the open plug and play protocol over HTTPS during device deployments. The PnP Agent, using DHCP, DNS, or other such methods, tries to acquire the IP address of the PnP server with which it wants to communicate. After a server is found and a connection has been established, the agent communicates with the PnP server to perform deployment-related activities.

    blank.gifPnP Proxy—An optional agent that runs on a Linux server. The FAN solution architecture uses the TPS server located in the DMZ as a PnP Proxy to increase the solution security. During device onboarding, all communication between field routers and FND located in the Control Center trusted zone takes place via TPS server.

    blank.gifPnP Server—The Cisco FND is a central server that functions as a PnP Server that encodes the logic of managing and distributing deployment information (images and configurations) for the devices being deployed. The FND PnP Server communicates with the PnP agent on the device using PnP protocol. The platform uses multiple Northbound REST APIs that drive core network automation solutions (Device Certificate Enrollment, etc.). The Cisco APIC-EM management tool can also be used as a PnP Server.

    blank.gifPnP Protocol—The PnP Protocol defines the transport bindings and schemas for various messages that are exchanged between the PnP Agent and PnP Server over HTTP or HTTPS (HTTPS being preferred for security reasons).

    blank.gifPnP Connect—An optional cloud component for automatic PNP server discovery if the DHCP or DNS methods are not available. The PNP Server is the backend part of the Cisco Network Plug and Play application in the APIC-EM. The Cisco network device contacts the Cisco Plug and Play Connect cloud service at devicehelper.cisco.com to obtain the IP address of the appropriate PNP server that is defined for your organization. The FAN solution Cisco PnP Connect Server information is the TPS IP address.

    blank.gifPnP Application—A mobile application for iOS and Android devices that helps configure Cisco devices with a bootstrap configuration and triggers remote branch deployments. The app communicates with the Cisco Network Plug and Play application over 3G/4G/Wi-Fi connections to get the predefined device bootstrap configuration, and delivers it to a Cisco network device by using a special serial cable that is physically connected to the device.

    When a Cisco FAR device boots up and no startup configuration is present, the PnP Agent will go through a PnP Server Discovery Phase to acquire the PnP Server IP address so that the FAR devices can download their bootstrap configuration. This bootstrap device configuration will prepare the device for the Device Certificate Enrollment step, VPN tunnel provisioning to the HER, and device registration with FND to download the final device configuration.

    PnP Agent supports the following PnP Discovery mechanisms:

    blank.gifDHCP Server Assist—Uses additional DHCP options 60 and 43 for IPv4 and option 9 for IPv6 to provide the PnP Server information during device DHCP IP address assignment.

    blank.gifDHCP Snooping—Used when the third party DHCP server does not support an additional option and where the Cisco network infrastructure is configured for DHCP Snooping and can insert DHCP options in the third party DHCP response.

    blank.gifDNS Server Lookup—The device query the DNS servers provided during the DHCP IP address assignment for a default fqdn: pnpserver.localdomain.

    blank.gifPnP Connect (Cloud Redirect)—Device connects to https://devicehelper.cisco.com/ portal to receive its bootstrap configuration or the PnP Server IP address for redirection.

    blank.gifUSB-based Bootstrapping—USB thumb drive with a bootstrap configuration using a specific file directory and name: router-confg/router.cfg/ciscortr.cfg.

    blank.gifStatic Configuration—Network administrator configures the PnP Server information on the device.

    The PnP DHCP and DNS options are documented as part of the "Zero Touch Deployment" chapter in the Cisco Distribution Automation Secondary Substation Design Guide. This design guide release details the device onboarding using the PnP Connect option.

    Note: To view the Cisco Distribution Automation Secondary Substation Design Guide, please use the following link:

    blank.gif https://www.cisco.com/c/en/us/td/docs/solutions/Verticals/Distributed-Automation/Secondary-Substation/DG/DA- SS-DG/DA-SS-DG-doc.html

    Note: For detailed information on PnP Discovery Phase, please refer to the following device configuration guide:

    blank.gif https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/pnp/configuration/xe-16-6/pnp-xe-16-6-book.html - reference_jdp_qgc_dy

    Automatic Device Bootstrapping Configuration using PnP Connect

    Customers can leverage the PnP Connect option to fully automate the device onboarding process. At a high level, a brand new device with no startup configuration in the NVRAM will use by default the PnP agent running as a service on the Cisco device operating system (IOS or IOS-XE) to contact the Cisco.com PnP Connect service using a specific URL: https://devicehelper.cisco.com/

    Figure 99 Cisco PnP Connect Solution Overview

     

    257044.jpg

    The Cisco PnP solution has integrated the customer device ordering process in the Cisco Commerce Workspace (CCW) into the PnP solution. Customers can provide the company Smart Account during ordering to simplify the device entry process in the Software Central PnP Connect portal. For devices that were ordered without a Smart Account, customers can either manually enter each device or perform a bulk import.

    Network administrators will need to create a Controller Profile to specify the TPS server IP address and associate all the field devices with this Controller Profile.

    Figure 100 PnP Connect Controller Profile

    257056.jpg

    Table 49 Controller Profile Information

    Profile Name:

    FAN-DMZ-TPS

    Deployment Type:

    onPrem

    Primary IPv4 Address:

    72.100.100.100

    Primary Protocol:

    http

    Primary Port:

    9125

    Controller Type:

    PNP SERVER

    Figure 101 Device and Controller Mapping

     

    257057.jpg

    Cisco FAR ZTD is supported over Ethernet interfaces and over Cellular 4G/LTE network interfaces. Both scenarios require that the network infrastructure or the Service Provider Cellular network offer the following services:

    blank.gifIn case of cellular, the cellular modem has a SIM inserted and it's been activated on the SP network

    blank.gifDynamic IP address assignment is enabled

    blank.gifDNS information is provided during address assignment

    blank.gifInternet access is allowed

    blank.gifPnP agent can communicate with https://devicehelper.cisco.com

    Cisco FAR ZTD 2.0 over Ethernet Interface

    When the FAR router backhaul is an Ethernet interface, the CGR router is either directly connected to the Substation LAN or to a third-party private radio network, which could be Microwave or licensed cellular service. Both scenarios use private network service and do not have internet services enabled; therefore, it makes more sense to use the PnP Agent DHCP or DNS discovery options than the PnP Connect option.

    However, if the customer wants to use PnP Connect over their private network, they will need to allow the FAR to reach the internet and communicate with the PnP Connect Service using the devicehelper.cisco.com FQDN. This can be achieved by building a routing path over the IT WAN network and configure the security appliances to allow the PnP traffic.

    Figure 102 FAR ZTD 2.0 over Ethernet

     

    257045.jpg

    Note: The device startup configuration must be empty in NVRAM for the PnP process to start. Also, since there is no startup configuration, the CGR router software image must be installed with a specific parameter that gets a saved Rommon Boot variable. For more information, please read the following documentation:

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/access/800/829/15-7-3M1-Release-Note.html

    Cisco FAR ZTD 2.0 over Cellular 4G/LTE Interface

    If the router is located outside of the Substation yard or if Ethernet connectivity is not an option, the CGR can be provisioned using the PnP agent over 4G cellular service interface. Service Providers offers two types of 4G cellular services: public or private.

    Typically, a public cellular service includes internet and dynamic device IP address assignment services. Because the service is public, Cisco modems come preloaded with specific firmware and a default profile configuration for each Service Provider. The pre-programmed profiles work only with the public cellular Access Point Name (APN) from AT&T, Verizon, T-Mobile, etc. for the North American market. This type of service is the easiest to use for FAR device onboarding since it's completely automated and does not need user interaction.

    For Public Cellular services, PnP Connect is the best option to use to onboard the devices since each device will receive a public-routable IP address from the provider. The field devices can reach out to Cisco PnP Connect to obtain the IP address of the Cisco FAN DA TPS server and complete the onboarding process.

    Since this service has internet connectivity, customers must ensure that the proper security infrastructure and policies are in place to prevent any external attacks. It's best to locate the Cisco FAN DA headend components (RA, HER, and TPS) that need to be accessed from the internet into the IT Corporate DMZ, as depicted in Figure 103.

    An alternative design for OT customers that don't want to depend on the IT group and have internet service within their Control Center, is to have the RA, HER, and TPS located in their Control Center DMZ.

    Figure 103 FAR ZTD 2.0 over Public Cellular

     

    257046.jpg

    On the other hand, Private Cellular Services uses a customer APN name that needs to be configured on the FAR modem so that the modem can join the correct cellular network. Since the device startup configuration needs to be empty, the APN name needs to be configured using global configuration mode onto the modem configuration, not onto the router configuration, which stays permanent even during device reboot.

    Figure 104 FAR ZTD 2.0 over Private APN without Internet Service

     

    257047.jpg

    When the Private APN service is provisioned with Internet Service, then customers can use the PnP Connect option to redirect the FAR devices to the TPS server located in the Control Center DMZ. The modem, in this case, still needs to be configured with the custom APN name that was defined when the cellular service order was placed.

    Using PnP Connect simplifies the user router configuration, which can be done during Staging or at the final destination as well.

    Figure 105 FAR ZTD 2.0 over Private APN with Internet Service

     

    257048.jpg

    If the Private APN is provisioned without internet services, customers can either manually configure the PnP Server information on the remote device or use the PnP DNS assist. When provisioning the Cellular services, the customer will configure the APN services to use their own DNS servers and create name records that resolve to the TPS server IP address.

    Table 50 Cellular Services Summary

    Cellular Service
    Modem APN Name Config
    Internet Service
    PnP Discovery Options
    APN Name
    Commands

    AT&T Public APN

    Not required

    Enable

    PnP Connect

    broadband

    N/A

    Verizon Public APN

    Not required

    Enabled

    PnP Connect

    VZWINTERNET

    N/A

    Private APN

    Required

    Disabled

    Manual Config or PnP DNS

    <APN_NAME>

    R1#cellular x lte profile create <PROFID> <APN_NAME>

    Private APN

    Required

    Enabled

    PnP Connect

    <APN_NAME>

    R1#cellular x lte profile create <PROFID> <APN_NAME>

    Note: The device startup configuration must be empty in NVRAM for the PnP process to start. Also, because no startup configuration exists, the CGR router software image must be installed with a specific parameter that gets saved with the boot image name into the Rommon Boot variable instead of the startup configuration file.

    For more information, please read the following documentation:

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/access/800/829/15-7-3M1-Release-Note.html

    Note: For more information on the Cisco Plug and Plug Solution, please refer to the following URL:

    blank.gif https://developer.cisco.com/docs/network-plug-n-play/#!network-plug-and-play

    Verizon Cellular service supports Dual APN configuration (also referred to as Split Data Routing) where customers could use the Public APN for device onboarding ZTD 2.0. During the device onboarding, the second Private APN can be configured eliminating the need for manual APN name configuration. Once the device is onboarded and managed by FND, customers could disable the Public APN for improved security.

    Note: For more information on Cisco and Verizon Dual APN configuration guide, please refer to the following URL:

    blank.gif https://www.cisco.com/c/dam/en/us/td/docs/routers/access/interfaces/software/deployment/guide/guide-061217.pdf

    By automating the entire process, customers have the option to not use a staging facility; that reduces installation duration and cost by provisioning the devices at their final destination, as depicted in Figure 106.

    Figure 106 Device Provisioning using ZTD 2.0 (PnP)

     

    257049.jpg

    Note: It is highly recommended that customers test the process few times before mass deploying all the FAR routers during deployment phase.

    Field Device (FD) Device Onboarding using ZTD 1.0 (Manual Bootstrapping)

    Figure 107 FD Device Onboarding using ZTD 1.0

     

    257050.jpg

    IR510 DA Gateways can be provisioned using two methods: the original method in which the devices must have three certificates installed (utility device identity, etc.) on the IR510 together with the basic RF configuration, sometimes referred to as the manufacturing configuration and the newest method, which is described in the next section.

    In the original method, since the certificates are already provisioned, no certificate enrollment takes place, only device authentication and authorization in order to join the mesh network. The last step after the device joins the network is to register itself with FNDS via CSMP protocol and retrieve the final configuration from the FND repository. This method does not support certificate bootstrapping and re-enrollment. For a successful implementation, it is highly recommended that FDs are tested at the staging location before they are deployed since the chance exists that a device won't join the mesh if it has the wrong certificate information.

    FD Certificate Enrollment using EST

    The newest method, which is part of the Resilient Mesh solution, addresses the above limitations by allowing FDs to perform certificate enrollment to receive a new utility device certificate and renew certificates about to expire. With FND 4.3 and Mesh 6.0, the DA Gateways can take advantage of the embedded Cisco SUDI certificate to start the IEEE 802.1x authentication process to join the mesh and communicate with a new Enrollment over Secure Transport (EST) RA component that runs on FND. EST RA acts was a proxy between the FDs and PKI infrastructure and AAA Server to authenticate, authorize, and enroll the FDs with the Utility's certificate infrastructure in order for them to fully join the mesh network. The device enrollment is done using EST (RFC 7030) over a CoAP/DTLS session. Figure 108 depicts the process. This greatly simplifies the device certificate management process.

    Note: The EST RA functionality is enabled by default on FND, but it can also be deployed on an external server running Linux OS.

    Figure 108 IR510/IR530 New Enrollment Process

     

    257060.jpg

    The Cisco SUDI certificates are installed during the manufacturing process. At staging, only the Utility Trust Anchor certificate (FND RA public certificate) together with basic RF configuration (SSID, etc) needs to be provisioned on the Field Devices.

    Figure 109 FD Certificate Enrollment using EST Overview

     

    257061.jpg

     

    Note: For additional details about the Control Center Network Management design considerations, refer to the Cisco FAN Head-End Implementation Guide at the following URL:

    blank.gif https://salesconnect.cisco.com/open.html?c=db570d3f-3212-4659-a306-5f65aeab862b

    Once the devices registration is successful with FND, the device will receive its full configuration.

    Cellular FAN DA Solution Devices Onboarding Overview

    Figure 110 Cellular Gateways Onboarding Process

     

    257051.jpg

    Today, the FAN DA Cellular solution supports the device onboarding methods shown in Table 51:

    Table 51 FAN DA Cellular Device Onboarding Supported Options

    Device Type
    Bootstrapping Config
    Device Certificate Enrollment
    Transport Supported

    IR1101

    ZTD 1.0 or ZTD 2.0

    Manual or SCEP (RSA)

    Ethernet or Cellular

    IR807

    ZTD 1.0 or ZTD 2.0

    Manual or SCEP (RSA)

    Ethernet or Cellular

    IR809

    ZTD 1.0 or ZTD 2.0

    Manual or SCEP (RSA)

    Ethernet or Cellular

    The FAN DA Cellular product can be managed On-Premise with two different management systems: Field Network Director (FND) or Cisco DNA Center (DNAC). FND is the preferred NMS for Cisco FAN DA. Cisco DNAC could be used in customer accounts where the utility enterprise IT team is already using DNAC for the enterprise network and would like to maintain the same look and feel for the OT network.

    Cisco also offers a cloud-based gateway management solution called Cisco Kinetic Gateway Management (GMM) for customers that prefer a subscription model.

    Reference documentation

    blank.gifCisco Field Network Director (FND) Overview:

    blank.gif https://www.cisco.com/c/en/us/products/cloud-systems-management/iot-field-network-director/index.htmlCisco DNA Center Overview

    blank.gif https://www.cisco.com/c/en/us/products/cloud-systems-management/dna-center/index.html

    blank.gifCisco Kinetic Gateway Management (Cloud Service):

    blank.gif https://developer.cisco.com/docs/kinetic/ - !gmm-overview

    DA Cellular Gateway Device Onboarding using ZTD 1.0 (Manual Bootstrapping)

    The device onboarding process for Cellular Gateway products is similar to the Unlicensed 900 MHz FAN solution described in Field Area Routers (FAR) Device Onboarding using ZTD 1.0 (Manual Bootstrapping).

    The main differences between the products are the cellular interface naming:

    blank.gifIR1101 format is: interface Cellular 0/1/0 for SIM0 slot or Cellular 0/1/1 for SIM1 slot

    blank.gifIR1101 expansion module: interface Cellular 0/2/0 for SIM0 and Cellular 0/2/1 for SIM1 slot

    blank.gifIR807 format is: interface Cellular 0

    blank.gifIR809 format is: interface Cellular 0

    DA Cellular Gateway Device Onboarding using ZTD 2.0 (PnP)

    The preferred method for provisioning cellular products is the ZTD 2.0 because it simplifies deployment and allows for a large number of devices to be provisioned automatically. The same process described in Automatic Device Bootstrapping Configuration using PnP Connect applies to IR1101, IR807, and IR809.

    Figure 111 Device Onboarding using ZTD 2.0 (PnP)

     

    257049.jpg

    Devices can be onboarded at the staging facility or final destination. Provisioning devices at the final destination reduces costs associated with the staging facility, additional equipment shipments, and resource travel. It improves solution security by removing the dependency on good third-party security policies. On the other hand, this new type of deployment requires thorough testing during the Pre-deployment phase to ensure that all components work as expected to avoid additional site visits. When devices are provisioned over cellular backhaul connectivity, the solution depends on the Service Provider service reliability.

    Note: It is highly recommended that customers test the process few times before mass deploying all the DA Cellular Gateways during the deployment phase in order to avoid additional truck rolls post deployment.

    FAN Device Software Management

    FND manages the firmware upgrades for the FAR devices, DA Gateways, and Range Extenders. The first stage of the upgrade process is firmware upload to local device storage. Upgrades can be scheduled so that they can be done during times where the grid network activity is slow. The process can be canceled or paused at any time.

    Firmware upgrades can be done per groups of devices and for special cases per device. FND will first check if the targeted devices have the software firmware on flash so that it reduces the WAN bandwidth utilization. Upgrades are done in increments of 16 devices at the same time from a particular group. Since flash space is limited, the FND will delete the unused software and, if there isn't enough space, it will notify the administrator that the device requires manual upgrade intervention.

    Depending on the FAR backhaul bandwidth, the software load stage to the FAR could take anywhere from a few minutes to few hours. The length of this process is also influenced by the number of Field Devices and the Control Center WAN interface bandwidth oversubscription ratio to the number of substations. To optimize the upload stage, firmware can be delivered over IPv6 multicast to optimize the mesh bandwidth usage for a large group of devices. Unicast is also supported. Based on the number of targeted devices, the FND will calculate whether multicast or unicast are most efficient at transmitting the firmware.

    Mesh devices have flash storage divided in the Uploading slot, Running slot, and Back-up slot. When FND is uploading a new firmware image, it will overwrite the Uploading slot information. The image is uploaded block by block and only missing blocks are retransmitted.

    With the new Resilient Mesh 6.0 software, the mesh uses the Cascade Firmware Upgrade feature to further improve the software delivery process and reduce the bandwidth utilization of the mesh. The firmware is reusing Multicast Protocol for Lossy Networks (MPL) RFC 7731, data and control messages with a slightly different behavior to improve functionality. Now retransmission take place between node and parents rather than between nodes and the FND. Mesh devices use ICMPv6 packets to carry the MPL request control messages that carry the seed ID and message bitmap and are sent to the neighbor IPv6 unicast link-local address instead of to the multicast link-local address (e.g. FC02::FC). The multicast messages are sent in the broadcast channel based on the common broadcast PAN scheduler.

    The performance or the duration of the firmware delivery is affected by the number of forward density and the numbers of hops to a destination. Higher density within an area also improves the source multicast input rate into mesh. The mesh number of nodes: small, large; number of hops and the FND input rate.

    Administrators can control the source multicast input rate as Fast, Medium, and Slow into the mesh on the FND management servers. The MPL messages are marked with a QoS DSCP value of 0 (low priority).

    Figure 112 Mesh Cascade Firmware Update Overview

     

    256104.jpg

    Figure 112 shows an example of how the FD image firmware is split in blocks that are encapsulated in IPv6 packets. The MPL messages are low priority and are transmitted in the Mesh Broadcast time slots. Each time slot is depicted by a different color to show how packets are transmitted over time. When a node has not received the entire firmware image blocks, it can request its parents to retransmit the missing blocks.

    Once all the firmware image blocks are received, the device image is loaded in the running slot and the device is rebooted.

    WAN Device Management

    The HER devices use the traditional NMS protocols like SNMP to be monitored and managed. Customers can use the existing management tools to perform device management functions. For new deployments, Cisco Prime Infrastructure is an advanced management tool and can be used to manage the WAN infrastructure.

    Note: Cisco Prime Infrastructure product overview. Refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/cloud-systems-management/prime-infrastructure/index.html

    Edge Compute Software Management

    Starting with FND 4.4, Edge Compute applications can be managed from a single GUI interface of the FND server. Customers can package their application in a Docker-style container and upload it to the FND for delivery and installation on the Field Device. FND uses an optimized process for software delivery similar to the FD firmware update.

    Figure 113 Edge Compute Develop and Deployment Overview

     

    256105.jpg

    Design Recommendation :

    Network designers should work closely with the utility software development team or third-party software companies. Since the mesh bandwidth is valuable and limited, software developers should try to optimize the way the application and its library dependencies are packaged in the EC container. Avoid monolithic packaging because any changes will require the entire container to be re-deployed over the mesh, thereby consuming unnecessary bandwidth. A good container design should leverage the Docker's layering function and separate the elements that have a higher probability of changing. For example, the application code should be part of its own layer since software programmers most likely will release new software functionality more often than the application's library decencies updates. In addition, if the application requires multiple libraries like Java and Python and then separating each library in a layer would help with critical software patch releases.

    Figure 114 FND EC Application Deployment

     

    256106.jpg

    Administrators can use the FND's graphical statistic page to check the status of the application running state, device resource utilization, and troubleshoot any application crashes. Application and device log analysis was optimized so that administrators can remotely look at the most recent events or retrieve events based on a customized time period.

    Figure 115 FND FD IOX Application Statistics

     

    256107.jpg

    Device Work Order Ticket

    FND has a built-in Work Order functionality so that only authorized Field Technicians can access the FAR and the FD Mesh devices. Cisco provides the IOT Device Manager (IOT-DM) graphical tool for local management of the mesh device using the out-of-band management Console or Ethernet port.

    Figure 116 IoT Device Manager Graphical Interface

     

    256108.jpg

    Access to these ports is restricted by enforcing network authentication and authorization for a period of time defined in the FND Work Order ticket form. The authentication is based on certificates and no user and password needs to be exchanged with the Field Technician, greatly increasing the solution security. Once the work order expires, the Field Technician laptop won't be able to access the Field Device even if there is physical access. A new work order ticket will need to be created by the System Operation personnel at the Control Center.

    Figure 117 Work Order Process Flow

     

    256109.jpg

    Field technicians can perform local troubleshooting and upload new firmware using the graphical IOT-DM tool.

    Note: Cisco IOT-DM tool requires Windows 10 OS. Please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/iot_dm/guide/guide_5_0/b_iot-device-manager-5/b_iot-device-manager-5_chapter_00.html

    Network Availability and Resiliency

    FAN Infrastructure Layer

    The IEEE 802.15.4g supports only one coordinator per wireless mesh PAN. The FAR device acts as the PAN coordinator. In order to increase the network availability and the resiliency of the network and avoid Inter-PAN migration, Cisco has developed the High Availability feature. Customers can deploy FAR chassis WPAN module redundancy or FAR chassis redundancy based on the targeted network availability goals.

    Note: The High Availability feature requires an external Ethernet cable between the redundant chassis.

    In both implementation types, the WPAN modules work in an Active/Standby configuration. Only one WPAN, the Active WPAN module, sends and receives IEEE 802.15.4g frames. The Active WPAN synchronizes its state (Neighbor table, RPL, etc.) with the Standby WPAN over the FAR backplane when the two are installed in the same chassis or over a Layer 2 Ethernet link between the two chassis.

    Figure 118 HA WPAN Deployment: Normal and Abnormal Conditions

     

    256110.jpg

    For WPAN HA configuration, the network recovery time is within a few seconds.

    Figure 119 HA Chassis Deployment: Normal and Abnormal Conditions

     

    256111.jpg

    For Chassis HA, the recovery time is within a couple of minutes based on the failure detection timers.

    For Highly Available environments, Cisco recommends customers deploy the chassis redundancy because it offers additional protection against failures.

    Table 52 HA Deployment Benefits

     
    WPAN HA
    Chassis HA
    Cost

    Lower

    Higher

    WPAN Failure

    Yes

    Yes

    Chassis failure

    No

    Yes

    Power Source Redundancy

    No

    Yes

    Failure Replacement Process

    Requires network outage

    Non-disruptive

    To increase the overall PAN network availability, each node should have at least two upstream links to two diverse parents. Ideally, the higher the separation angle between the two links, the better so that a single obstruction will not affect both paths. In addition, the two upstream links should have similar characteristics in terms of performance and rank position within the RPL routing tree. This will help minimize the downstream tree topology changes during a link failure. In places where no redundancy exists, customer should consider installing additional Range Extenders (IR530) to increase the node network availability, as shown in Figure 120.

    Figure 120 Upstream Link and Node Redundancy: Normal Conditions with Upstream Redundancy and Link & Node Failure

     

    256112.jpg

    Customers should consider adding battery backup to the DA Gateway devices to increase the network uptime, especially during a distribution feeder power outage. The IR510 can be powered using DC Input voltage with a range from ± 9.5V to ± 60V. Depending on the geographical climate deployment area and the additional equipment that is installed in the enclosure housing, the IR510, customers should ensure proper spacing between equipment or consider additional ventilation options to stay within the IR510's thermal requirements for proper operations.

    Note: The system shall operate in a thermal environment from -40 °C to 85 °C.

    Figure 121 DA Gateway Battery Back-up Example

     

    256113.jpg

    For customers that are looking to leverage a solar panel to charge the UPS battery attached to an IR510, the following data can be used as a reference guide for initial system planning to determine if the equipment location installation can provide enough solar energy to meet the system availability.

    IR510 power usage depends on the radio transmission modulation, data rate, and transmitter duty cycle. For 50% duty cycles, the power usage is around 8W. Power usage increases when additional IOX applications are ran on the IR510. When IOX CPU utilization is at 50%, the power consumption will use an additional 4-5W.

    Note: The power usage listed above are reference numbers for an ambient temperature of 77°F (25 °C). For precise numbers check the product data sheet.

    For disaster recovery scenarios, when a wide PAN failure has occurred (for example, the mesh coordinator has failed), the DA Gateways can perform Inter-PAN migration as a gateway of last resort. This migration process is slow and depends on a number of factors:

    blank.gifNode cache knowledge of the adjacent PAN information

    blank.gifNode security credential of the adjacent PAN

    blank.gifNumber of nodes that needs to perform Inter-PAN migration

    blank.gifThe failed PAN ID topology (daisy-chained, etc)

    Figure 122 Disaster Recovery via Inter-PAN Migration: Normal State and PAN ID 1 Failure

     

    256114.jpg

    Note: In order for Inter-PAN Migration to work, both mesh PANs must be configured with the same SSID.

    Design Guidance:

    Inter-PAN migration design should be carefully analyzed as part of the PAN Capacity planning and the PAN targeted performance since traffic volume will increase based on the number of nodes that will migrate and the type of DA applications they support.

    WAN Infrastructure Layer

    FlexVPN can be deployed in a Highly Available configuration to increase network availability and resiliency over the WAN. To prevent hub failures, customers can deploy FlexVPN in a Multi-Hub Stateless Failover or Multi-Hub Resiliency topology within the same Control Center or across the Primary and Back-up Control Centers.

    A FlexVPN with Stateless Failover design has two or more hubs routers configured in an active/standby HSRP configuration and the spoke devices. FAR routers are only configured with a single VPN tunnel to the HSRP Virtual IP address.

    The Multi-Hub Resiliency design uses the spoke backup peer IKEv2 feature where the FAR routers are configured with multiple backup HER IP addresses and only establish one VPN tunnel to the primary HER device. If a failure occurs, the FAR router will establish a new VPN tunnel to the secondary or tertiary HER IP address.

    Table 53 WAN Resiliency Design Considerations

    High Availability features
    Protection scenario
    # of spoke VPN tunnels
    Second HER state
    VPN tunnel status during failure
    Failure Detection Technology
    Convergence
    Special requirements
    Multi-Hub Stateless Failover

    Protects Hub failure

    Single

    Standby

    No impact

    HSRP Keepalive

    msec to seconds

    Layer 2 connectivity between WAN interfaces

    Dual-Hub Resiliency

    Protect Hub failure

    Single

    Standby

    VPN and routing convergence

    IKEv2 DPD Keepalive

    seconds to minutes

    Convergence depends on number of FAR devices

    Multi-Hub Clustering

    Protect Hub failure

    Single

    Active

    VPN and routing convergence

    HSRP Keepalive IKEv2 DPD Keepalive

    seconds to minutes

    Layer 2 connectivity between WAN interfaces. Convergence depends on number of FAR devices

    Multi-Hub Resiliency with WAN transport redundancy

    Protect Hub failure and WAN transport

    Single, with Pivot Tunnel

    Standby

    VPN and routing convergence

    IKEv2 DPD Keepalive

    seconds to minutes

    Convergence depends on number of FAR devices

    Multi-Hub Resiliency with Spoke Dual Tunnel

    Protect Hub failure

    Two tunnels

    Standby

    Routing convergence

    Dynamic routing

    msec to seconds

    Tunnel Interface between the HER devices

    Multi-Hub Resiliency with Spoke Dual Tunnel per WAN transport service

    Protect Hub failure and WAN transport failure

    Two tunnels, one per WAN transport service

    Standby

    Routing convergence

    Dynamic routing

    msec to seconds

    Tunnel Interface between the HER devices

    Notes:

    blank.gifEach HA design convergence can be lowered to a milliseconds or seconds range by using additional failure detection mechanisms like BFD or IPSLA at the expense of the WAN bandwidth utilization.

    blank.gifAdditional configuration enhancements can be done for designs with spoke with dual-tunnel configuration so that traffic is load-balanced across the two HERs.

    Design Guidance:

    If the Inter-PAN migration is desired and implemented, network administrators need to take into consideration the additional dynamic IPv6 host routes that are being learned by the FAR devices from the DA Gateways that join an adjacent PAN. The host route is inserted in the FAR routing table as a connected route, which will need to be advertised into the FlexVPN routing domain so that traffic from the Control Center can be routed toward the new FAR device.

    Figure 123 PAN MAP-T Prefix Advertisement

     

    256115.jpg

    Reference Documentation:

    For Dual-Hub Resiliency, please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/docs/security/flexvpn/118888-configure-flexvpn-00.html

    For the FlexVPN with IKEv2 Load-Balancing Cluster Configuration Guide, please refer to the following URL:

    blank.gif https://www.cisco.com/en/US/docs/ios-xml/ios/sec_conn_ike2vpn/configuration/15-2mt/sec-cfg-clb-supp.html

    For Multi-Hub Resiliency with Spoke Dual Tunnel per WAN Transport Service, please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/support/docs/security/flexvpn/116412-configure-flexvpn-00.html

    Headend Infrastructure Layer

    The FND is a critical application for monitoring and managing a Distribution Grid communication infrastructure. The FND was designed with High Availability in mind and customers should deploy the management software in a redundant configuration to address the overall availability of system.

    FND provides two main levels of HA: one for the front-end FND Server(s) and the other for the backend and database HA:

    blank.gifFND Server HA-This is achieved by connecting multiple FND servers to network load balancer. Traffic originating from the FAN network goes to the load balancer, which uses a round-robin protocol to distribute the load among the FND cluster servers.

    blank.gifFND Database HA-This is achieved by configuring two FND Database servers: a primary server and a standby (or secondary) server. When the primary database receives new data, it sends a copy to the standby database. A separate system runs the Observer (the Observer can also run on the standby server), which is a program that monitors the FND Database servers. If the primary database fails, the Observer configures the standby server as the new primary database. FND Database HA works in single and cluster IoT FND server deployments.

    Figure 124 FAN Management System High Availability

     

    256116.jpg

    HA Guidelines and Limitations:

    blank.gifAll FND nodes must on the same subnet

    blank.gifAll FND nodes must run on similar hardware

    blank.gifAll IoT FND nodes must run the same software version

    Note: The WAN and FAN devices can continue to operate for days during a catastrophic FND failure, but since FND maintains the security mesh keys for the FAN devices, customers must recover the FND server to avoid any disruption in the monitoring and controlling of the DA grid devices.

    Design Guidance:

    Cisco recommends customers consider deploying the FAN management infrastructure to a disaster recovery site as well; for example, Back-up Control Centers, especially for utilities located in areas open to natural disasters.

    Besides the FND and FND Database, the other Headend Infrastructure components (Load-balancers, firewalls, DNS, DHCP, and Active Directory) should also be deployed in HA configuration to increase the availability of each service.

    Reference Documentation:

    For the Cisco FND HA Installation Guide, please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/iot_fnd/install/4_1/iot_fnd_install_4_1/high_availability.html

    For the Cisco Prime Network Register (CNR) HA - Administration Guide, please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/net_mgmt/prime/network_registrar/9-1/administration/guide/Admin-Guide/Admin-Guide_chapter_00.html

    For the Cisco Firewall HA Configuration Guide, please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/security/firepower/601/configuration/guide/fpmc-config-guide-v601/fpmc-config-guide-v601_chapter_01100110.pdf

    For F5 Big IP Load Balancer Product Information, please refer to the following URL:

    blank.gif https://www.f5.com/services/resources/glossary/load-balancer

    Equipment Mean Time Between Failures

    Table 54 shows the Equipment Mean Time between Failures.

    Table 54 Cisco Resilient Mesh MTBF Values

    Device Type
    MBTF (hours)
    Configuration Details

    CGR1240

    512,750

    1xCGM 802.15.4
    1xCellular
    3xBBU

    IR510

    758,700

    Predicted MTBF (pMTBF)

    IR530

    7,854,516

    Predicted MTBF (pMTBF)

    Network Scalability

    FAN Infrastructure Layer

    Customers can connect multiple grid devices to the DA Gateways based on the device communication interface.

    Table 55 IR510 DA Gateway Supported Interfaces

    Grid Device Interface
    Number of devices
    Port
    Comments

    RS232

    2

    Serial 0 (DCE) or Serial 1 (DTE)

    Might require gender changer if grid devices are the same type

    RS485

    32

    Serial 0 (DCE)

    Multi-drop, half-duplex. Must be terminated externally.

    Ethernet

    254

    Ethernet 0

    For more than one device, customers must use a layer 2 switch

    Design Guidance:

    Currently, IR510 software version 6.0 supports up to 15 NAT44 entries; therefore, the number of IPv4 grid devices behind the IR510 varies and it's less than 254. This only applies when the application session towards the IPv4 grid device is initiated from a remote location and each IPv4 grid devices uses many unique ports.

    Note: Out of the 15 NAT entries, three are used by the IOX module on the IR510

    For DA deployments, this is not a problem since typically on an electrical feeder pole or underground cabinet, no more than two or three IPv4 grid devices exist

    IPv6 grid devices do not have any restrictions. Cisco recommends customers make IPv6 support a requirement during the purchase evaluation process for electric grid equipment.

    Note: The range extender (IR530) does not have any communication interface to connect grid devices

    Each mesh device can maintain a large number of Layer 2 neighbors. A node maintains connectivity status up to four parents (upwards) and has no limit for downwards children when the Resilient Mesh operates in Non-Storing mode. When the mesh needs to support peer-to-peer communication, then the mesh must be configured for Storing Mode, which has a limit of the number of downward routes it can store.

    Table 56 IR510 and IR530 Neighbor Scalability Limits

    Description
    Maximum Number
    Comments
    Layer 2 neighbors

    2048

    Parents and Children

    Layer 3 downwards routes

    300

    Children, in Storing Mode

    PAN ID info

    2

    PAN info cached

    The IR510 has a dedicated Edge Compute environment where customers can load grid applications to perform different local functions. The IR510 CPU is divided into CPU units that can be allocated and reserved to each application container.

    Table 57 IR510 IOx Hardware and Software Resources

    Description
    Physical
    Size
    Comments
    CPU

    ARM 1.0GHz

    3940 Units

     

    Memory (RAM)

    2 GB

    1.5 GB

    Volatile Memory

    Permanent Storage

    4 GB

    3.5 GB

    Flash Memory

    Serial Interfaces

    DCE and DTE

    2

    Serial interface cannot be shared

    Ethernet Interface

    ETH0

    1

    Access interface.

    Bridged or Routed Mode

     

     

     

    USB

    USB

    1

    Management file transfer

    Cisco CGR routers have four modular slots where customers can install a few combinations of modules based on the desired service.

    Table 58 CGR Module Limits

    Description
    Maximum Number
    Comments

    CGM WPAN 900Mhz

    1

    Two will supported in the future CGR1xxx Slot4

    CGM 4G LTE with 900 support

    1

    CGR1240 Slot3 CGR1120 Slot3

    CGM SRV (IOx)

    1

    CGR1240 Slot5 CGR1120 any slot

    Besides these modules, customers can attach serial as well as IP devices to the CGR.

    Table 59 CGR Maximum Number of Interfaces Supported

    Description
    Maximum Number
    Comments

    RS232

    2

    When in HA configuration

    RS485

    2

    Provider diversity and redundancy

    Gigabit Ethernet

    2

     

    Fast Ethernet

    2/4

    CGR1120 has 2 additional interfaces

    Cisco CGR routers act as PAN Coordinators and maintain additional information compared to IR510 or IR530. The following are device software limits, including the maximum PAN size.

    Table 60 CGR CGM (WPAN) Neighbor Scalability Limits

    Description
    Maximum Number
    Comments
    Layer 2 neighbors

    2048

    Parents and Children

    Layer 3 downwards routes

    300

    Children, in Storing Mode

    Maximum nodes per PAN

    Varies

    Based on application type and bandwidth requirements

    Design Guidance:

    When designing a PAN size, network administrators need to account for all nodes during normal operations, but also for any potential nodes from nearby PANs that can perform Inter-PAN migration.

    Initially, Edge Compute was implemented in the CGR routers using the native hardware. Since CGR routers act as aggregation devices, the onboard resources were not sufficient for today's edge process requirements. For small application and low resource utilizations, customers can still leverage the native IOx functionality. Recently, Cisco has developed a dedicated Edge Compute module called Cisco Server Module (CSM) that customers can install in one of the CGR router slots.

    Table 61 Cisco CGR CGS Module IOx Hardware and Software Resources

    Description
    Physical
    Size
    Comments
    CPU

    AMD x86, 4 cores, 800 MHz

    7318

    --

    Memory (RAM)

    4 GB

    3.8 GB

    Volatile Memory

    Permanent Storage

    64 and 128 GB

    50 and 110 GB

    There are 2 types of modules

    Serial Interfaces

    DCE and DTE

    2

    Serial interface cannot be shared

    Ethernet Interface

    ETH0

    2

    Internal and External

    USB

    USB

    2

    External storage, Management file transfer

    Cisco IOx

     

    Yes

     

    Windows 7/10 VM

     

    Yes

    KVM virtualization

    Linux Ubuntu 14.04 and up VM

     

    Yes

    KVM virtualization

    Design Guidance:

    The CGR onboard IOx resources should be disabled and allocated 100% to IOS when a CSM module is installed.

    Reference Documentation:

    For the Cisco CGR CGM Module Data Sheet, please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/1000-series-connected-grid-routers/datasheet-c78-739683.html

    For additional Edge Compute resources, especially for large Edge Compute deployments, customers can deploy one or more IC3000 hardened server appliances within the substation network.

    Table 62 Cisco IC3000 IOX Hardware and Software Resources

    Description
    Physical
    Size
    Comments
    CPU

    Intel x86, 4 cores, 1.25GHz

    10260 Units

     

    Memory (RAM)

    8 GB

    6.4 GB

    Volatile Memory

    Permanent Storage

    128 GB

    96 GB

    mSATA

    Gigabit Ethernet Interface

    3

    2

    One dedicated to management

    Serial Interfaces

    USB

    2

     

    Reference Documentation:

    For the Cisco IC3000 Appliance Data Sheet, please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/products/collateral/routers/3000-series-industrial-compute-gateways/datasheet-c78-741204.html

    For the Cisco IOx Product Matrix, please refer to the following URL:

    blank.gif https://developer.cisco.com/docs/iox/ - !platform-support-matrix/platform-support-matrix

    WAN Infrastructure Layer

    FAR routers in the substation establish VPN tunnels with the HER devices in the Control Centers. Typically, only a pair of FAR devices is installed in the substation yard in a HA configuration.

    Design Guidance:

    Additional FAR devices or pair of devices can be installed within the same substation yard, but additional RF consideration must be taken into account that could have a negative impact on network performance. This topic is further explained in RF Design Considerations.

    Depending on the WAN design solution, the active CGR router initiates one or multiple FlexVPN tunnels based on the utility Control Centers and Disaster Recovery locations. A network administrator has no limitations to take into consideration from a VPN tunnel scaling prospective, nor from a packet-forwarding rate because maximum Mesh radio physical interfaces rate of 1.2Mbps is much less than the FAR WAN interfaces when connected via an Ethernet interface 10, 100, or 1000Mbps.

    The scaling consideration come into place for deployments with a large number of substations where the HER device (FlexVPN server) needs to aggregate lots of VPN tunnels. The HER maximum of tunnels is greatly affected by how many protocols the device must maintain besides the VPN control plane. GRE, dynamic routing protocols, aggressive routing protocol timers, and security services all reduce the maximum number of tunnels support on a HER device.

    Table 63 Headend Router Scalability Limits

    Platform
    QoS Enabled
    IKEv2 Max. Tunnels
    IKEv2 Dynamic Routing Max Tun.
    BGP Max Adj.
    EIGRP Max Adj.

    CSR 1000v with 4vCPU 4G

    Yes

    1000

    1000

    2000

    1000

    ISR 4451

    Yes, 10% performance impact

    2000

    2000

    2000

    1000

    ASR 1001-X

    Yes, 16K queues

    4000

    4000

    4000

    1000

    ASR 1002-X

    Yes, 128K queues

    10,000

    10,000

    6000

    1000

    Headend Infrastructure Layer

    The FND can support a large number of devices as listed in Table 64 and Table 65:

    Table 64 Field Network Director Scalability Limits

    FND Version
    IOX Support
    Max. # of FAR
    Max. # of DA Gateways
    Comments

    FND 4.1

    No

    5000

    5 million

    --

    FND 4.3

    Yes

    5000

    6 million

    Requires integration with FD

    Table 65 Fog Director Scalability Limits

    FD Version
    FND Integration Required
    Max. # of IOx Nodes
    Default Update Timer
    Comments

    FD 1.5

    Yes

    1.5 million

    2 hours

    Fog Director in two node cluster

    Network Flexibility

    FAN Infrastructure Layer

    The Cisco Resilient Mesh solution can support any type of network topology. The number of nodes, their location and each node RF coverage will dictate how the topology will look like. A node supports point-to-point and point-to-multipoint connectivity to other adjacent nodes. If multiple nodes share the same RF signal propagation area, then they can form a hub-and-spoke or star topology.

    Figure 125 Basic Network Topology

     

    256117.jpg

    The network can be extended beyond one RF area by using composite network topologies like daisy chain, tree, or cluster tree topologies that cover larger distances.

    Figure 126 Extended Network Topologies

     

    256118.jpg

    The solution supports dynamic changes to the network topology because of RF changing conditions, referred to as node migration. Intra-PAN migration takes place when a node chooses another node as its parent based on the shortest path cost to the FAR router. Inter-PAN migration occurs when a node decides to join another adjacent PAN ID because of lack of additional parents in its original PAN or better PAN characteristics like lower node density and a better path cost to the new FAR.

    Figure 127 Node Migration

     

    256119.jpg

    Inter-PAN migration can be prevented by configuring a unique SSID per substation, practically creating two different mesh networks. For these types of designs, administrators should plan for additional redundancy within the PAN where each node has multiple candidate parents to avoid node isolation as shown in Figure 128:

    Figure 128 Prevent Inter-PAN Migration

     

    256381.jpg

    The RF mesh PAN topology will be largely determined by the physical aspect of the grid electrical network and the available utility assets infrastructures (electrical polls, street lighting polls) where communication can be installed. The following factors influence the overall topology:

    blank.gifNumber of DA grid devices

    blank.gifNumber of FAR devices based on the number of substations and areas with high density of nodes.

    blank.gifNumber of Range Extenders for signal coverage gaps or increased node redundancy

    One important aspect of a FAN DA solution is that the RF topology does not have to match the electrical grid feeder network unless specific peer-to-peer communication is required between two DA devices.

    Design Guidance:

    Cisco recommends not engineering every node and link on how data is forwarded through the mesh since the solution was designed to be dynamic and to support the end DA device communication based on the best path, shortest path through the mesh independent of the communication equipment feeder asset installation.

    Figure 129 Feeder Agnostic Mesh Topology

     

    256378.jpg

    WAN Infrastructure Layer

    The Cisco FAN Distribution Automation solution was developed to fit different customer WAN environments.

    The solution is agnostic to the customer's WAN transport service (private dark fiber, managed Layer 2 Metro Service, Layer 3 MPLS service, or Private or Managed Cellular service) by using an overlay VPN design based on FlexVPN. The DA packets are encrypted and encapsulated with new IP headers that are hidden from service providers as they travel through the WAN infrastructure and protect the utility grid infrastructure. For Private Cellular services where utility has an acquired licensed spectrum, the FAR devices must be connected to the utility's private modem.

    Figure 130 Supported WAN Transport Services

     

    256120.jpg

    The overlay VPN supports any-to-any or hub-and-spoke topologies at the transport layer. For FLISR scenarios where peer-to-peer is required between substations, customers should consider implementing or acquiring an any-to-any topology to achieve optimal traffic flow between substations.

    Customers can provision new services and connect the FAR router directly to the service via an Ethernet or Cellular interface. Security features can be configured to harden the security of the devices. Customers with an existing substation WAN can take the FAN Aggregation layer (FAR devices) and connect them to their existing Substation WAN router or LAN infrastructure. For NERC/CIP substations or for more secure designs, the FAR routers can be connected to the Substation DMZ to ensure that no unauthorized traffic enters the utility network from the FAN network.

    Figure 131 Substation Connectivity Options

     

    256121.jpg

    The solution can be customized to fit different SCADA architectures over existing WAN implementations as described in Utility SCADA Systems Architecture Overview. If the WAN infrastructure is not IPv6 or multicast ready, the overlay VPN implementations helps the customer deploy the DA solution without having to make changes to the existing WAN infrastructure.

    Cisco has offered different types of security products and VPN technologies such as remote client access and site-to-site VPN for quite some time. The VPN technologies evolved over time into different solutions like EasyVPN and DMVPN. With the release of FlexVPN, Cisco has consolidated the VPN implementation types and their configuration syntax under a single architecture to simplify the solution management.

    Cisco FlexVPN incorporates the remote client access functionality of EasyVPN, Cisco VPN Client, and Cisco AnyConnect as well as site-to-site VPN functionality of point-to-point VPN or FlexVPN.

    The FAN DA solution does not require any remote client access functionality. The VPN tunnels between the FAR and HER devices are site-to-site VPN tunnel.

    The most predominant VPN designs are either FlexVPN or DMVPN. Cisco recommends FlexVPN for the FAN DA solution because its control plane is lightweight out of the box compared to DMVPN. FlexVPN designs are more appropriate for Cellular service deployment or when Cellular is used as a backup solution since usage is metered and utilities typically acquire a shared data plan among all FAN devices. By default, customers can use FlexVPN IKEv2 routing functionality to advertise Control Center routes during the tunnel establishment phase without needing to run a dynamic routing protocol over the VPN tunnel. This approach will conserve additional bandwidth used by the control plane of a routing protocol over the Cellular network.

    Another benefit of the FlexVPN over DMVPN is that the VPN Hub servers, the HER devices can implement point-to-point tunnels per remote FAR device, cloned from a virtual-interface template, instead of a DMVPN multi-point GRE interface. This enables customers to configure QoS policies per remote location, especially when remote sites have different transport services. In situations of Private WAN implementations, customers might have a higher WAN interface service for critical or larger substations than for others. In case of Cellular, a remote location could be further located from the Cellular; therefore, its link bandwidth and performance characteristics would be different from other locations. This will avoid the VPN hub from oversubscribing the remote site backhaul link.

    FlexVPN and DMVPN support both any-to-any and hub-and-spoke topologies.

    Figure 132 FlexVPN Network Topologies

     

    256122.jpg

    Design Guidance:

    Implementing a hub-and-spoke topology does not mean that Substation A devices or mesh devices can't talk to devices located at Substation B. Additional configuration needs to be applied to either filter the substation routes or only advertise specific Control Center networks. FlexVPN IKEv2 routing simplifies this configuration by associating specific routes per remote location spoke identity.

    Customers that have already implemented DMVPN can modify the design to leverage FlexVPN instead; this can be easily done since DMVPN is a subset of FlexVPN.

    Headend Infrastructure Layer

    The Cisco FAN Headend infrastructure can be added to the existing Control Center LAN infrastructure as a separate block or part of the existing network.

    The headend infrastructure requires a few elements like RA, TPS, or HER to be located in the Control Center WAN block in a DMZ area. The FAR devices will need reachability over the WAN transport infrastructure, not the overlay VPN, to the three elements in the DMZ during the FAR device provisioning process.

    In addition, the FAN solution requires integration with the existing Active Directory and the certificate (PKI) infrastructure. Customer can also choose to build a dedicated environment.

    Figure 133 FAN DA Headend Integration

     

    256123.jpg

    Customers that have deployed the Cisco FAN AMI solution can re-use the AMI Headend infrastructure (HER, RA, TPS, and FND) for the DA solution, as long as the infrastructure is upgraded to the FAN DA software version required.

    For small-scale deployments, customers can install multiple components in a virtual environment like VMWare vSphere.

    Table 66 Headend Infrastructure FAN Components

    FAN Component
    Virtualization Support
    Device Type
    Documentation Link
    FND

    Yes, VMware ESXi

    --

    https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/iot_fnd/install/ova/installation_ova.html

    TPS

    Yes, VMware ESXi

    --

    --

    HER

    Yes, VMware ESXi, Microsoft Hyper-V, KVM

    CSRv1000

    https://www.cisco.com/c/en/us/td/docs/routers/csr1000/software/configuration/b_CSR1000v_Configuration_Guide.html

    RA

    Yes, VMware ESXi

    CSRv1000

    https://www.cisco.com/c/en/us/td/docs/routers/csr1000/software/configuration/b_CSR1000v_Configuration_Guide.html

    Firewall

    Yes, VMware ESXi

    ASAv

    https://www.cisco.com/c/en/us/td/docs/security/asa/asa93/asav/quick-start/asav-quick/asav-vmware.html

    PKI

    Yes, any

    --

    --

    IPAM

    Yes, VMware ESXi

    CNR

    https://www.cisco.com/c/en/us/td/docs/net_mgmt/network_registrar/7-2/installation/guide/CNR72Install/IGvirtualappliance.html

    RF Design Considerations

    This section starts with an overview of the ISM Band, regulatory requirements, and IEEE 802.15.4 Layer 1 and 2 standards used by the Cisco Resilient Mesh Solution. Readers familiar with these topics can skip the introduction and delve into Cisco Resilient Mesh Release Overview,

    It is followed by a detail review of the Cisco Resilient Mesh based on the new IEEE 802.15.4 Option 2 that uses OFDM modulation for increased performance to support the Distribution Automation Use Cases highlighted in Use Cases.

    ISM Band Overview

    Cisco Resilient Mesh uses the 900Mhz Industrial, Scientific, and Medical (ISM) radio band, which is an unlicensed frequency band for transmission between the FAN communication equipment.

    The ISM bands for North America use the following frequency ranges:

    1.blank.gif 902-928 MHz, (26 MHz block)

    2.blank.gif 2400-2483.5 MHz, (83.5 MHz block)

    The following regulatory compliance institutions define the rules for the spectrum usage accordingly to their country policies:

    1.blank.gif USA—Federal Communication Commission (FCC), documented in the Code of Federal Regulations (CFR) Title 47 in Parts 15.247 and Part 15.249

    2.blank.gif Canada—Industry Canada (IC), documented in RSS-Gen, RSP-100 and RSS-210

    3.blank.gif Mexico—NORMA Oficial Mexicana, documented in NOM-121-SCT1-2009

    The key takeaways from the FCC documents is that while the ISM band uses unlicensed spectrum, it does not mean that it is not regulated or controlled and that equipment vendors cannot adhere to these rules. Cisco products are certified and compliant with these FCC rules.

    Table 67 FCC Part 15.247 Requirements List

    Description
    Limits

    Frequency Hopping

    Required

    Lower Modulation Max Avg. Occupied Channel Interval in a 20 seconds period

    400 msec

    Lower Modulation Min. # of Hopping Channels

    50 channels

    Digital Modulation minimum channel bandwidth required

    500 kHz

    Digital Modulation Max Avg. Occupied Channel Interval in a 10 seconds period

    400 msec

    Digital Modulation Min. # of Hopping Channels

    25 channels

    Hybrid Systems with freq. hopping and digital modulation Max. Avg Occupied Channel Interval is equal to the number of channels * 0.4

    400 msec

    Conducted Transmit Power

    1 Watt (+30dBm)

    Effective Isotropic Radiated Power (EIRP)

    4 Watts (+36dBm)

    Maximum Antenna Gain

    6 dBi

    The Cisco Resilient Mesh falls under the hybrid systems category therefore the window period of time is based on the number of channels multiply by 0.4, which is 31*0.4 = 12.4 seconds.

    The 900Mhz band is optimal for last mile connectivity. The combination of transmit power and wavelength results in a good signal propagation covering distances up to a mile or a mile and a half in environments with line of sight (LoS) and low noise floor, while offering enough bandwidth to support Utility Distribution Applications at a much lower cost than Fiber, Wi-Fi, Cellular, or other radio technologies.

    The use of narrow band allows receivers to demodulate the signal at lower RSSI levels. Together with the frequency hopping technique, which spreads the signal over the ISM band, this technology has an operational advantage in environments with higher noise floor levels and interference and allows it to co-exist with other systems operating within the ISM band and supports deployment of dense area of DA Gateways.

    Figure 134 FCC 900MHz ISM Narrow Band Spread Spectrum

     

    256124.jpg

    Reference Documentation:

    For the CFR Title 47 - Part 15.247 and Part 15.249, please refer to the following URL:

    blank.gif https://www.ecfr.gov/cgi-bin/text-idx?SID=836adae133a1c7714e83c4db3eec1b2c&mc=true&node=pt47.1.15&rgn=div5 - se47.1.15_1247

    blank.gif https://www.ecfr.gov/cgi-bin/text-idx?SID=836adae133a1c7714e83c4db3eec1b2c&mc=true&node=pt47.1.15&rgn=div5#se47.1.15_1249

    For the IC RSS-Gen, please refer to the following URL:

    blank.gif https://www.ic.gc.ca/eic/site/smt-gst.nsf/eng/sf08651.html

    blank.gif https://www.ic.gc.ca/eic/site/smt-gst.nsf/eng/sf01320.html

    For the NORMA, please refer to the following URL:

    blank.gif http://dof.gob.mx/nota_detalle.php?codigo=5147409&fecha=21/06/2010

    ISM Interference Considerations

    Radio networks are prone to interferences and frame retransmission due to the inherent nature of the medium through which the signal must travel.

    In general, interference can be classified in the two categories:

    blank.gifAdjacent Band Interference (Out of band)

    blank.gifIn-Band Interference

    In the case of adjacent band interference, neighboring communications systems near the ISM band in North America could lead to link poor performance within a specific area, especially when the systems are co-located. Even though FCC has allocated Guard Bands between bands to limit the RF leakage between bands, this will not solve improper customer equipment installation like the one in Figure 135.

    Figure 135 Co-Site Interference Example

     

    256333.jpg

    The most relevant neighboring systems to ISM band are:

    blank.gifCellular Band 5, 26 operating on 850Mhz with high transmit power: up to 100 Watts per channel

    blank.gifLand Mobile Radios (LMR) systems, which operate in the 851-870 MHz band using up to 35 Watts of transmit power

    Cellular Band 5 is used by Verizon for LTE and for legacy 2G/3G services. Verizon's LTE service primary band in USA is Band 13 (700 MHz). AT&T also uses this band primarily for some 3G (HSPA+) services and for LTE service where the primary bands are not available: Band 12 and 17 operating in 700 MHz range.

    Cellular Band 26 is used by Sprint 4G in some rural areas to boost coverage or within buildings.

    Figure 136 ISM Adjacent Frequency Bands Considerations

     

    256125.jpg

    Cisco Resilient Mesh products have built-in filters to reduce the RF leakage from adjacent bands that desensitize the Cisco DA Gateways receiver. Based on the interference system type and location, customers might need to install Cisco DA Gateways away from interference sources. If that's not possible due to limited installation locations, then customers can use external filters that are designed to reduce the interferences on a given frequency range. The filter will be installed between the Cisco DA RF port and the antenna.

    It is highly recommended that customers engage a professional RF company to determine what external filters would be required and solve the external interference.

    There are two types of In-band interferences. The first one is Co-Channel interference, where two signals (desired and unwanted signal) exist within the same channel at the same time, but with different RSSI levels. This interference can usually be mitigated by relocating the receiving station away from the interferer or closer to the transmitting station to create enough signal level separation so that receiver can demodulate the desired signal.

    Figure 137 Co-Channel Interference

     

    256382.jpg

    Design Guidance:

    For Cisco Resilient Mesh, OFDM 800 Kbps (Phymode 149) data rate, the delta between the desired signal and unwanted signal should be at least 15dB. The second type of interference is the Adjacent Channel Interference, which occurs when the desired and unwanted signal are in different adjacent channels, but because of the close proximity, the two signals interfere with each other.

    Figure 138 Adjacent Channel Interference

     

    256383.jpg

    The Co-channel and Adjacent Interferences can be separated into two sub-categories, based on the system ownership. External in-band interference is caused by other systems operating in the same ISM band located in the vicinity of your system for which you don't have authority. Internal or self-interference is caused by other devices within a customer system that are co-located in the same area. Since ISM band is licensed for free, customer should expect in-band interference in the field.

    IEEE 802.15.4, because of its robust design that leverages spread spectrum with channel hopping schema, allows multiple systems operating in the ISM band to co-exist as long as equipment manufacturers properly respect the FCC regulations and IEEE 802.15.4 standard and the customer follows common RF best practices for:

    blank.gifPhysical separation between transmitters

    blank.gifSystem traffic load design in terms of node density within and area and transmitter duty-cycle utilization

    Carrier Sense Multiple Access with Collision Detection (CSMA/CD) assists with frame corruption prevention due to in-band interference, but when collision does occur, IEEE 802.15.4 's MAC layer reliability services will ensure that lost frames are retransmitted so that the utility applications are not affected.

    The more challenging scenario is when external systems are proprietary and do not adhere to industry standards to implement CSMA/CD or purely use a high duty-cycle transmission plan within the FCC limits. Some Fault Indicators systems are an example of such implementation.

    Design Guidance:

    Customer should take a more holistic approach to deploying Smart Grid technologies by evaluating vendors and their products for interoperability and co-existence.

    Take into consideration the following external systems that operate within ISM band during the RF planning phase:

    blank.gifRadio location Services

    blank.gifSmart City Lighting

    blank.gifOther 900 MHz DA systems

    blank.gifAdvanced Metering Infrastructure (AMI)

    blank.gifFault Location Indicators Sensors

    blank.gifBaby Monitors

    blank.gifCordless Phones

    blank.gifGarage Door Openers

    blank.gifHome Automation: ZigBee

    Self-interference can be mitigated by a proper RF design since the customers own the infrastructure and have full control over the system, installation location, antenna types, etc.

    Recommendation:

    It is essential that customers work with a professional RF company, perform proper site survey of the area, and model the RF signal propagation using advanced software design for this type of tasks.

    PHY and MAC Layers (IEEE 802.15.4g/e) Standard Overview

    In 2011, the IEEE developed the 802.15.4 technical standard as a framework for low cost, low rate, and low power Wireless Personal Area Network (WPAN) for the Physical Layer (Layer 1) and the MAC or Data Link Layer (Layer 2). It is based on the unlicensed spectrum ISM bands.

    The main goals used to develop the standard were:

    1.blank.gif Thousands of devices in a small area

    2.blank.gif Low power wireless

    3.blank.gif Limited Range

    4.blank.gif Multi-Hop for communication beyond the device range

    5.blank.gif Self-Organization with dynamic routing

    6.blank.gif Coexist with other systems operating in the same unlicensed frequency

    In 2012, IEEE 802.15.4g amendment was developed to include additional outdoor physical layer data rates and modulation for Smart Utility Networks (SUN) and the modification of the MAC layer to support the new formats:

    1.blank.gif Multi-Rate and Multi-Regional (MR) Frequency Shifting Key (FSK)

    2.blank.gif MR Orthogonal Frequency-Division Multiplexing (OFDM)

    3.blank.gif MR Offset Quadrature Phase Shift Keying (QPSK)

    Part of the standard IEEE has defined multiple options for OFDM modulation. Cisco has implemented OFDM Option 2 profile as part of the Resilient Mesh release to increase the physical link data rates.

    Note: Currently, only the 1200, 800, 400, and 200 data rates are supported. The remaining data rates will be supported in the future through software updates.

    The standards define a general frame format for all packet transmissions. The frame format consists of both the physical and the MAC layer. The frame format consists of three parts:

    blank.gifHeader

    blank.gifData portion

    blank.gifFooter

    Figure 139 IEEE 802.15.4 General Frame Format

     

    256127.jpg

    The PHY layer uses a Synchronization Header and Physical Header. The Synchronization Header has a Preamble field that is used to RF symbol synchronization between the sender and receiver so that the receiver can correctly receive the following frame. The Start of Frame Delimiter (SFD) indicates to the receiver the end of the Preamble field and the beginning of the frame payload or the Payload Service Data Unit (PSDU).

    Initially, the maximum PSDU size was 127 Bytes but it was increased later on to 2047 Bytes.

    Figure 140 PHY Layer Frame Fields

     

    256128.jpg

    Note: The SHR and PHR are specific to each PHY mode of operations and can be found in Clause 6 of the standard. Please refer to the following URL:

    blank.gif https://standards.ieee.org/standard/802_15_4-2006.html

    Reference Documentation:

    For the IEEE 802.15.4g Standard, please refer to the following URL:

    blank.gif https://standards.ieee.org/standard/802_15_4-2011.html

    At the MAC layer, the IEEE 802.15.4e standard defines the following fields within the frame (MPDU): the MAC Header (MHR), MAC Payload, and MAC Footer.

    Figure 141 MAC Frame Format

     

    256129.jpg

    The four types of MAC frames are:

    blank.gifEnhanced Beacon (EB) Frame

    blank.gifData Frame

    blank.gifAcknowledgment Frame (ACK)

    blank.gifMAC Command Frame

    Note: The MAC Payload is not used for ACK frames.

    The MHR and the MFR is common to all MAC frames. The MHR has several subfields that are described in the standard, but the ones of interest are the Frame Control subfields, which contains flags that tell the receiver how to interpret the rest of the header fields and whatever the frame needs to be acknowledged.

    The Security Field indicates if the MAC frame contains the optional Security Header Field and if the MAC payload is protected.

    PAN ID Compression is uses to indicate if both Source and Destination PAN ID fields must be present in the Address Field.

    The Destination and Source Mode fields specify if the PAN ID is present within the Address Field and if the Address size: Destination or Source is in native 64-bit format or the compressed 16-bit format.

    The Sequence Number field is used to track the acknowledgments for the data packets and the Address Field and Auxiliary Security Head, which is present only when Security Enable field is set to 1.

    Figure 142 MHR Fields

     

    256130.jpg

    The Address Field varies in size, based on the flags set in the Frame Control subfields.

    At the end of the MAC frame, there is a MAC Footer that is a Frame Check Sequence field that is calculated over the MHR and MAC Payload for frame data corruption.

    MAC Payload follows the MHR and its payload structure and size varies based on the MAC frame type. It is briefly described below. If security is enabled, then the payload is protected through encryption.

    Enhanced Beacon Frames

    Enhanced Beacons (EB) messages are used by the Cisco DA devices to disseminate useful information regarding the PAN to which they belong so that other devices that are looking to joining a PAN can discover the available networks in the vicinity. Joining nodes are nodes that have not yet been granted access to the PAN. As such, joining nodes cannot communicate IPv6 datagrams with neighboring devices. The EB message is the only message sent in the clear that can provide useful information to joining nodes to select the best PAN and best neighbor within a PAN. The following information is sent in the EB frame:

    blank.gifSASID, which is used as a filter so new devices can avoid joining foreign networks.

    blank.gifGTK info: Include GTK ID and a SHA256 key hash. Mesh nodes use it during the join process to check if it has the GTK or not. This IE is also used when the GTK is renewed by the FAR. Each node can store up to four keys per PAN and keys for up to two different PANs.

    blank.gifNetwork Info.

    blank.gifPAN size: number of RPL nodes. Value only updated by the FAR/RPL root.

    blank.gifPath cost to the root: RPL Rank.

    blank.gifUnicast/listening Schedule: Used to implement the channel-hopping algorithm.

    Joining devices also use the RSSI value of the received EB message to determine if a neighbor is likely to provide a good link. The transceiver hardware provides the RSSI value of each frame. Neighbors that have an RSSI value below the minimum threshold during the course of receiving EB messages are not considered for PAN access requests.

    Figure 143 MAC Beacon Frame Fields

     

    256131.jpg

    MAC Command Frames:

    Nodes use MAC command frames to:

    blank.gifAssociation request and response

    blank.gifDisassociate with a particular network

    blank.gifRequest Beacon frames, GTS info

    blank.gifSend notification for PAN ID conflict or Orphan scenarios

    The Command Type field specifies the command type that is found in the Command Payload field. The command types include Association Request and Association Response, which can be found in the Standard Reference Documentation.

    Figure 144 MAC Command Frame Fields

     

    256132.jpg

    Data Frames

    Data frames can be sent as unicast or broadcast with a retransmission mechanism. If an ACK is not received, the node will try to re-transmit it up to eight times at the MAC layer. If the packet is sent upstream, the node will also try to send it to a backup parent (retransmission at the network layer) up to 8 times.

    Since version 6.0-19, the number of retransmissions will be based on the QoS classification of the packet, where packets with higher Assured Forwarding (AF) class and low drop probability markings will be retransmitted more than lower AF classes.

    Both type of packets will always carry an IE describing the Unicast listening schedule of the sender.

    Figure 145 MAC Data Frame Fields

     

    256133.jpg

    ACK Frames

    In the current implementation, any data packet must be acknowledged. Receiving an ACK packet will help to compute the ETX value for the link (used by RPL) as well as the forward RSSI for the sending node. The ACK packet is encrypted like any other Data frame.

    Figure 146 MAC ACK Frame Fields

     

    256134.jpg

    Reference Documentation

    For the IEEE 802.15.4e Standard, please refer to the following URL:

    blank.gif https://standards.ieee.org/standard/802_15_4-2011.html

    Around the same time in 2012, Wireless Smart Utility Networks (Wi-SUN) Alliance was formed to drive industry adaption of the IEEE 802.15.4g, develop industry market requirements (MRD) and Technical Profile Specifications (TPS), and establish a certification program for multi-vendor equipment interoperability testing.

    Cisco has been a member of the Wi-SUN Alliance from the beginning and is a big promoter for open standards and multi-vendor interoperability.

    At the time of the writing, Cisco was going through the certification process for the Wi-SUN 1.0 certification.

    Reference Documentation

    For Wi-SUN Alliance, please refer to the following URL:

    blank.gif https://www.wi-sun.org/

    Cisco Resilient Mesh Release Overview

    Cisco Resilient Mesh is the second-generation outdoor wireless solution based on ISM band (900 MHz) targeted for the Americas region. It is based on the newly launched DA Gateway product, the IR510, the range extended IR530, and the CGR WPAN OFDM module. It leverages a new software release (CG-Mesh 6.0.19) that includes enhanced features like Adaptive Modulation and Adaptive Data Rate and new capabilities like Edge Compute to support Utility Customers' Distribution Automation use cases.

    General RF Design Considerations

    Each outdoor wireless mesh deployment is unique since each environment has its own local radio characteristics based on the equipment installation locations, obstructions, noise floor, and external interferences.

    Typically, the following factors have a major influence over the RF design:

    blank.gifGeographical location, terrain type: mountains, hills or plains

    blank.gifDemographics: urban, sub-urban or rural areas

    blank.gifClimate: tropical, polar, dry, temperate, etc

    blank.gifUtility profile: service territory and number of grid assets

    Figure 147 Geographical Locations and Demographics

     

    256301.jpg

    Mountain or hilly areas typically affect the device's LoS and free space required between nodes to have a good reliable link with low packet error rate (PER). Two adjacent devices could have different elevation points, which will influence how antennas are installed. Tree branches and leaves could obstruct some of the path between the nodes, which leads to variation on the RSSI level between the nodes that require good fading margins for the links to be stable. Flat areas have more open spaces and allow for better signal propagation, resulting in greater link distance. In desert areas, foliage typically is not a problem, but differences in temperature between day and night will result in couple or few dBm changes in the RSSI values.

    The demographics introduce different RF challenges. In urban areas, a lot more noise and interferences exists because of the multitude of other RF systems or commercial business that have elements that produce noise. An RF design for an urban area will look different then a design for a rural area because the node density per area would be different. In urban areas, utility assets are closer (between a half mile and one mile) whereas in rural areas assets could be more than two miles apart from each other, therefore requiring additional range extenders to extend the signal coverage. The urban area is more prone to interference due to the clutter diversity (such as airports and hospitals) that exists within the metropolitan area.

    Figure 148 Urban Clutter Diversity

     

    256302.jpg

    The utility service territory and the number of grid assets that need to be connected affect the number of mesh devices the solution requires. A large utility most likely will have a presence in different demographic regions. In urban areas, the Distribution Automation feeder length is only a couple of miles long since the substation density per area is greater than rural areas. This allows the design to have multiple take out points. The PAN coverage will be small with higher density nodes and allow for higher capacity. The design will only use range extenders in areas that coverage is not that great due to obstacles or RF interference. In sub-urban areas, the substations will be more spread out, with longer feeder length, up to 4 to 5 miles. This design will require additional range extender to increase the coverage area and to increase the redundancy of the PAN by designing additional paths between the nodes.

    Rural deployments will cover long feeders up to 10 miles, which will leverage a larger number of range extenders than in urban areas. This type of design will lack the redundancy since it has a daisy-chain topology. Customers should plan to design additional redundant path using just range extenders.

    Medium and small utilities will typically have only one or two RF designs covering urban and/or rural with a lower number of substations and grid assets to connect. For designs with a small number of substations and FAR routers in order to keep the mesh depth to a reasonable number of 4 to 5 hops, customers can use range extenders to aggregate nodes, therefore reducing the number of hops between nodes, or install additional FAR routers that use cellular for backhaul connectivity.

    Communication Equipment Placement

    In this design guide, the FARs or CGRs are assumed to be installed within the utility substation premise and to leverage the substation WAN for backhaul transport. This will keep the network operation costs down by reusing the existing WAN infrastructure. The solution does not require any changes to the substation WAN since it's using an overlay VPN design that makes the Cisco Resilient Mesh design agnostic to the WAN design.

    Utilities typically already have a radio communication infrastructure within the substations: RF towers or wooden polls where the CGR routers can be installed. The CGR will be connected using fiber back to the substation communication hut where it will be connected to the substation LAN or WAN router.

    Figure 149 Substation Communication Tower

     

    256303.jpg

    For urban substations that are surrounded by tall residential or commercial buildings that obstruct the CGR signal propagation, customers will have to install more than one CGR router outside the substation yard. For that, customers can extend the fiber connectivity from the substations onto the feeder and install the CGR on the feeder itself. If the fiber extension is not an option, then customers can leverage Wi-Fi point-to-point or mesh to extend the substation connectivity.

    The DA Gateways and the IR510 typically is installed on the feeder overhead infrastructure inside the grid device controller enclosure that is mounted on the poll.

    Figure 150 IR510 Overhead Line Installation

     

    256304.jpg

    However, some instances exist in the urban or residential area where the utility uses an underground distribution infrastructure. In that case, the communication equipment is installed on pad mounts or within manholes.

    Figure 151 IR510 Underground Line Installation

     

    256305.jpg

    These locations are fixed and are part of the Feeder Automation planning. Together with the substation locations, it will be used as the initial data source for the RF design development. Customers will need to provide each location GIS coordinates and the type of grid asset that will be connected. RF engineers will use professional tools to model the first RF design and determine if additional coverage is required. If range extenders and IR530 are required, then customers also need to provide the potential infrastructure asset list (poll IDs and locations where the additional communication equipment can be installed). The RF engineering team will rework the RF design model to take into consideration these additional locations.

    Figure 152 IR530 Street Lighting Poll Installation

     

    256306.jpg

    Equipment Antenna Considerations

    Cisco provides customers with a variety of antenna options that were specifically designed for this solution. Because the radio network is a mesh network, a node must communicate with more than two nodes, typically in opposite directions. This requirement leads to installations that use omni-directional antennas, unless a node is a leaf-node. In that case, customers can use a directional antenna.

    The following antennas are available for use:

    256386.jpg  

    256385.jpg  

    256387.jpg  

    Omni Low Gain +1.5 dBi (ANT-WPAN-OD-OUT-N)
    CGR1240 Omni +2.3 dBi Gain (ANT-MP2-I-OUT-M)
    Omni +5 dBi Gain (ANT-LPWA-DB-O-N-5)

    256388.jpg  

    256389.jpg  

    Omni +4 dBi Gain Vandal and impact resistant (IP67)
    Yagi +9 dBi Gain (ANT-WPAN-Y-OUT-N)

    Reference Documentation

    For the Cisco Omni Low Gain (ANT-WPAN-OD-OUT-N) Datasheet, please refer to the following URL:

    blank.gif http://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/antennas/installing/cg_antenna_install_guide/ANT-WPAN-OD-OUT-N.html

    For the Cisco CGR 1240 Omni 2.3 dBi Gain (ANT-MP2-I-OUT-M) Datasheet, please refer to the following URL:

    blank.gif http://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/antennas/installing/cg_antenna_install_guide/ANT-MP2-I-OUT-M-ANT-MP2-I-O-SS-M.html

    For the Cisco Omni 54 dBi Gain (ANT-LPWA-DB-O-N-5) Datasheet, please refer to the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/antennas/installing-combined/industrial-routers-and-industrial-wireless-antenna-guide/ANT-LPWA-DB-O-N-5.pdf

    For the Cisco Yagi 9 dBi Gain (ANT-WPAN-Y-OUT-N) Datasheet, please refer to the following URL:

    blank.gif http://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/antennas/installing/cg_antenna_install_guide/ANT-WPAN-Y-OUT-N.html

    RF Design Development Process

    Developing a good RF design can be a complex process for engineering teams that lack the RF expertise and can lead to failed implementations. For that reason, it is critical that customers acquire the proper resources to assist with the RF design development. There is no good substitute for RF field experience and practice.

    Design Guidance:

    In order to successfully develop an RF design and deploy the Cisco Resilient Mesh solution, customers should engage a Cisco partner or Cisco Services to assist with the RF design phase for their environment.

    The RF design phase should encompass the following process steps:

    1.blank.gif Develop Utility application model

    2.blank.gif Initial Field site survey samples

    3.blank.gif Develop predictive RF Model using software tools

    4.blank.gif Validate RF Model in the Field

    Figure 153 RF Design Development Process

     

    256307.jpg

    In the Application Model step, customers should determine the application communication requirements based on the different use cases that the network must support. A device communication traffic profile should be developed for each grid device type (such as capacitor bank controller and recloser), identifying the bandwidth and latency required during steady state and normal grid operations, and the burst rate for the worst case scenario when there are a lot of events happening in the electrical grid.

    With this information, a feeder traffic profile can be developed based on the feeder's device types and quantities, which will be used with the RF modeling output to determine the network mesh depth, number of take out points, or CGR routers to ensure that the network would support the DA applications traffic profiles.

    The purpose of the Initial Field survey is to collect samples of the RF characteristics (noise floor readings, in-band, and external interference from a few locations) that would be used as input in the predictive RF Modeling tool so that the output model will be more accurate. In addition, it will allow for the discovery of unknowns since the RF modeling tools don't use real-time data and eliminate any surprises later in the deployment.

    Using advanced RF modeling tools, based on the following set of input data, a signal coverage map will be generated that depicts potential RF links between devices that meets the desired signal RSSI criteria:

    blank.gifGIS coordinates of the utility assets where the communication equipment can be installed

    blank.gifEquipment installation height and antenna type

    blank.gifInitial Field site survey data

    blank.gifDesired Link RSSI

    blank.gifDesired Link Modulation and Data Rate

    Figure 154 RF Heat-map with Potential Links

     

    256308.jpg

    An RF engineer would be able to identify the following:

    blank.gifSignal coverage gaps; isolated nodes

    blank.gifSingle points of failure; nodes with a single parent

    blank.gifOversubscribed nodes; parents with large number of children

    blank.gifLong paths; nodes exceeding normal hop count

    The engineer could then refine the RF Model by adding additional range extenders or CGR routers to increase the number of take out points.

    Based on the RF Model output, a second Field Site Survey is required to validate the predicted RF links and gather additional RF samples from other locations. The RF team should baseline the candidate links identified by the RF Model in terms of throughput, delay, and packet loss. The link test should be done as close as possible to the equipment final installation point.

    Using the results collected during this phase, the RF team can determine if the RF Model is good enough or if further refinements are required. Typically, it takes one to two iterations before finalizing the RF Model.

    Once the equipment is deployed in the field, the customer can use the Cisco Resilient FND management tools to monitor the link health over time between the DA gateways. The FND allows customers to define the link ETX range to categorize links as Good, Poor, and Fair.

    Network RF Segmentation

    The Cisco Resilient Mesh uses the concept of SSID and PAN ID. The SSID is a unique global name that represents the Utility network. Within the utility network SSID, each coordinator or CGR router forms a PAN network that runs its own RPL instance. Mesh nodes must be configured with the SSID information and, based on the available PANs, nodes will join a specific PAN. In the case where the CGR routers are installed within the substation premise, each substation and its neighborhood area will have one PAN ID. This logical configuration will allow nodes to perform Inter-PAN migration as long as nodes can hear nodes in adjacent PANs and can establish a stable link. In addition, nodes can be move from one substation area to another if customers would like to redeploy a node.

    Note: Nodes will cache information for a maximum of two PANs. When redeploying a node, the node will first try to join the PANs for which it has cached information so joining a new PAN might take longer. It is recommended to reprogram the node configuration file so that the cached PAN info is erased.

    Figure 155 One PAN per Substation

     

    256309.jpg

    Capacity Planning

    The Cisco Resilient Mesh products radio interfaces are half-duplex, meaning that the communication between two devices is uni-directional. Sending and receiving data does not occur at the same time.

    Figure 156 Half-Duplex Communication

     

    256384.jpg

    Because of this mode of operation, the end-to-end application throughput is less than the physical data rate. If the application traffic flow needs to travel multiple hops, the throughput will keep decreasing at each hop. It is important that network administrators design the network depth to a reasonable number of hops in order to achieve the desired level of services. The Cisco recommendation is to keep the mesh depth to no more than 4 to 5 hops for DA applications.

    Typically, a mesh network is designed for the highest link data rate between each node since the same link could carry traffic from other downstream nodes unless that link is connecting a leaf node and since the mesh dynamic topology that can change due to changes in the RF conditions. For example, a leaf node could become a parent; therefore, its link now will also be an uplink.

    Figure 157 Capacity Planning

     

    256311.jpg

    During the network capacity planning phase, network designers should not plan to load the network at more than 20% capacity during steady state. Each aggregation node should not transport more than 20% of the available goodput bandwidth determine during the RF Site Survey phase so that the node is not oversubscribed. This approach will allow the network to operate during peak times, especially when many events are happening in the grid or when the RF environment is experiencing poor performance and the mesh nodes need to perform packet retransmissions.

    Design Recommendation:

    Always design the first links between the CGR router and the first rank nodes to operate at the highest data rate possible and plan for at least 10 to 15 dB fading margin so that the links will remain stable during RSSI changes. The first links will have the heaviest utilization in the PAN since they are the trunk of the RPL tree.

    Figure 158 Design First Links for Highest Data Rate

     

    256312.jpg

    Following this recommendation, network administrators can determine the ratio of children per parent to ensure that there is no oversubscription and that the network has enough remaining capacity to operate during critical electrical grid events. Figure 159 is an example of how to determine the ratio, assuming constant ingress traffic into the mesh from grid devices or downstream nodes. If the ingress traffic from all downstream nodes is not constant, then the ratio of children per parent can be done based on a statistical model that will allow more children per parent or each child could send ingress traffic into the mesh at a higher capacity.

    Figure 159 Parent/Children Ratio Example

     

    256313.jpg

    Channel Hopping Schema

    The Cisco Resilient Nodes follow a global broadcast channel schedule that is randomly generated by the PAN coordinator, the CGR router. Mesh nodes determine the global broadcast channel schedule and synchronize their local clock from the Information Element (IE) frames.

    Each node also randomly generates the Local Unicast Channel Schedule that complies with the FCC regulation and IEEE standard described in ISM Band Overview. Nodes follow this schedule to listen on a specific channel for a given period of time, 125 milliseconds before hopping onto another channel or the broadcast channel. Adjacent nodes determine another node's Unicast Channel schedule and know when and to which channel it can transmit data. This ensures that a node cannot transmit more than 400msec on any given channel within a 20-second period.

    Dwell window and max-dwell parameters under the WPAN interface set the transmit limits in order to comply with the regional ISM band regulations. The dwell window defines the measurement period and the maximum dwell time represents the maximum transmit duration within the window period. For US, the dwell window period for the Cisco Resilient Mesh is defined by the number of channel (31 channels) multiplied by 0.4 seconds which totals 12.4 seconds and the max-dwell time is 0.4 seconds.

    Because each node has a different unicast channel schedule, multiple nodes can co-exist within the same area and can receive data at the same time instance (multiple transmissions), assuming the channels are not adjacent and at the receiving node, the desired signal is stronger by 15 dB than the undesired signal.

    Figure 160 Channel Hopping Schedules

     

    256314.jpg

    Note: The transmit function has priority over the unicast listening schedule; therefore, when a node has data to transmit, it can skip the listening node state until it reaches the FCC transmit limit. When a node skips the listening state, the adjacent nodes have no way of knowing that the node is not listening anymore and can attempt to send data to the channel the node was supposed to listen. This scenario leads to frame retransmission until the targeted node switches back to the listening unicast schedule.

    Channel notching is a feature that lets customers remove certain channels from being used by the unicast scheduler. This feature is useful when some channels have poor performance due to heavy utilization from nearby systems or interference devices. This feature is configured per neighbor on both sides of a link.

    Figure 161 Channel Notching List Feature

     

    256315.jpg

    Note: Based on FCC regulation, a node must be configured with at least 25 channels.

    Adaptive Data Rate and Adaptive Modulation Considerations

    Cisco Resilient Mesh release has brought to the market a unique, innovative solution that customers can leverage for their deployments. Since then, the mesh software was enhanced so that each mesh node can be configured with more than a single, fixed data rate. Network administrators can plan for the network to support up to four data rates and, if the RF link characteristics deteriorate, the software can dynamically change its data rate to a lower rate, which should have a better RSSI margin to maintain connectivity with a neighbor node. The reverse is also true: if RF conditions improve, than the software will dynamically increase the link data rate.

    This automatic process of down-shifting or up-shifting the data rate within a certain modulation, like OFDM, is called Adaptive Data Rate (ADR). It is done per packet, meaning every packet carries the OFDM MCS data rate in the physical header and the device interface driver can decode the frame at different data rates. Because the link RF characteristics between two nodes are similar, the nodes will use the same data rate in both directions. So, one can think of the adaptive data rate as a per-link or per-neighbor feature.

    At a high level, the ADR feature uses the average link RSSI level and average packet loss rate as thresholds to determine when to change the data rate.

    Figure 162 Adaptive Rate Feature

     

    256316.jpg

    Table 68 OFDM Adaptive Rate RSSI Thresholds for Changing Data Rates

    Modulation
    PHY Mode
    Index Rate
    Data Rate (kbps)
    Received Sensitivity (RSSI) @10%PER
    Down-shift RSSI Threshold
    Up-shift RSSI Threshold

    OFDM

    150

    MCS6

    1200

    -95

    -90

    -85

    149

    MCS5

    800

    -101

    -96

    -91

    147

    MCS3

    400

    -102

    -97

    -92

    146

    MCS2

    200

    -104

    -99

    -94

    144

    MCS0

    50

    -112

    -107

    -102

     

    This feature can come in handy for deployments that lack the initial node density and when customers are looking to quickly deploy DA applications in specific parts of an area. The initial RF network design can be done based on a lower data rate, for example 200Kbps, which should allow for greater link distances between nodes. This gives customers deployment agility to address feeders with poor performance first. Over time, the area will have enough density for the RF design to be changed so that the links operate at a higher data rate, therefore higher capacity.

    256317.jpg  

    DA-FA-DG-196.jpg  

    Initial Deployment based on PhyMode 146 (200Kbps)
    Final Design based on PhyMode 149 (800Kbps)

    Adaptive data rate can also help during temporary space obstruction between two nodes. For example, let's say a delivery truck is parked between two nodes. The software will down-shift to a lower data rate in order to maintain connectivity. This is a trade-off between available performance and network availability. If the link happened to be an uplink also, then the nodes will experience congestion, but QoS will ensure that priority traffic will be sent first ahead of non-critical traffic.

    Figure 163 Normal Conditions and Temporary Path Obstruction

     

    256319.jpg

    Another scenario where customers can benefit from AR is with leaf nodes that are connected to grid assets (example: end-of-line meters) that don't require fast data rates. During the RF design phase, customer can engineer these links to operate at a lower data rate.

    Figure 164 Lower Data Rate for Lead Nodes

     

    256320.jpg

    Caution:

    Customers should avoid mixing link data rates on aggregation nodes to avoid performance issues. When an aggregation node has some children that operate at 800Kbps data rate and others at 200Kbps, the children with faster data rate could become stuck behind longer transmission from the slower children.

    The Cisco Resilient Mesh also supports Adaptive Modulation, which will allow devices to change a link modulation from OFDM modulation to FSK modulation and vice versa.

    Figure 165 Adaptive Modulation Feature

     

    256321.jpg

    This allows Cisco Resilient Mesh to be backward compatible with existing AMI deployments based on Cisco CG-Mesh and allow for a true multi-service solution for DA, AMI, DER, etc.

    Resilient Mesh Performance

    RF Link and Path Performance

    The Cisco Resilient Mesh has been enhanced to perform faster physical transmission data rates based on the IEEE 802.15.4g/e Option 2 profile using OFDM modulation. At the time of this writing, the software supports five out of the seven modulation data rates specified in the standard. Cisco will support the remaining data rate in the upcoming software releases without requiring any hardware upgrades.

    The transmit power varies across the different OFDM data rates; the higher the data rate, the less transmit power is used in order to comply with the FCC regulations. Therefore, the link area of coverage will be smaller for the lower data rates. This is an important aspect that should be considered during the RF design planning, especially if customers are implementing Adaptive Data Rates and Adaptive Modulation, as described in Adaptive Data Rate and Adaptive Modulation Considerations.

    Note: The transmit power cannot be changed by customers.

    Note: The RSSI for 10% Packet Error Rate (PER) it was captured at a normal temperature of 77° F/ 25° C and it will vary by few units in region with extreme temperatures.

    Table 69 OFDM Modulation Data Rates Maximum RSSI Values

    Frequency Band (MHz)
    Modulation
    Modulation Schema
    Channel Spacing (kHz)
    # of Channels
    Data Rates (kbps)
    Transmit Power (dBm)
    Received Sensitivity (RSSI) @10%PER
    Configur-ation PHY Mode Number

    902-928 North America and Mexico

    OFDM

    QAM 16 rate 3/4

    800

    31

    1200

    23

    -95

    150

    QAM 16 rate 1/2

     

     

    800

    25

    -101

    149

    QPSK rate 3/4

     

     

    400

    27

    -102

    147

    QPSK rate 1/2

     

     

    200

    28

    -104

    146

    BPSK rate 1/2 and 4x repeat

     

     

    50

    28

    -112

    144

    All OFDM physical data rates use the same channel plan structure with 31 channels, where the Channel Hopping Scheduler randomly hops across them based on the FCC regulations described in RF Design Considerations.

    Table 70 OFDM Channels List

    Frequency Band (MHz)
    Modulation
    Channel Spacing (kHz)
    Channel Number
    Center Frequency (MHz)

    902-928 North America and Mexico

    OFDM

    800

    0

    902.8

    1

    903.6

    2

    904.4

    3

    905.2

    4

    906.0

    5

    906.8

    6

    907.6

    7

    908.4

    8

    909.2

    9

    910.0

    10

    910.8

    11

    911.6

    12

    912.4

    13

    913.2

    14

    914.0

    15

    914.8

    16

    915.6

    17

    916.4

    18

    917.2

    19

    918.0

    20

    918.8

    21

    919.6

    22

    920.4

    23

    921.2

    24

    922.0

    25

    922.8

    26

    923.6

    27

    924.4

    28

    925.2

    29

    926.0

    30

    926.8

    The Cisco Resilient Mesh also supports lower data rates based on Frequency Shifting Keying (FSK) for backward compatibility for the initial AMI and DA deployments based on the IR509 and IR529.

    Table 71 2FSK Data Rates, Number of Channels, and Maximum RSSI Values

    Freq. Band (MHz)
    Modula-
    tion
    Modula-
    tion Schema
    FEC Enabled
    Channel Spacing
    # of Channels
    Data Rates (kbps)
    Transmit Power (dBm)
    Rec’d Sensi-
    tivity (RSSI) @10%PER
    Config-
    uration PHY Mode

    902-928 North America and Mexico

    2FSK

    -

    OFF

    400

    64

    150

    30

    -102

    66

    2FSK

    -

    OFF

    200

    129

    50

    30

    -119

    64

    Note: A lower OFDM data rate is better than a higher FSK data rate because of the advanced encoding schema that OFDM uses that carries redundant data across sub-carriers within a channel. To enable data redundancy within FSK modulation, administrators must enable FSK FEC, which reduces the data rate in a half.

    Table 72 2FSK 150 Kbps Data Rate Channels List

    Channel Number
    Center Frequency (MHz)
    Channel Number
    Center Frequency (MHz)
    Channel Number
    Center Frequency (MHz)
    Channel Number
    Center Frequency (MHz)

    0

    902.400

    16

    908.800

    32

    915.200

    48

    921.600

    1

    902.800

    17

    909.200

    33

    915.600

    49

    922.000

    2

    903.200

    18

    909.600

    34

    916.000

    50

    922.400

    3

    903.600

    19

    910.000

    35

    916.400

    51

    922.800

    4

    904.000

    20

    910.400

    36

    916.800

    52

    923.200

    5

    904.400

    21

    910.800

    37

    917.200

    53

    923.600

    6

    904.800

    22

    911.200

    38

    917.600

    54

    924.000

    7

    905.200

    23

    911.600

    39

    918.000

    55

    924.400

    8

    905.600

    24

    912.000

    40

    918.400

    56

    924.800

    9

    906.000

    25

    912.400

    41

    918.800

    57

    925.200

    10

    906.400

    26

    912.800

    42

    919.200

    58

    925.600

    11

    906.800

    27

    913.200

    43

    919.600

    59

    926.000

    12

    907.200

    28

    913.600

    44

    920.000

    60

    926.400

    13

    907.600

    29

    914.000

    45

    920.400

    61

    926.800

    14

    908.000

    30

    914.400

    46

    920.800

    62

    927.200

    The Resilient Mesh also supports a very low data rate based on the Offset Quadrature Phase-Shift Keying (OQPSK) modulation that can be uses for sensor data collection.

    Table 73 OQPSK Data Rate and Maximum RSSI Values

    Frequency Band (MHz)
    Modu-
    lation
    Modu_
    lation Schema
    Channel Spacing
    # of Channels
    Data Rates (kbps)
    Transmit Power (dBm)
    Received Sensitivity (RSSI) @10%PER
    Configur-
    ation PHY Mode

    902-928 North America and Mexico

    OQPSK

    -

    200

    129

    6.26

    30

    -123

     

    One Link Performance Testing (Reference Results)

    The link performances between two devices can greatly vary based on the RF conditions. When testing performance, it is important to detail the conditions in which the test took place. For example, a test performed using the same link RSSI values between two nodes will have different results when the noise floor differs or if there is a device that interfere in the vicinity, or if one test is performed during a sunny day versus a rainy day. Figure 166 shows the test results performed during internal lab validation and gives network designers a sense of the application throughput between two nodes. The testing was performed in various conducted environments using coax cabling over a good quality link (RSSI = ~70 dBm), low noise floor conditions (-120 dBm), and no external interference. A UDP data stream was sent from one node to another using different packet sizes: 64B, 256B, 512B, and 1300 Bytes and the maximum throughput between the nodes was recorded at which the Packet Error Rate (PER) < 1%.

    Figure 166 Link Performance Test Topology

     

    256391.jpg

    Figure 167 Uni-directional Link Throughput Reference Numbers

    Modu-
    lation
    PHY Mode
    Data Rate (kbps)
    Hop Count
    Max Send Rate (kbps) @Packet Size 128 Bytes
    Max Send Rate (kbps) @Packet Size 256 Bytes
    Max Send Rate (kbps) @Packet Size 512 Bytes
    Max Send Rate (kbps) @Packet Size 1024 Bytes
    Test App. Type
    OFDM

    150

    1200

    1

    ~163

    ~326

    ~585

    ~601

    UDP

    149

    800

    1

    ~163

    ~322

    ~446

    ~466

    UDP

    147

    400

    1

    ~137

    ~203

    ~270

    ~281

    UDP

    146

    200

    1

    ~83

    ~117

    ~147

    ~153

    UDP

    144

    50

    1

    ~21

    ~32

    ~38

    ~39

    UDP

    FSK

    66

    150

    1

    ~54

    ~78

    ~103

    ~106

    UDP, FEC OFF

    64

    50

    1

    ~16

    ~23

    ~30

    ~31

    UDP, FEC OFF

    The one-way latency across the link varies based on the link data rate. The higher the data rate, the less time it takes for a frame to be transmitted over the air. Table 74 captures the average latency for sending a stream with different packet sizes for 10 seconds from one node to another over the same link conditions as the previous throughput testing.

    Table 74 Uni-directional (One Way) Link Latency Reference Numbers

    Modu-
    lation
    PHY Mode
    Data Rate (kbps)
    Hop Count
    1W Avg. Latency (msec) @Packet Size 128 Bytes
    1W Avg. Latency (msec) @Packet Size 256 Bytes
    1W Avg. Latency (msec) @Packet Size 512 Bytes
    1W Avg. Latency (msec) @Packet Size 1024 Bytes
    Test App. Type
    OFDM

    150

    1200

    1

    ~5.7

    ~7

    ~9.2

    ~13

    UDP

    149

    800

    1

    ~6.5

    ~8.2

    ~11.2

    ~17.4

    UDP

    147

    400

    1

    ~9

    ~11.9

    ~17.3

    ~32.4

    UDP

    146

    200

    1

    ~13.8

    ~19.1

    ~29.9

    ~51.4

    UDP

    144

    50

    1

    ~42

    ~63.4

    ~105

    ~188

    UDP

    FSK

    66

    150

    1

    ~15

    ~22.6

    ~36.8

    ~74.2

    UDP, FEC OFF

    64

    50

    1

    ~41.5

    ~66.1

    ~112

    ~215

    UDP, FEC OFF

    The link latency will vary if a node has multiple parents since the RF link characteristics will change with each parent link. Also, if one parent link is experiencing congestion or interference, the child node will retransmit the packet multiple times using different Back-off timers, based on the priority of the packet as discussed in Figure 87; therefore, latency to transmit one packet will increase.

    Path, Multi-hop Throughput Performance Testing (Reference Results)

    The same RF environmental conditions used in the Link performance testing were used to determine the mesh path performances over multiple hops: OFDM - 5 links and FSK - 2 links.

    This type of testing was done to baseline the mesh performance for DA FLISR use case if the Grid vendor solution requires peer-to-peer communication between the reclosers.

    Figure 168 Path Performance Test Topology

     

    256390.jpg

    Table 75 Uni-directional Multi-Hop Throughput Reference Numbers

    Modu-
    lation
    PHY Mode
    Data Rate (kbps)
    Hop Count
    Max Send Rate (kbps) @Packet Size 128 Bytes
    Max Send Rate (kbps) @Packet Size 256 Bytes
    Max Send Rate (kbps) @Packet Size 512 Bytes
    Max Send Rate (kbps) @Packet Size 1024 Bytes
    Test App. Type
    OFDM

    150

    1200

    5

    ~78

    ~131

    ~194

    ~199

    UDP

    149

    800

    5

    ~73

    ~109

    ~163

    ~173

    UDP

    147

    400

    5

    ~49

    ~70

    ~99

    ~101

    UDP

    146

    200

    5

    ~30

    ~43

    ~55

    ~57

    UDP

    144

    50

    5

    ~7

    ~9

    ~11

    ~12

    UDP

    FSK

    66

    150

    2

    ~14

    ~24

    ~38

    ~41

    UDP, FEC OFF

    64

    50

    2

    ~5

    ~8

    ~11

    ~12

    UDP, FEC OFF

    DA Feeder Automation using Cellular Service (3G/4G) Solution

    This chapter includes the following major topics:

    blank.gifImportant Features Supported by LTE Pluggable Modules

    blank.gifDistribution Automation Architecture using Cellular Backhaul

    blank.gifCellular Backhaul Design Considerations

    Cellular backhaul will be the most prevalent deployment backhaul and the new Cisco IR1101 platform will be the correct choice for deploying as a DA Gateway. However, the Cisco IR807 could be deployed in cases where lower cost and lower power consumption are the primary requirements. The CGR 1120 platform will suit for Dual LTE backhaul DA Gateway deployments.

    Cisco DA Gateways supports a Cellular-pluggable module that supports the following 4G/3G modes:

    blank.gif4G LTE—4G LTE mobile specification provides multi-megabit bandwidth, more efficient radio network, latency reduction, and improved mobility. LTE solutions target new cellular networks. These networks initially support up to 100 Mb/s peak rates in the downlink and up to 50 Mb/s peak rates in the uplink. The throughput of these networks is higher than the existing 3G networks

    blank.gif3G Evolution High-Speed Packet Access (HSPA/HSPA+)—HSPA is a UMTS-based 3G network. It supports High-Speed Downlink Packet Access (HSDPA) and High-Speed Uplink Packet Access (HSUPA) data for improved download and upload speeds. Evolution High-Speed Packet Access (HSPA+) supports Multiple Input/Multiple Output (MIMO) antenna capability.

    blank.gif3G Evolution-Data Optimized (EVDO or DOrA) Mode—EVDO is a 3G telecommunications standard for the wireless transmission of data through radio signals, typically for broadband Internet access. DOrA refers to EVDO Rev-A. EVDO uses multiplexing techniques including Code Division Multiple Access (CDMA), as well as Time Division Multiple Access (TDMA), to maximize both individual users' throughput and the overall system throughput.

    For more details about other supported TDD LTE, UMTS, HSPA+ and HSPA bands, please refer to product specification documentation at the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/access/1101/hardware/installation/guide/1101hwinst/pview.html#12287

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/connectedgrid/cgr1000/ios/modules/4g_lte/b_4g_cgr1000.html

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/access/800/807/software/configuration/guide/b_IR807_config/b_IR807_config_chapter_011.html

    For details about wireless antenna positioning guidelines, you can refer to the following document.

    blank.gif https://www.cisco.com/c/en/us/td/docs/routers/access/1101/hardware/installation/guide/1101hwinst/pview.html#12287

    Important Features Supported by LTE Pluggable Modules

    blank.gifDual SIM, which allows SIM to be active in either slot; failover to the alternative SIM slot if the active SIM loses connectivity to the network

    blank.gifAuto SIM mode, which will automatically select the right carrier after a SIM slot switching and automatically reset the modem

    blank.gifSIM Online Insertion and Removal (OIR)

    blank.gifAssisted GPS (A-GPS)

    blank.gifShort Message Service (SMS)

    blank.gifModem Firmware Upgrade

    blank.gifSIM lock and unlock capabilities

    blank.gifIPv6 protocol is supported on the cellular interface to enable LTE IPv6 data connection

    Distribution Automation Architecture using Cellular Backhaul

    The DA application bidirectional flow can be classified as follows:

    1.blank.gif SCADA <----> RTU < > IEDs

    2.blank.gif SCADA < > IEDs

    3.blank.gif IEDs < >IEDs

    Cellular Backhaul will suit for second flow; i.e., SCADA -----> IEDs

    Figure 169 depicts a solution architecture where IEDs can directly communicate with the centralized SCADA. In this design, DA Gateways directly connect to HER in the regional control center via public WAN connectivity. For redundancy design, DA Gateways can have two active/active tunnels to two different regional Control Centers. DA application traffic and NMS control traffic can flow in the same FlexVPN tunnel.

    For more details, refer to the Distribution Automation - Secondary Substation Design Guide at the following URL:

    blank.gif https://www.cisco.com/c/en/us/td/docs/solutions/Verticals/Distributed-Automation/Secondary-Substation/DG/DASS-DG.pdf

    Figure 169 SCADA to IED Traffic Flow

     

    256323.jpg

     

    Cellular Backhaul Design Considerations

    blank.gifBandwidth is generally shared between many users (such as smartphones, smart meters, and M2M) when attached to the same base station. This makes it difficult to design a network with guaranteed bandwidth, latency, and QoS parameters for meeting any performance-based criteria.

    blank.gifBandwidth is asymmetric since the services are designed to offer greater download speed to smartphone users. Conversely, FAN traffic profiles have either symmetrical or greater upstream speed requirements, which requires evaluating the traffic load when designing the network. This means using a network protocol to understand the link capacity and potential costs (dependent on service subscription tariffs).

    blank.gifCoverage and network availability must be evaluated for rural zones with isolated devices.

    blank.gifCellular deployments only offer native IPv4 services and if IPv6 connectivity is required, IPv6 traffic must be tunneled over GRE/IPv4.

    Glossary

    The following table lists the acronyms and initialisms used in this document.j

    Term
    Definition
    A

     

    ACK

    Acknowledgment Frame

    AD

    Administrative Distance

    AF

    Assured Forwarding

    AMI

    Advanced Meter Infrastructure

    B

     

    BGP

    Border Gateway Protocol

    BMR

    Basic Mapping Rules

    BR

    Border Router

    C

     

    CBC

    Capacitor Bank Controller

    CCA

    Clear Channel Assessment

    CDN

    Cisco Developer Network

    CGM

    Connected Grid Module

    CGR

    Connected Grid Router

    CNR

    Cisco Network Register

    CoAP

    Constrained Application Protocol

    CSM

    Cisco Compute Content Switching Module

    CSMA/CD

    Carrier Sense Multiple Access with Collision Detection

    CSMP

    CoAP Simple Management Protocol

    CX

    Cisco Customer Experience

    D

     

    DA

    Distribution Automation

    DAG

    Directed Acrylic Graph

    DAO

    Destination Advertisement Object

    DER

    Distributed Energy Resources

    DHCP

    Dynamic Host Configuration Protocol

    DIO

    DAG Information Object

    DIS

    DAG Information Solicitation

    DMR

    Default Mapping Rule

    DMS

    Distribution Management System

    DODAG

    Destination Oriented Directed Acrylic Graph

    DR

    Demand Response

    DRU

    Dynamic Routing Update

    DSCP

    Differentiated Services Code Point

    E

     

    EB

    Enhanced Beacon

    ECC

    Elliptic Curve Cryptography

    EOC

    Energy Operations Center

    EST

    Enrollment over Secure Transport

    ETX

    Expected Transmission Count

    F

     

    FAN

    Field Area Network

    FAR

    Field Area Routers

    FCAPS

    Fault, Configuration, Accounting, Performance, and Security

    FLISR

    Fault Location, Isolation, and Service Restoration

    FMR

    Forwarding Mapping Rules

    FND

    Cisco Field Network Director

    G

     

    GIS

    Geographic Information System

    H

     

    HA

    High Availability

    HER

    Headend Router

    HSDPA

    High-Speed Downlink Packet Access

    HSM

    Hardware Security Module

    HSPA/HSPA+

    High-Speed Packet Access

    HSUPA

    High-Speed Uplink Packet Access

    I

     

    IED

    Intelligent Electronic Device

    IGP

    Interior Gateway Protocol

    IID

    Interface Identifier

    IPS

    Intrusion Preventions Systems

    IR

    Industrial Routers

    ISE

    Cisco Identity Services Engine

    L

     

    LIR

    Local Internet Registry

    LLN

    Low power and Lossy Networks

    LMR

    Land Mobile Radios

    LoS

    Line of Sight

    M

     

    MDM

    Meter Data Management

    MIMO

    Multiple Input/Multiple Output

    MOP

    Mode of Operations

    MPL

    Multicast Protocol for Lossy Networks

    MQC

    modular QoS CLI

    MTBF

    Equipment Mean Time Between Failures

    MTU

    Maximum Transmission Unit

    N

     

    NAM

    Neighborhood Area Network

    NAT-PT

    Network Address Translation and Protocol Translation

    NPS

    Network Policy Service

    NS

    Neighbor Solicitation

    NTP

    Network Time Protocol

    O

     

    ODM

    Operational Data Model

    OF

    Objective Function

    OFDM

    Orthogonal frequency-division multiplexing

    OIR

    Online Insertion and Removal

    OMS

    Outage Management System

    P

     

    PAN

    Personal Area Network

    PER

    Packet Error Rate

    PKI

    Public Key Infrastructure

    PMTUD

    Path Maximum Transmission Unit Discovery

    PnP

    plug-and-play

    PSDU

    Payload Service Data Unit

    PSPU

    Physical Service Protocol Unit

    Q

     

    QPSK

    Quadrature Phase Shift Keying

    R

     

    RA

    Registration Authorization

    RBAC

    Role-Based Access Controls

    RFI

    Remote Fault Indicator

    RIR

    Regional Internet Registries

    RPL

    Request Parameter List

    RSSI

    Received Signal Strength Level

    RTU

    Remote Terminal Unit

    S

     

    SAIDI

    System Average Interruption Duration Index

    SAIFI

    System Average Interruption Frequency Index

    SCADA

    Supervisory Control and Data Acquisition

    SCEP

    Simple Certificate Enrollment Protocol

    SEIM

    Security Event and Incident Management

    SFD

    Start of Frame Delimiter

    SMS

    Short Message Service

    SNMP

    Simple Network Management Protocol

    SSM

    Software Security Module

    SUDI

    Secure Unique Device Identifier

    SUN

    Smart Utility Networks

    T

     

    TDMA

    Time Division Multiple Access

    TPS

    Tunnel Provisioning Server

    V

     

    VRF

    Virtual Routing and Forwarding

    W

     

    WAN

    Wide Area Network

    Wi-SUN

    Wireless Smart Utility Networks

    WPAN

    Wireless Personal Area Network

    Z

     

    ZTD

    Zero Touch Deployment