Tech View

Verification and supervision of communication networks for utility automation

Nowadays communication networks are an integral part of the utility automation systems. With the increased usage of non-conventional instrument transformers and IEC 61850 protection devices, more and more critical information like Sampled Values streams and GOOSE messages are transmitted on these networks. All protection, automation, and control devices have to be online and communicating appropriately. Testing tools and techniques are needed to verify and supervise the operation of protection, automation, and control (PAC) systems. This already starts in the commissioning phase, where configuration errors and communication problems are ruled out and the correct transmission of all signals has to be verified. Later on, during the operation phase of a digital substation, it is crucial that problems on the communication network are detected immediately, so that the operating personnel can react on it. The correct functioning of the communication network is an essential precondition for the optimal performance of a PAC system. Consequently, the performance of the communication network needs to be measured and assessed on its own. Depending on the communication architecture and technologies deployed, different approaches are applicable.

Verification of the IEC 61850 Communication
The description of the communication system in the standardised IEC 61850 substation configuration language (SCL) format serves as the basis for the verifications. It is verified that the IEC 61850 server of all intelligent electronic devices (IED) are available and reachable over client/server (C/S) connection and the substation real-time network traffic (GOOSE and Sampled Values) is actually present on the communication network as it is defined in the configuration files.

A network analyser tool can verify, proof and document that all protection and control devices are working and communicating properly. Such verifications are mainly done in factory and site acceptance tests (FAT, SAT), and during the commissioning. In case of a malfunction, the network analyser tool has to provide detailed information for debugging. In Figure 1 an example verification result of a system with two protection devices (IED) and two merging units (MU) is shown.

The system verification provides the results for each IED in the system. If an IED is “checked green”, the complete IEC 61850 communication has been found as defined in the configuration files. A warning indicates that there is an issue, which can be related to the server in the IED or that not all Sampled Values streams or GOOSE messages are found on the network as expected. An error is shown if an IED or one of its services is not found during the verification process. Differences in found Sampled Values streams or GOOSE messages are visualised by showing the found values next to the defined ones. The GOOSE shown in Figure 2 as an example has different values for the Application ID, GOOSE ID, and the Configuration revision. If the values found on the network are the correct ones, the IED configuration file has to be updated, otherwise the device need to be reconfigured accordingly.

A system verification is often an interactive working process because the devices are put in operation one by one during commissioning. The verification steps can be performed incrementally without re-executing all the checks for all devices already verified. If devices do not perform as desired, detailed information is provided for further investigating and debugging of the problem. If there are any other GOOSE message or Sampled Values streams found on the network they are listed as “orphans”. If these orphans are not used in the PAC system anymore they should be eliminated from the network by reconfiguration or removing of the publisher devices, otherwise the SCL configuration files have to be updated accordingly.

After the successful verification of the complete IEC 61850 communication, it is proofed that all devices are available in the PAC system and they are communicating correctly. The next step is to set up a supervision of the IEC 61850 communication so that any issue during the operation of the PAC system it detected immediately.

Supervision of the IEC 61850 Communication
During the normal operation of a PAC system, it is recommended to supervise the IEC 61850 network communication based on the SCL definition. This is achieved by constantly evaluating all network packets of the Sampled Values streams and GOOSE message in the system. If Precision Time Synchronisation (PTP) is used in the PAC system, it is also important to supervise the PTP communication in the network. Figure 3 shows a possible setup with a network analyser which is tapped into a link to supervise the network traffic.

The network analyser detects the abnormalities in the real-time network traffic and automatically logs all events with the corresponding detailed information (e.g. lost samples, GOOSE timing problems, PTP time synchronisation issues) to a storage device. The event severity and category help to filter and analyse the entries in the event log.

The analyser is working autonomously and can be connected in passive TAP mode to the substation network. Thus, it can obtain all traffic on a link without the requirement to configure traffic monitoring features such as port mirroring in the Ethernet switches. Events can trigger the recording of the relevant data for in-depth investigation of the abnormalities occurred. Additionally, notifications can be sent via email to inform the operating staff about the occurrence of an event.

Figure 4 shows the supervisor event list of the network analyser and the details of a selected GOOSE time to live expired event. In the example the network analyser has detected that the time allowed to live of a GOOSE message has expired and the repetition of the GOOSE packet was missing. After the GOOSE message was received again, a GOOSE out of sequence event was logged into the event list which provides information about how many repetition packets or even if status changes of the GOOSE message were missing during the timeout period.

Timeouts and out of sequence events of supervised Sampled Values streams are also detected and logged by the network analyser. If there are any malformed GOOSE or Sampled Values packets on the network, a parsing error event is logged. For configured Sampled Values streams and GOOSE message which are not received by the network analyser a never seen event is created in the log. A clock drift between a Sampled Values publisher and the network analyser is also detected as an event.
Depending on the type of event it can be necessary to immediately react on it or notify the responsible personal. A flexible configuration of different actions for any kind of supervisor event type is required.

After the complete setup the permanently installed network analyser is constantly supervising the configured IEC 61850 network communication. Beside the reaction with the defined actions it is also possible to check the event log entries via remote connection to the analyser device.

Verification of the Communication Infrastructure
The communication network is the underlying infrastructure for the information exchange between the devices. It is an integral part of the PAC system and its dependable performance is the precondition for the proper function of the applications running on top of it.

IEC 61850 makes no particular assumptions of the network infrastructure to be used. The requirements depend solely on the application. Thus, several topologies and technologies are in use today, all providing another trade-off between efforts, performance, and reliability.

Standard Ethernet allows to build networks with different topologies, but lacks redundancy on a level that is needed for a PAC system. To cover redundancy, IEC 61850-90-4 refers to the standard IEC 62439-3, where the redundancy mechanisms HSR and PRP are defined.

Standard Ethernet networks
Ethernet has made an impressive development since the late 1970s, with increasing link speeds and enhanced features, but always maintaining backwards compatibility. For a PAC system, such networks are built with “substation hardened” Ethernet switches, which are managed high performance networking devices, made for the environmental conditions (climate, EMC) in electric installations.

Common ways to connect the networking devices are star topologies and so-called ring structures. Star topologies are easy to understand and when well planned, they can deliver a small number of hops a message needs to pass through to reach its destination.

On the other hand, ring structures appear to be attractive for several reasons. The networking devices are physically connected to form a “ring”. Several protection relays and bay controllers contain integrated Ethernet switches with two external ports, so they can be directly linked into such a ring. This reduces the number of dedicated Ethernet switches needed for building such a network. Furthermore, there is the desire that such “rings” would provide some redundancy.

But in an Ethernet network, real ring connections would establish circular paths for the Ethernet packets, so they must not exist. Otherwise, circulating packets would accumulate until the network is overloaded and breaks down. To avoid this, the Rapid Spanning Tree Protocol (RSTP) analyses the topology and breaks physically connected rings into tree structures, which are free of circulating packets. RSTP does this by disabling certain links and putting them in stand-by. If the tree structure is broken because an active link fails, RSTP reconfigures the tree by re-activating some of the inactive links. The way the tree will be structured can be influenced by setting the so-called bridge priorities in the switches.

This underlines clearly that RSTP is a topology management mechanism in the first place. In general, the performance of the reconfiguration is not guaranteed. Depending on the circumstances, a reconfiguration can take seconds. Vendor specific flavours of fast spanning tree mechanisms are generally not interoperable between vendors. Well, in an office environment, RSTP could be considered as a redundancy mechanism. Who cares if a printout appears five seconds later or if the loading of a file takes two seconds longer? But in a PAC environment, the requirements are much more stringent. RSTP is not really applicable as a redundancy mechanism there.

Nevertheless, there is the desire to assess the reconfiguration times of a network structure managed with RSTP, especially if the application relies on perceived short reconfiguration times.
Since there is no “reconfiguration complete” signal issued by the switches, such an assessment can be only done in an indirect way. It can be observed when traffic disappears due to a failing link and when it re-appears after the reconfiguration is complete. To obtain the results with a sufficient time resolution, the traffic must have an adequate frequency. Sampled Values can be a suitable traffic, delivering packets every few hundred micro-seconds, which provides a reasonable resolution for the reconfiguration times to be measured.

The most simplistic way to verify the reconfiguration performance is to look at the packet rate on a link, checking the times when it goes to zero and when it comes back to non-zero values again. But due to the averaging nature of calculating packet rates, this method loses resolution.

Another option is to record the traffic and measure the time between the last packet before the interruption and the first packet after reconfiguration. A supervisor function for Sampled Values will record events when the first packet is missing (Timeout) and when the stream re-appears (Out of sequence). When such events are mapped to binary traces, the times can be easily observed, recorded, and finally measured with cursors.

Besides the reconfiguration times, the different timings in differently reconfigured networks with RSTP can be verified as well.

High-availability Seamless Redundancy (HSR) Networks
HSR networks are used for protection of electrical substation automation that require redundancy and zero recovery time in failure cases. Each node in an HSR network is attached by two Ethernet ports and sends the same frame over both ports. The receiver gets the two identical frames and forwards the first frame to the application and discards the second frame.

For measuring packet delays in a combined Ethernet and HSR communication infrastructure a network analyser is connected passive in TAP mode into the HSR ring topology, as shown in Figure 5. The delay of the communication between an IED in the Ethernet network to an HSR node can easily be measured and verified against the requirements of this communication path.

Almost all network analyser features like the verification and supervision of the IEC 61850 communication (as described in chapter 1 and 2) are supported with this test setup without having a build-in HSR node in the network analyser.

A drawback of HSR is that, like in any ring structure, the packets need to go through a considerable number of hops to reach their destination and each hop adds some delay. Cut-through packet forwarding provides advantages in this regard, but it is not a requirement and even with cut-through, delays can be caused by traffic injected by the nodes. To meet the requirements, the number of nodes will be limited in practice. An often mentioned maximum number is around 20 nodes for networks with a link speed of 100 Mbit/s. Therefore, larger systems have to be segregated into several HSR rings, which might be interconnected by other means, possibly PRP networks.

Parallel Redundancy Protocol (PRP) Networks
PRP utilises two independent Ethernet networks and redundancy is achieved by connecting the devices to both networks. Such a device is called a doubly attached node (DAN).

PRP is preferred to connect a large number of IEDs without the need of either physical network or sub application segregation. Also, non-PRP nodes can be connected to PRP networks. Such a device is a so-called singly attached node (SAN), connected to only one of the networks. These devices do not have redundancy and can communicate only with other nodes attached to the same network. The network packets in PRP networks are tagged at the end of the frame with the redundancy control trailer in order to be interoperable with not PRP aware devices.

A possible test setup to measure and verify the communication in PRP networks is shown in Figure 6. There are two network analysers connected in a passive TAP mode into the PRP network paths A and B at points P4 and P5. They measure the propagation asymmetry of the PRP network infrastructure between the points P4 and P5. The analysis software identifies the same Ethernet packets in the two network paths and by comparing the time-stamps of the packets the time differences are worked out. The different PRP trailers in the frames must not be considered in the packet matching algorithms.

For all network topologies mentioned above, a suited measurement method for assessing the timing behaviour is available. Figure 5 even shows that such measurements can span over network segments of different technology.

Conclusion
The application of IEC 61850 communication in PAC systems brings many new challenges, but the new options and benefits have the potential to outweigh the difficulties by far.

The engineering concept and the resulting availability of configuration information in machine readable form (SCL files) greatly support the testability of the systems. The verification of the communication on the application layer as laid out in the first half of this article is essentially facilitated by these features. This applies through the whole life cycle of a PAC system. A configuration and a test setup worked out for a FAT can then again be used during commissioning. Configuration changes that may have happened in between are immediately detected and cleaned up. If the configuration remained unchanged, this is verified and ticked off even faster. And the same configuration information serves its purpose for the supervision during operation or during maintenance work later on.

Considering the fact that the performance and reliability of the communication networks were often questioned by sceptics, it is surprising that only small effort was invested in verifying these important aspects until now. But commissioning the communication infrastructure on its own should become a dedicated task to ensure a solid base for the PAC system communication on top of it. Electrical power engineers with the focus on protection, automation, and control can perform such tasks easily with state-of-the-art tools.

Authors
Matthias Wehinger, innovation manager
for power utility communication products at OMICRON

Fred Steinhauser,
Head of Power Utility Communication
at OMICRON

Click to comment

You must be logged in to post a comment Login

Leave a Reply

Most Popular

To Top