One of the bright promises of real-time cloud computing should be redundancy. After all, people expect the cloud to give them 99.99% uptime or more. That’s a primary benefit. At the same time, many real-time control applications require robust, redundant solutions. So it’s a perfect match. The only real question is implementation–how to create a redundant real-time system in the cloud?
Just to review, redundancy in a real-time control system means that some or all of the system is duplicated, or redundant. The goal is to eliminate, as much as possible, any single point of failure. When a piece of equipment or a communication link goes down, a similar or identical component is ready to take over. There are three main kinds of redundancy, cold-standby, warm-standby, and hot-standby. Among these, hot-standby is the most valuable, as it ensures that the switchover from the failed system to the redundant system takes place very quickly, often in milliseconds.
Achieving hot-standby redundancy for a real-time data stream in a cloud system requires a unique approach. The business model of cloud computing is not suitable because in that model servers act as end-points for data, and offer formatted output, such as HTML, on request. As we discussed last week, a real-time cloud system needs to be data centric, allowing the raw data to flow quickly through the system, and get converted to the desired format (HTML, XML, SQL etc.) only when it reaches its final destination.
At the same time, your typical process control system architecture won’t really do the job properly either. Factory automation redundancy on a LAN is provided by “proxy” applications that maintain connections to redundant data paths and select the best path to feed data to the downstream client. This is important but not enough for cloud computing. The weak link in a cloud system is often the network connection to the cloud server, or the host of the cloud server itself. For example, consider the recent outage of the Amazon EC2 cloud, and the disruption that caused.
What is needed for a truly redundant, real-time cloud system is the ability to provide fully redundant data paths from inside the plant firewall to inside the client firewall. A user should be able to configure these identical data streams across two completely different cloud computing service providers. Thus, even if the whole cloud system goes down, you can function through a completely independent data path, and continuing accessing your data. If the system supports hot-standby redundancy, the swapover should be virtually seamless.
Is this actually possible? Yes, using properly designed middleware. Of course, achieving hot-standby redundancy on the cloud also depends on fast data rates, low latency, and a data-centric infrastructure. But implemented well, redundancy can become an integral part of a robust, cloud-based real-time system. Such a system can, in turn, support LAN-to-LAN bridging and synchronization, which we will discuss next week.
