Engineers have typically been enthusiastic and early adopters of new computer technology, especially when it comes to real-time data processing. As PC technology has advanced, they first applied analog, then digital technology. They have moved from isolated computers to wired and wireless networks for implementing real-time data communication in their process control systems. Moving to the cloud would seem like the next logical step, but as with each previous step, this evolution will require changes in thinking and communications design.
To open a dialog about what these design changes might be, we list here what we understand to be nine core requirements for a cloud system to support the flow of real-time data for industrial, embedded and financial systems. Each of these is discussed in more detail in related Real-Time Cloud blog posts (blue links).
Data Rates and Latency
1. High-speed “push” data sources. The data should be pushed out to the cloud, and then pushed to the user. Polling requires too much time and uses too much bandwidth. Push technology also enables machine-to-machine communication.
2. Publish/subscribe data delivery. In an event-driven model a user makes a request for data one time, then gets updates whenever they occur.
3. Low-latency data transmission. The data needs to flow quickly and effortlessly through the system, through an in-memory real-time database. Relational databases typically used for business systems are too slow.
Reversing Client/Server Relationship to Keep Firewalls Closed
4. Reverse the client/server relationship. The typical client/server thinking is to treat an in-plant control system as a server (it is, after all, the source of the data), and the cloud service as a client to that in-plant system. This means there needs to be an open firewall port directly into the in-plant control system. Secure cloud-based systems need to reverse the client/server relationship by having the in-plant system act as the client and the cloud service act as the server, even though the in-plant system is the data source. This allows the in-plant system to stream data to the cloud service without exposing itself to the Internet.
Data-Centric Infrastructure
5. Data-centric, not web-centric, design. The data stays in its simplest format, with no HTML or XML code, for lowest possible latency.
6. Raw data access at the cloud. The raw data flows from the source, through the cloud, to the user, and gets converted to other formats (such as HTML, XML, SQL, etc.) at the last instant.
7. Multiple user types. Different users, such as web browsers, databases, spreadsheets, and machine-to-machine systems access a single data source.
Redundancy
8. Independent, hot-standby, redundant cloud systems. It should be possible to provide fully redundant data paths from inside the plant firewall to inside the client firewall, that can switch over in milliseconds in case of any service outage.
LAN-to-LAN via the Cloud
9. LAN-to-LAN bridging and synchronization. The system maintains a complete copy of the data set on the source LAN, and sends it across to the user LAN, continuously updating it in real time for live replication of the data on both LANs. Should the cloud communication channel be lost, local clients and servers don’t need to respond to the network failure. Individual control areas within a distributed system can continue operating as “islands of automation” until the cloud connection recovers.
Taken together, we feel these nine core requirements are needed to support a robust real-time cloud system. These are, of course, in addition to the need to find a good cloud hosting service that provides a favourable service level agreement (SLA), solid performance, and security.
I think “Data Rates and Latency” aspects are critical enablers from a “Real-time” context. Interestingly Amazon Cloud’s AWS direct connect services offering 10 GBPS dedicated and secure network connects combined with other services like Amazon SQS and SNS are very interesting to explore from this perspective.
Hi Sankar,
Of course, the faster your network connection, the better your system will run. We must assume a robust infrastructure to support any kind of real-time data flow. With that in place, you can look at the kind of services and middleware that offer the best performance.
Services like Amazon SQS (Simple Queue Service) or Amazon SNS (Simple Notification Service) have been designed for people to send email-type messages to a number of subscribers in a variety of formats such as email, HTTP, and SMS. This is not a data-centric design, so it may not be capable of large-volume, high-speed, automated transmission of raw data values from financial, industrial, or embedded sources.
Best regards,
Bob